OASIS Topology and Orchestration Specification for Cloud Applications (TOSCA) TC

 View Only
  • 1.  RE: [tosca] Proposal for requirement "occurrences"

    Posted 02-03-2022 21:33




    Let s put grammar issues aside for a minute, and try to get agreement on the core concept here. I don t think there is any question that requirements can be fulfilled at design time (Day 0). It sounds like we also agree that it is perfectly
    valid in TOSCA to leave requirements unfulfilled with the goal of having those requirements fulfilled at deployment time (Day 1). Assuming we all agree, then we can turn to a discussion of the best syntax for Day 1 requirement fulfillment.
     
    Chris
     
     
     

    From: Tal Liron <tliron@redhat.com>
    Sent: Thursday, February 3, 2022 11:06 AM
    To: Chris Lauwers <lauwers@ubicity.com>
    Cc: tosca@lists.oasis-open.org
    Subject: Re: [tosca] Proposal for requirement "occurrences"

     



    On Thu, Feb 3, 2022 at 11:46 AM Chris Lauwers < lauwers@ubicity.com > wrote:






    OK, perhaps we re getting to the core of the confusion/disagreement. Requirement fulfillment is not at all a question of WHEN . The issue of WHEN is extremely straightforward. It is either done on Day 0 (Design
    Time) or on Day 1 (Deployment Time):






     


    Yes, but here you are exactly saying that there are two moments in which it can happen. So there definitely is a WHEN question. Our disagreement is in that you don't find the answer to WHEN to be problematic (you say it's "extremely straightforward".)
    I say it has very profound and negative implications:


     


    1) A runtime requirement means, by definition, that we do not have a complete topology at design time. That's what I meant when I said "unnecessarily sacrificing Day 0 for Day 1". Again, the cost is tremendous, because Day 0 just happens
    to be one of TOSCA's strong points. So whatever we end up deciding I will continue to insist that this is an anti-pattern and should not be used by anyone who cares about design.


     


    Is it the end of the world? No. There are a lot of awful features in TOSCA that I strongly advise people not to use. For example: workflows. Also notifications, as they stand in 1.3, are quite pointless and I would suggest better ways of
    handling events within the existing grammar. But I'll continue to insist that we need to make runtime requirements (assuming we cannot agree to get rid of them and focus on 2.9.2) 100% explicit with a keyword and not magically postpone a requirement fulfillment
    to runtime without the designer ever knowing that it even happened. That to me is the true disaster, as it entirely breaks TOSCA for designers. A graph can be broken without the designer even knowing.


     


    2) We could very well be dealing with entirely different systems handling requirement fulfillment in these two times. This is indeed my most common use case: I use existing orchestrators which don't have a "runtime requirements" feature.
    In fact, I would say most of them don't have anything like this. (A notable exception is Canonical's Charm ecosystem.) Of course we can always add an extra layer in front of an orchestrator that could create that kind of feature for us. But it's a huge complication,
    and again not worth it for a feature that is ultimately unnecessary. Meanwhile, 2.9.2 is a feature that almost every orchestrator can handle: provision a new resource vs. use an existing one. The devil is in the details of where this "existing one" comes from,
    but that's where every orchestrator and platform is vastly different. The kinds of scopes available in different platforms, and the relationships between these scopes, are diverse. Sometimes there are inventories or multiple inventories, sometimes there are
    pools (that need to be managed), and often there are complex policies regarding all of those and rules on how to decide which scope to use. And again, that's why a "global" scope seems wrong to me. The scope itself might be a runtime decision by the orchestrator.
    It's just not something that a single keyword can help us with.

     

    The example you provided of properties based on input values is useful here.


     

    You are right that the value of get_input is used in Day 1 deployment, but it's not a "runtime" feature in that it is well defined as, well, an explicit input. It absolutely can be used in Day 0, during validation. It is not at all
    like attributes that must be retrieved from the platform. And it's definitely not like an inventory of existing resources. "get_input" relates to our discussion about
    design variability. Essentially it means that this specific topology template represents a set of possible topologies.

     


    This introduces a challenge to design validation. Ideally we would want to test that the entire set is valid, but this challenge is not always easily met. In some cases it might be clear what the variants are and someone could test by just
    using different inputs. But it might not always be possible if, say, you have many different inputs that interact in complex matrix ways. The set of variations might be enormous.


     


    So, to me, none of this has anything to do with "runtime: true". I would still insist that all these variant's validity is a Day 0 issue that conforms to 2.9.2.


     


    Actually, your example is odd to me. You want node template "a" to require a "large" node template, so why not just provide it in the design? Even you agreed that 2.9.2 is semantically equivalent to "dangling". Isn't it obviously the better
    pattern here? Like so:


     

      a:

        type: A
        requirements:
        - dependency:
            node_filter:
              properties:
              - size: { equal: large }


     


      b:
        type: B


        directives: [ select ]


        properties:
          size: large


     


    Nothing is "dangling". Design is valid. We exactly fulfilled what "a" needs.


     


    Node template "b" should not be variable because "a" here has a clear requirement, so the topology would always look like this. It's a singular topology, not a set. The only "change" introduced is what runtime node is selected to actually
    implement the "b" template, but it's still within this design. The point is that "b" will always be "large", adhering to the property. It's a single clear design.


     


    How then could variability work here? Well, I'd have to change the intent of your example, but let me try to stay as close to it as possible:


     


      a:

        type: A
        requirements:
        - dependency:
            node_filter:
              properties:
              - size: { equal: { get_input: size:_input } }


     


      b:
        type: B


        directives: [ select ]


        properties:
          size: small


     



      c:


        type: B


        directives: [ select ]


        properties:
          size: large


     


    Clearly we now have two possible topologies depending on the input. "a" will connect to either "b" or "c". Both variations are valid and also easy to validate. We just need to provide the two possible input values and see that indeed in
    both cases there is no error and both designs are valid. And again, nothing is "dangling".











  • 2.  Re: [tosca] Proposal for requirement "occurrences"

    Posted 02-03-2022 22:52
    On Thu, Feb 3, 2022 at 3:33 PM Chris Lauwers < lauwers@ubicity.com > wrote: Let s put grammar issues aside for a minute, and try to get agreement on the core concept here. I don t think there is any question that requirements can be fulfilled at design time (Day 0). It sounds like we also agree that it is perfectly valid in TOSCA to leave requirements unfulfilled with the goal of having those requirements fulfilled at deployment time (Day 1). Assuming we all agree, then we can turn to a discussion of the best syntax for Day 1 requirement fulfillment. That's close but not quite good enough for me. :) I would say this: I'm OK with runtime requirements only if the syntax is explicit and the grammatical and orchestration implications are fully understood. I wouldn't call it "perfectly valid". It's an anti-pattern that I will advise against using. I'm only open to compromising here in order to move us forward. But also it's essential that before we do so we accept the example at 2.9.2 as definitive, which also means we would also have to spec it out properly. I do agree with you that TOSCA has always hinted at runtime requirements. Way back years ago when I made my TOSCA tutorial (for 1.0) it was there in the spec as suggestive little wordings, but I could not see how it could be implemented coherently. For example, in TOSCA 1.0 this is how node_filter for node templates was described: The optional filter definition that TOSCA orchestrators would use to select the correct target node. This keyname is only valid if the directive has the value of selectable set. Note that there is no mention of "at runtime" here, and example 2.9.2 is the only place where it is used. And what does "target node" mean here? Target of what? This implies that it relates somehow to the requirements of this node template. Anyway, so little here to build upon. As for requirement node_filter, it's even less clear: The optional filter definition that TOSCA orchestrators or providers would use to select a type-compatible target node that can fulfill the associated abstract requirement at runtime. And that little hint, "at runtime", was pretty much all the explanation we got. But this hint created far more questions than it answered. Nowhere does it say here that this "type-compatible target node" could be somewhere else, other in the topology template. The word "dangling" does not appear. Could "at runtime" mean at the Day 1 moment of deployment, in reference to variability due to get_input, as we saw in my previous email? The target word used here is "node" rather than "node template", which is ... what exactly? We had no concept then of node representations. Nowhere does it say here that this "node" might live in an external inventory or that a node representation can be created on-the-fly for an existing (pooled?) real-world node instance. You've found answers to these questions that made sense to you and have built them into a complete feature. I'm uncomfortable with many of the assumptions behind these answers. From my perspective, 2.9.2 solved the problem of runtime selection completely and this "at runtime" hint was a confusing distraction (there are many such confusing distractions in the spec). Actually, I've never even needed to use a "select" directive. (In TOSCA 1.0, section 2.9.2 actually does not use that directive, it's added in later versions of the spec). The matter of whether a node template is something to be "created" (provisioned, removed from a pool, instantiated, etc.) or "selected" by the orchestrator is just ... not that interesting for TOSCA. It's a decision that's ultimately made by an orchestrator that takes into account many complicated factors having to do with its inventories, platform specifics, optimization algorithms, etc. At best a designer can provide policies (or metadata hints) that can direct or constrain the orchestrator's eventual decision. An example of such a policy is: this server cannot be shared with other services. That means that the orchestrator can either provision a new VM or use an unused one from a pool, but it absolutely cannot "select" one if it's already being used by another service, even if it matches the requirements. We've come a long way since TOSCA 1.0. We now have a clear and agreed-upon "mental model" for a TOSCA processor. With that, I think, it's time to expand the model in order to consider what "runtime" could really mean for TOSCA. I would start with these questions: Can a service access node representations created by other services? (this is a question for 2.9.2, not just for "dangling" requirements) What would be the rules for accessing them? (security considerations). Do node representations only originate in TOSCA, or can an orchestrator create node representations from pre-existing runtime instances? When and how should that happen? How would this relate to the access rules of #2? Again, the best starting point for me is 2.9.2.