OASIS Topology and Orchestration Specification for Cloud Applications (TOSCA) TC

 View Only
  • 1.  Re: Differentiating TOSCA from HEAT

    Posted 01-04-2022 18:09
    On Tue, Jan 4, 2022 at 8:33 AM Bruun, Peter Michael (CMS RnD Orchestration) < peter-michael.bruun@hpe.com > wrote: In the previous meeting you mentioned some bad experience with HEAT and how we want to make sure TOSCA is perceived as superior. It's a useful comparison, but importantly TOSCA is not so clearly superior in every aspect. The fully-declarative approach has linguistic disadvantages. By the way, "Heat" is the orchestration module, "HOT" is the language, though this distinction isn't always very meaningful. It becomes meaningful in the Tacker project, which intends to be a TOSCA frontend for Heat (behind the scenes it translates to HOT, as far as I remember). By the way, Tacker's "problem", as I see it, is not in its use of TOSCA, but in its use of the Simple Profile. The Simple Profile means that Tacker can do much less than HOT. But there's no reason why Tacker can't model all OpenStack resources in TOSCA. I have an example of this . Also note that Puccini can compile HOT into Clout , though obviously HOT is missing many features over TOSCA. 1. HEAT has no custom nodetypes and no inheritance HEAT only supports a closed set of built-in OpenStack service types (Nova, Neutron etc) are supported. Well, you can, it's just not very easy. But it is powerful. TOSCA is designed to be extensible by using TOSCA itself (that's sometimes how I explain TOSCA to people: the same language is used for both the resources and their schema). For HOT you need to use Python, but once you are in Python you can perform all kinds of validations that are impossible to do in TOSCA. 2. HEAT has nothing like requirements-capabilities or other abstraction mechanisms. This is essentially true. TOSCA capabilities allow for a kind of composition (I often call a TOSCA node an "assemblage" of capabilities). In HOT these sub-components are all separate resources. The assemblage is often implied by using the get_resource function. 3. HEAT has no concept of artifacts since all functionality is hard-coded in OpenStack. Well, artifacts are not their own types (and there is no packaging format, like CSAR), but they are supported in a few ways: 1. The get_file function lets you insert artifacts anywhere in HOT. This is actually more flexible, in some ways, to TOCA's get_artifact function. 2. Some OpenStack components support artifacts and these are specifically modeled as resource types. Examples: Glance WebImage , Barbican Secret , Heat CloudConfig , etc. (By the way, HOT has more intrinsic functions than TOSCA, so it "wins" in this respect.) 4. HEAT has no concept of a service graph . There are of course connections between network ports and server interfaces, but these are not formalized as a real relationship-based graph structure I wouldn't say that it has no such concept, just that it has a very weak one. There is only one kind of relationship in HOT and that's "dependency", which is implicitly created via calls to functions such as get_resource and get_attr. It's possible to extract this graph from the HOT parser. I find "dependency" relationships to be the least interesting. They are important for one and only purpose: to automatically create a "deploy" workflow for non-declarative platforms. TOSCA's power is in introducing typed relationships, where each type implies its own semantic graph. One obvious use is networking, but we see graphs everywhere in the cloud: clustering, redundancy hierarchies, data streaming, service meshes, etc. 5. HEAT assumes everything is managed by OpenStack the HEAT model does not extend to containers and other management systems such as K8 Well, this is true in the sense that all supported modules are part of the OpenStack project. But the point is that OpenStack is modular, and every OpenStack module represents its own management system. For example, Neutron is very different from Nova. And then you have something like Magnum for containers, which is again different from Nova. And it's definitely possible to create a custom module, which supports HOT, but is not part of the OpenStack family of modules. 6. HEAT has no concept of events and closed loop management of services Yes, but I would argue that TOSCA doesn't, either. At least not yet. :) You can probably manage event streams as relationship types. But doing so is in no way integrated with TOSCA operations/notifications. 7. The lifecycle of HEAT objects is a primitive exists/not-exists model this cannot be extended So, if you recall from above, the dependency graph is used to generate a deployment workflow, which is not entirely unsophisticated LCM. There is, at least, an idea of when in time the object exists or does not exist. But also you can plug in your own custom Mistral Workflow . Without saying anything bad about Mistral, I'll mention that most users probably use an external workflow engine to do their LCM work, something like Ansible. 8. HEAT has poor error handling. If a request fails, there is no way to clean up or retry a half-created service (Hmm. How do we do that with TOSCA?) Yes, Heat is also slow and painful and does not scale well. But that's not the fault of the HOT language. I mostly blame the traditional phase-by-phase paradigm baked into OpenStack for this. I think it's important for TOSCA to support this paradigm because many systems look like this, but I can't see how TOSCA can fix the problem of "half-created services" (indeterminate states). It's not a language challenge but an architectural challenge. The solution I propose is simply "don't do this", or rather "don't assume that you can succeed in doing LCM in the cloud". In fully-declarative approaches there is no concept of order and time and instead you take for granted that state is emergent (and to an extent ephemeral). TOSCA doesn't have to do anything special to support this paradigm because, again, it's a matter of "don't do this". If you're not modeling any kind of workflow then the feature is just not there. So TOSCA lends itself quite easily to a fully-declarative approach. Then again, so does HOT. The problem is not HOT the language but Heat the orchestration module. Please add more Hm, here's what I can think of are the main advantages of TOSCA: 1. TOSCA is a standard with wide industry support. HOT isn't. The standard also includes a packaging format (CSAR). 2. The fact that TOSCA is models + schema in one language is a major advantage over HOT. Indeed TOSCA's power of validation is way, way beyond anything HOT can do. This extends to data types, which are fairly rich in TOSCA (but can be richer -- see OpenAPIv3). TOSCA's new-ish support for profiles in 2.0 makes all of this even more powerful, as it allows TOSCA to be used for all kinds of orchestration domains. Indeed, a single service can be described at various levels in just one topology template that imports multiple profiles. 3. TOSCA allows for custom integrations with subsystems via interfaces/operations/notifications. 4. TOSCA has many implementations rather than just one (well, I guess you can use HOT with Puccini...). Each implementation has different approaches to orchestration and integration. So, TOSCA is just more broadly useful than HOT. HOT for the most part limits you to an OpenStack-based environment (though again you can extend it for your own custom subsystems). The loose specification meant that for a while TOSCA implementations diverged quite a lot, but with TOSCA 2.0 we hope to run a tighter ship and increase interoperability.


  • 2.  RE: Differentiating TOSCA from HEAT

    Posted 01-05-2022 07:59




    Thanks, Tal
     
    In our previous meeting, you mentioned that you are aware of bad experience and failures using HEAT , and for this discussion I don t think it matters if those
    failures have been due to the HEAT orchestrator, the HOT language for configuring services with HEAT, or with the OpenStack architecture as a whole. After that, you are right that some of my concerns are with the expressiveness of HOT.
     
    I was hoping that you could add for our positioning of TOSCA some more concrete details about the mentioned bad experience? What exactly was the nature and reasons
    for those failures? Your 4 bullets are too generic, I think. You mention that HEAT is slow and does not scale, and you ascribe that to the OpenStack architecture, and not so much to the HOT language. Is that, in your opinion, the primary reason for the shortcoming
    of HEAT? If so, to ensure the success of TOSCA we would need to give some attention to scalability in our discussions.
     
    Concerning your views on declarative orchestration, clearly if a single underlying management system, and the components it orchestrates are all fully declarative
    and insensitive to sequencing, then indeed, the orchestrator itself does not need to be concerned with sequencing. But at the lowest level, technology is inherently sensitive to sequencing.
     
    Example: Installing a VM running a database application. If the management
    system allows you to specify this declaratively, including the required database configuration, then the orchestrator does not need to be concerned with the sequencing. If another VM needs to run an application that uses the database, and the two VMs are created
    and started in arbitrary order, then either that application needs to be insensitive to situations where the database is not yet ready or the declarative management system must be aware of the dependency.
     
    The point is that the temporal dependencies do not go away by themselves. The prerequisite is careful design of applications and management systems to fit into
    such a paradigm, and eventually in some cases, we are basically just pushing the sequencing problem down to lower level orchestrators/management-systems, and if the service topology happens to span more than one management system, then not only must each system
    be declarative within itself, but all the systems must be designed to interwork according to the centrifuge model to handle any required sequencing between them.
     
    This is a beautiful vision, but as you also say, we are not there, and so TOSCA will need to be able to support any sequencing requirements that are not yet within
    the capabilities of the systems being orchestrated.
     
    Frankly, just as you doubt the future of the state-sequencing model, I doubt the scalability and practicality of the centrifuge vision. But we don t need to settle
    that I definitely acknowledge the relevance of the centrifuge model for some scenarios and you recognize that state-sequencing will still be required for TOSCA for some time. So no need to argue about that.
     
    Best regards
    Peter
     
    From: Tal Liron [mailto:tliron@redhat.com]

    Sent: 4. januar 2022 19:09
    To: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>
    Cc: Chris Lauwers <lauwers@ubicity.com>; tosca@lists.oasis-open.org
    Subject: Re: Differentiating TOSCA from HEAT
     



    On Tue, Jan 4, 2022 at 8:33 AM Bruun, Peter Michael (CMS RnD Orchestration) < peter-michael.bruun@hpe.com > wrote:




    In the previous meeting you mentioned some bad experience with HEAT and how we want to make sure TOSCA is perceived as superior.





     


    It's a useful comparison, but importantly TOSCA is not so clearly superior in every aspect. The fully-declarative approach has linguistic disadvantages.


     


    By the way, "Heat" is the orchestration module, "HOT" is the language, though this distinction isn't always very meaningful. It becomes meaningful in the
    Tacker project, which intends to be a TOSCA frontend for Heat (behind the scenes it translates to HOT, as far as I remember). By the way, Tacker's "problem", as I see it, is not in its
    use of TOSCA, but in its use of the Simple Profile. The Simple Profile means that Tacker can do much less than HOT. But there's no reason why Tacker can't model all OpenStack resources in TOSCA. I have an
    example of this . Also note that Puccini can compile

    HOT into Clout , though obviously HOT is missing many features over TOSCA.




     1.        HEAT has no custom nodetypes and no inheritance HEAT only supports a closed set of built-in OpenStack service types (Nova, Neutron etc) are supported.




    Well, you can, it's just not very easy. But it is powerful. TOSCA is designed to be extensible by using TOSCA itself (that's sometimes how I explain TOSCA to people: the same language is used for both the resources and their schema). For
    HOT you need to use Python, but once you are in Python you can perform all kinds of validations that are impossible to do in TOSCA.




    2.        HEAT has nothing like requirements-capabilities or other abstraction mechanisms.




    This is essentially true. TOSCA capabilities allow for a kind of composition (I often call a TOSCA node an "assemblage" of capabilities). In HOT these sub-components are all separate resources. The assemblage is often implied by using the

    get_resource function.




    3.        HEAT has no concept of artifacts since all functionality is hard-coded in OpenStack.




    Well, artifacts are not their own types (and there is no packaging format, like CSAR), but they are supported in a few ways:


     


    1. The
    get_file function lets you insert artifacts anywhere in HOT. This is actually more flexible, in some ways, to TOCA's get_artifact function.


    2. Some OpenStack components support artifacts and these are specifically modeled as resource types. Examples:

    Glance WebImage ,
    Barbican Secret ,
    Heat CloudConfig , etc.


     


    (By the way, HOT has more intrinsic functions than TOSCA, so it "wins" in this respect.)




    4.        HEAT has no concept of a service
    graph . There are of course connections between network ports and server interfaces, but these are not formalized as a real  relationship-based graph structure




    I wouldn't say that it has no such concept, just that it has a very weak one. There is only one kind of relationship in HOT and that's "dependency", which is implicitly created via calls to functions such as get_resource and get_attr. It's
    possible to extract this graph from the HOT parser.


     


    I find "dependency" relationships to be the least interesting. They are important for one and only purpose: to automatically create a "deploy" workflow for non-declarative platforms. TOSCA's power is in introducing typed relationships,
    where each type implies its own semantic graph. One obvious use is networking, but we see graphs everywhere in the cloud: clustering, redundancy hierarchies, data streaming, service meshes, etc.




    5.        HEAT assumes everything is managed by OpenStack the HEAT model does not extend to containers and other management systems such as K8




    Well, this is true in the sense that all supported modules are part of the OpenStack project. But the point is that OpenStack is modular, and every OpenStack module represents its own management system. For example, Neutron is very different
    from Nova. And then you have something like
    Magnum for containers, which is again different from Nova. And it's definitely possible to create a custom module, which supports HOT, but is not part of the OpenStack family of modules.




    6.        HEAT has no concept of events and closed loop management of services




    Yes, but I would argue that TOSCA doesn't, either. At least not yet. :)


     


    You can probably manage event streams as relationship types. But doing so is in no way integrated with TOSCA operations/notifications.




    7.        The lifecycle of HEAT objects is a primitive exists/not-exists model this cannot be extended




    So, if you recall from above, the dependency graph is used to generate a deployment workflow, which is not entirely unsophisticated LCM. There is, at least, an idea of when in time the object exists or does not exist.


     


    But also you can plug in your own custom
    Mistral Workflow .


     


    Without saying anything bad about Mistral, I'll mention that most users probably use an external workflow engine to do their LCM work, something like Ansible.




    8.        HEAT has poor error handling. If a request fails, there is no way to clean up or retry a half-created service (Hmm. How do we do that with TOSCA?)




    Yes, Heat is also slow and painful and does not scale well. But that's not the fault of the HOT language. I mostly blame the traditional phase-by-phase paradigm baked into OpenStack for this. I think it's important for TOSCA to support
    this paradigm because many systems look like this, but I can't see how TOSCA can fix the problem of "half-created services" (indeterminate states). It's not a language challenge but an architectural challenge.


     


    The solution I propose is simply "don't do this", or rather "don't assume that you can succeed in doing LCM in the cloud". In fully-declarative approaches there is no concept of order and time and instead you take for granted that state
    is emergent (and to an extent ephemeral). TOSCA doesn't have to do anything special to support this paradigm because, again, it's a matter of "don't do this". If you're not modeling any kind of workflow then the feature is just not there. So TOSCA lends itself
    quite easily to a fully-declarative approach. Then again, so does HOT. The problem is not HOT the language but Heat the orchestration module.




     Please add more




     


    Hm, here's what I can think of are the main advantages of TOSCA:


     



    1. TOSCA is a standard with wide industry support. HOT isn't. The standard also includes a packaging format (CSAR).


     

    2. The fact that TOSCA is models + schema in one language is a major advantage over HOT. Indeed TOSCA's power of validation is way, way beyond anything HOT can do. This extends to data types, which are fairly rich in TOSCA (but can be richer
    -- see OpenAPIv3). TOSCA's new-ish support for profiles in 2.0 makes all of this even more powerful, as it allows TOSCA to be used for all kinds of orchestration domains. Indeed, a single service can be described at various levels in just one topology template
    that imports multiple profiles.


     

    3. TOSCA allows for custom integrations with subsystems via interfaces/operations/notifications.

     


    4. TOSCA has many implementations rather than just one (well, I guess you can use HOT with Puccini...). Each implementation has different approaches to orchestration and integration. So, TOSCA is just more broadly useful than HOT. HOT for
    the most part limits you to an OpenStack-based environment (though again you can extend it for your own custom subsystems). The loose specification meant that for a while TOSCA implementations diverged quite a lot, but with TOSCA 2.0 we hope to run a tighter
    ship and increase interoperability.









  • 3.  Re: Differentiating TOSCA from HEAT

    Posted 01-06-2022 00:50
    On Wed, Jan 5, 2022 at 1:59 AM Bruun, Peter Michael (CMS RnD Orchestration) < peter-michael.bruun@hpe.com > wrote: I was hoping that you could add for our positioning of TOSCA some more concrete details about the mentioned bad experience? What exactly was the nature and reasons for those failures? Your 4 bullets are too generic, I think. You mention that HEAT is slow and does not scale, and you ascribe that to the OpenStack architecture, and not so much to the HOT language. Is that, in your opinion, the primary reason for the shortcoming of HEAT? If so, to ensure the success of TOSCA we would need to give some attention to scalability in our discussions. There is nothing particularly wrong with HOT the language. It shares some of the same DNA as TOSCA and does a lot of the same things. The same is true for Cloudify DSL. The reason Puccini can parse all three of these languages is due to their core similarity. TOSCA is better than the others for the most part, but that's more about specific grammatical features than some essential qualitative difference. You can think of HOT as a subset of TOSCA. There are various reasons why Heat isn't great. The relevant one for our group is: it creates its worfklows automatically for you but there is almost no visibility into them, and definitely no hackability. That makes debugging very painful. It's an extremely anti-devops approach: we'll do the work, you stay away. Above, I linked to a Puccini example where I use TOSCA + an OpenStack profile to generate an Ansible playbook for deployment. The advantage, in my opinion, is that you get an actual debuggable and extensible playbook. There's no real lesson here for TOSCA specifically, but I do think Heat can be a cautionary tale for those of us wanting to implement automatic workflows in an orchestrator. Concerning your views on declarative orchestration, clearly if a single underlying management system, and the components it orchestrates are all fully declarative and insensitive to sequencing, then indeed, the orchestrator itself does not need to be concerned with sequencing. But at the lowest level, technology is inherently sensitive to sequencing. Absolutely. I just believe it should be solved locally, with specificity for that resource's unique lifecycle challenges, and then locked away as a black box (but with access to the source code, so that devops can fix production bugs). Indeed the responsibility for implementing this functionality should best be with the component's vendor. They know it best. It's basically the operator pattern: the orchestration work should be a managerial component living in the same space as the resource it's orchestrating. Sometimes I call it "side by side" orchestration. It's absurd to me that devops teams for various companies again and again try to solve the same LCM issues for whatever orchestrator they are using. Invariably there are bugs and scalability challenges. Orchestrators should not be doing generic phase-by-phase LCM, especially if they are not running in-cluster. It's a losing battle. Example: Installing a VM running a database application. If the management system allows you to specify this declaratively, including the required database configuration, then the orchestrator does not need to be concerned with the sequencing. If another VM needs to run an application that uses the database, and the two VMs are created and started in arbitrary order, then either that application needs to be insensitive to situations where the database is not yet ready or the declarative management system must be aware of the dependency. I strongly recommend that the application be able to stand up even if the database is not ready. This is the cornerstone of living in the cloud: it's an ephemeral environment where dependent services may come and go or just change suddenly. An orchestrator's work here is, of course, not to create the database connection. But it can assist in discovery (IP address injection?) and otherwise notifying, collecting notifications, and reconciling issues. The point is that the temporal dependencies do not go away by themselves. The prerequisite is careful design of applications and management systems to fit into such a paradigm, and eventually in some cases, we are basically just pushing the sequencing problem down to lower level orchestrators/management-systems, and if the service topology happens to span more than one management system, then not only must each system be declarative within itself, but all the systems must be designed to interwork according to the centrifuge model to handle any required sequencing between them. Welcome to the cloud-native world. :) It's best for your components to be designed to run in clouds, but there are also a lot of options for you if they don't. The operator pattern can allow you to create a cloud-native frontend for a component that doesn't play the game well. There are good examples of this in the world of databases. Most of the popular and mature databases we use have not been designed for cloud. But operators can allow for LCM of db clusters in cloud environments, managing all the difficult aspects of geographical redundancy, auto-scaling, failovers, load-balancing, backups, etc. If this operator is of good quality you end up being able to treat the db cluster declaratively and not worry about low-level sequences. And then all an orchestrator needs to do is work with those declarations. (Again, that's why I prefer to call it a "coordinator".) This is a beautiful vision, but as you also say, we are not there, and so TOSCA will need to be able to support any sequencing requirements that are not yet within the capabilities of the systems being orchestrated. I agree. But I think TOSCA is already there: 1) By using typed relationships you can derive various kinds of dependency graphs. There can be a graph for installation dependencies, a graph for networking configuration, etc. From these topological graphs a sequenced workflow graph (DAG) can be derived for your workflow engine of chocie. (Again, I hope you learn from Heat what not to do, and that you fully expose that DAG to users.) 2) Do you want users to be able to design their own DAGs? TOSCA is well suited for it. A "task" can be a node and these nodes can be connected via typed relationships. I'm working on a TOSCA profile for Argo Workflows that does exactly that. I dislike the workflow grammar in TOSCA 1.3 mostly because it's superfluous. We really don't need two different grammars for creating graphs.


  • 4.  RE: Differentiating TOSCA from HEAT

    Posted 01-06-2022 07:44




    Hi Tal,
     
    This makes complete sense we are aligned on understanding now.

     
    There is nothing particularly wrong with HOT the language.
     
    Agree. In fact our orchestrator has a language, DSD , that was originally derived from HOT, so you will recognize some of the syntax, but I fixed all the issues
    with the HOT language that I mentioned below, and quite a few more.
     
    You can think of HOT as a subset of TOSCA.
     
    The way I extended HOT makes TOSCA a subset of the DSD language, but DSD is a lot easier to understand and use.
     
    Basically, we made the language define a compositional graph-grammar instead of just a graph. Also, we went a different direction than TOSCA by making the DSD
    language Turing-complete while also being completely declarative. I know that we deliberately do not want that with TOSCA. But the graph grammar approach does solve
    all the day 2 expressiveness issues that some uses of TOSCA are facing.
     
    The relevant one for our group is: it creates its worfklows automatically for you but there is almost no visibility into them, and definitely no hackability.

    I hope you learn from Heat what not to do, and that you fully expose that DAG to users
     
    Yes. There are two things here.
     
    I fully agree that the visibility into the playbooks is critical. Version 1 of our orchestrator did, many years ago, not have that and we quickly realized that
    this would not work. So we give full visibility of the playbook graph including direct insight into why it looks the way it does.
     
    As for hackability , that is nice for small scale DevOps type orchestration,
    but when you look at thousands of transactions per minute on behalf of end-customers, you don t
    want users to hack each playbook individually. The paradigms you want for orchestration depends heavily on that kind of concern.
     
    There is always a tradeoff between flexibility and constraint. Too many constraints on what users can do will definitely lead to frustration I get that a lot
    with systems like maven, where they are really helpful if what you want to do is within their paradigm, and really frustrating if it is not. But too much flexibility, also leads to frustration because that leaves the user without guidance as to what combinations
    of functions and features are meaningful for some intent, and what combinations are not. So flexibility is not always the same as good .
     
    Basically, your domain for orchestration is DevOps, as you say. Our orchestration domain is quite different from DevOps, and that makes a huge difference in the
    requirements. So nothing is right or wrong just two different worlds.
     
    I just believe it should be solved locally, with specificity for that resource's unique lifecycle challenges, and then locked away as a black box
     
    I completely agree always solve issues as close to their source as possible. I just don t work in a domain where that is not feasible today, because most of
    the systems are orchestrating are nowhere near becoming cloud native. So in essence our orchestrator can be, when there is nothing else doing it, the system that turns such entities declarative.
     
    Welcome to the cloud-native world. :) It's best for your components to be designed to run in clouds
     
    Cloud native is for compute nodes running applications with connectivity. We are not in that domain. There are lots of nodes that do not represent
    components in clouds .
     
    Invariably there are bugs and scalability challenges
     
    That is not my experience. Our orchestrator is fully cloud native and scalable, including full support for zero-downtime rolling upgrades of the cluster. We run
    thousands of service requests per minute and millions of events per second and have service graphs consisting of hundreds of thousands of nodes. I can tell you from experience that it
    is possible and can be made to scale. I am not saying it is easy
    J
     
    By the way, my middle name should be Dag I
    dream in DAGs. J
     
    As for bugs, automatically generating workflows is a lot less error-prone than hacking up workflows manually. I know of several companies who learned that the
    hard way ending up with thousands of spaghetti-workflows with overlapping functionality, but different bugs in each. I don t think we really disagree on this.
     
    Do you want users to be able to design their own DAGs? TOSCA is well suited for it. A "task" can be a node and these nodes can be connected via typed relationships.
     
    We have been doing that for years, but I mentioned that we do a Graph Grammar as our catalog, which means that users can do what you say, but can be constrained
    to do only meaningful modifications of the DAG that represent a clearly modeled
    intent .
     
    Peter
     
    From: Tal Liron [mailto:tliron@redhat.com]

    Sent: 6. januar 2022 01:50
    To: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>
    Cc: Chris Lauwers <lauwers@ubicity.com>; tosca@lists.oasis-open.org
    Subject: Re: Differentiating TOSCA from HEAT
     



    On Wed, Jan 5, 2022 at 1:59 AM Bruun, Peter Michael (CMS RnD Orchestration) < peter-michael.bruun@hpe.com > wrote:




    I was hoping that you could add for our positioning of TOSCA some more concrete details about the mentioned
    bad experience? What exactly was the nature and reasons for those failures? Your 4 bullets are too generic, I think. You mention that HEAT is slow and does not scale, and you ascribe that to the OpenStack architecture, and not so much to the HOT language.
    Is that, in your opinion, the primary reason for the shortcoming of HEAT? If so, to ensure the success of TOSCA we would need to give some attention to scalability in our discussions.




     


    There is nothing particularly wrong with HOT the language. It shares some of the same DNA as TOSCA and does a lot of the same things. The same is true for Cloudify DSL. The reason Puccini can parse all three of these languages is due to
    their core similarity. TOSCA is better than the others for the most part, but that's more about specific grammatical features than some essential qualitative difference. You can think of HOT as a subset of TOSCA.


     


    There are various reasons why Heat isn't great. The relevant one for our group is: it creates its worfklows automatically for you but there is almost no visibility into them, and definitely no hackability. That makes debugging very painful.
    It's an extremely anti-devops approach: we'll do the work, you stay away. Above, I linked to a Puccini example where I use TOSCA + an OpenStack profile to generate an Ansible playbook for deployment. The advantage, in my opinion, is that you get an actual
    debuggable and extensible playbook. There's no real lesson here for TOSCA specifically, but I do think Heat can be a cautionary tale for those of us wanting to implement automatic workflows in an orchestrator.


     




    Concerning your views on declarative orchestration, clearly if a single underlying management system,
    and the components it orchestrates are all fully declarative and insensitive to sequencing, then indeed, the orchestrator itself does not need to be concerned with sequencing. But at the lowest level, technology is inherently sensitive to sequencing.




     


    Absolutely. I just believe it should be solved locally, with specificity for that resource's unique lifecycle challenges, and then locked away as a black box (but with access to the source code, so that devops can fix production bugs).
    Indeed the responsibility for implementing this functionality should best be with the component's vendor. They know it best. It's basically the operator pattern: the orchestration work should be a managerial component living in the same space as the resource
    it's orchestrating. Sometimes I call it "side by side" orchestration.


     


    It's absurd to me that devops teams for various companies again and again try to solve the same LCM issues for whatever orchestrator they are using. Invariably there are bugs and scalability challenges. Orchestrators should not be doing
    generic phase-by-phase LCM, especially if they are not running in-cluster. It's a losing battle.

     



    Example:
    Installing a VM running a database application. If the management system allows you to specify this declaratively, including the required database configuration, then the orchestrator does not need to be concerned with the sequencing. If another VM needs to
    run an application that uses the database, and the two VMs are created and started in arbitrary order, then either that application needs to be insensitive to situations where the database is not yet ready or the declarative management system must be aware
    of the dependency.




     


    I strongly recommend that the application be able to stand up even if the database is not ready. This is the cornerstone of living in the cloud: it's an ephemeral environment where dependent services may come and go or just change suddenly.
    An orchestrator's work here is, of course, not to create the database connection. But it can assist in discovery (IP address injection?) and otherwise notifying, collecting notifications, and reconciling issues.


     




    The point is that the temporal dependencies do not go away by themselves. The prerequisite is careful
    design of applications and management systems to fit into such a paradigm, and eventually in some cases, we are basically just pushing the sequencing problem down to lower level orchestrators/management-systems, and if the service topology happens to span
    more than one management system, then not only must each system be declarative within itself, but all the systems must be designed to interwork according to the centrifuge model to handle any required sequencing between them.




     


    Welcome to the cloud-native world. :) It's best for your components to be designed to run in clouds, but there are also a lot of options for you if they don't. The operator pattern can allow you to create a cloud-native frontend for a component
    that doesn't play the game well.


     


    There are good examples of this in the world of databases. Most of the popular and mature databases we use have not been designed for cloud. But operators can allow for LCM of db clusters in cloud environments, managing all the difficult
    aspects of geographical redundancy, auto-scaling, failovers, load-balancing, backups, etc. If this operator is of good quality you end up being able to treat the db cluster declaratively and not worry about low-level sequences. And then all an orchestrator
    needs to do is work with those declarations. (Again, that's why I prefer to call it a "coordinator".)


     




    This is a beautiful vision, but as you also say, we are not there, and so TOSCA will need to be able
    to support any sequencing requirements that are not yet within the capabilities of the systems being orchestrated.




     


    I agree. But I think TOSCA is already there:


     


    1) By using typed relationships you can derive various kinds of dependency graphs. There can be a graph for installation dependencies, a graph for networking configuration, etc. From these topological graphs a sequenced workflow graph (DAG)
    can be derived for your workflow engine of chocie. (Again, I hope you learn from Heat what not to do, and that you fully expose that DAG to users.)


     


    2) Do you want users to be able to design their own DAGs? TOSCA is well suited for it. A "task" can be a node and these nodes can be connected via typed relationships. I'm working on a TOSCA profile for
    Argo Workflows that does exactly that. I dislike the workflow grammar in TOSCA 1.3 mostly because it's superfluous. We really don't need two different grammars for creating graphs.









  • 5.  Re: Differentiating TOSCA from HEAT

    Posted 01-06-2022 17:11
    On Thu, Jan 6, 2022 at 1:44 AM Bruun, Peter Michael (CMS RnD Orchestration) < peter-michael.bruun@hpe.com > wrote: Agree. In fact our orchestrator has a language, DSD , that was originally derived from HOT, so you will recognize some of the syntax, but I fixed all the issues with the HOT language that I mentioned below, and quite a few more. Could you provide us with a comparison of DSD and TOSCA? How about the whole DSD specification? As for hackability , that is nice for small scale DevOps type orchestration, but when you look at thousands of transactions per minute on behalf of end-customers, you don t want users to hack each playbook individually. The paradigms you want for orchestration depends heavily on that kind of concern. There is always a tradeoff between flexibility and constraint. Too many constraints on what users can do will definitely lead to frustration I get that a lot with systems like maven, where they are really helpful if what you want to do is within their paradigm, and really frustrating if it is not. But too much flexibility, also leads to frustration because that leaves the user without guidance as to what combinations of functions and features are meaningful for some intent, and what combinations are not. So flexibility is not always the same as good . Basically, your domain for orchestration is DevOps, as you say. Our orchestration domain is quite different from DevOps, and that makes a huge difference in the requirements. So nothing is right or wrong just two different worlds. So, let me unpack this a bit, because I'm not "on the side of devops". I'm on the side of "it depends". In my work I have two kinds of customers: 1) There are companies that are not in the business of developing software. They would prefer to offload this work, and even more preferably to off-the-shelf solutions (cheaper in the short- and the long-term). Their main struggle is in integrating these solutions. For that, sometimes they go to yet another vendor. 2) There are companies that prefer some ownership over solutions. They invest in building up a devops team. The main struggle here is in balancing the trade offs between existing solutions and their own special needs. The thing is, for both kinds of companies hackability is useful, if in different ways. For the first, it allows for choice in vendors and avoidance of lock in. For the second, it allows for customization of off-the-shelf solutions without having to fork them. I think it's absolutely possible to make something hackable without much cost to its built-in, opinionated automation. The keyword is "extensibility". A good example is Cloudify's plugins: the Cloudify platform generates a default DAG with a set of built-in tasks, but via plugins you can modify that DAG and inject your own tasks. There is a cost to this, and the platform has to give you the tools to develop such customizations, but I think we both agree that we would want those tools to exist even without customizations so that we don't end up with something like Heat. (Sorry, Heat, you've become the punching bag in this conversation.) From the perspective of a from-the-trenches engineer: there's nothing more frustrating then a big automatic system that does a zillion things for you, but then can't do the simplest thing that you can do in a command terminal in one minute. And of course it's baffling and infuriating for managers to hear about this disparity. We have to stop building systems that end up shooting ourselves in the foot. Welcome to the cloud-native world. :) It's best for your components to be designed to run in clouds Cloud native is for compute nodes running applications with connectivity. We are not in that domain. There are lots of nodes that do not represent components in clouds . Well, the "C" in "TOSCA" stands for cloud. Clouds always have peripheral systems that are not so cloud-like, which we want to support, but TOSCA absolutely has to nail clouds.


  • 6.  RE: Differentiating TOSCA from HEAT

    Posted 01-06-2022 17:49




    Great! I think we have converged to a common understanding.
     
    For more information about DSD, take a look at the presentation I gave to the TOSCA TC back in October 2020:

    https://www.oasis-open.org/committees/download.php/67868/HPE_ServiceDirector.pdf
     
    Best regards
    Peter
     
     
    From: Tal Liron [mailto:tliron@redhat.com]

    Sent: 6. januar 2022 18:11
    To: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>
    Cc: tosca@lists.oasis-open.org
    Subject: Re: Differentiating TOSCA from HEAT
     



    On Thu, Jan 6, 2022 at 1:44 AM Bruun, Peter Michael (CMS RnD Orchestration) < peter-michael.bruun@hpe.com > wrote:




    Agree. In fact our orchestrator has a language, DSD , that was originally derived from HOT, so you will recognize some of the syntax, but I fixed all the issues
    with the HOT language that I mentioned below, and quite a few more.





     


    Could you provide us with a comparison of DSD and TOSCA? How about the whole DSD specification?


     




    As for hackability ,
    that is nice for small scale DevOps type orchestration, but when you look at thousands of transactions per minute on behalf of end-customers, you don t
    want users to hack each playbook individually. The paradigms you want for orchestration depends heavily on that kind of concern.
     
    There is always a tradeoff between flexibility and constraint. Too many constraints on what users can
    do will definitely lead to frustration I get that a lot with systems like maven, where they are really helpful if what you want to do is within their paradigm, and really frustrating if it is not. But too much flexibility, also leads to frustration because
    that leaves the user without guidance as to what combinations of functions and features are meaningful for some intent, and what combinations are not. So flexibility is not always the same as good .
     
    Basically, your domain for orchestration is DevOps, as you say. Our orchestration domain is quite different
    from DevOps, and that makes a huge difference in the requirements. So nothing is right or wrong just two different worlds.




     



    So, let me unpack this a bit, because I'm not "on the side of devops". I'm on the side of "it depends". In my work I have two kinds of customers:


     


    1) There are companies that are not in the business of developing software. They would prefer to offload this work, and even more preferably to off-the-shelf solutions (cheaper in the short- and the long-term). Their main struggle is in
    integrating these solutions. For that, sometimes they go to yet another vendor.


    2) There are companies that prefer some ownership over solutions. They invest in building up a devops team. The main struggle here is in balancing the trade offs between existing solutions and their own special needs.


     


    The thing is, for both kinds of companies hackability is useful, if in different ways. For the first, it allows for choice in vendors and avoidance of lock in. For the second, it allows for customization of off-the-shelf solutions without
    having to fork them.


     


    I think it's absolutely possible to make something hackable without much cost to its built-in, opinionated automation. The keyword is "extensibility". A good example is Cloudify's plugins: the Cloudify platform generates a default DAG with
    a set of built-in tasks, but via plugins you can modify that DAG and inject your own tasks. There is a cost to this, and the platform has to give you the tools to develop such customizations, but I think we both agree that we would want those tools to exist
    even without customizations so that we don't end up with something like Heat. (Sorry, Heat, you've become the punching bag in this conversation.)


     


    From the perspective of a from-the-trenches engineer: there's nothing more frustrating then a big automatic system that does a zillion things for you, but then can't do the simplest thing that you can do in a command terminal in one minute.
    And of course it's baffling and infuriating for managers to hear about this disparity. We have to stop building systems that end up shooting ourselves in the foot.



     





      Welcome to the cloud-native world. :) It's best for your components to be designed to run in clouds
     
    Cloud native is for compute nodes running applications with connectivity. We are not in that domain.
    There are lots of nodes that do not represent components in clouds .




     


    Well, the "C" in "TOSCA" stands for cloud. Clouds always have peripheral systems that are not so cloud-like, which we want to support, but TOSCA absolutely has to nail clouds.