Thanks Tal. Comments in-line
From:
tosca@lists.oasis-open.org <
tosca@lists.oasis-open.org>
On Behalf Of Tal Liron
Sent: Tuesday, November 17, 2020 10:10 AM
To:
tosca@lists.oasis-open.org Subject: [tosca] More on dangling requirements
This came up at the end of the ad-hoc today, so I thought to elaborate more on what I mean here and what the implications are for the instance model.
Here's what I am assuming a dangling requirement looks like:
topology_template:
node_templates:
web-server:
type: Application
requirements:
- data: Database # node type
Yes, and presumably the Application node type already defines the data requirement to require a node of type Database , in which case you don t even have to specify the Database type here. In addition, you
will likely also want to specify a node filter to filter the set of Database nodes that will be considered to fulfill this particular requirement. The node filter (together with the node and capability types specified in the requirement definitions) define
the query that you will run against your inventory to find the set of suitable nodes for fulfilling this requirement.
From the perspective of a graph, this semantic model explicitly has a single vertex in it, with no edges.
Before the requirement fulfillment phase , this is correct.
But how would the instance model look? Is it a single vertex? If it is, then there are no edges to the graph (edges are between vertexes, not floating in space). Or, do you understand this design to imply that there
indeed are two vertexes with an edge between them? If that's the case, the instance model must have some kind of implicit "placeholder" vertex.
If the requirement is mandatory (i.e. the occurrences keyword in the requirement definition has a lower bound that is greater than zero), then the orchestrator will find a suitable node during the requirement fulfillment
phase, and create an edge between the node that has the dangling requirement and the node from inventory that was used to fulfill the requirement.
Unfortunately, this vertex is not named in the TOSCA design and indeed is invisible to it.
The target node is not named in the service template that has the dangling requirement, but it surely was named in its own service template (i.e. the template from which that node was originally created).
You can't do a "get_property" and refer to the database used by the web server (note that you could use the TARGET keyword, but only within the relationship values).
You can absolutely do a get_property to refer to the database. Since your template specifies that the target node for the data requirement is of type Database, your parser can validate that valid property values
are being retrieved.
In my opinion, this is one of the most useful features of TOSCA: it allows for the pull mechanism for getting data into node types without needing additional functions. I use this all the time in my orchestrator.
Here's the alternative grammar that I vastly prefer, in which we
require explicit vertexes:
topology_template:
node_templates:
web-server:
type: Application
requirements:
- data: database # node template
database:
type: Database
directives:
- substitutable
Yes, the TOSCA specification shows this as an alternative grammar for dangling requirements , although your example shows the wrong directive: if the database node is supposed to be found using requirement fulfillment ,
you use the select directive instead of the substitute directive. The substitute directive is only intended to be used for substitution mapping. If you use the select directive with a node template, then that node template can also define a node filter
(similar to the way dangling requirements can define node filters).
The feature is exactly the same as with dangling requirements: it's the orchestrator's responsibility to provide some kind of database resource that complies with the Database node type.
According to the spec, this mechanism for specifying dangling requirements is useful when you want to indicate that multiple dangling requirements need to be fulfilled by the same node instance.
However, there is no confusion here regarding the instance model (two vertexes with one edge) because it indeed follows through from the semantic model. The design phase is complete and indeed it is a full graph
with no "dangling", no need to consider edges floating in space.
I don t think there is confusion either way: if a dangling requirement is not mandatory, it will not result in an edge in the instance model. If it is mandatory, it will result in an edge in the instance model (or
the orchestration will fail if a suitable target node cannot be found).
Moreover, I think this design is much more flexible in allowing for many different ways of specifying exactly
how the orchestrator will provide the database:
TOSCA doesn t specify how the orchestrator is supposed to provide the (inventory) database, so I m not sure what additional flexibility is needed?
1) Directives. I'm not a fan of this simplistic grammatical feature, but we could use it [did we remove it in 2.0?]:
database:
type: Database
directives:
- substitutable
- provisionable
- allocatable
On the contrary, we made directives mandatory in v1.3 to make orchestration actions explicit. The designer uses the substitute directive to indicate that a node is abstract and needs to be substituted, or the select
directive that a node must be retrieved from inventory. We need to expand on directives support to allow for multiple directives in the same template, which would support create-if-not-exists or substitute-if-not-exists scenarios.
2) Policies. Much better because these are typed and complex and can thus model the actual mechanism used by the orchestrator (and be part of a profile):
policies:
database:
type: Provisioning
targets: [ database ]
properties:
optional: true
machine-types: [ virtual, baremetal ]
compatibility: [ mariadb, postgresql ]
cluster: true
redundancy: 2
The problem with policies as you use them is that they have absolutely no (language) semantics associated with them. All the semantics are encoded in the properties, which means that they can only be processed by
an external domain-specific entity that knows what these properties mean.
Note that the dangling requirement grammar doesn't, as it stands, doesn't have a clear way to specify whether the requirement is optional or not, or whether it is
conditionally optional, e.g. "nice to have" in certain situations, a "hard" requirement in others. A policy can allow us to have "optional: true" or something more complex, as needed.
As I stated earlier, the occurrences keyword in the requirement definition specifies whether the requirement is mandatory or optional. Do you have example where conditionally-optional should be used?
This is why I keep emphasizing that a node instance (in the instance model!) can have
zero corresponding resources in the real world. If the provisioning policy indeed allows for fulfilment to be optional then there might not be any database at all. It's still part of the
template , it's just not part of what ended up being deployed. Likewise, the relationship is there in the design (complete graph), but there are zero actual database connections in the real world.
Fulfilling dangling requirements is strictly an instance model action: it connects two nodes (vertices) in the instance model graph using an edge ). If necessary. Whether anything happens in the real world because
of this depends on the (domain) specific artifacts (or whatever else you use to reflect the instance model into the real world).
Also note the "redundancy: 2". This could mean that two database resources in the real world are provisioned. So, the application would need to configure two separate connections, one to each database resources.
The instance model is two vertexes with a single edge between them, but that single edge represents two connections. (Whether it's "one to many" is very implementation specific, and indeed the connections might be of different types: perhaps there can be only
one "read-write" primary database while the others are "read-only" secondaries).
Yes, that is completely fine and again independent of any instance model considerations. If your implementation creates two physical instances from one node instance in the instance model, it should feel free to
do that. If that means you need lists of property values (one value for each physical instance) then of course you need to create your node types accordingly.
And it can get more complex: perhaps later on in the runtime lifecycle (day 2) a database becomes available, and because it's a "nice to have" suddenly there would be a non-zero correspondence between the instance
model and the world.
I m not sure I understand what you mean here.
Nevertheless, despite all this runtime complexity the instance model is always the same: two vertexes with a single edge between them.
Yes, I think we re in agreement on this. But again, this is completely orthogonal to the requirement fulfillment discussion.
(Final note on this feature: there are policy frameworks out there with much richer grammar than TOSCA can ever hope to provide. In those cases, we would probably want to include those specifications as artifacts.
But artifacts are currently only attached to nodes, not policies. And generally to my knowledge we have never discussed how external policy frameworks would interact with TOSCA.)
We have not yet had any solid discussions about TOSCA policies so clearly this is an area that needs to be explored more. However, most external policy frameworks are really just imperative programming languages
disguised as policy frameworks . We clearly don t want to add all kinds of imperative support to TOSCA. Instead, we should add rich support for declarative policies.
3) Rich node and capability types that model ranges of possibilities rather than or in addition to specific hard values by using lists and maps instead of single values:
database:
type: Database
directives:
- substitutable
capabilities:
machine:
properties:
types: [ virtual, baremetal ]
In this case the "machine" capability might have a "type" attribute, which is not a list, and which specifies the exact type of the node instance.
I m not sure I understand what is intended here. How does a capability specify a node type (or a list of node types)?
It's worth discussing a possible limitation to all the above approaches (including the "dangling requirements" approach). How do we express more complex selection logic? For example: "provision a baremetal machine
with at least 2 GB of RAM, but if it's a virtual machine require at least 4 GB of RAM".
If you re talking about selection (i.e. requirement fulfillment i.e. find a node from inventory), then this is exactly what the expanded node filter syntax is for that we proposed several months ago. However,
you use the word provision in your example, which is different from selection . Assuming you provision using substitution, then a substituting template for a virtual machine presumably will be different from a substituting template for a bare-metal server,
and each substituting template will specify how much memory it needs.
It's definitely possible to create data types that can express and/or, but that only covers
branching logic. The decision flow might not be tree-shaped at all, and indeed can itself be a graph. Moreover, it might not be a finite algorithm at all: there might be a ML-based system that makes the provisioning decision on its own, and at best we
can provide certain hints to our preferences as well as hard (regulatory) requirements.
I think TOSCA currently supports all of this: a TOSCA orchestrator performs requirement fulfillment, or substitution. Both of these functions require decision logic to find the best node to fulfill a dangling requirement,
or the best template to substitute an abstract node. If a TOSCA orchestrator wants to use AI/ML to help with this decision logic, it should feel free to do so.
My point is that this is not a limitation but a feature. :) We want TOSCA to be able to model all kinds of systems for requirement fulfilment, provisioning, allocation, and placement, including innovative approaches
that have not been invented yet. In my view, doing so means improving our data type grammar for allowing more robust logic, e.g. support for anyOf or allOf for lists and maps, etc.
I think we all agree that we want the same. However, based on your examples in this email, I m not sure we need to make any changes to the current model for how a TOSCA orchestrator is expected to work. However,
we do need all the enhancements to the instance model that we have proposed over the last several months (including support for cardinality, richer TOSCAPath syntax, and streamlined condition/constraint clauses to be used in filters and policies). I m hoping
we can make progress on those over the next several months.
Thanks,
Chris