tosca-comment

 View Only
  • 1.  RE: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-18-2020 23:40
    Hi Adam

    Thanks again for the very helpful feedback. Some comments inline.

    Here are some comments and questions that came up as I was implementing the specification, I hope you find them useful:

    # Normative questions and comments:

    * Unless I missed something, it appears that tosca identifiers (e.g. template names, property names etc.) are only defined as YAML strings. I assume this is under-specified? Given TOSCA's XML heritage and that namespace prefixes are still used, should they be constrained to XML's NCName production (XML names without colons)?
    I believe that in most (if not all) cases, the context makes it clear where the names are valid. For example, template names are only valid within the context of a single topology template (and by implication a single service template). Property names are only valid within a specific type definition, etc. The only place where namespace prefixes are supported (and useful) is for the type definitions themselves, since type definitions can be imported from other templates, which means that name collisions are possible. Do you have examples where the current naming scheme can create problems?

    * The lists of node states in section 3.4.1 (p74) doesn't include "available" but it is used elsewhere in the spec. For example in the diagram on p.226. (That diagram also show the "delete" operation moving the node state from "configured" to "configured" -- I assume that is an error and it should be "stopped" and "deleted" respectively?). The example in section 7.3.3.1 (p260) also references an "available" state.
    Yes, good catch. Clearly the text got out of sync with the diagrams. This needs to be corrected.
    That said, this observation points to an issue we’ve been working on in Version 2.0: the list of valid node states is defined as part of “TOSCA the language”, but the specific states are modeled after the “Standard” lifecycle management interface, which is part of “the TOSCA type system” (i.e. the Simple Profile types). There are a number of instances in the specification where assumptions about the Simple Profile types bleed over into the language itself. This makes it difficult to use TOSCA (the language) without also using the Simple Profile types. We’re trying to remove such “bleed-over” and create a cleaner separation between the language and the types. This could mean that the “state” concept might need to be removed from the language, and instead be moved to an attribute of an Interface, for example (which would be defined in the type system, not in the language).

    * metadata is allowed on property and attribute definitions but not on parameters (inputs and outputs) or on artifacts. Is this inconsistency an oversite? I have found it necessary to add metadata to input and artifact definitions.
    Yes, metadata should be allowed pretty much everywhere (except possibly in property and attribute assignments). We’ll try to eliminate such inconsistencies.

    * requirement assignments can specify the capability and relationship types but capability assignments can not. Is this an oversite? I found declaring the type in capability assignment necessary, for example, to specify that the "endpoint" capability of a node template is a subtype of the generic "endpoint" capability.
    Would you mind sharing a specific example? If I understand correctly, what you’re trying to do sounds like an “on-the-fly” creation of a node sub-type within a topology template. I’d like to understand your use case better before commenting on this.

    * I have had to make other extensions to spec, some of which you may want to consider for a future version of TOSCA -- the list of changes are documented here: https://www.onecommons.org/unfurl/tosca.html
    Thanks, very helpful. By the way, I’d like to repeat my invitation for you to join the TOSCA TC so we can discuss these during our regular meetings.

    # Editorial comments:

    * The discussion in 13.4.1 Shell Scripts (p.360) about implementation outputs is confusing because it seems to imply that exporting an environment variable in shell script makes it available to the parent process when in fact doesn't (instead it makes it available to child processes). To implement something like this the orchestrator would have to wrap or rewrite the shell script (that's what Yorc does) or place other requirements on the script -- for example, Cloudinary requires shell scripts to call a helper child process to pass outputs back to the orchestrator.
    Yes, this is actually more than an “editorial” error. What’s currently described won’t work


  • 2.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-19-2020 14:05
    Hi Chris,

    Responses inline below...

    thanks,
    Adam

    On Tue, Feb 18, 2020 at 3:40 PM Chris Lauwers <lauwers@ubicity.com> wrote:

    > Hi Adam
    >
    >
    >
    > Thanks again for the very helpful feedback. Some comments inline.
    >
    >
    >
    > Here are some comments and questions that came up as I was implementing
    > the specification, I hope you find them useful:
    >
    >
    >
    > # Normative questions and comments:
    >
    >
    >
    > * Unless I missed something, it appears that tosca identifiers (e.g.
    > template names, property names etc.) are only defined as YAML strings. I
    > assume this is under-specified? Given TOSCA's XML heritage and that
    > namespace prefixes are still used, should they be constrained to XML's
    > NCName production (XML names without colons)?
    >
    > I believe that in most (if not all) cases, the context makes it clear
    > where the names are valid. For example, template names are only valid
    > within the context of a single topology template (and by implication a
    > single service template). Property names are only valid within a specific
    > type definition, etc. The only place where namespace prefixes are supported
    > (and useful) is for the type definitions themselves, since type definitions
    > can be imported from other templates, which means that name collisions are
    > possible. Do you have examples where the current naming scheme can create
    > problems?
    >
    >
    >

    I was referring to the syntax of TOSCA names and identifiers in general --
    unless I'm missing something, it appears that a name can be any valid YAML
    string, including ones containing whitespace, line feeds, non-normalized
    Unicode sequences, etc. This makes it harder to build interoperable and
    secure implementations (think SQL injection attacks). For
    compatibility with XML you might want to consider constraining as I suggest
    above. Or maybe they should limited to be more like identifiers in a
    programming language or to the syntax of DNS hostnames, as Kubernetes does.
    On the other hand, maybe it would be useful to allow a more liberal syntax
    so that any valid URI could be used as an identifier.

    * The lists of node states in section 3.4.1 (p74) doesn't include
    > "available" but it is used elsewhere in the spec. For example in the
    > diagram on p.226. (That diagram also show the "delete" operation moving the
    > node state from "configured" to "configured" -- I assume that is an error
    > and it should be "stopped" and "deleted" respectively?). The example in
    > section 7.3.3.1 (p260) also references an "available" state.
    >
    > Yes, good catch. Clearly the text got out of sync with the diagrams. This
    > needs to be corrected.
    >
    > That said, this observation points to an issue we’ve been working on in
    > Version 2.0: the list of valid node states is defined as part of “TOSCA the
    > language”, but the specific states are modeled after the “Standard”
    > lifecycle management interface, which is part of “the TOSCA type system”
    > (i.e. the Simple Profile types). There are a number of instances in the
    > specification where assumptions about the Simple Profile types bleed over
    > into the language itself. This makes it difficult to use TOSCA (the
    > language) without also using the Simple Profile types. We’re trying to
    > remove such “bleed-over” and create a cleaner separation between the
    > language and the types. This could mean that the “state” concept might need
    > to be removed from the language, and instead be moved to an attribute of an
    > Interface, for example (which would be defined in the type system, not in
    > the language).
    >
    >
    >
    > * metadata is allowed on property and attribute definitions but not on
    > parameters (inputs and outputs) or on artifacts. Is this inconsistency an
    > oversite? I have found it necessary to add metadata to input and artifact
    > definitions.
    >
    > Yes, metadata should be allowed pretty much everywhere (except possibly in
    > property and attribute assignments). We’ll try to eliminate such
    > inconsistencies.
    >
    >
    >
    > * requirement assignments can specify the capability and relationship
    > types but capability assignments can not. Is this an oversite? I found
    > declaring the type in capability assignment necessary, for example, to
    > specify that the "endpoint" capability of a node template is a subtype of
    > the generic "endpoint" capability.
    >
    > Would you mind sharing a specific example? If I understand correctly, what
    > you’re trying to do sounds like an “on-the-fly” creation of a node sub-type
    > within a topology template. I’d like to understand your use case better
    > before commenting on this.
    >
    >
    Sure... consider:

    capability_types:
    mytypes.capabilities.Endpoint.SSH:
    derived_from: tosca.capabilities.Endpoint.Admin
    description: Capability to connect to the host via SSH
    properties:
    protocol:
    type: string
    value: ssh
    port:
    type: tosca:PortDef
    default: 22

    node_templates:
    myServer:
    type: tosca.nodes.Compute
    capabilities:
    endpoint: # declared in tosca.nodes.Compute as
    a tosca.capabilities.Endpoint.Admin
    type: mytypes.capabilities.Endpoint.SSH

    If I can't declare the capabiliy's type in the capability assignment, I'd
    be forced to declare a subtype of tosca.nodes.Compute that overrides
    tosca.nodes.Compute's "endpoint" and have the template use this new
    subtype. This is particularly cumbersome because TOSCA doesn't support
    multiple inheritance or mixins, so this could lead to a combinatorial
    explosion of subtypes. More abstractly, TOSCA has a nominal type system and
    "requirements" and "capabilities" are essentially references to other nodes
    -- and I can't think of a practical example of a nominal type system that
    doesn't allow covariant object references like ths.


    >
    >
    > * I have had to make other extensions to spec, some of which you may want
    > to consider for a future version of TOSCA -- the list of changes are
    > documented here: https://www.onecommons.org/unfurl/tosca.html
    >
    > Thanks, very helpful. By the way, I’d like to repeat my invitation for you
    > to join the TOSCA TC so we can discuss these during our regular meetings.
    >

    Thank you, I plan to.


    >
    >
    > # Editorial comments:
    >
    >
    >
    > * The discussion in 13.4.1 Shell Scripts (p.360) about
    > implementation outputs is confusing because it seems to imply that
    > exporting an environment variable in shell script makes it available to the
    > parent process when in fact doesn't (instead it makes it available to child
    > processes). To implement something like this the orchestrator would have to
    > wrap or rewrite the shell script (that's what Yorc does) or place other
    > requirements on the script -- for example, Cloudinary requires shell
    > scripts to call a helper child process to pass outputs back to the
    > orchestrator.
    >
    > Yes, this is actually more than an “editorial” error. What’s currently
    > described won’t work


  • 3.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-19-2020 22:21
    On Wed, Feb 19, 2020 at 8:11 AM adam souzis <adam@onecommons.org> wrote:

    > I was referring to the syntax of TOSCA names and identifiers in general --
    > unless I'm missing something, it appears that a name can be any valid YAML
    > string, including ones containing whitespace, line feeds, non-normalized
    > Unicode sequences, etc. This makes it harder to build interoperable and
    > secure implementations (think SQL injection attacks). For
    > compatibility with XML you might want to consider constraining as I suggest
    > above. Or maybe they should limited to be more like identifiers in a
    > programming language or to the syntax of DNS hostnames, as Kubernetes does.
    > On the other hand, maybe it would be useful to allow a more liberal syntax
    > so that any valid URI could be used as an identifier.
    >

    I don't think we should or need to restrict names. (Actually, even YAML
    doesn't have any real restrictions, as you can escape special characters
    and use anything.) Whatever issue might rise from storing these names
    should be handled by specific implementations. We definitely cannot guess
    all the current and future environments in which TOSCA is used. SQL column
    keys? MongoDB IDs? Neo4j? OpenStack VM names? Kubernetes deployment names?

    Generally I would say that it's not such a good idea to use the node
    templates names for anything other than the design phase. If you need to
    give a node, for example, a DNS name, then it may be better to have a
    required property for the node type for that, for which you can then add
    constraints to ensure that it fits the usage. Node templates are a design
    concept, and they may or may not map to runtime concepts.

    If you insist on using node template names, for example as the basis for
    "node instance" names -- imagine a node template becoming a Kubernetes
    Deployment name -- then it would be up to you to generate an acceptable
    unique name for the target. Maybe replace spaces with underscores, remove
    punctuation, etc. Consider, also, that the uniqueness of the name might be
    a runtime issue -- checking first that such a Deployment does not already
    exist so that it won't be overridden, in which case an underscore or a
    number is added, or similar, to ensure no overlap.

    One size does not fit all and we can't guess it all. So let's not and leave
    it as an implementation detail.

    Another important part of YAML is Unicode support. We do want to allow it,
    as it's just common international courtesy for the design phase. See this
    example <https://web.puccini.cloud/?load=examples/tosca/unicode.yaml>.

    >



  • 4.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-19-2020 23:49
    If this is an implementation detail than you are almost guaranteed to have
    interoperability issues between implementations. As it stands, right now it
    is valid for a template name to contain a line break or carriage return.
    Maybe your TOSCA implementation handles that but quite likely that template
    would not work properly in some other implementation. I can not think of a
    single vocabulary or programming language that allows any possible string
    to be treated as an identifier without some constraints. And your examples
    of where these identifiers could be used (e.g. "sql key columns, kubernetes
    deployment names") are actually an argument for restricting their syntax so
    they could easily work in all those contexts.

    On Wed, Feb 19, 2020 at 2:21 PM Tal Liron <tliron@redhat.com> wrote:

    > On Wed, Feb 19, 2020 at 8:11 AM adam souzis <adam@onecommons.org> wrote:
    >
    >> I was referring to the syntax of TOSCA names and identifiers in general
    >> -- unless I'm missing something, it appears that a name can be any valid
    >> YAML string, including ones containing whitespace, line feeds,
    >> non-normalized Unicode sequences, etc. This makes it harder to build
    >> interoperable and secure implementations (think SQL injection attacks). For
    >> compatibility with XML you might want to consider constraining as I suggest
    >> above. Or maybe they should limited to be more like identifiers in a
    >> programming language or to the syntax of DNS hostnames, as Kubernetes does.
    >> On the other hand, maybe it would be useful to allow a more liberal syntax
    >> so that any valid URI could be used as an identifier.
    >>
    >
    > I don't think we should or need to restrict names. (Actually, even YAML
    > doesn't have any real restrictions, as you can escape special characters
    > and use anything.) Whatever issue might rise from storing these names
    > should be handled by specific implementations. We definitely cannot guess
    > all the current and future environments in which TOSCA is used. SQL column
    > keys? MongoDB IDs? Neo4j? OpenStack VM names? Kubernetes deployment names?
    >
    > Generally I would say that it's not such a good idea to use the node
    > templates names for anything other than the design phase. If you need to
    > give a node, for example, a DNS name, then it may be better to have a
    > required property for the node type for that, for which you can then add
    > constraints to ensure that it fits the usage. Node templates are a design
    > concept, and they may or may not map to runtime concepts.
    >
    > If you insist on using node template names, for example as the basis for
    > "node instance" names -- imagine a node template becoming a Kubernetes
    > Deployment name -- then it would be up to you to generate an acceptable
    > unique name for the target. Maybe replace spaces with underscores, remove
    > punctuation, etc. Consider, also, that the uniqueness of the name might be
    > a runtime issue -- checking first that such a Deployment does not already
    > exist so that it won't be overridden, in which case an underscore or a
    > number is added, or similar, to ensure no overlap.
    >
    > One size does not fit all and we can't guess it all. So let's not and
    > leave it as an implementation detail.
    >
    > Another important part of YAML is Unicode support. We do want to allow it,
    > as it's just common international courtesy for the design phase. See this
    > example <https://web.puccini.cloud/?load=examples/tosca/unicode.yaml>.
    >
    >>



  • 5.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-20-2020 00:13
    On Wed, Feb 19, 2020 at 5:49 PM adam souzis <adam@onecommons.org> wrote:

    > If this is an implementation detail than you are almost guaranteed to have
    > interoperability issues between implementations. As it stands, right now it
    > is valid for a template name to contain a line break or carriage return.
    > Maybe your TOSCA implementation handles that but quite likely that template
    > would not work properly in some other implementation. I can not think of a
    > single vocabulary or programming language that allows any possible string
    > to be treated as an identifier without some constraints. And your examples
    > of where these identifiers could be used (e.g. "sql key columns, kubernetes
    > deployment names") are actually an argument for restricting their syntax so
    > they could easily work in all those contexts.
    >>
    >>
    Are you saying that an arbitrary string can't be used as a VARCHAR or TEXT
    column key in SQL? I'm pretty sure you can put anything in there. As for
    programming languages, I don't know of a single one that can't use an
    arbitrary string as a key in a hashmap.

    Perhaps you can give us a concrete example in which you tried to use a node
    template name in a context that could not accept an arbitrary string? What
    exactly are you trying to do?



  • 6.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-20-2020 04:42
    I'm saying that if something like this:

    node_templates:
    'a \U00000041 @ \f b\n" &* ':
    type: foo

    is considered a valid template you asking for trouble with little gain.





    On Wed, Feb 19, 2020 at 4:13 PM Tal Liron <tliron@redhat.com> wrote:

    > On Wed, Feb 19, 2020 at 5:49 PM adam souzis <adam@onecommons.org> wrote:
    >
    >> If this is an implementation detail than you are almost guaranteed to
    >> have interoperability issues between implementations. As it stands, right
    >> now it is valid for a template name to contain a line break or
    >> carriage return. Maybe your TOSCA implementation handles that but quite
    >> likely that template would not work properly in some other implementation.
    >> I can not think of a single vocabulary or programming language that allows
    >> any possible string to be treated as an identifier without some
    >> constraints. And your examples of where these identifiers could be used
    >> (e.g. "sql key columns, kubernetes deployment names") are actually an
    >> argument for restricting their syntax so they could easily work in all
    >> those contexts.
    >>>
    >>>
    > Are you saying that an arbitrary string can't be used as a VARCHAR or TEXT
    > column key in SQL? I'm pretty sure you can put anything in there. As for
    > programming languages, I don't know of a single one that can't use an
    > arbitrary string as a key in a hashmap.
    >
    > Perhaps you can give us a concrete example in which you tried to use a
    > node template name in a context that could not accept an arbitrary string?
    > What exactly are you trying to do?
    >



  • 7.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-20-2020 04:47
    or
    node_templates:
    "


  • 8.  Re: [tosca-comment] Feedback on TOSCA 1.3 spec

    Posted 02-20-2020 13:15
    On Wed, Feb 19, 2020 at 10:42 PM adam souzis <adam@onecommons.org> wrote:

    > I'm saying that if something like this:
    >
    > node_templates:
    > 'a \U00000041 @ \f b\n" &* ':
    > type: foo
    >
    > is considered a valid template you asking for trouble with little gain.
    >

    Again, I don't see any technical trouble with implementations dealing with
    it, there is no security risk (any more than us allowing properties of type
    "string" to have arbitrary values), and for international reasons we do not
    want to limit use of Unicode (see the example I linked). I definitely don't
    see it in our scope to delve into which Unicode we want to allow
    (ideographic alphabets) and which not (emojis). Aesthetically some
    designers' choices of node template names might not seem productive or
    appealing, but that is their choice, and software can deal with it.

    Even with very restricted limits (ASCII only, lowercase only, no spaces or
    punctuation) you can make "bad" choices:

    node_templates:
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa:
    type: Foo

    The only technical requirement I think we need is the one we have: that
    node template names be unique within that topology template. That allows
    implementations to place them in hashmaps by using the name as key, or to
    use that name as a UNIQUE column in a database.

    By the way, though the YAML 1.1 spec does require map keys to be unique,
    some YAML parser implementations (example
    <https://github.com/yaml/pyyaml/issues/165>) do not enforce it. It's worth
    making sure.