UBL Naming and Design Rules SC

Agenda UBL NDR SC 16 July 2003

  • 1.  Agenda UBL NDR SC 16 July 2003

    Posted 07-16-2003 11:11
    The NDR SC Weekly Telephone Conference call is 16 July 2003 at 10:30 am
    Central (8:30am PDT).
    
    #############################################
    STANDING INFORMATION FOR UBL CONFERENCE CALLS
    U.S. domestic toll-free number: (866)839-8145
    Int. access/caller paid number: (865)524-6352
    Access code: 5705229
    #############################################
    
    1.  Roll Call and Welcome from the chair (Mavis)
    Regrets Eduardo Gutentag
    
    2.  Adoption of agenda/schedule planning
    
    Upcoming Call Schedule:
    
    July 23 - Mark, chair  (Clist & NDR 1.0 OASIS review)
    July 28 - Aug 1 - Face to Face Montreal (Process OASIS NDR/Clist review)
    
    
    4.  Review of rules issued to list for discussion
    
    - Rule 105 [ID/IDREF MUST NOT be used.]
    
    
    See Dan's comments [This seems a little overboard. I guess the corresponding
    rule is that
    key/keyref will be used to implement simple ID/IDREF relationships. Has
    anyone verifed that this is something that the tools support? I know they
    will manage ID/IDREF for me, but I have doubts about any other mechanism
    for making relationships.]
    
    - Rule 103 Substitution Groups [Substitution groups MUST NOT be used.]
    
    See Eve's comments [Does this mean that the normative UBL schemas must block
    element
    substitution so that customizations can't use it, or that UBL must not
    itself define substitution groups while letting customizations do it,
    or...?]
    
    
    
    - Rule 102 [The absence of a construct or data MUST NOT carry meaning.]
    
    See Tony's comments [I object to this rule on the basis that it seems
    unprovable.  How does one
    guarantee that the absence of something cannot never imply something
    meaningful?  I suspect that the idea behind this was something like "absence
    of
    an element should not be used to imply that the element has a null value",
    or
    something like that.  Can someone who was around for the creation of this
    rule
    comment on what the idea behind it was? ]
    
    See Jim's comments [Although I've contributed next to nothing to UBL, I do
    seem to recall a
    discussion around this that is easily understood in the form of an
    example (totally contrived) that would violate the rule:
    
    "The CustomerID element may have the optional attribute customerIDType.
    If that optional attribute is missing, then the customer ID is a DUNS
    number."
    
    (Yes, I'm aware of W3C XML Schema support for defaults, but that's
    beside the point and probably not relevant at all given that the rule is
    not stated in the context of XML Schemas.)]
    
    
    - Rule 100 [[R 100] Union technique MAY be used to merge data types. UBL:
    Not
    applicable. Therefore, SHOULD NOT be used. (Code lists are excluded from
    this rule.)]
    
    See Dan's comments [How about:
    
    [R 100] Union MAY be used to create new simple datatypes as long as those
    types are not enumerated types (code lists).]
    
    
    - Rule 99 Simple Type Restriction [Simple Type restriction MAY be used.]
    
    See Dan's comments [This rule seems a little strange to me. The implication
    is that we have
    gone through all the schema capabilities and have indicated yes/no as to
    what can be used - is this true? If it is true, then I would recommend one
    rule that has a table of all the "features" with an indication of their
    use. It will be easier to find and maintain.]
    
    - Rule 98 [Built-in Simple Types SHOULD be used wherever possible.]
    
    
    See Dan's comment [Should we be masking the datatypes to protect against any
    future changes in
    definitions or other technologies? For instance xsd:boolean allows 0/1 and
    true/false - do all languages implement boolean this way? Will another XML
    schema language define boolean this way, or will it add yes/no?
    
    I would suggest adopting and specifying the UBL definition of the datatypes
    (for now they might be exactly what is in schema), but we don't use the
    schema types directly. So I would defined a ubl:Boolean that restricts
    xsd:boolean and then use ubl:Boolean everywhere instead of xsd:boolean.
    
    I belive in the long run this will be a useful "protection" from technology
    change.]
    
    
    - Rule 97 [Mixed content MUST NOT be used (excluding documentation).
    
    Duplicate: See Rule 25]
    
    See Dan's comment [Hoefully Mixed content is defined somewhere in a
    glossary, if not please
    add a description here:
    
    An element has mixed content, when it allows both data and additional
    element content.]
    
    
    
    - Rule 96 [Two schemas shall be developed for each standard.  One schema
    shall
    be a run-time schema devoid of documentation. One schema shall be a fully
    annotated schema that employs XHTML for the annotations.]
    
    See Chee Kai's comment [Objection : Stating two versions of schemas that are
                otherwise the same except for documentation
                is a user-level optimisation/preference issue
                that shouldn't become a hard-rule from UBL/NDR.
    
    This is not to say that we cannot publish as a non-normative
    add-on, or reading aid or optimized version (assuming
    documented version is normative).
    
    Suggest removing it.]
    
    See Dan's comment [Removing it allows one to do or not do this. I believe
    the general request
    is to provide these two forms of the schema and there should be a
    rule/requirement that states this. I'm fine with the requirement to send
    both, but I think it should still be
    a stated requirement/rule.]
    
    See Tony's comment [I am *completely* opposed to the suggestion that users
    can filter annotations
    out of their Schemas if required.  My experience is that only the < 1% of
    people who take part in XML committees have the confidence and experience
    for
    that.  Worse, it just introduces an unnecessary, globally-distributed
    quality
    control issue.  If it is so easy to filter out the documentation, then let
    us
    do it just once ourselves for everyone on the planet, and issue two
    equivalent
    normative Schemas, full-fat and low-fat.  I'm happy to contribute an XSLT
    script or Java class to do the job, if it is too hard to do from the Schema
    generation tools.]
    
    See Chee Kai's comment on Tony [Sorry, Tony, I am VERY STRONGLY against
    having more than
    ONE normative schema.
    
    Taking a step further based on proposed "normative optimized
    schema without documentation", why can't people ask for
    "normative optimized schema without documentation AND
    without whitespaces" (ie, one long line of "optimized"
    non-fat milk?), and why can't people ask for "normative
    optimized schema in compressed WBXML binary format",
    and all sorts of other "normative" versions?
    
    Where does that stop??
    
    So, no, please don't open the pandora box.]
    
    See Bill's comment [I think we should remove this rule.  A user can filter
    out annotations if
    they are unwanted.]
    
    
    - Rule 95 [[R 95]  Wildcards MUST NOT be used.
    
    
    ATG Decision: ACCEPTED. Wildcards MUST NOT be used. Editors Note - does this
    include Any and AnyAttribute?]
    
    See Dan's comment [The wildcard for elements is Any, the wildcard for
    attributes is
    anyAttribute and there is a wildcard for the datatype as well anyType and
    anySimpleType. Because the question was asked, maybe these should all be
    listed as an example.]
    
    
    - Rule 94 [The nillable attribute MUST NOT be used]
    
    See Chee Kai's comments [Amendment Required because prohibiting xsd:nil does
    NOT
    equate with prohibiting empty content element.
    
    Suggested change:
    
    [R 94]: The nillable attribute MUST NOT be used.
            Empty content element MUST NOT be instantiated
            UNLESS it is expressly a user-intended indication
            to instantiate empty content for a given element.]
    
    See Dan's comment on Chee Kai [This isn't quite right. In schemas only
    elements of type string can be
    presented as empty in a data stream without using the NIL attribute. The
    user indication that nil is the appropriate interpretation is to use this
    schema attribute (or we have to create our own method).
    
    We need a statement more like this:
    
    Any element declared to have data, must not appear in a data stream as an
    empty element. Elements declared as EMPTY may only appear in the data
    stream as an empty element. This rule then prevents the use of the nillable
    attribute in the schema definition and the corresponding xsi:nil attribute
    in the date stream.]
    
    
    
    
    - Rule 91 Types [All type declarations MUST be global.]
    
    
    See Dan's comments [I would modify this rule to be:
    
    [R 91]   For reuse and extension, all types MUST be named, which then
    requires their declarations be globally defined.]
    
    
    - Rule 90 [The Abbreviations and Acronyms listed in Section XX must always
    be
    used.]
    
    See Dan's comments [ This should be combined so there is only one rule #87
    
    
    [R 87]   The only Abbreviations and Acronyms allowed for names used in the
    schemas or datastreams are listed in Section XX, ie. Element, attribute and
    Simple and Complex Type Names. Code list values are not controlled by this
    rule. [Editor's note: Section xx to be a
    section in the NDR document.  Currently this section only includes ID for
    Identifier, DUNS, and URI.]
    
    See Mike's comments [That doesn't really capture Rule 90. I think two rules
    are necessary:
    
    [R 87]   The only Abbreviations and Acronyms allowed in the naming of
    Element, Attribute and
    Simple and Complex Types are those contained in the list published in
    Section XX.  [Editor's note: Section xx to be a
    section in the NDR document.  Currently this section only includes ID for
    Identifier, DUNS, and URI.]
    
    [R90]    The Abbreviations and Acronyms listed in Section XX MUST be used
    when referring to ( or 'in the place of'?) their corresponding references.]
    
    
    
    - Rule 89 [Acronyms and abbreviations must only be taken from the latest
    version of the Pocket Oxford English Dictionary. The first occurrence listed
    for a word will be the preferred item to be used.]
    
    See Dan's comments [I would change the text:
    
    [R 89]   When the use of a new acronym or apprviation is approved fo ruse
    in UBL documents, the acronyms or abbreviations MUST be taken from the
    latest
    version of the Pocket Oxford English Dictionary. If more than one value is
    provided it should be the first occurrence listed for a word will be the
    preferred item to be used.
    
    
    Do we need something here to handle possible collisions with existing
    abbreviations? What about abbreviations that make words that may/may not be
    used elsewhere? Seems like this rule or the previous one that said they can
    be added should state some additional requirements (assuming you agree we
    should avoid these problems) Also what happens if we agree that we want an
    abbreviation and one is not in the dictionary, should we state a method for
    creating one, or should it be that we don't use the abbreviation?
    
    I have seen a general rule for creation (when they don't exist other wise)
    to be: "Drop all the vowels from the word."]
    
    See Tony's comment on Dan [>I have seen a general rule for creation (when
    they don't exist other wise)
    > to be: "Drop all the vowels from the word."
    
    Yes, that is the standard rule one uses to make acronyms accessible to
    native
    english speakers, and inaccessible to everyone else.]
    
    See Dan's comment on Tony [I didn't say anything about clarity ;-) it just
    gives you an abbreviation.
    I don't know that any abbreviation or acronym is going to be useful for a
    non-native speaker. By the way I'm having enough trouble with BIE and ACC
    and BCT in trying to figure out what these really mean. Even when expanded
    they don't mean much.]
    
    - Rule 86 [In the context of a schema, information that expresses
    correspondences between data elements in different classification schemes
    ("mappings") may be regarded as metadata. This information should be
    accessible in the same manner as the rest of the information in the schema.]
    
    
    See Bill's comment [What do the terms "classification schemes" and
    "mappings" mean in this rule?]
    
    
    
    
    - Rule 85 [Instances conforming to schemas should be readable and
    understandable, and should enable reasonably intuitive interactions.]
    
    
    See Dan's comments [I don't think the Instance has anything to do with the
    readability and
    understandability. Seems to me this should be a requirement for the schema
    design/generation process. This seems more appropriate:
    
    [R 85]   Schemas should be designed (or generated) such that instances
    conforming to them are readable and understandable, and should enable
    reasonably intuitive int[eractions.]
    
    
    See Mike G's comments on Dan [ [R 85]   Schemas should be designed (or
    generated) such that instances
    conforming to them are readable and understandable, and should enable
     reasonably intuitive interactions.
    
    Better, but I'm still wondering if the judgment of such things is too
    subjective for a rule ...]
    
    
    - Rule 84 [UBL messages must express semantics fully in schemas and not rely
    merely on well-formedness.]
    
    See Dan's comments [What are we really trying to say with this rule?
    Is that any UBL document should have a matching Schema? Should there also
    be a statement about validity based upon that schema? I would think
    something like this is more appropriate:
    
    [R 84]   Any UBL messages MUST have a corresponding schema and the data
    stream must be valid based upon that schema. You should not rely
    merely on well-formedness when defining and building a message..]
    
    
    See Tony's comments [I don't think Schemas could ever be said to capture
    semantics.  They can
    capture some of the business rules, but I rarely can they capture all of the
    business rules.  I'm not convinced that trying to modify or adjust real
    business rules just to fit in with what Schemas is a good idea.
    
    Let me reword this in English, or a close approximation thereof:
    
    I don't think Schemas could ever be said to capture semantics.  They can
    capture some of the business rules, but rarely can they capture all of the
    business rules.  I'm not convinced that trying to modify or adjust real
    business rules just to fit in with what Schemas can do is a good idea.]
    
    See Mike G's comments [Or:
    
    [R 84]   All UBL messages MUST validate to a corresponding schema. You
    should not rely
    merely on well-formedness when defining and building a message.. ]
    
    
    
    - Rule 83    [Trading partners may agree on other character encodings to use
    among themselves. It is recommended in all case that encoding declarations
    be provided in the XML declarations of UBL documents.]
    
    
    See Dan's comments [This seems to weak. I would change it to read:
    
    [R 83]   UBL documents MUST always identify their character encoding with
    the XML declaration. It is also recommend that for protability that UTF-8
    or UTF-16 be used; although trading partners may agree on other character
    encodings to use among themselves.]
    
    
    - Rule 82 [ UBL documents must use the same legal characters in XML
    character
    data that are listed in the XML Recommendation.  Including tab, carriage
    return, line feed, and the legal characters of Unicode and ISO/IEC 10646.]
    
    See Dan's comment s [ I would change the rule text:
    
    [R 82]   UBL documents must conform to all the requirements of the most
    current XML Recommendation.
    
    Why are we worried just about these characters. I don't know that this is
    even required, but if you want to keep something for this rule, then I
    would make the more general and inclusive rule as above. Otherwise its
    like, "Oh all I need are these characters I can forget about all the other
    XML requirements".]
    
    - Rule 81 extending [[R 81]  A UBL message set may be extended where
    desirable if the business
    function of the UBL original is retained., but the message exists within its
    own business context. ]
    
    See Mike's comments [Recommend: A UBL message set may only be extended if
    the business function of the UBL original is retained., and the message
    exists within its own business context. ]
    
    
    Recommend: A UBL message set may only be extended if the business function
    of the UBL original is retained., and the message exists within its own
    business context.
    
    
    - Rule 77 Duration [Duration may be expressed by the BCC Duration.]
    
    
    See Dan's comment [Also is this wording a little weak? If I have a duration
    I have a duration
    and it should always be represented with the proper BCC. When or why would
    you allow something different?]
    
    - Rule 76 Point in time [The intervals in a point in time should be
    represented by a single
    BCC indicated by the choice operator i.e. FrequencyDuration, FrequencyYear
    etc.]
    
    See Tony's comment [I don't see how a point in time can have intervals.  I
    think this needs
    rewording, at the least.]
    
    - Rule 72 Representation Term [For each representation term the equivalent
    data type must be used
    i.e. if the representation term Date is used, then the corresponding built
    in datatype xsd:date must be used.]
    
    See Dan's comment [Wouldn't it be better to have a single rule and table
    that lays out the
    relation between the representation term and its data type (maybe even the
    BCC involved)? If we did that there would be one place to look this
    information up and we could remove the need for new rule with the additions
    of new types and remove rules like this and #68 which seems to be a
    generalization of this without the specific mapping.
    
    If you need something specific to report against, if could be Rule #72-Date
    or Rule #72-Decimal.]
    
    - Rule 71 Period Details [A period can be expressed using the Aggregate Core
    Component (ACC)
    PeriodDetails. The ACC is divided into 3 representation types, Date, Time
    and DateTime. One of these must be selected. Each option has a start and end
    date, start and end time or start DateTime and end DateTime.]
    
    
    
    See Dan's comment [This is even weaker than the BC Duration requirements.
    Also there is an
    explanation of the design that I don't think should be in the rule.
    
    Here is how I would change this based upon current design, if you agree we
    should be more specific in the requirements, then may would change to must.
    Rule 71 Period Details [A period MAY be expressed using the ACC
    PeriodDetails.]
    Also we are not consistent in how acronyms are spelled out. BCC used
    without an explanation but here we spell out ACC. There should be a
    standard glossary that has all these terms with maybe a pointer to control
    documentation for the definition.]
    
    
    5.  Action Items and Issues:
    
       A.  Modeling changes and schedule for schema fixes for XML instances and
    FPSC,
    
        B.  Other issues or comments per the 0p80 Review?
    
    
    6. AOB
    
    
    8.  Adjourn
    Next NDRSC teleconference call is 23 July 2003, Mark Chairing.