UBL Naming and Design Rules SC

 View Only

Re: [ubl-lcsc] Re: [ubl-ndrsc] UBL: question on CCT language component

  • 1.  Re: [ubl-lcsc] Re: [ubl-ndrsc] UBL: question on CCT language component

    Posted 03-04-2004 03:44
     MHonArc v2.5.0b2 -->
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

    ubl-ndrsc message

    [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


    Subject: Re: [ubl-lcsc] Re: [ubl-ndrsc] UBL: question on CCT language component


    I agree with Stephen on this.

    In reviewing the CCT/SDT and UDT schemas we have been using up to draft 4 of 1.0, it appears we have dropped all use of normalizedString.
    this is despite the resolution in San Francisco that token is too restrictive.  From reading the XML Schema Primer (part two), i gather that...
    string = a set of finite-length sequences of characters
    normalizedString = strings that do not contain the carriage return (#xD) nor tab (#x9) characters
    token = strings that do not contain the line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces

    What i understood we agreed to is that the XSD representation of the content of core component types should be:

    Amount. Content = xsd:decimal
    Binary Object. Content = xsd:base64Binary
    Code. Content = xsd:normalizedString (currently this still says xsd:token)
    Date Time. Content = xsd:dateTime
    Identifier. Content = xsd:normalizedString (currently this still says xsd:token)
    Indicator. Content = xsd:boolean
    Measure. Content = xsd:decimal
    Numeric. Content = xsd:decimal
    Quantity. Content = xsd:decimal
    Text. Content = xsd:string

    I too am concerned that we seem to revert back every time.  The solution is to drive these schemas from the Library rather than have then added-on later.
    I suggest we correct these for draft 6 and propose that we do this by making the mapping part of the CCT/SDT/UDT models.

    As i mentioned in the discussion we had in Washington on the Friday morning, all supplementary components resolve to be a ". Content" for one of these core component types.  So a supplementary component called "Numeric. Format. Text" is a "Text. Content" and a supplementary component called "Measure Unit. Code List Version. Identifier" is of "Identifier. Content", and so forth.  
    This means that theoretically we need only define datatypes as listed above and the supplementary components will take their xsd:datatypes from these.
    However, it appears we have chosen in some cases to shortcut this principle by using inbuilt XSD datatypes.  So we find that if the representation is "Identifier" and the property term is  "Uniform. Resource" we use xsd:anyURI and if the property term is "Language" we use xsd:language.

    The nett result of this will be that we do not have xsd:token anywhere - which is fine by me.

    Does this agree with your findings, Anne?


    Stephen Green wrote:
    Anne
    
    I'm worried that in all this we might inadvertently drop the LC/NDR
    resolution to use xsd:normalizedString wherever there is the possibility
    that more than one space need be preserved (having business meaning).
    
    I think it was decided for Identifier but there was less certainty about
    using it for Code.
    
    *I note that most Supp Comps are identifiers.*
    Perhaps this got neglected as changes were made to the Schemas and
    then the omissions reinforced as we discussed xsd:token versus xsd:string.
    
    Steve