MHonArc v2.5.0b2 -->
ubl-ndrsc message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: Re: [ubl-lcsc] Re: [ubl-ndrsc] UBL: question on CCT language component
I agree with Stephen on this.
In reviewing the CCT/SDT and UDT schemas we have been using up to draft 4
of 1.0, it appears we have dropped all use of normalizedString.
this is despite the resolution in San Francisco that token is too restrictive.
From reading the XML Schema Primer (part two), i gather that...
string = a
set of finite-length sequences of characters
normalizedString = strings that do not contain the
carriage return (#xD) nor tab (#x9) characters
token = strings that do not contain the line feed (#xA)
nor tab (#x9) characters, that have no leading or trailing spaces (#x20)
and that have no internal sequences of two or more spaces
What i understood we agreed to is that the XSD representation of the content
of core component types should be:
Amount. Content = xsd:decimal
Binary Object. Content = xsd:base64Binary
Code. Content = xsd:normalizedString (currently this still says xsd:token)
Date Time. Content = xsd:dateTime
Identifier. Content = xsd:normalizedString (currently this still says xsd:token)
Indicator. Content = xsd:boolean
Measure. Content = xsd:decimal
Numeric. Content = xsd:decimal
Quantity. Content = xsd:decimal
Text. Content = xsd:string
I too am concerned that we seem to revert back every time. The solution
is to drive these schemas from the Library rather than have then added-on
later.
I suggest we correct these for draft 6 and propose that we do this by making
the mapping part of the CCT/SDT/UDT models.
As i mentioned in the discussion we had in Washington on the Friday morning,
all supplementary components resolve to be a ". Content" for one of these
core component types. So a supplementary component called "Numeric. Format.
Text" is a "Text. Content" and a supplementary component called "Measure
Unit. Code List Version. Identifier" is of "Identifier. Content", and so
forth.
This means that theoretically we need only define datatypes as listed above
and the supplementary components will take their xsd:datatypes from these.
However, it appears we have chosen in some cases to shortcut this principle
by using inbuilt XSD datatypes. So we find that if the representation is
"Identifier" and the property term is "Uniform. Resource" we use xsd:anyURI
and if the property term is "Language" we use xsd:language.
The nett result of this will be that we do not have xsd:token anywhere -
which is fine by me.
Does this agree with your findings, Anne?
Stephen Green wrote:
Anne
I'm worried that in all this we might inadvertently drop the LC/NDR
resolution to use xsd:normalizedString wherever there is the possibility
that more than one space need be preserved (having business meaning).
I think it was decided for Identifier but there was less certainty about
using it for Code.
*I note that most Supp Comps are identifiers.*
Perhaps this got neglected as changes were made to the Schemas and
then the omissions reinforced as we discussed xsd:token versus xsd:string.
Steve