MHonArc v2.5.0b2 -->
ubl message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: Code list discussion so far
Hello UBL TC,
We held the first of two discussions to resolve the code list
issue this morning; the second of the two will take place Thursday
afternoon (1 p.m. in Ottawa). Preliminary outcomes are as
follows.
- There are serious use cases that require modifications to code
lists in the interval between official revisions of code lists.
This is especially true in the case of industry-specific code
lists.
- Solutions that require the namespaces in the UBL schemas to be
changed when a code list is modified are very expensive.
- There appear to be three ways to accomplish modifications to
UBL schemas without changing the namespaces:
1. Users simply modify the file containing the code list while
leaving everything else alone. This method is being used
successfully in Denmark. Obviously we cannot prevent users
from doing this, and given a proper notification procedure,
it seems to work pretty well.
2. We explicitly enable modifications to the code lists by
embedding a "substitution group hook" in the UBL schemas as
described by Tony Coates and Marty Burns. While cleaner
from a conceptual point of view, we're finding it difficult
to see any big advantage of this approach over simply
swapping out one code list module and replacing it with
another one. The basic notification and management issues
appear to be about the same.
3. We take a radically different view of the problem by
distinguishing between two kinds of code lists:
a. Code lists that define codes used only in UBL (status
codes, for example). Such lists are typically
well-defined, are completely under our control, and are
not (or should not be) extensible.
b. Code lists that are defined by outside agencies and
referenced in UBL. These are conceptually distinct from
the first category even if some happen to be bundled into
the UBL package.
Making this distinction would allow us to take two different
approaches to code list definition. Code lists of the first
kind could be defined in schema modules using enumerations
just as we do in 1.0. Code lists of the second kind could
be defined in XML instances of a standard code list schema,
with the codes of this kind declared as unrestricted
strings. Ordinary XSD validation would be used for the
first kind of code list, just as in 1.0, whereas validation
of the second kind would typically take place in a second
validation phase using something like Schematron.
Participants in the discussion noted the following points
regarding this third alternative:
- Publishing standard code lists as instances of a standard
code list schema is much closer to the basic XML concept
than publishing code lists as schema fragments. In fact,
the whole namespace problem we've been wrestling with
here can be seen as an artifact of the attempt to use
things that should change very rarely (schemas) to
publish things that people often want to modify (code
lists). One result of this has been that instead of
recommending a standard for code lists using a standard
formalism (such as an XSD schema) we have been
recommending a template for code list schemas for which
there is no standard formalism, just a complex set of
prose descriptions supplemented by examples. The code
list paper published in UBL 1.0 admits it to be
"desirable that the [code list] data model be expressed
in a machine readable form" but can do no more than to
place this desirable development in some distant future
where a formalism exists for doing so. The definition of
a standard XML schema for code lists would solve this
simply by putting such a definition at the appropriate
conceptual level.
- Defining codes as unrestricted strings would obviously
make it trivially easy to meet all the requirements for
ad hoc code list modification. The tradeoff would be
that the code lists themselves could no longer be used to
directly drive XSD validation. It is unlikely, however,
that any major user of the UBL schemas would be satisfied
with just a simple check against an enumeration before
entering the document into an accounting application; it
is much more likely that something like a Schematron
check would be performed following simple XSD validation.
This is in fact what is done in the Danish
implementations, and it more closely reflects an initial
premise of the UBL code list effort that most code list
validation would take place at the application level
(report of the NDRSC, 18 March 2002).
- Post-schema validation appears to be less problematic
than what we're hearing from initiatives that are
attempting to use substitution groups. We believe it to
be significant that this is the approach adopted for ISO
20022 (banking).
- We could provide a mechanism (an XSLT transformation, for
example) that would take *any* code list published using
the standard code list schemas and generate code list
schema modules just like the ones we've included in UBL
1.0. (The XSLT would, in effect, provide the missing
formalism needed to specify the construction of the
schema modules in a machine-readable way.) In fact, we
could provide the modules so generated as part of the
release package together with instructions for validating
instances against these generated modules in a second XSD
pass, thus providing all of the advantages of validation
against enumerations while still allowing easy
modification of code lists.
- In a separate decision, the TC decided this morning to
accept the UDT and CCT schema modules defined by
UN/CEFACT ATG2 rather than defining and maintaining our
own. Those schema modules reference a few standard code
lists (currencies, language codes, units of measure, mime
media types) that would retain the old enumeration form.
As most real-world situations requiring code list
modification are encountered not with these very basic
standard lists but with industry-defined code lists, this
is not considered a problem.
- Mark Crawford wished to be put on record as having
reservations about this approach for two reasons:
1. The desirability of maintaining XSD validation of
codes, and
2. The wish to maintain alignment with ATG2, which
intends to specify all code lists as schema modules.
It is recognized, however, that ATG2 does not have
customization as a goal, whereas we do (though to what
extent still remains to be determined).
- To make this approach practical for users, it will be
necessary to provide documentation showing users how to
implement a post-schema code list validation phase using
Schematron. Bryan Rasmussen has volunteered to create
this if his management will approve the work.
Everyone interested in this subject should be prepared to
participate in tomorrow's follow-up discussion (1 p.m. Ottawa time
Thursday 11 August at the usual UBL conference number).
Jon
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]