Phil:
Let me try to answer your points in a general way:
First, when we talk about "code lists" I am assuming that we are restricting
ourselves (as we did in xCBL) to those lists of commonly used, well-defined,
externally-maintained "codes" that come from places like X12, ISO, and the
UN/CEFACT Codes Working Group. For xCBL, we harmonized these codes in some
cases, and in others, we chose to subset them for our own uses, but we have
clear maps back to the definitions commonly understood in business today.
We *cannot* use any controlled construction in UBL - be it an element or
attribute name, or a value in an enumerated list - that we do not in some
way completely and unambiguously define. Otherwise, we have failed in
creating a useful language for e-business.
In general, I agree with you - we *must* be unambiguous, using formal
references to that work of other bodies that we base ours on, if indeed we
choose to do this.
As for alphabetic constraints, XSD does give us the ability to do pattern
constraints called "regular expressions", so I think we could do what you
suggest, but a simple enumeration datatype will get us to the same place. I
don't think parsers yet support the regular expression stuff, although they
might.
As for validation, you do have a good point - few users ever support all of
the codes in a long code lists. But the validation issue depends on
something else. My mental picture of how SMEs will use this stuff has a
lower bound, which is that they view business documents in a browser, based
on a hosted application that can do only two simple things: (1) parse the
document against the schemas; and (2) run it through an XSL or CSS
stylesheet to produce a display form compatible with today's web-browser
technology.
There are several companies - mine among them - that offer this type of
low-level, hosted functionality, and it is generally seen as the basic
replacement for FAX-based processes used by the EDI VANs, called "Rip &
Read". These applications - because they are generic, XML-based applications
- typically do not offer detailed functionality about the mappings between
sets of codes that are common in more fully automated EDI implementations.
Because of this, I feel that being able to validate code lists with generic
XSD parsers is very important. So is limiting, to the extent practical, the
sets of enumerated values that people use to express semantics within UBL.
Cheers,
Arofan