Folks:
I've thought a lot about this issue, and I believe the trade-off is this:
(1) Using elements to represent codes is one possibility, that gives us the
advantage of being able to validate a code from a controlled list. Also, if
we wrap these in a parent type, the list can be extended. (Ugly, but it
works.) For companies that have expensive validation software to handle
code-lists, this isn't a problem, but it is a problem for the little guys.
We can get free code-list standardization and validation from this approach,
which I think is good. The down-side is that designing and maintaining these
code-lists is a bitch. (Many, many versions of our schemas that do nothing
but update code-lists). Perhaps we could have special namespaces for
codelists, and have special rules so that versioning is not done by
namespace but with an attribute? Just a thought.
(2) Using the "string" approach will absolutely defeat any hope of
interoperability without benefit of expensive translation software. The EDI
experience has shown that people will happily invent their own
non-interoperable codes. In xCBL we allowed for this with the "CodedOther"
approach: all code lists have an enumeration of choices, and then a sister
element that holds a non-standard code. If you choose the "Other" code, then
you have to fill in the string. This approach is not, in my opinion, the
best solution, but it may be the best we can do with XML Schema. Using just
a string makes it not necessary to maintain codelists at all, but sacrifices
much of the benefit of having a UBL, in my opinion.
(3) Codelists as enumerated data types. This is my preferred approach - a
codelist is, in fact, an enumeration of specific semantics, and this format
makes it clear and easier to manage. What we need is an ability to extend
these (a major failing of XML schema).
Let me suggest:
(1) Dedicated namespaces for codelists (one per codelist, or related group
of codelists)
(2) Alow these namespaces to be static - that is, not versioned.
(3) Have a "version" associated with the codelist in a way that does not
change the name of the namespace. (Could we use XSD "version" for this?)
This way, we could version our structures and our codelists separately.
This models the best part of EDI, where it is common practice to update
codelists versions within an older version of message structures. And all
this, while not throwing away the ability to validate codelists with a
parser.
The down-side, of course, is that codelists are in a special class in terms
of how they are versioned and use namespaces, but I don't think it will be
that confusing - if they weren't special, we wouldn't be having this
discussion. And this approach is, after all, very much a part of the
existing EDI standards culture.
Cheers,
Arofan