UBL Naming and Design Rules SC

Re: [ubl-ndrsc] Code lists: discussion kickoff

  • 1.  Re: [ubl-ndrsc] Code lists: discussion kickoff

    Posted 01-31-2002 20:09
    Gregory, Arofan wrote: > > Folks: > > I've thought a lot about this issue, and I believe the trade-off is this: > > (1) Using elements to represent codes is one possibility, that gives us the > advantage of being able to validate a code from a controlled list. Also, if > we wrap these in a parent type, the list can be extended. (Ugly, but it > works.) For companies that have expensive validation software to handle > code-lists, this isn't a problem, but it is a problem for the little guys. > We can get free code-list standardization and validation from this approach, > which I think is good. The down-side is that designing and maintaining these > code-lists is a bitch. (Many, many versions of our schemas that do nothing > but update code-lists). Perhaps we could have special namespaces for > codelists, and have special rules so that versioning is not done by > namespace but with an attribute? Just a thought. Just a point here. Code lists in themselves do not always guarantee interworking applications. Unless each code list item is bound to an unambiguous textual definition there can still be problems. Case in point, the characters AML . When the notion of using ASN.1 as an XML schema was first proposed, I used these characters to describe our work. But when we did a google search we found so many other uses of these same characters, we switched to XER. So code lists can help in validation, but they may not provide a 100% solution even when the list of codes is fixed. And my guess is that the longer the list of codes, and the greater the number of list users from different disciplines, the more likely such problems will arise. The result: you and I will both use AML, each of us with a totally different meaning. > (2) Using the string approach will absolutely defeat any hope of > interoperability without benefit of expensive translation software. The EDI > experience has shown that people will happily invent their own > non-interoperable codes. In xCBL we allowed for this with the CodedOther > approach: all code lists have an enumeration of choices, and then a sister > element that holds a non-standard code. If you choose the Other code, then > you have to fill in the string. This approach is not, in my opinion, the > best solution, but it may be the best we can do with XML Schema. Using just I agree. This approach while not perfect as you say is at least a far more simple one than you describe below. Can we go this way for version one (for speed of work) and change our minds in a later version to a more complex solution such as you describe below without causing significant problems? > a string makes it not necessary to maintain codelists at all, but sacrifices > much of the benefit of having a UBL, in my opinion. It does push the actual validation off to the application. But given the length of the code list examples I've seen, I wonder, if for a given user whether all of the ones listed would REALLY be valid for that user's application? Seems to me, as an example, if I only ship to the US and Canada, that for my document only USA and CAN might be valid out of the list of all country codes. What benefit would I get from JAP and FRA being valid? When an actual instance document is created for a UBL user, will we provide support for specifying further granularity of code list constraints? > (3) Codelists as enumerated data types. This is my preferred approach - a > codelist is, in fact, an enumeration of specific semantics, and this format > makes it clear and easier to manage. What we need is an ability to extend > these (a major failing of XML schema). I have an enumerated type in my favorite schema language, but essentially its named values are treated as integers. But I can also view code lists differently using what is termed a permitted alphabet constraint, a set of the sets of characters that determine what is valid for an instance of a given user defined type. This allows me to express the valid sets of characters that can be used in a given field of some type, say as MyCodeList ::= UTF8String ( ABC BAX , ... ) The extension marker ( ... ) instructs tools to also expect other values not in the list, so I do not need to code up an Other choice alternative. But I am almost certain that such permitted alphabet constraints do not exist in XSD. > Let me suggest: > > (1) Dedicated namespaces for codelists (one per codelist, or related group > of codelists) > (2) Alow these namespaces to be static - that is, not versioned. > (3) Have a version associated with the codelist in a way that does not > change the name of the namespace. (Could we use XSD version for this?) > > This way, we could version our structures and our codelists separately. > This models the best part of EDI, where it is common practice to update > codelists versions within an older version of message structures. And all > this, while not throwing away the ability to validate codelists with a > parser. This seems a reasonable approach. But how is interoperability maintained when a code list item is removed? Are we affected if an item with one meaning in code list version A is given another meaning in code list version B? My question here is what happens in terms of interoperability if you are using A and I am using B? Phil > The down-side, of course, is that codelists are in a special class in terms > of how they are versioned and use namespaces, but I don't think it will be > that confusing - if they weren't special, we wouldn't be having this > discussion. And this approach is, after all, very much a part of the > existing EDI standards culture. > > Cheers, > > Arofan > >