UBL Naming and Design Rules SC

[ubl-ndrsc] Eve's writeup on list containers

  • 1.  [ubl-ndrsc] Eve's writeup on list containers

    Posted 09-06-2002 15:16
     MHonArc v2.5.2 -->
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

    ubl-ndrsc message

    [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


    Subject: [ubl-ndrsc] Eve's writeup on list containers


    In Wednesday's meeting, I agreed to provide a writeup on the issues with 
    list containers.  Here it is.  I'm hoping that some of this might be 
    useful in the eventual creation/modification of a position paper on the 
    wider topic of containers in general.
    
    *List Containers in UBL
    
    *TOC
       Executive summary
       Introduction and definitions
       Motivating the presence of containers in the physical model
       The particular value of list containers in the physical model
       The value of list-container ABIEs in the logical model
       Conclusion
    
    
    *Executive summary
    
    This writeup considers whether list containers should appear in our 
    physical model and, separately, in our logical model.  I conclude by 
    recommending that it should appear in both, but I take far too many 
    words and steps to get there. :-)
    
    
    *Introduction and definitions
    
    When we say "container" in this discussion, we mean an XML element, 
    plain and simple.  XML is a hierarchical technology, leading to the 
    possibility -- indeed the likelihood -- of significantly nested element 
    structures in nearly all XML instance documents.
    
    A BIE is a model for a piece of business information to which has been 
    applied a semantically unique and useful definition (and of course also 
    an identified business context, because it's a BIE and not a CC, but 
    that's not important right now).  The containership discussion revolves 
    pretty much entirely around ABIEs, which are collections of other BIEs 
    and thus have a kind of hierarchy themselves.
    
    Note that our process of turning our logical model (spreadsheet) into a 
    physical model (XSD) takes ABIEs and turns them into complex datatypes, 
    which each govern one or more XML elements.  So there is more than just 
    a vague similarity here -- ABIE hierarchy pretty much turns into XML 
    hierarchy, to a first approximation.
    
    One question we've been considering is whether to allow for any kinds of 
    containers that are *not* properly ABIEs.  Do we ever need XML elements 
    in UBL that are not connected to a semantically unique and useful 
    definition but are rather some kind of artificial construct?  Is it 
    useful to have a construct adds nothing of value to the logical model, 
    but does add value to the physical model?  We'll consider the logical 
    and the physical value propositions separately for list containers.
    
    This writeup focuses on a specific kind of container, a "list 
    container".  This is a container whose contents consist solely of a 
    series of like elements.  An example is the series of line items that 
    appears in an order.  Currently all multiple-cardinality constructs in 
    our logical model are of the form 0..n or 1..n; we don't (yet) have any 
    cardinalities like 3..35.  But this discussion applies to them as well.
    
    
    *Motivating the presence of containers in the physical model
    
    It's pretty easy to see the value added when containers are present in 
    (appropriate) abundance in the physical model.  All XML tools adapt to 
    XML's hierarchical nature and tend to hook up nearly all processing to 
    the presence of elements (or element boundaries, for streaming processors).
    
    Thus, containers allow processing to "factor out" into the right place. 
      In OO-speak, if you turn containers into objects to which you attach 
    methods, it's desirable to hide certain data "lower down" from methods 
    operating on other data "higher up".  (There's no doubt an analogy here 
    to relational processing of functional dependencies as well, but as my 
    grip on OO technology is somewhat tenuous and my grip on relational 
    technology is more so, I'll stop while I'm behind.)  You wouldn't want 
    your Address information associated directly with a Party; it's useful 
    to have the intervening Address ABIE/container because you do things 
    with Party data that you don't do with Address data, and vice versa.
    
    The benefit of factoring-out is true for XSD customization as well.  If 
    the data is appropriately grouped, then each trading community needing 
    to add more data will be able to find the conceptually right place to 
    add those pieces.  It would be weird to have to add new address-related 
    fields to PartyType, particularly since XSD would require that they get 
    stuck at the end of the content model, possibly far away from the other 
    address stuff.
    
    As an aside, note that it's typically easier to flatten a deeply nested 
    XML document (removing information) than to "tree-ify" a relatively flat 
    XML document (adding new information, which may require human invention).
    
    
    *The particular value of list containers in the physical model
    
    I have been arguing that XML processing benefits from list containers as 
    much as from other kinds of containers.  However, intellectual honesty 
    (darn it :-) compels me to report some new findings that make this 
    position slightly harder to hold.
    
    Our current modeling formalism allows for only relatively simple content 
    models.  For example, the content model of an order is (in DTDspeak):
    
       (Header, LineItem+, Summary?)
    
    I don't believe it's possible to encode something more like the 
    following in our spreadsheet without inventing another layer of ABIE; 
    the nested parenthesized group can't be done and would probably 
    *deserve* its own ABIE and definition if we really wanted this structure:
    
       (Header, (LineItem+, Summary?)+)
    
    We do have some cases of multiple series of like elements within a 
    single parent element, but each set has a different element name.  For 
    example, an order header looks like this:
    
       (IssueDateTime, Identifier*, BuyerIdentifier?, SellerIdentifier?,
       BuyerAccountId?, Quote*, ...)
    
    Here, both Identifier and Quote are series of like elements, but they 
    have different names so the series are easily distinguishable in processing.
    
    Because of this situation, it turns out that XSLT processing (for 
    example) is not all that hard when there is no list container present. 
    All processing that needs to be done on the list as a whole could simply 
    be done in the XSLT template for the parent element.  For example, a 
    skeleton template for handling the Order element could look like this:
    
       <xsl:template match="ubl-procurement:Order">
         ...order setup stuff goes here...
    
         <xsl:apply-templates select="Header"/>
         (...shunt off to Header template...)
    
         ...line item setup stuff goes here...
         <xsl:apply-templates select="LineItem"/>
         (...shunt off to one LineItem template that works for all
             of them equally...)
         ...line item wrapup stuff goes here...
    
         <xsl:apply-templates select="Summary"/>
         (...shunt off to Summary template...)
    
         ...order wrapup stuff goes here...
       </xsl:template>
    
    I have been arguing that it would be more appropriate to put the line 
    item setup and wrapup processing in its own dedicated LineItemList 
    template, because these things are specific to the line items, not the 
    order.  I asked Norm Walsh about this (he supplied an XSLT stylesheet 
    from which I derived the above example); while he agreed that my 
    position is conceptually sound, he felt that it may not be all that 
    compelling in practical terms for XSLT designers.  Those of you who are 
    planning to compile the UBL schemas into an object representation, or 
    use procedural code to operate on UBL, might want to weigh in on this as 
    well.
    
    Of course, the argument made above for easier customization still holds 
    true for list containers just as much as for other containers.  If a 
    trading community ever wanted to add some *un*like elements to a 
    collection of like elements (such as metadata that applies across a 
    whole line item list), the only honest way to do this in XSD is to hang 
    the additions off a LineItemList structure.  (It would be extremely 
    difficult and ugly to add a list container in the customization that 
    wasn't present in the original.)
    
    The only reason I can think of *not* to have list container elements is 
    that it creates a bunch of new elements.  The question then becomes on 
    of whether it's a true economy or a false economy not to have them, at 
    which point the above arguments need to be referred to.
    
    On balance, I maintain that the presence of list containers in the 
    physical model is a good thing.  My general policy is to be generous 
    with containers because they are the common currency of XML; I hope I've 
    shown that the presence of list containers makes processing more natural 
    and object-oriented, and allows for customization where the lack of a 
    container would effectively disallow it.
    
    
    *The value of list-container ABIEs in the logical model
    
    If you accept my reasoning on why list containers are good in the 
    physical model, it's time to consider whether we also need them in the 
    logical model.
    
    List containers have an interesting characteristic not shared by other 
    kinds of containers: You can inspect the logical model and know exactly 
    where all the list containers would go -- and therefore it would be easy 
    to have our perl script generate them in the physical model while having 
    no annoying traces of them appear in the logical model.  They would 
    never have to be ABIEs, and never incurring the expense of having to 
    write and maintain spreadsheet rows for them.  So why might it be 
    desirable for a list container to be a true first-class ABIE?
    
    One reason is that there is overhead in having to manage an obvious 
    mismatch like this between the logical and physical models.  Right now, 
    every XSD complex type has a connection to a BIE.  If we generate 
    complex types with no connection back to a BIE, we have to explain a new 
    kind of "thing" in the UBL universe.  (However, note that the advance 
    word on CCTS V1.85 is that they are inventing a non-BIE structure for 
    cases like this, so perhaps the mismatch would go away.  Then again, 
    that means that a non-BIE list container would *still* appear in the 
    logical model, just as a new kind of second-class thing.)
    
    Another reason is that even if the list container doesn't start out as 
    an ABIE logically, the moment that a trading community wants to use the 
    power of XSD customization to contextualize a UBL list, they'll need to 
    turn it into an ABIE to do so.  If it can be an ABIE as a result of that 
    process, why not in its original state?  It's not so hard to define a 
    list ABIE initially as "A collection of information about <insert name 
    of repeatable things here>", and I don't really see any impurity in it 
    when the first little contextualization could also use the same 
    definition for a real BIE.  Also, I have no idea how a customizer would 
    go about turning a non-BIE structure into a true BIE, particularly if 
    the non-BIE really didn't appear anywhere in the logical model.
    
    In fact, what this boils down to is that there's not so much difference, 
    after all, between an ABIE and a complex datatype in XSD.  The reason 
    that complex datatypes exist in XSD is *not* to manage constraints on 
    the physical expression of the XML (DTDs did that just fine without the 
    notion of a type hierarchy); it's to encapsulate a particular chunk of 
    data that is likely to be associated with its own special functionality. 
      In this sense it's truly object-oriented, exactly as the BIE system is.
    
    So if we agree that list containers are desirable in the physical model, 
    it seems like more trouble than it's worth to *avoid* them in the 
    logical model.
    
    
    *Conclusion
    
    List containers good as logical things!
    List containers good as physical things!
    :-)
    
    
    
    -- 
    Eve Maler                                        +1 781 442 3190
    Sun Microsystems                            cell +1 781 883 5917
    XML Web Services / Industry Initiatives      eve.maler @ sun.com
    
    


    [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


    Powered by eList eXpress LLC