UBL Naming and Design Rules SC

RE: [ubl-ndrsc] Rule: 115 and 116 Containers

  • 1.  RE: [ubl-ndrsc] Rule: 115 and 116 Containers

    Posted 07-17-2003 20:53
    I'm with Chee-Kai -- I think [R 116] is wrong.  (I know it's probably too
    late -- but I'm gonna say my peace anyway :-)
    The two cases I've heard made in favor of it are:
    
    1. container elements foster more readable stylesheets
    2. container elements significantly improve document processing performance
    
    Argument 1 is weak.  Forgive me for posting working code, but here is an
    instance document with superfluous containers:
    
    <?xml version="1.0" encoding="UTF-8"?>
    <doc>
    	<SuperfluousContainer>
    		<Fruit>Apple</Fruit>
    		<Fruit>Orange</Fruit>
    		<Fruit>Banana</Fruit>
    	</SuperfluousContainer>
    </doc>
    
    And here is a stylesheet to process it:
    
    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:transform version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    	<xsl:output method="xml" version="1.0" encoding="UTF-8"
    indent="yes"/>
    	<xsl:template match="doc">
    		<xsl:element name="NewDoc">
    			<xsl:apply-templates select="current()/*"/>
    		</xsl:element>
    	</xsl:template>
    	<xsl:template match="SuperfluousContainer">
    		<BeforeFruit/>
    		<xsl:apply-templates select="current()/*"/>
    		<AfterFruit/>
    	</xsl:template>
    	<xsl:template match="Fruit">
    		<AFruit>
    			<xsl:value-of select="text()"/>
    		</AFruit>
    	</xsl:template>
    </xsl:transform>
    
    And here is the output:
    
    <?xml version="1.0" encoding="UTF-8"?>
    <NewDoc>
    	<BeforeFruit/>
    	<AFruit>Apple</AFruit>
    	<AFruit>Orange</AFruit>
    	<AFruit>Banana</AFruit>
    	<AfterFruit/>
    </NewDoc>
    
    The example injects an element before the first fruit and after the last
    one.  That's the example we've been discussing for a couple years as being
    the bugaboo here.
    
    And here is an analogous source instance doc -- this time with no
    superfluous containers:
    
    <?xml version="1.0" encoding="UTF-8"?>
    <doc>
    	<Fruit>Apple</Fruit>
    	<Fruit>Orange</Fruit>
    	<Fruit>Banana</Fruit>
    </doc>
    
    And here is a different stylesheet to process this one:
    
    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:transform version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    	<xsl:output method="xml" version="1.0" encoding="UTF-8"
    indent="yes"/>
    	<xsl:template match="doc">
    		<xsl:element name="NewDoc">
    			<xsl:apply-templates select="current()/*"/>
    		</xsl:element>
    	</xsl:template>
    	<xsl:template match="Fruit">
    		<xsl:if test="position() = 1">
    		<BeforeFruit/>
    		</xsl:if>
    		<AFruit>
    			<xsl:value-of select="text()"/>
    		</AFruit>
    		<xsl:if test="position() = last()">
    		<AfterFruit/>		
    		</xsl:if>
    	</xsl:template>
    </xsl:transform>
    
    Comparing the two stylesheets I note that the one for superfluous containers
    is 19 lines and the one for repeating elements (with no superfluous
    containers) is 20 lines.  That's only one line of code difference.  And I
    don't think the second stylesheet is any less readable than the first.
    
    If I look at the two source documents, and extrapolate to larger documents
    with more nesting I can say with certainty that superfluous containers make
    for larger documents and IMHO are a bit harder for humans to read -- do to
    the increase in indentation necessitated by the deeper hierarchy.
    
    As for point 2 (processing performance), that's just Voodoo Computer
    Science.  So, which XML processing tools are we using for comparison?  Which
    versions of those tools?  What is the use-case/scenario/algorithm?  How big
    is the document?  Worst-case, if you tell me that the document is HUGE then
    I'll tell you a) the Bolivian rug-weaver using Perl as the processing tool
    isn't gonna see the HUGE document and b) the company (Wal*Mart) that sees
    the HUGE document can darn-well write a transform on the incoming document
    (or four or five transforms) that make it more amenable to efficient
    processing.
    
    But you know what -- I still haven't seen any real _evidence_ that
    superfluous containers provide any processing performance advantage in the
    first place.  It's more likely they hurt performance since they _definitely_
    make documents larger!
    
    So by my count, it's:
    
    Superfluous containers:  they make documents bigger (inflicting a processing
    burden) and harder for humans to read
    Repeated elements (no superfluous containers): they make documents smaller
    and easier for humans to read, and necessitate a tiny bit more XSLT code in
    some situations.
    
    Down with [R 116]!
    
    
    Bill Burcham
    Sr. Software Architect, Integration Software Development
    Sterling Commerce, Inc.
    469.524.2164
    bill_burcham@stercomm.com