UBL Naming and Design Rules SC

 View Only

Re: [ubl-ndrsc] Rule: 115 and 116 Containers

  • 1.  Re: [ubl-ndrsc] Rule: 115 and 116 Containers

    Posted 07-17-2003 21:53
    Bill, I think your argument is bogus.
    
    The alternative to
    
    <?xml version="1.0" encoding="UTF-8"?>
    <doc>
    	<SuperfluousContainer>
    		<Fruit>Apple</Fruit>
    		<Fruit>Orange</Fruit>
    		<Fruit>Banana</Fruit>
    	</SuperfluousContainer>
    </doc>
    
    is not, in real life,
    
    <?xml version="1.0" encoding="UTF-8"?>
    <doc>
    	<Fruit>Apple</Fruit>
    	<Fruit>Orange</Fruit>
    	<Fruit>Banana</Fruit>
    </doc>
    
    but more probably
    
    <?xml version="1.0" encoding="UTF-8"?>
    <doc>
    	<someelement>foo</somelement>
    	<Fruit>Apple</Fruit>
    	<anotherone>bar</anotherone>
    	<Fruit>Orange</Fruit>
    	<alongcontainerlikeaddress>
                  <a>
                     <b>
                        <c>foo</c>
                     </b>
                   </a>
             </alongcontainerlikeaddress>
    	<Fruit>Banana</Fruit>
    </doc>
    
    Also, although I don't have the time or the inclination of checking this out,
    (I am on vacation after all) I believe your first stylesheet is way more
    complicated than needed for dealing with the container case, I believe it
    can be cut in half -- but again, I have not checked this, it's just based
    on previous experience with stylesheets.
    
    Burcham, Bill wrote:
    > I'm with Chee-Kai -- I think [R 116] is wrong.  (I know it's probably too
    > late -- but I'm gonna say my peace anyway :-)
    > The two cases I've heard made in favor of it are:
    > 
    > 1. container elements foster more readable stylesheets
    > 2. container elements significantly improve document processing performance
    > 
    > Argument 1 is weak.  Forgive me for posting working code, but here is an
    > instance document with superfluous containers:
    > 
    > <?xml version="1.0" encoding="UTF-8"?>
    > <doc>
    > 	<SuperfluousContainer>
    > 		<Fruit>Apple</Fruit>
    > 		<Fruit>Orange</Fruit>
    > 		<Fruit>Banana</Fruit>
    > 	</SuperfluousContainer>
    > </doc>
    > 
    > And here is a stylesheet to process it:
    > 
    > <?xml version="1.0" encoding="UTF-8"?>
    > <xsl:transform version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    > 	<xsl:output method="xml" version="1.0" encoding="UTF-8"
    > indent="yes"/>
    > 	<xsl:template match="doc">
    > 		<xsl:element name="NewDoc">
    > 			<xsl:apply-templates select="current()/*"/>
    > 		</xsl:element>
    > 	</xsl:template>
    > 	<xsl:template match="SuperfluousContainer">
    > 		<BeforeFruit/>
    > 		<xsl:apply-templates select="current()/*"/>
    > 		<AfterFruit/>
    > 	</xsl:template>
    > 	<xsl:template match="Fruit">
    > 		<AFruit>
    > 			<xsl:value-of select="text()"/>
    > 		</AFruit>
    > 	</xsl:template>
    > </xsl:transform>
    > 
    > And here is the output:
    > 
    > <?xml version="1.0" encoding="UTF-8"?>
    > <NewDoc>
    > 	<BeforeFruit/>
    > 	<AFruit>Apple</AFruit>
    > 	<AFruit>Orange</AFruit>
    > 	<AFruit>Banana</AFruit>
    > 	<AfterFruit/>
    > </NewDoc>
    > 
    > The example injects an element before the first fruit and after the last
    > one.  That's the example we've been discussing for a couple years as being
    > the bugaboo here.
    > 
    > And here is an analogous source instance doc -- this time with no
    > superfluous containers:
    > 
    > <?xml version="1.0" encoding="UTF-8"?>
    > <doc>
    > 	<Fruit>Apple</Fruit>
    > 	<Fruit>Orange</Fruit>
    > 	<Fruit>Banana</Fruit>
    > </doc>
    > 
    > And here is a different stylesheet to process this one:
    > 
    > <?xml version="1.0" encoding="UTF-8"?>
    > <xsl:transform version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    > 	<xsl:output method="xml" version="1.0" encoding="UTF-8"
    > indent="yes"/>
    > 	<xsl:template match="doc">
    > 		<xsl:element name="NewDoc">
    > 			<xsl:apply-templates select="current()/*"/>
    > 		</xsl:element>
    > 	</xsl:template>
    > 	<xsl:template match="Fruit">
    > 		<xsl:if test="position() = 1">
    > 		<BeforeFruit/>
    > 		</xsl:if>
    > 		<AFruit>
    > 			<xsl:value-of select="text()"/>
    > 		</AFruit>
    > 		<xsl:if test="position() = last()">
    > 		<AfterFruit/>		
    > 		</xsl:if>
    > 	</xsl:template>
    > </xsl:transform>
    > 
    > Comparing the two stylesheets I note that the one for superfluous containers
    > is 19 lines and the one for repeating elements (with no superfluous
    > containers) is 20 lines.  That's only one line of code difference.  And I
    > don't think the second stylesheet is any less readable than the first.
    > 
    > If I look at the two source documents, and extrapolate to larger documents
    > with more nesting I can say with certainty that superfluous containers make
    > for larger documents and IMHO are a bit harder for humans to read -- do to
    > the increase in indentation necessitated by the deeper hierarchy.
    > 
    > As for point 2 (processing performance), that's just Voodoo Computer
    > Science.  So, which XML processing tools are we using for comparison?  Which
    > versions of those tools?  What is the use-case/scenario/algorithm?  How big
    > is the document?  Worst-case, if you tell me that the document is HUGE then
    > I'll tell you a) the Bolivian rug-weaver using Perl as the processing tool
    > isn't gonna see the HUGE document and b) the company (Wal*Mart) that sees
    > the HUGE document can darn-well write a transform on the incoming document
    > (or four or five transforms) that make it more amenable to efficient
    > processing.
    > 
    > But you know what -- I still haven't seen any real _evidence_ that
    > superfluous containers provide any processing performance advantage in the
    > first place.  It's more likely they hurt performance since they _definitely_
    > make documents larger!
    > 
    > So by my count, it's:
    > 
    > Superfluous containers:  they make documents bigger (inflicting a processing
    > burden) and harder for humans to read
    > Repeated elements (no superfluous containers): they make documents smaller
    > and easier for humans to read, and necessitate a tiny bit more XSLT code in
    > some situations.
    > 
    > Down with [R 116]!
    > 
    > 
    > Bill Burcham
    > Sr. Software Architect, Integration Software Development
    > Sterling Commerce, Inc.
    > 469.524.2164
    > bill_burcham@stercomm.com
    > 
    >