OASIS DocBook TC2

  • 1.  Modular doc processing model

    Posted 06-17-2009 18:40
    Following up on the modular DocBook proposals so far, I'd like to seek 
    clarification on  processing expectations of assembly elements.
    
    My first impression is that an assembly element would be preprocessed to 
    resolve all resources, with the result being a valid DocBook XML document 
    with all the content pulled in, profiled, and relationships resolved.  That 
    resolved document would then be fed into a second process using a DocBook 
    stylesheet for rendering.  Like profiling, these two phases could be 
    pipelined, or they could be separated so the intermediate result can be 
    examined.
    
    At least, this is likely how I would handle assemblies.  This allows the 
    assembly process to generate assembly errors, and then halt if necessary 
    before passing the result to a formatting stylesheet, where likely many 
    fallout error messages would result due to incomplete input.
    
    If this is the processing model, then what happens when an assembly element 
    contains more than one structure element after profiling?  One could imagine 
    each structure being rendered as a separate output document.  But if one 
    structure is intended for print and another for help, then they would likely 
    need different handling in assembly and different stylesheets downstream.
    
    Also, Larry's example has a tutorial structure referenced as module of the 
    user.guide structure.  It isn't clear whether the tutorial is meant to be a 
    separate output.  For that matter, the element name 


  • 2.  RE: [docbook-tc] Modular doc processing model

    Posted 06-17-2009 19:45
    Bob,
    
    I agree pretty much with your processing expectations.  The multiple 
    structures in a single assembly is allowed, but I expect it to be less 
    common than a single structure per file.  I was expecting the ids on 
    the structures to be used to call out which one to process, with a 
    fallback of "do the first one if no ID is specified" or something 
    similar (maybe a book is the default or some other rule that makes
    sense or that the user defines).  I actually put them all in one
    file as a single assembly more to show the multiple uses of the same
    content rather than as a best practice.  It was more a matter of
    practicality, being able to see the other documents while working on 
    one of them.  That may speak in favor of allowing more than one in an
    assembly.
    
    The processing of assemblies into the intermediate result would not
    always result in a single canonical DocBook instance.  We actually 
    do help systems on a file-per-file basis with an input file for each
    output file (each one is roughly an article, although we call them
    helpNodes) rather than chunking from a single DocBook instance.  
    There is a separate structure file describing the relationship among
    the help pages (which the structure element would replace).  In
    general the process would produce canonical DocBook but I think that
    should be open for other models.
    
    The tutorial that is referenced into the book would be processed into 
    sections for presentation in the book.  I was assuming a tutorial 
    engine for presentation of it when rendered on its own (with the typical 
    previous, next, help, ToC, etc buttons that are common in a tutorial 
    being part of the engine so that they don't have to be added to each 
    frame of the source).  I think I mentioned that in one of the previous 
    messages about the first sample (or maybe on the phone call -- it has 
    been a while).
    
    It is obvious that we need a pretty thorough description of the 
    processing expectations of this sample, since I had a number of 
    things in mind when I was writing it, but obviously didn't do a
    good job of communicating them.  Does it make more sense to
    embed them as comments in the sample or provide another file
    describing what is expected.  The comment model makes the file
    a lot larger, but you can see what the markup means while you
    are reading the file itself.
    
    I have mixed feelings about a single structure per assembly versus
    multiple ones.  On the one hand, this would be equivalent to a 
    document set, and could describe all the documents associated with
    a project.  On the other, DocBook is supposed to be a semantic
    markup language, so we should be able to assign a meaning to the
    multiple structures in the assembly.  It is a good question for
    discussion.
    
    Regards,
    Larry