OASIS Darwin Information Typing Architecture (DITA) TC

Re: [dita] chunking description for spec

  • 1.  Re: [dita] chunking description for spec

    Posted 02-13-2007 12:16
    A general comment, reflected in my detailed comments below: the DITA 
    spec should never use "file" when "XML document" is meant (and since 
    files are not part of any standard DITA is defined in terms of, should 
    only used in the context of non-normative examples). Likewise, "file" 
    should never be used where "storage object" is meant.
    
    > *Reuse of a nested topic* – A content provider creates a set of topics 
    > as a single module. A reuser wants to incorporate only one of the nested 
    > topics from the module. 
    
    Change "module" to "document". In this context the term "module" has no 
    DITA-defined meaning and could be confused with "module" as it is used 
    in the context of specialization. (A search on "module" in the latest 
    architecture spec shows that "module" is only used in the context of 
    specialization modules".)
    
       The reuse can reference the nested topic from a
    > DITA map, using the chunk attribute to specify that the topic should be 
    > produced in its own file.
    
    I'm not sure why "chunk" is needed in this case--if I'm re-using a 
    topic, regardless of where it occurs within its containing document, it 
    will be processed without regard to its *parents*.
    
    That is, while there may not be a default *value* for chunk, there 
    absolutely is a default chunking behavior and it is definitely not 
    "select-document" (I'm not even sure select-document is a good idea but 
    that's a discussion for another day).
    
    That is, in the case where the parent topic of the directly-used topic 
    is not otherwise included in the map (or is not included via a parent 
    topicref of the topicref using the directly-used nested topic) then the 
    parent topic should not be involved at all, unless the chunk value is 
    *explicitly set* to "select-document".
    
    > *Use of the chunk attribute*
    > 
    > When a set of topics is transformed for output using a map, the map 
    > author may use the chunk attribute to override the implementation 
    > specific default behavior. 
    
    I think a pointer to the implementation-specific stuff at the end of 
    this discussion would be useful here--until I got there it was not clear 
    what "implementation specific" meant.
    
       The chunk attribute allows the map author to
    > request that multi-topic files be broken into smaller files, and that
    
    change to: "request that multi-document XML documents be broken into 
    multiple XML documents"
    
    > multiple individual topics be combined into a larger files.  
    
    change to: combined into larger XML documents.
    
    > *by–topic – *When the chunk attribute value includes the “by–topic” 
    > token, a chunking policy is established for the current topicref element 
    > where a separate output document is produced for each source topic in
    
    Here I think "document" should be "chunk" given that the details of the 
    outputs cannot be known in this statement and therefore may or may not 
    be documents in the XML sense (or any other sense).
    
    > the referenced document. 
    
    change: "topic in the referenced document" to "child topic of the 
    referenced topic". There is no sense in which topic-containing documents 
    can be referenced--only topics can be referenced using topicref 
    format="topic".
    
       The policy only applies for a chunk action of
    > the current element (for example, to-content), except when it is set on 
    > the map or map specialization element, when the “by-topic” policy is 
    
    "or map specialization element" is unneeded--it is sufficient to say 
    "the map" as that implicitly includes any specializations of map.
    
    > *by–document – *When the chunk attribute value includes the 
    > “by–document” token, a chunking policy is established for the current 
    > topicref element where a single output document is produced for the 
    
    "output chunk"
    
    > Some tokens or combinations of tokens may not be appropriate for all 
    > output types. When unsupported or conflicting tokens are encountered 
    > during output processing, warning and error messages may be produced. 
    >  Recovery from such conflicts or other errors is implementation dependent.
    
    Change "may" to "should"
    
    This next paragraph is way too implementation dependent (it assumes a 
    file-based storage system for example). I would make it a more qualified 
    note:
    
    add: "NOTE: When creating new topics via chunk processing, the storage 
    object name or identifier (if relevant) is taken from the copyto 
    attribute if set, otherwise the root name is taken from the id attribute 
    if the by-topic policy is in effect and from the name of the referenced 
    document if the by-document policy is in effect.
    
    
    > *Examples*
    > 
    > Given several single topic files, parent1.dita, parent2.dita, …, 
    
    c/files/documents/
    
    > child1.dita, child2.dita, …, grandchild1.dita, grandchild2.dita 
    > containing topics with ids P1, P2, …, C1, C2, …, GC1, GC2, …., several 
    > nested topic files, nested1.dita, nested2.dita, …, each containing two 
    
    c/files/documents/
    
    
    > Produces a single output file, P1.xxxx, containing topic P1 and topics 
    
    c/file/chunk/ and throughout
    
    
    > 
    > For use in the %topicref-atts; and %topicref-atts-no-toc; descriptions 
    > in Chapter 23 of the DITA 1.1 Language Reference Specification:
    >  
    > *Name* 	*Description* 	*Data Type* 	*Default Value* 	*Required?*
    > chunk 	When a set of topics is transformed using a map, the chunk 
    > attribute allows multi-topic files to be broken into smaller files, and 
    
    c/files/documents/
    
    > multiple individual topics to be combined into  larger combined files.
    
    c/files/documents/
    
    Cheers,
    
    Eliot
    -- 
    W. Eliot Kimber
    Professional Services
    Innodata Isogen
    8500 N. Mopac, Suite 402
    Austin, TX 78759
    (214) 954-5198
    
    ekimber@innodata-isogen.com
    www.innodata-isogen.com