OASIS Darwin Information Typing Architecture (DITA) TC

Re: [dita] Groups - New Action Item #0023 Embedding version numbersin cat...

  • 1.  Re: [dita] Groups - New Action Item #0023 Embedding version numbersin cat...

    Posted 09-12-2005 22:08
     MHonArc v2.5.0b2 -->
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

    dita message

    [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


    Subject: Re: [dita] Groups - New Action Item #0023 Embedding version numbersin cat...


    The general requirement is to be able to distinguish different versions 
    of the DITA-defined DTD and schema components such that multiple 
    versions can co-exist within a single access space and such that there 
    is no ambiguity about which version a particular document requires.
    
    In addition, there needs to be the ability to refer to DTD and schema 
    components using a form that means "the latest version available".
    
    [For the purposes of the rest of this discussion I will use the term 
    "schema component" to mean the files that make up a schema, regardless 
    of that schema's implementation form: DTD, XSD, etc.]
    
    Schema components are files and therefore are ultimately addressed by 
    some form of "file name" or "external identifier", in XML terminology.
    
    In Web terms schema components are resources addressed by URI. However, 
    the DITA materials and most users of them retail the pre-XML use of 
    public identifiers to point to schema components. Regardless of the 
    initial form of reference, all references to schema components must be 
    resolved to actual files or storage objects. In the case of the DITA 
    materials as provided by OASIS, this translates into the locations of 
    files within the DITA distribution.
    
    Another related issue is the relationship between namespace URIs and 
    schema component URIs.
    
    It is generally accepted, and appeared to be the concensus of the DITA 
    TC, that a namespace identifies an *application*, which is a pure 
    abstraction and, as such, so no sense in which can be versioned.
    
    However, at any point in time, the application will be represented by 
    some set of concrete implementing artifacts that are versioned. In the 
    case of DITA, these artifacts include the various schema components, 
    supporting documentation, processor implementations, and so forth.
    
    Thus, the general practice is that namespace URIs are invariant and do 
    not include any sort of version identifier. That is, DITA is DITA in all 
    its versions. However, there will certainly be many versions of the 
    implementing DITA components. [This issue can be clouded by the use of 
    numbers to distinguish related but distinct applications, i.e., "DITA 1" 
    and "DITA 2". In the context of this discussion "DITA 1" (i.e., the 
    current form of DITA on which the TC is working) is a completely 
    distinct application from "DITA 2", which will be developed at some 
    future time. This subtle distinction can of course lead one to tedious 
    existential discussions. For now just take it as given that the 
    foregoing is true.]
    
    This distinction is important in part because of the way that XML 
    documents can be associated with schemas using the mechanisms defined in 
    the XSD specifications and in part because it can help to avoid 
    confusion about what is and isn't versioned and therefore how names 
    should be constructed.
    
    Another important practical consideration is the general need for access 
    to schema components on a local system with locally-defined names and 
    locations for files, as opposed to always accessing components from some 
    central, public server. This consideration is addressed most generally 
    by the OASIS XML Catalog mechanism, which provides standard mechanisms 
    from mapping from one form of external identifier to another, ultimately 
    to some form that can be resolved by the processor using the catalog 
    (i.e., read from the local file system).
    
    Thus, the situation we find with DITA is:
    
    - The use of both DTDs and XSD schemas for documents
    
    - For DTDs, the use of public IDs and non-absolute system IDs to point 
    to declaration sets
    
    - For XSD schemas, the use of absolute and non-absolute URIs to point to 
    schema components.
    
    - The shipment of XML catalogs as part of the base DITA implemetnation 
    materials
    
    The questions then are what names are needed and how should they be used?
    
    The names needed are, I think, the following:
    
    - For the DITA package itself, packaged as a tree of files, a top-level 
    directory that includes a distinguishing version identifier, i.e. 
    "dita_1.0/", "dita_1.1/", etc. Within the package, the locations of and 
    names of individual files should be changed as little as possible from 
    version to version in order to simplify XML catalog maintenance and 
    forestall confusion in users and implementors who move from one version 
    to another.
    
    - For each version of each schema component:
    
       - A normative absolute URL that unambiguously names that component, 
    distinct from all other components and from all other versions of that 
    component
    
       - A normative public identifier that unambiguously names that 
    component, distinct from all other components and from all other 
    versions of that component
    
    - For each schema component considered as a "resource" (a collection of 
    versions):
    
       - A normative absolute URL that identifies that resource with the 
    implicit semantic of "the latest version available".
    
       - A normative public identifier that identifies that resource with 
    the implicit semantic of "the latest version available".
    
       - A conventional filename for the component within the tree of files 
    provided in the DITA implementation package
    
    Note that the public IDs and absolute URLs are synonyms, although the 
    absolute URLs have the advantage that they could be made resolvable. 
    Public IDs by contrast must always be mapped to a system ID.
    
    [Opinion: Public IDs are an anacronism that are not useful in XML and 
    should be replaced with absolute URIs in all cases. However, I realize 
    that some systems seem to depend on public IDs and that therefore there 
    is resistance to this move. However, I find the use of public IDs 
    potentially problematic because of the inherent ambiguity in XML of 
    whether to prefer the public ID or system ID when resolving external 
    identifiers. URIs present no such problem.]
    
    Given the above, the DITA catalog for a given version of the 
    distribution must then provide the following entries:
    
    - Mappings from the version-specific public IDs to their corresponding 
    system IDs (which could be to the absolute URLs)
    
    - Mappings from the version-specific absolute URLs to their 
    corresponding files in the DITA distribution package
    
    - Mappings from the version-independent public IDs to their 
    corresponding system IDs (which could be to the absolute URLs)
    
    - Mappings from the version-independent absolute URLs to their 
    corresponding files in the DITA distribution package
    
    - Mappings from the DITA-defined namespaces that have governing XSD 
    schemas to those schemas.
    
    I have no particular opinion or insight about the details of the 
    version-specific identifiers themselves. It doesn't really matter as 
    long as the version information is clear.
    
    Cheers,
    
    Eliot
    -- 
    W. Eliot Kimber
    Professional Services
    Innodata Isogen
    9390 Research Blvd, #410
    Austin, TX 78759
    (512) 372-8155
    
    ekimber@innodata-isogen.com
    www.innodata-isogen.com
    
    


    [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]