OASIS Darwin Information Typing Architecture (DITA) TC

 View Only

Cross-Deliverable Links and Key Resolution

  • 1.  Cross-Deliverable Links and Key Resolution

    Posted 12-04-2014 18:46
    I think there may be some confusion about how I intended cross-deliverable references to be processed. This was captured in the discussion that Michael and I had about how to implement cross-deliverable link production but since that concept didn't get included in the 1.3 spec I think most people have not paid attention to it. I think the relevant aspect of cross-deliverable linking for this discussion is that the facility as specified explicitly does not require that you know for sure that given peer key will actually be defined *at the time you author the link*. The reason for this is that the publications involved may be developed and produced asynchronously and with little coordination. Thus the keys you want to link to may not in fact have been literally defined at the time you author the links. You can only know if a given key is or is not resolvable when you produce the final deliverable of the document doing the linking: between the time you author the link and when you produce the final form, anything could happen to the target document. In addition, if you use the generic key-based implementation approach that Michael and I developed, all references to peer keys become local key references when you produce the final deliverable so normal key resolution rules apply during that final deliverable production process. For all these reasons, peer key references simply have to be ignored for the purpose of determining whether or not a key is resolvable as long as you're not producing the final-form deliverable. The reason that there is this distinction between production of the final-form deliverable and any other processing you might be doing is because resolving cross-deliverable links requires a multi-pass process conceptually and that's how a lot of processors will implement it. In particular, it is possible to have any amount of time elapse between when you do pass 1, as described below, and when you do pass 2: there is no requirement that they be performed together in time. Therefore I think we can reasonably expect that most processors will actually reflect these two pass in their implementations. The passes are: Pass 1: Each publication involved in cross-deliverable linking is processed once to determine, *for that publication*, what deliverable anchors any keys become for that deliverable. This mapping of keys-to-deliverable-addresses is saved for use in subsequent passes (it was the details of how this data could be saved that Michael and I discussed and arrived at the proposed interchange solution of using intermediate key definitions). For example, if topic "topic-01.dita" is referenced by the topicref: <topicref keys="chapter-01" href="topic-01.dita"> in the map and if for HTML output the result is HTML file "chapter-01.html", then the deliverable-specific key-to-anchor mapping would be "key 'chapter-01' maps to HTML file 'chapter-01.html'" for this deliverable. This mapping can be represented by a normal key definition of the form: <keydef keys="chapter-01" href="../../publication-02/chapter-01.html" scope="external" format="html" /> Pass 2: Each publication involved in cross-deliverable linking is processed again, this time using the deliverable-specific key-to-anchor mappings for each of the target publications to resolve any key references to those publications. Note that pass 1 does not *require* that any target peer maps be available because you're only concerned with keys within each publication (that is, generating that publication's key-to-anchor map). It is not until pass 2 that the processor has to be able to resolve the cross-deliverable keys and that is the point at which failures can and should be reported. Note also that there is an inherently loose coupling between these two phases: in the general case you don't know when or if any given target deliverable will itself be available and therefore you don't necessarily know during pass 1 processing if a given key will or won't actually be resolvable when you go to do pass 2. You might have authored links to a key that you expect will be defined but doesn't happen to be defined in the target publication at authoring time. As long as that key is defined and resolvable when you do pass 2, it's all good. Thus, there can be processing contexts in which it is not known, and doesn't need to be known, that a peer key reference can't be resolved, namely the pass-1 processing for each publication. However, *if the peer maps are available*, processors certainly can check the key definitions if they choose to and report the issue. But how you manage your related publications relative to each other and the generation of deliverables is entirely a business decision: you could impose very tight controls or very loose controls, depending on what you need. The DITA-defined aspects of the process accommodate both loose and tight control. For that reason, we cannot state the rule for peer keys as "if you can't resolve the key it is treated as an unresolvable key" because there are now valid processing contexts where you simply don't know if the key is or is not resolvable. I think the rule has to be stated in terms of producing final deliverables: at that point, the normal unresolvable key rules should apply. But, there's more: The general mechanism Michael and I arrived at uses intermediate key definitions as the way of capturing the key-to-anchor binding, as shown above. The basic idea is that in pass 1 you generate a set of key definitions that reflect the key-to-anchor binding for the deliverable you're creating. These keys are declared as scope="external" and with a format reflecting the target deliverable (e.g., format="html", format="pdf" or whatever it is). In pass 2, each publication that links to that deliverable literally includes those keys before any locally-defined keys so that the deliverable-specific keys take precedence. In this scenario, during pass 2 processing the key definitions are now local to the publication making the cross-deliverable link, not peer, and so normal key processing rules apply: either the key is defined and it's all good or it's not and normal undefined key rules apply. Given this implementation approach, it should be clear that processors should ignore peer key references, at least for the purposes of applying unresolvable key rules, because they can't know for sure if the key is or is not resolvable in the general case. However, DITA users can choose to impose a rule that all peer maps must be available during pass 1 processing and that they should reflect the final set of keys that will be available in that publication. This is the "tightly controlled interlinked publication set" use case, e.g., what might be provided by a CCMS that manages the authoring and publication of all the publications in a related set, enforcing specific business rules for release and publication. (This was the use case I typically had in mind when thinking about this problem, e.g., the "all-knowing publication process manager".) In that case processors can check the resolvability of peer key references early and report them or treat them as unresolvable during pass 1 (or at some appropriate workflow checkpoint where it is required that all links be resolvable). But that is an implementation and business rule choice that is not inherent in the cross-deliverable link mechanism and that cannot be mandated by the standard. Note that Michael had a completely different and equally-valid use case in mind: the "disconnected and lightly coordinated interlinked document set", where publications that link to each other are managed by different groups with very little direct coordination other than the interchange of the key-to-anchor maps necessary to produce publications that link to other publications. In the context of Michael's use case, it should be clear that trying to enforce key resolvability during pass 1 is simply not generally useful or, in some cases, not possible, because you simply don't have the required key-to-anchor mapping during initial authoring or maybe not until you do final deliverable generation for publication. In this disconnected case you might expect owners of documents to also interchange maps that provide just the key definitions to which other publications are allowed to link. In that case, early validation of key references would be possible. But again, this level of coordination is not required by the facility as specified or intended. Cheers, E. ————— Eliot Kimber, Owner Contrext, LLC http://contrext.com