OASIS Darwin Information Typing Architecture (DITA) TC

View Only

Back to discussions

Expand all | Collapse all

Chunking and Composite Topics

1. Chunking and Composite Topics

0 Recommend
B. Urbina
Posted 11-01-2013 12:48
Hello All, Kristen asked me to submit my recent work on the Chunking and Composite topic functions of DITA. With my colleagues Mark Poston and Rob Hanna we have been experimenting trying to use maps to leverage content that’s been either created in or converted to composite topics. This email contains is an almost-copy-and-paste from our report to the client, but I’d also like to add my own (hastily put together) commentary. <rant> I find the chunking attribute syntax vastly overcomplicated. Instead of offering a good default that’s simply achieved, it offers something that’s expensive for vendors to implement and/or difficult to edit by hand. I have worked with the usual main players - FrameMaker, XMetaL, oXygen, Arbortext editor – and none offer any help or special functions around chunking. It’s only an advanced-user feature, and so it doesn’t really help move licenses for people getting started, and it requires quite a lot of UI to make usable. And the documentation in the spec is just a series of examples that don’t have full XML sets shown, just partial ones with prose description of what should happen on output. Training on the functionality is a nightmare and I have actually had to look up the spec in a course when asked a question because the various permutations are so many and the tools do nothing to help. I would suggest that there are two use cases being addressed with the chunking attribute, one is merging files together, the other is reusing them from files that are used together. This may be overloading the attribute. The merging functionality makes sense, but the reuse/splitting options are rather opaque. I’d suggest by changing some of the default behaviours this could be made much easier. My own take would be: From a map, if you specify a child topic of a multi-topic file, then it’s safe to assume that that’s the topic you want, and not anything above (this is how most things in XML work, so it follows logically). So the default meanings could be: <topicref href= = “All topics in this file” <topicref href= = “All topics from topic id1a down” <topicref href= chunk="select-topic"> = “topic id1a only” (although it’s highly debatable whether this should be called using an attribute called “chunk” at all). In a CCMS that uses IDs, there should be no change, you just split on the # like usual. I’d suggest that simplifying the parameters passed to @chunk would enable more users to take advantage of it. I’m sure many are, but because of the complexity, lack of tool support, and resulting difficult to use for beginners, I believe many aren’t Googling the spec and learning how to use it. </rant> <reportextract> Reusing topics from a ditabase topic If one uses chunking and conditions on the topicrefs then you can conditionally filter topics in and out and rearrange their hierarchy, even though they are stored in ditabase topics. To reuse a topic from a ditabase topic: 1. Specify the topic id in the map and set the chunking attribute to “to-content select-topic” to insert a single topic or “to-content select-branch” or a topic and its descendants. An example is supplied below of a DITAbase-based file being split up and reordered. File noz-test.dita <!DOCTYPE dita PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd"> <dita> <topic id="id1"> <title>Topic 1</title> <body> Topic 1. Topic 1 has a cross reference to <xref href= 1a</xref>. Topic 1 has a cross reference to <xref href= 1b</xref>. </body> <topic id="id1a"> <title>Topic 1a</title> <body> Topic 1a has a cross reference to <xref href= 1</xref>. Topic 1a has a cross reference to <xref href= 1b</xref>. </body> <topic id="id1b"> <title>Topic 1b</title> <body> Topic 1b has a cross reference to <xref href= 1</xref>. Topic 1b has a cross reference to <xref href= 1a</xref>. </body> </topic> </topic> </topic> </dita> Map <!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd"> <map> <title>DITA Topic Map</title> <topicref href= chunk="to-content select-topic"> <topicref href= chunk="to-content select-topic" audience=”customerABC”/> </topicref> <topicref href= chunk="to-content select-topic"/> <reltable> <relrow> <relcell> <topicref href= > </relcell> <relcell collection-type="sequence"> <topicref href= > <topicref href= > </relcell> </relrow> </reltable> </map> Note: · There appears to be a bug in the DITA OT that prevents rendering of topics with mixed topic types. All topics must be of the same type or else the transformation fails. The bug in the DITA OT is most likely in the Java extensions in the OT, not the XSLT. It should not be - if this is the only problem – particularly difficult to debug. Infineon must decide whether to: o Fix the bug o Make topics all the same type (most logically this would be all <topic>, within ditabase files. If this is done, as users and content are being migrated to the new, more modular way of working the topic types can and should be applied on individual topics. o Not reuse below the topic level for now. · The same limitations on xrefs apply with composite as with regular topics, and the same risks of broken links. Limitations of composite topic type · Simplified task is not included in the ditabase DTD. Ditabase DTD requires additional specialization to include simplified task. · Composite files will only be able to be categorised as a whole in the taxonomy. As they are burst, the topics contained will have to be categorised after they are created. · All IDs need to be unique across all topics – not just unique within a topic. · Additional stylesheet work may be required to achieve publishing features such as mini-tables of contents (or forward organizers). · Whole assemblies must be versioned with any change to a topic rather than simply versioning a single topic. · Topic-type OT bug as described above. </reportextract> <thanks> To you all for your attention. </thanks> B. Noz Urbina – Business Development Manager blog http://lessworkmoreflow.blogspot.com ¦ twitter @nozurbina e noz.urbina@mekon.com ¦ UK mob +44 (0)7739 522 002 ¦ ES mob +34 625 467 866 ¦ skype nozskype
2. Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

0 Recommend
Michael Priestley
Posted 11-01-2013 14:26
| view attached
I haven't made the required changes to existing topics yet, but here's a couple of new topics, combining the rule discussion for reusing domain specialization elements (from DITA 1.2, but with syntax changes) and for reusing structural specialization elements (added for DITA 1.3). I've made a couple of syntax changes in the domain attribute since the previous proposal, which will impact the troubleshooting proposal. I know that's a pain, and I apologize, but here's why I did it: - For the domain syntax, as I looked at the examples and the previously agreed syntax I just couldn't come up with simple logical generalization rules. I had them for structural but couldn't replicate with the domain dependencies. So I changed the domain syntax to be more in line with the structural one: eg (topic concept codeConcept+pr-d). - For the structural syntax, we were including the fragment identifier as part of the dependency declaration - eg (topic troubleshooting+task/steps). But that extra /steps wasn't actually being used by any of the downstream processing, eg conref or generalization. So it was just eating up space in the domains attribute, which is already straining at its character limit. So I eliminated the fragment identifier, which means you can reuse multiple fragments from the same source (eg, adding /steps as well as /steps-unordered) without having to add more values to the domain attribute: eg it just becomes (topic troubleshooting+task) - But now the two types of dependency have the same syntax, but need to be treated differently (the discussion of the difference is in the new topic). So I changed the syntax even further, so that we could differentiate the structural dependency from a domain dependency - by using "++" instead of "+": (topic troubleshooting++task) Other alternatives I looked at: - Could we identify the domains because of the naming syntax? Domains typically have -d suffixes. Answer: nope. "Typically" doesn't cut it - it's a naming convention, not a requirement. I checked the spec wording. - Could we use the same +/- syntax we use in the class attributes to distinguish domains and structural specialization? IE use "+" for domain dependencies, and "-" for structural ones? Answer: ick. It would work technically, but (topic troubleshooting-task) suggests we're removing elements rather than adding them - it might be more consistent but at the cost of intuitiveness. So with that out of the way, here's the actual new spec topic, with changes incorporated - feedback and discussion welcomed and encouraged: Michael Priestley, Senior Technical Staff Member (STSM) Total Information Experience (TIE) Technology Strategist mpriestl@ca.ibm.com http://dita.xml.org/blog/25 Title: Sharing elements across specializations Sharing elements across specializations The process of specialization allows for easy reuse of elements from ancestor specializations; however, it is also possible to reuse elements from other unrelated specializations, as long as the dependency is properly declared to prevent invalid generalization or conref processing. A structural specialization can incorporate elements from unrelated domains or other structural specializations by referencing them in the content model of a specialized element. For example, a specialized reference topic could declare a specialized list <apilist> in which each <apilistitem> contains an <apiname> element, which is borrowed from the programming domain. Or a specialized concept topic could declare a specialized <process> section that contains the <steps> elements borrowed from task. The elements included in this manner must be specialized from ancestor content that is valid in the new context. For example, a specialized element can include the <steps> elements in its content model if it could validly include <ol>, because <steps> is specialized from <ol>. If the reusing and reused specializations share common ancestry, the reused elements must be valid in the reusing context at every level they share in common. For example, suppose there are two specializations of <task>: <mytask> and <yourtask>. The specializer of <mytask> wants to borrow <yoursteps> from <yourtask> instead of creating their own equivalent specialization. The <yoursteps> element is specialized from <steps> in <task>. The specializer of <mytask> can only include <yoursteps> where <steps> would be allowed; they cannot incorporate it anywhere <ol> is allowed, even though <ol> is in its ancestry, because the two specializations share <task> in their ancestry as well, and we need to maintain the validity of <task> as a generalization target. If we allowed <mytask> to transplant specializations of <steps> anywhere they wanted in the <mytask> model, then <mytask> would cease to be a valid specialization of <task>. Although a well-designed structural specialization hierarchy with controlled use of domains is still the primary means of sharing and reusing elements in DITA, the ability to also share elements declared elsewhere in the hierarchy allows for situations where relevant markup comes from multiple sources and would otherwise be developed redundantly. Domain attribute syntax for cross-specialization dependencies When a structural specialization reuses elements from another structural or domain specialization, it has a dependency on the source specialization module that must be declared in the domains attribute in order for conref and generalization processing to function correctly. The domains attribute must include a parenthetical _expression_ that lists the ancestry for the current specialization. The final value, for the current specialization, must include the modules it depends on, separated by either a + for a dependency on a domain specialization or a ++ for a dependency on another structural specialization. In addition, it must include a separate parenthetical _expression_ for each reused domain or structural specialization that lists the ancestry of the reused specialization. Dependency on a domain specialization For example, a codeConcept specialization could specialize from a concept and include elements from the programming domain. Its ancestry, and the ancestry of the domain it uses, would be declared as: domains= ...(topic concept codeConcept+pr-d) (topic pr-d) Dependency on a structural specialization For example, a checklist specialization could specialize from reference and include elements from task. Its ancestry, and the ancestry of the domain it uses, would be declared as: domains= ...(topic reference checklist++task) (topic task) Generalization with cross-specialization dependencies Dependencies across specializations limit generalization targets to those that either preserve the dependency or eliminate them. Some generalization targets will not be valid and should be detected before generalization occurs. When a structural specialization has a dependency on a domain specialization, then the domain cannot be generalized without also generalizing the reusing structural specialization. For example, if programming domain elements in codeConcept were generalized to their topic equivalents without also generalizing codeConcept elements, then the places where the content model of a codeConcept element includes programming domain elements would become invalid. However, codeConcept could be generalized to concept or topic, without generalizing programming domain elements, as long as the target document type includes the programming domain. When a structural specialization has a dependency on another structural specialization, then both must be generalized together to a common ancestor. For example, if the task elements in checklist were generalized without also generalizing checklist elements, then the checklist content models that referenced task elements would be broken. And if the checklist elements were generalized to topic without also generalizing the task elements, then the task elements would be out of place, since they cannot be validly present in topic. However, checklist and task can be generalized together to any ancestor they have in common: in this case topic. Generalizing processes should detect invalid generalization target combinations and report them as errors.

Attachment(s)

sharing-elements-across-modules.html 8 KB 1 version
3. Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

0 Recommend
Robert Thomas
Posted 11-02-2013 15:46
Hi Michael, Nice job. I was able to follow and understand the spec topic. I also understand how it applies to the 13097 troubleshooting topic proposal (thank you for using it as an example). You may want to add language that describes syntax for a structural specialization the reuses both structural specialization elements and domain specialization elements. For instance, suppose that for the codeConcept example you wished to co-opt both the pr-d domain specialization and the properties element and its descendants from the reference topic. I have a couple of guesses of how the syntax would look: (topic concept codeConcept++reference+pr-d) (topic concept codeConcept+pr-d codeConcept++reference) The second guess would be slightly easier to process, but the first guess would help economize attribute length. I also wondered what the syntax would be if codeConcept borrowed from two domain specializations. For example, in codeConcept you might wish to incorporate elements from both pr-d and ui-d. Would it be (topic concept codeConcept+pr-d+ui-d) or the tokenized version similar to guess number two in the previous example? Allowing borrowing from more than one specialization further facilitates re-use of existing markup, significantly improving the usability of the end specialization. But, does allowing more than one specialization introduce too much complexity into the architecture for tools to implement? If not, how much complexity can be tolerated? Hypothetically, one of these new conglomerate specializations could refer to another such conglomerate specialization. Would a processor have to perform any gymnastics to reconcile the domain attributes in each, or would they be treated independently? Best Regards, Bob Thomas On Fri, Nov 1, 2013 at 8:25 AM, Michael Priestley < mpriestl@ca.ibm.com > wrote: I haven't made the required changes to existing topics yet, but here's a couple of new topics, combining the rule discussion for reusing domain specialization elements (from DITA 1.2, but with syntax changes) and for reusing structural specialization elements (added for DITA 1.3). I've made a couple of syntax changes in the domain attribute since the previous proposal, which will impact the troubleshooting proposal. I know that's a pain, and I apologize, but here's why I did it: - For the domain syntax, as I looked at the examples and the previously agreed syntax I just couldn't come up with simple logical generalization rules. I had them for structural but couldn't replicate with the domain dependencies. So I changed the domain syntax to be more in line with the structural one: eg (topic concept codeConcept+pr-d). - For the structural syntax, we were including the fragment identifier as part of the dependency declaration - eg (topic troubleshooting+task/steps). But that extra /steps wasn't actually being used by any of the downstream processing, eg conref or generalization. So it was just eating up space in the domains attribute, which is already straining at its character limit. So I eliminated the fragment identifier, which means you can reuse multiple fragments from the same source (eg, adding /steps as well as /steps-unordered) without having to add more values to the domain attribute: eg it just becomes (topic troubleshooting+task) - But now the two types of dependency have the same syntax, but need to be treated differently (the discussion of the difference is in the new topic). So I changed the syntax even further, so that we could differentiate the structural dependency from a domain dependency - by using "++" instead of "+": (topic troubleshooting++task) Other alternatives I looked at: - Could we identify the domains because of the naming syntax? Domains typically have -d suffixes. Answer: nope. "Typically" doesn't cut it - it's a naming convention, not a requirement. I checked the spec wording. - Could we use the same +/- syntax we use in the class attributes to distinguish domains and structural specialization? IE use "+" for domain dependencies, and "-" for structural ones? Answer: ick. It would work technically, but (topic troubleshooting-task) suggests we're removing elements rather than adding them - it might be more consistent but at the cost of intuitiveness. So with that out of the way, here's the actual new spec topic, with changes incorporated - feedback and discussion welcomed and encouraged: Michael Priestley, Senior Technical Staff Member (STSM) Total Information Experience (TIE) Technology Strategist mpriestl@ca.ibm.com http://dita.xml.org/blog/25 --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php -- Bob Thomas +1 720 201 8260 Skype: bob.thomas.colorado Instant messaging: Gmail chat ( bob.thomas@tagsmiths.com ) or Skype Time zone: Mountain (GMT-7)
4. Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

0 Recommend
Michael Priestley
Posted 11-04-2013 15:13
Hi Bob, Thanks for the fast review and great suggestions. I'll add some more detailed examples to tease out the cases you suggest. Michael Priestley, Senior Technical Staff Member (STSM) Total Information Experience (TIE) Technology Strategist mpriestl@ca.ibm.com http://dita.xml.org/blog/25 From: Bob Thomas <bob.thomas@tagsmiths.com> To: Michael Priestley/Toronto/IBM@IBMCA, Cc: "dita@lists.oasis-open.org" <dita@lists.oasis-open.org> Date: 11/02/2013 11:47 AM Subject: Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax Sent by: <dita@lists.oasis-open.org> Hi Michael, Nice job. I was able to follow and understand the spec topic. I also understand how it applies to the 13097 troubleshooting topic proposal (thank you for using it as an example). You may want to add language that describes syntax for a structural specialization the reuses both structural specialization elements and domain specialization elements. For instance, suppose that for the codeConcept example you wished to co-opt both the pr-d domain specialization and the properties element and its descendants from the reference topic. I have a couple of guesses of how the syntax would look: 1. (topic concept codeConcept++reference+pr-d) 2. (topic concept codeConcept+pr-d codeConcept++reference) The second guess would be slightly easier to process, but the first guess would help economize attribute length. I also wondered what the syntax would be if codeConcept borrowed from two domain specializations. For example, in codeConcept you might wish to incorporate elements from both pr-d and ui-d. Would it be (topic concept codeConcept+pr-d+ui-d) or the tokenized version similar to guess number two in the previous example? Allowing borrowing from more than one specialization further facilitates re-use of existing markup, significantly improving the usability of the end specialization. But, does allowing more than one specialization introduce too much complexity into the architecture for tools to implement? If not, how much complexity can be tolerated? Hypothetically, one of these new conglomerate specializations could refer to another such conglomerate specialization. Would a processor have to perform any gymnastics to reconcile the domain attributes in each, or would they be treated independently? Best Regards, Bob Thomas On Fri, Nov 1, 2013 at 8:25 AM, Michael Priestley < mpriestl@ca.ibm.com > wrote: I haven't made the required changes to existing topics yet, but here's a couple of new topics, combining the rule discussion for reusing domain specialization elements (from DITA 1.2, but with syntax changes) and for reusing structural specialization elements (added for DITA 1.3). I've made a couple of syntax changes in the domain attribute since the previous proposal, which will impact the troubleshooting proposal. I know that's a pain, and I apologize, but here's why I did it: - For the domain syntax, as I looked at the examples and the previously agreed syntax I just couldn't come up with simple logical generalization rules. I had them for structural but couldn't replicate with the domain dependencies. So I changed the domain syntax to be more in line with the structural one: eg (topic concept codeConcept+pr-d). - For the structural syntax, we were including the fragment identifier as part of the dependency declaration - eg (topic troubleshooting+task/steps). But that extra /steps wasn't actually being used by any of the downstream processing, eg conref or generalization. So it was just eating up space in the domains attribute, which is already straining at its character limit. So I eliminated the fragment identifier, which means you can reuse multiple fragments from the same source (eg, adding /steps as well as /steps-unordered) without having to add more values to the domain attribute: eg it just becomes (topic troubleshooting+task) - But now the two types of dependency have the same syntax, but need to be treated differently (the discussion of the difference is in the new topic). So I changed the syntax even further, so that we could differentiate the structural dependency from a domain dependency - by using "++" instead of "+": (topic troubleshooting++task) Other alternatives I looked at: - Could we identify the domains because of the naming syntax? Domains typically have -d suffixes. Answer: nope. "Typically" doesn't cut it - it's a naming convention, not a requirement. I checked the spec wording. - Could we use the same +/- syntax we use in the class attributes to distinguish domains and structural specialization? IE use "+" for domain dependencies, and "-" for structural ones? Answer: ick. It would work technically, but (topic troubleshooting-task) suggests we're removing elements rather than adding them - it might be more consistent but at the cost of intuitiveness. So with that out of the way, here's the actual new spec topic, with changes incorporated - feedback and discussion welcomed and encouraged: Michael Priestley, Senior Technical Staff Member (STSM) Total Information Experience (TIE) Technology Strategist mpriestl@ca.ibm.com http://dita.xml.org/blog/25 --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php -- Bob Thomas +1 720 201 8260 Skype: bob.thomas.colorado Instant messaging: Gmail chat ( bob.thomas@tagsmiths.com ) or Skype Time zone: Mountain (GMT-7)
5. Re: Chunking and Composite Topics

0 Recommend
Kristen Eberlein
Posted 11-05-2013 17:38
Hi, Noz. (And Mark and Rob by cc) We talked about this briefly at today's TC meeting. While we cannot make any changes to chunking for DITA 1.3 -- the deadline for new proposals is long past -- I asked for volunteers to review the current content in the spec and make suggestions for improvement. And I got volunteers; Stan Doherty (Mathworks) and Chris Nitchie (Oberon Technologies) are on the hook for that work :) Best, Kris Kristen James Eberlein Principal consultant, Eberlein Consulting Co-chair, OASIS DITA Technical Committee Charter member, OASIS DITA Adoption Committee www.eberleinconsulting.com +1 919 682-2290; kriseberlein (skype) On 11/1/2013 8:47 AM, Noz Urbina wrote: Hello All, Kristen asked me to submit my recent work on the Chunking and Composite topic functions of DITA. With my colleagues Mark Poston and Rob Hanna we have been experimenting trying to use maps to leverage content that’s been either created in or converted to composite topics. This email contains is an almost-copy-and-paste from our report to the client, but I’d also like to add my own (hastily put together) commentary. <rant> I find the chunking attribute syntax vastly overcomplicated. Instead of offering a good default that’s simply achieved, it offers something that’s expensive for vendors to implement and/or difficult to edit by hand. I have worked with the usual main players - FrameMaker, XMetaL, oXygen, Arbortext editor – and none offer any help or special functions around chunking. It’s only an advanced-user feature, and so it doesn’t really help move licenses for people getting started, and it requires quite a lot of UI to make usable. And the documentation in the spec is just a series of examples that don’t have full XML sets shown, just partial ones with prose description of what should happen on output. Training on the functionality is a nightmare and I have actually had to look up the spec in a course when asked a question because the various permutations are so many and the tools do nothing to help. I would suggest that there are two use cases being addressed with the chunking attribute, one is merging files together, the other is reusing them from files that are used together. This may be overloading the attribute. The merging functionality makes sense, but the reuse/splitting options are rather opaque. I’d suggest by changing some of the default behaviours this could be made much easier. My own take would be: From a map, if you specify a child topic of a multi-topic file, then it’s safe to assume that that’s the topic you want, and not anything above (this is how most things in XML work, so it follows logically). So the default meanings could be: <topicref href= > = “All topics in this file” <topicref href= > = “All topics from topic id1a down” <topicref href= chunk= select-topic > = “topic id1a only” (although it’s highly debatable whether this should be called using an attribute called “chunk” at all). In a CCMS that uses IDs, there should be no change, you just split on the # like usual. I’d suggest that simplifying the parameters passed to @chunk would enable more users to take advantage of it. I’m sure many are, but because of the complexity, lack of tool support, and resulting difficult to use for beginners, I believe many aren’t Googling the spec and learning how to use it. </rant> <reportextract> Reusing topics from a ditabase topic If one uses chunking and conditions on the topicrefs then you can conditionally filter topics in and out and rearrange their hierarchy, even though they are stored in ditabase topics. To reuse a topic from a ditabase topic: 1. Specify the topic id in the map and set the chunking attribute to “to-content select-topic” to insert a single topic or “to-content select-branch” or a topic and its descendants. An example is supplied below of a DITAbase-based file being split up and reordered. File noz-test.dita <!DOCTYPE dita PUBLIC -//OASIS//DTD DITA Composite//EN ditabase.dtd > <dita> <topic id= id1 > <title>Topic 1</title> <body> Topic 1. Topic 1 has a cross reference to <xref href= >Topic 1a</xref>. Topic 1 has a cross reference to <xref href= >Topic 1b</xref>. </body> <topic id= id1a > <title>Topic 1a</title> <body> Topic 1a has a cross reference to <xref href= >Topic 1</xref>. Topic 1a has a cross reference to <xref href= >Topic 1b</xref>. </body> <topic id= id1b > <title>Topic 1b</title> <body> Topic 1b has a cross reference to <xref href= >Topic 1</xref>. Topic 1b has a cross reference to <xref href= >Topic 1a</xref>. </body> </topic> </topic> </topic> </dita> Map <!DOCTYPE map PUBLIC -//OASIS//DTD DITA Map//EN map.dtd > <map> <title>DITA Topic Map</title> <topicref href= chunk= to-content select-topic > <topicref href= chunk= to-content select-topic audience=”customerABC”/> </topicref> <topicref href= chunk= to-content select-topic /> <reltable> <relrow> <relcell> <topicref href= /> </relcell> <relcell collection-type= sequence > <topicref href= /> <topicref href= /> </relcell> </relrow> </reltable> </map> Note: · There appears to be a bug in the DITA OT that prevents rendering of topics with mixed topic types. All topics must be of the same type or else the transformation fails. The bug in the DITA OT is most likely in the Java extensions in the OT, not the XSLT. It should not be - if this is the only problem – particularly difficult to debug. Infineon must decide whether to: o Fix the bug o Make topics all the same type (most logically this would be all <topic>, within ditabase files. If this is done, as users and content are being migrated to the new, more modular way of working the topic types can and should be applied on individual topics. o Not reuse below the topic level for now. · The same limitations on xrefs apply with composite as with regular topics, and the same risks of broken links. Limitations of composite topic type · Simplified task is not included in the ditabase DTD. Ditabase DTD requires additional specialization to include simplified task. · Composite files will only be able to be categorised as a whole in the taxonomy. As they are burst, the topics contained will have to be categorised after they are created. · All IDs need to be unique across all topics – not just unique within a topic. · Additional stylesheet work may be required to achieve publishing features such as mini-tables of contents (or forward organizers). · Whole assemblies must be versioned with any change to a topic rather than simply versioning a single topic. · Topic-type OT bug as described above. </reportextract> <thanks> To you all for your attention. </thanks> B. Noz Urbina – Business Development Manager blog http://lessworkmoreflow.blogspot.com ¦ twitter @nozurbina e noz.urbina@mekon.com ¦ UK mob +44 (0)7739 522 002 ¦ ES mob +34 625 467 866 ¦ skype nozskype
6. RE: Chunking and Composite Topics

0 Recommend
B. Urbina
Posted 11-05-2013 17:39
Hi Kris, That’s great news! Hopefully that will spur consumer demand which will in turn get us some tool support. Thanks to Stan and Chris! Noz From: Kristen James Eberlein [mailto:kris@eberleinconsulting.com] Sent: 05 November 2013 18:37 To: Noz Urbina; dita@lists.oasis-open.org Cc: Mark Poston; Rob Hanna Subject: Re: Chunking and Composite Topics Hi, Noz. (And Mark and Rob by cc) We talked about this briefly at today's TC meeting. While we cannot make any changes to chunking for DITA 1.3 -- the deadline for new proposals is long past -- I asked for volunteers to review the current content in the spec and make suggestions for improvement. And I got volunteers; Stan Doherty (Mathworks) and Chris Nitchie (Oberon Technologies) are on the hook for that work :) Best, Kris Kristen James Eberlein Principal consultant, Eberlein Consulting Co-chair, OASIS DITA Technical Committee Charter member, OASIS DITA Adoption Committee www.eberleinconsulting.com +1 919 682-2290; kriseberlein (skype) On 11/1/2013 8:47 AM, Noz Urbina wrote: Hello All, Kristen asked me to submit my recent work on the Chunking and Composite topic functions of DITA. With my colleagues Mark Poston and Rob Hanna we have been experimenting trying to use maps to leverage content that’s been either created in or converted to composite topics. This email contains is an almost-copy-and-paste from our report to the client, but I’d also like to add my own (hastily put together) commentary. <rant> I find the chunking attribute syntax vastly overcomplicated. Instead of offering a good default that’s simply achieved, it offers something that’s expensive for vendors to implement and/or difficult to edit by hand. I have worked with the usual main players - FrameMaker, XMetaL, oXygen, Arbortext editor – and none offer any help or special functions around chunking. It’s only an advanced-user feature, and so it doesn’t really help move licenses for people getting started, and it requires quite a lot of UI to make usable. And the documentation in the spec is just a series of examples that don’t have full XML sets shown, just partial ones with prose description of what should happen on output. Training on the functionality is a nightmare and I have actually had to look up the spec in a course when asked a question because the various permutations are so many and the tools do nothing to help. I would suggest that there are two use cases being addressed with the chunking attribute, one is merging files together, the other is reusing them from files that are used together. This may be overloading the attribute. The merging functionality makes sense, but the reuse/splitting options are rather opaque. I’d suggest by changing some of the default behaviours this could be made much easier. My own take would be: From a map, if you specify a child topic of a multi-topic file, then it’s safe to assume that that’s the topic you want, and not anything above (this is how most things in XML work, so it follows logically). So the default meanings could be: <topicref href= = “All topics in this file” <topicref href= = “All topics from topic id1a down” <topicref href= chunk="select-topic"> = “topic id1a only” (although it’s highly debatable whether this should be called using an attribute called “chunk” at all). In a CCMS that uses IDs, there should be no change, you just split on the # like usual. I’d suggest that simplifying the parameters passed to @chunk would enable more users to take advantage of it. I’m sure many are, but because of the complexity, lack of tool support, and resulting difficult to use for beginners, I believe many aren’t Googling the spec and learning how to use it. </rant> <reportextract> Reusing topics from a ditabase topic If one uses chunking and conditions on the topicrefs then you can conditionally filter topics in and out and rearrange their hierarchy, even though they are stored in ditabase topics. To reuse a topic from a ditabase topic: 1. Specify the topic id in the map and set the chunking attribute to “to-content select-topic” to insert a single topic or “to-content select-branch” or a topic and its descendants. An example is supplied below of a DITAbase-based file being split up and reordered. File noz-test.dita <!DOCTYPE dita PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd"> <dita> <topic id="id1"> <title>Topic 1</title> <body> Topic 1. Topic 1 has a cross reference to <xref href= 1a</xref>. Topic 1 has a cross reference to <xref href= 1b</xref>. </body> <topic id="id1a"> <title>Topic 1a</title> <body> Topic 1a has a cross reference to <xref href= 1</xref>. Topic 1a has a cross reference to <xref href= 1b</xref>. </body> <topic id="id1b"> <title>Topic 1b</title> <body> Topic 1b has a cross reference to <xref href= 1</xref>. Topic 1b has a cross reference to <xref href= 1a</xref>. </body> </topic> </topic> </topic> </dita> Map <!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd"> <map> <title>DITA Topic Map</title> <topicref href= chunk="to-content select-topic"> <topicref href= chunk="to-content select-topic" audience=”customerABC”/> </topicref> <topicref href= chunk="to-content select-topic"/> <reltable> <relrow> <relcell> <topicref href= > </relcell> <relcell collection-type="sequence"> <topicref href= > <topicref href= > </relcell> </relrow> </reltable> </map> Note: · There appears to be a bug in the DITA OT that prevents rendering of topics with mixed topic types. All topics must be of the same type or else the transformation fails. The bug in the DITA OT is most likely in the Java extensions in the OT, not the XSLT. It should not be - if this is the only problem – particularly difficult to debug. Infineon must decide whether to: o Fix the bug o Make topics all the same type (most logically this would be all <topic>, within ditabase files. If this is done, as users and content are being migrated to the new, more modular way of working the topic types can and should be applied on individual topics. o Not reuse below the topic level for now. · The same limitations on xrefs apply with composite as with regular topics, and the same risks of broken links. Limitations of composite topic type · Simplified task is not included in the ditabase DTD. Ditabase DTD requires additional specialization to include simplified task. · Composite files will only be able to be categorised as a whole in the taxonomy. As they are burst, the topics contained will have to be categorised after they are created. · All IDs need to be unique across all topics – not just unique within a topic. · Additional stylesheet work may be required to achieve publishing features such as mini-tables of contents (or forward organizers). · Whole assemblies must be versioned with any change to a topic rather than simply versioning a single topic. · Topic-type OT bug as described above. </reportextract> <thanks> To you all for your attention. </thanks> B. Noz Urbina – Business Development Manager blog http://lessworkmoreflow.blogspot.com ¦ twitter @nozurbina e noz.urbina@mekon.com ¦ UK mob +44 (0)7739 522 002 ¦ ES mob +34 625 467 866 ¦ skype nozskype
7. Re: [dita] Re: Chunking and Composite Topics

0 Recommend
Christopher Nitchie
Posted 12-02-2013 20:05
I spent some time over the holiday weekend reviewing chunking. The problem, as I see it, isn’t with the topic, per se; it’s the fact that the chunking attribute itself is extremely problematic. The default tokens are vague and difficult to remember, its functionality is based on assumptions that don’t apply for all processors, and most importantly, it is so overloaded as to be almost indescribable. Fundamentally, it’s responsible for two use cases: 1. Customizing the behavior of references to subsets of compound topics/ditabases. 2. Combining content referenced by ’this’ topicref and child topicrefs into a single output chunk. The spec breaks this down further, into three things: A. Selecting topics (select-*) B. Splitting of those topics into chunks (as-*) C. ‘Rendering’ the map branch (to-*) I’d argue that (A) and (B) are different aspects of use-case (1), and (C) is use-case (2), though you wouldn’t know it from the current spec language. It’s not quite that clean, though, because the to-* tokens, as far as I can tell, play double duty, controlling both the combining of child topics, as well as informing the results of the selection performed by select-*. For example, when I reference ditabase.dita#TopicC * By default (with the OTK), the navigation points to TopicC, but all of ditabase.dita is rendered as a single chunk. * chunk=“by-topic” or “select-topic to-content” will extract only TopicC into its own chunk. * chunk=“select-branch to-content” will extract TopicC and its children into a single chunk. * chunk=“select-branch by-topic” will extract TopicC and its children, each to its own chunk (though specifying ’to-content’ appears to override ‘by-topic’). And so on. I’ve actually started putting together a cheat-sheet based on trial and error, because there’s no way I can keep all the different combinations in my head. Meanwhile, the only values that really matter when dealing with a parent trying to combine/split its children are the to-* tokens, to-navigation and to-content. The to-content value combines the topicref and its children into one chunk, to-navigation... doesn’t. I’m frankly mystified as to what to-navigation is supposed to do, and I’ve been at this for hours. The spec isn’t much help. It says something about “navigation chunks” but never really defines what that means, except in a parenthetical that I’m having trouble making sense of. Re: default behavior. The spec more or less explicitly states that there is no spec-mandated default behavior, so a processor is free to chunk using the select-branch to-content algorithm suggested by Noz, and I know of at least one implementation that does. Sort of. (Arbortext selects the branch and throws the rest away, but from there, chunking/ToC generation is controlled by the stylesheet.) I think a lot of the complexity/confusion here stems from the fact that the OTK does its best to have one output chunk per input topic/ditabase file, but the ‘chunk’ attribute allows you to tweak that. The spec is operating on the assumption that all DITA processing attempts to optimize along similar lines (there’s a similar issue with @copy-to), but nowhere does the spec (as far as I know) *mandate* this behavior. I’ve always found that optimization problematic because metadata from the topicref can cascade into the topic, and so it’s very difficult to determine equality between two topicrefs, even when they’re to the same URI. As we introduce more features like scoped keys and branch filtering, this problem will continue to get worse. Post-1.3, I think we need to start moving away from that implicit one-to-one input topic/output chunk assumption in the spec, and move towards a paradigm where each (non-resource-only local dita) topicref represents its own output unit. So for 1.3, I think we need to revisit the language describing to-navigation. The other spec-specified values are pretty good, taken in isolation; the challenge comes when trying to think through how different combinations of values might affect output, and it’s in the combinations that the real value lies. The existing examples are good, though I’d suggest adding a simple ‘select a branch from a ditabase’ example as #3; #1 is simple chunking, #2 is simple bursting, and then #3 jumps into nested chunking, so a simple branch-selection example might help ease people in. Other than that, though, I’m not sure how much we can do. Post-1.3, I think we should consider deprecating the ‘chunk’ attribute altogether and replacing it with more fine-grained control attributes. Just off the top of my head (pseudo-DTD): <!ATTLIST chunk-replacement “ topic-selection (topic branch all)  topic-merge (yes no)  topic-nav (per-topic first-topic)  “> * topic-selection controls what amount of the referenced file is considered the content unit. CDATA and extensible. * topic-split indicates whether to break up the selected content unit into individual output chunks. * topic-merge specifies whether to combine the referenced content unit (or pseudo-content-unit for topicheads) with content referenced beneath it into a single chunk. Topic-merge takes precedence over topic-chunk. * topic-nav controls whether navigation/ToC entries are generated for nested topics in the logical content unit resulting from topic-selection and topic-merge (and possibly to what depth). As an alternative, we could extend @toc. CDATA and extensible. I think this enables everything currently possible using the ‘chunk’ attribute, and the specified defaults map to the current default OTK behavior. It also allows something that I couldn’t get working in the OTK without multiple topicrefs, namely, including a compound topic as a single chunk with multiple TOC entries. Splitting the functions of the chunk attribute each into their own more specific, fine-grained attributes would, I think, make life easier for just about everybody. Chris Chris Nitchie (734) 330-2978 chris.nitchie@oberontech.com www.oberontech.com < http://www.oberontech.com/ > Follow us: < https://www.facebook.com/oberontech > < https://twitter.com/oberontech > < http://www.linkedin.com/company/oberon-technologies > From: Kristen James Eberlein <kris@eberleinconsulting.com> Date: Tuesday, November 5, 2013 at 12:37 PM To: Noz Urbina <noz.urbina@mekon.com>, "dita@lists.oasis-open.org" <dita@lists.oasis-open.org> Cc: Mark Poston <mark.poston@mekon.com>, Rob Hanna <rob@infoarchitects.ca> Subject: [dita] Re: Chunking and Composite Topics Hi, Noz. (And Mark and Rob by cc) We talked about this briefly at today's TC meeting. While we cannot make any changes to chunking for DITA 1.3 -- the deadline for new proposals is long past -- I asked for volunteers to review the current content in the spec and make suggestions for improvement. And I got volunteers; Stan Doherty (Mathworks) and Chris Nitchie (Oberon Technologies) are on the hook for that work :) Best, Kris Kristen James Eberlein Principal consultant, Eberlein Consulting Co-chair, OASIS DITA Technical Committee Charter member, OASIS DITA Adoption Committee www.eberleinconsulting.com < http://www.eberleinconsulting.com > +1 919 682-2290; kriseberlein (skype) On 11/1/2013 8:47 AM, Noz Urbina wrote: Hello All, Kristen asked me to submit my recent work on the Chunking and Composite topic functions of DITA. With my colleagues Mark Poston and Rob Hanna we have been experimenting trying to use maps to leverage content that’s been either created in or converted to composite topics. This email contains is an almost-copy-and-paste from our report to the client, but I’d also like to add my own (hastily put together) commentary. <rant> I find the chunking attribute syntax vastly overcomplicated. Instead of offering a good default that’s simply achieved, it offers something that’s expensive for vendors to implement and/or difficult to edit by hand. I have worked with the usual main players - FrameMaker, XMetaL, oXygen, Arbortext editor – and none offer any help or special functions around chunking. It’s only an advanced-user feature, and so it doesn’t really help move licenses for people getting started, and it requires quite a lot of UI to make usable. And the documentation in the spec is just a series of examples that don’t have full XML sets shown, just partial ones with prose description of what should happen on output. Training on the functionality is a nightmare and I have actually had to look up the spec in a course when asked a question because the various permutations are so many and the tools do nothing to help. I would suggest that there are two use cases being addressed with the chunking attribute, one is merging files together, the other is reusing them from files that are used together. This may be overloading the attribute. The merging functionality makes sense, but the reuse/splitting options are rather opaque. I’d suggest by changing some of the default behaviours this could be made much easier. My own take would be: From a map, if you specify a child topic of a multi-topic file, then it’s safe to assume that that’s the topic you want, and not anything above (this is how most things in XML work, so it follows logically). So the default meanings could be: <topicref href="noz-test.dita"> = “All topics in this file” <topicref href="noz-test.dita#id1a"> = “All topics from topic id1a down” <topicref href="noz-test.dita#id1a" chunk="select-topic"> = “topic id1a only” (although it’s highly debatable whether this should be called using an attribute called “chunk” at all). In a CCMS that uses IDs, there should be no change, you just split on the # like usual. I’d suggest that simplifying the parameters passed to @chunk would enable more users to take advantage of it. I’m sure many are, but because of the complexity, lack of tool support, and resulting difficult to use for beginners, I believe many aren’t Googling the spec and learning how to use it. </rant> <reportextract> Reusing topics from a ditabase topicIf one uses chunking and conditions on the topicrefs then you can conditionally filter topics in and out and rearrange their hierarchy, even though they are stored in ditabase topics. To reuse a topic from a ditabase topic: 1. Specify the topic id in the map and set the chunking attribute to “to-content select-topic” to insert a single topic or “to-content select-branch” or a topic and its descendants. An example is supplied below of a DITAbase-based file being split up and reordered. File noz-test.dita <!DOCTYPE dita PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd"> <dita> <topic id="id1"> <title>Topic 1</title> <body> Topic 1. Topic 1 has a cross reference to <xref href="#id1a">Topic 1a</xref>. Topic 1 has a cross reference to <xref href="#id1b">Topic 1b</xref>. </body> <topic id="id1a"> <title>Topic 1a</title> <body> Topic 1a has a cross reference to <xref href="#id1">Topic 1</xref>. Topic 1a has a cross reference to <xref href="#id1b">Topic 1b</xref>. </body> <topic id="id1b"> <title>Topic 1b</title> <body> Topic 1b has a cross reference to <xref href="#id1">Topic 1</xref>. Topic 1b has a cross reference to <xref href="#id1a">Topic 1a</xref>. </body> </topic> </topic> </topic> </dita> Map <!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd"> <map> <title>DITA Topic Map</title> <topicref href="noz-test.dita#id1b" chunk="to-content select-topic"> <topicref href="noz-test.dita#id1" chunk="to-content select-topic" audience=”customerABC”/> </topicref> <topicref href="noz-test.dita#id1a" chunk="to-content select-topic"/> <reltable> <relrow> <relcell> <topicref href="noz-test.dita#id1b"/> </relcell> <relcell collection-type="sequence"> <topicref href="noz-test.dita#id1"/> <topicref href="noz-test.dita#id1a"/> </relcell> </relrow> </reltable> </map> Note: · There appears to be a bug in the DITA OT that prevents rendering of topics with mixed topic types. All topics must be of the same type or else the transformation fails. The bug in the DITA OT is most likely in the Java extensions in the OT, not the XSLT. It should not be - if this is the only problem – particularly difficult to debug. Infineon must decide whether to: o Fix the bug o Make topics all the same type (most logically this would be all <topic>, within ditabase files. If this is done, as users and content are being migrated to the new, more modular way of working the topic types can and should be applied on individual topics. o Not reuse below the topic level for now. · The same limitations on xrefs apply with composite as with regular topics, and the same risks of broken links. Limitations of composite topic type· Simplified task is not included in the ditabase DTD. Ditabase DTD requires additional specialization to include simplified task. · Composite files will only be able to be categorised as a whole in the taxonomy. As they are burst, the topics contained will have to be categorised after they are created. · All IDs need to be unique across all topics – not just unique within a topic. · Additional stylesheet work may be required to achieve publishing features such as mini-tables of contents (or forward organizers). · Whole assemblies must be versioned with any change to a topic rather than simply versioning a single topic. · Topic-type OT bug as described above. </reportextract> <thanks> To you all for your attention. </thanks> B. Noz Urbina– Business Development Manager blog http://lessworkmoreflow.blogspot.com < http://lessworkmoreflow.blogspot.com/ > ¦ twitter@nozurbina enoz.urbina@mekon.com < mailto:julian.murfitt@mekon.com > ¦UK mob +44 (0)7739 522 002 ¦ES mob +34 625 467 866 ¦skype nozskype --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php

OASIS Darwin Information Typing Architecture (DITA) TC

Chunking and Composite Topics

B. Urbina11-01-2013 12:48

Michael Priestley11-01-2013 14:26

Robert Thomas11-02-2013 15:46

Michael Priestley11-04-2013 15:13

Kristen Eberlein11-05-2013 17:38

B. Urbina11-05-2013 17:39

Christopher Nitchie12-02-2013 20:05

1. Chunking and Composite Topics

2. Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

3. Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

4. Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

5. Re: Chunking and Composite Topics

6. RE: Chunking and Composite Topics

7. Re: [dita] Re: Chunking and Composite Topics

Contact Us

Membership

Privacy & Terms

OASIS Darwin Information Typing Architecture (DITA) TC

Chunking and Composite Topics

B. Urbina11-01-2013 12:48

Michael Priestley11-01-2013 14:26

Robert Thomas11-02-2013 15:46

Michael Priestley11-04-2013 15:13

Kristen Eberlein11-05-2013 17:38

B. Urbina11-05-2013 17:39

Christopher Nitchie12-02-2013 20:05

1. Chunking and Composite Topics

2. Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

3. Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

4. Re: [dita] Preview of 13121 - reuse of elements from structural specializations - with discussion of changed syntax

5. Re: Chunking and Composite Topics

6. RE: Chunking and Composite Topics

7. Re: [dita] Re: Chunking and Composite Topics

Related Content

Clarrification of @chunk="to-content" combined with topic reuse

Clarrification of @chunk="to-content" combined with topic reuse Attachments

Everybody's favorite topic: chunking!

13121 - reusing portions of structural specializations via domain mechanism

Chunking topic in the specification

Contact Us

Membership

Privacy & Terms