Chris, Erik, and
Bruce,
I want to bring up a
concern that was discussed at last Monday’s meeting of the Translation
Subcommittee. We are working on an indexing best practice that ensures that
index terms do not interrupt sentence flow for segmentation. In the discussion
of index ranges, the translation professionals were concerned about having to
duplicate the index tag content in the start and end tags. Please clarify for us
if that is indeed the case:
Startindexterm = DITA
and Endindexterm = DITA
The concern is that the
indexterm may not be translated exactly the same way by two different
translators working on topics in parallel or even by the same translator working
at different times on different topics. That is – if the index start and end
ranges can span topics (is that the case?).
There was also a
concern that having the same text entered twice might result in a spelling error
that would affect translation.
Here are the
recommendations that the SC has been discussing. Please let me know if there are
misconceptions.
- Insert index entries that refer to entire topics in
the prolog element using the <keywords> tag
(prolog—metadata—keywords—indexterm). Index entries using <keyword>
should be processed as index terms referring to the beginning of the
referenced topic. clarify
- Insert all block-level index tags immediately
following the start tag of the nearest containing block
element.
- If an index term is intended to span several elements
in one topic, insert the start range at beginning of start block (i.e., the
parent block element) and the end range markup at end of the end block
element. See Chris Wong
- Question: Is this allowed at all?If you want an index
term to span a group of topics, insert the start range of the index tag in the
prolog of the first topic and the end tag in the prolog of the last topic in
the DITA map. Do not do this. Prolog across multiple
topics.
Thanks for your help, …
JoAnn
JoAnn T. Hackos,
PhD
President
Comtech Services,
Inc.
710 Kipling Street,
Suite 400
Denver, CO
80215
303-232-7586
joann.hackos@comtech-serv.com
joannhackos
Skype
www.comtech-serv.com
Well, we may need to discuss it, but here's a position
statement.
As Chris Wong
wrote, index entries are point-like by
default.
Here's a potential
accomodation.
Chris wrote: >
We can leave open the
possibility that a processor may elect to treat an indexterm in a topic
prolog as a page range: for example, if that topic is deeply nested.
This one is a tempting
accomodation, but I'll try an argument that justifies not making this
accomodation.
Suppose that we are
looking at a topic with no nested sub-topics.
When indexing the first
reference to an item, the entry should generate a point reference to the initial
point where that item enters the discussion. If the item is a prominent item
within that scope, a reference to the initial point is sufficient,
because the reader is likely to be interested in a large fraction of the scope
without being prompted by an index entry.
If the item is a
subsidiary item in the scope and only occurs once, a reference to the initial
point is sufficient, because the item only occurs
once.
If the item is a
subsidiary item that occurs multiple times, or if the occurrences span multiple
adjacent scopes, than a page range is
appropriate.
Now applying these
guidelines to topic-level index entries ...
A topic-level index
entry is an assertion that the item is a prominent item within that scope. The
reference is to the topic as a whole, and a reference to the initial point is
sufficient.
A start-of-range
assertion at the topic level is not well defined. How do you know in a single
topic that there will be other subsequent topics that will address the same
item? Ranges are inherently appropriate for spans across contents of a topic or
contents of a grouping of topics.
In a map, a
start-of-range assertion does make sense.
-----Original
Message-----
From: Chris Wong
[mailto:cwong@idiominc.com]
Sent: Monday, July 10, 2006 6:05
PM
To: Erik Hennum; Grosso,
Paul
Cc:
dita@lists.oasis-open.org
Subject: RE: [dita] indexing
question
One question that
comes to mind is: why would you want a page range that spans one and only one
topic? For example, I pulled out my old "XML in a Nutshell" and looked up
"Arabic Unicode block". This table spans 2 pages, but is only indexed with a
page number pointing to the start of the topic. That is because the topic is
so obviously self-enclosed that a single page reference is sufficient.
What I'd say
is that an indexterm in a topic prolog points to the topic. Page range markers
in a topic prolog has no meaning, since the indexterm is out of the
content flow. So index-range-start/index-range-end should be ignored. This
will allow an author to generate an index reference to a single topic by
entering an indexterm in the topic prolog.
We can leave open
the possibility that a processor may elect to treat an indexterm in a
topic prolog as a page range: for example, if that topic is deeply nested.
Hi, Paul
(Grosso) and Indexing Enthusiasts:
To follow up on the index range
question, we had a fair bit of discussion about ranges last Fall. The
consensus at the time was that ranges should be set explicitly. A sample from
the thread:
http://www.oasis-open.org/apps/org/workgroup/dita/email/archives/200510/msg00020.html
I guess my perspective remains that an indexterm
in the prolog could be treated as a special case of a general rule: that an
indexterm covers the content of its container and that processing emits a page
range if the indexed container extends to more than 2 pages.
Even so,
I don't want to undo the progress we've made:
http://www.oasis-open.org/apps/org/workgroup/dita/download.php/18917/IssueNumber45b.html
Can we isolate any anomalies in the
current indexing proposal and fix those quickly without changing the
fundamental approach?
Paul, are you aware of other hiccups besides the
requirement to index an entire topic from start of the title through the end
of the related links or the end of the nested topics?
Chris (Wong), as
the lead on the indexing proposal, do you have any
suggestions?
Hoping that's useful,
Erik
Hennum
ehennum@us.ibm.com
pgrosso@ptc.com>" src="gif00073.gif" width="16" border="0">"Grosso, Paul"
<pgrosso@ptc.com>
From: Erik Hennum [mailto:ehennum@us.ibm.com]
Sent: Thursday, 2006 June 29
18:13
To: JoAnn
Hackos
Cc:
dita@lists.oasis-open.org; Grosso, Paul
Subject: RE: [dita] indexing
question
That said, we still need a way to generate a range
over the whole topic.
Huh? I would have
thought what you just said in the first paragraph means that an indexterm
within the prolog generates a range over the whole topic. Now I'm really
confused.