Chris, Erik, and Bruce,
I want to bring up a concern that was
discussed at last Monday’s meeting of the Translation Subcommittee. We
are working on an indexing best practice that ensures that index terms do not
interrupt sentence flow for segmentation. In the discussion of index ranges,
the translation professionals were concerned about having to duplicate the
index tag content in the start and end tags. Please clarify for us if that is
indeed the case:
Startindexterm = DITA and Endindexterm =
DITA
The concern is that the indexterm may not
be translated exactly the same way by two different translators working on
topics in parallel or even by the same translator working at different times on
different topics. That is – if the index start and end ranges can span
topics (is that the case?).
There was also a concern that having the
same text entered twice might result in a spelling error that would affect
translation.
Here are the recommendations that the SC
has been discussing. Please let me know if there are misconceptions.
- Insert index
entries that refer to entire topics in the prolog element using the
<keywords> tag (prolog—metadata—keywords—indexterm).
Index entries using <keyword> should be processed as index terms referring
to the beginning of the referenced topic. clarify
- Insert all
block-level index tags immediately following the start tag of the nearest
containing block element.
- If an index
term is intended to span several elements in one topic, insert the start
range at beginning of start block (i.e., the parent block element) and the
end range markup at end of the end block element. See Chris Wong
- Question: Is
this allowed at all?If you want an index term to span a group of topics,
insert the start range of the index tag in the prolog of the first topic
and the end tag in the prolog of the last topic in the DITA map. Do not do
this. Prolog across multiple topics.
Thanks for your help, … JoAnn
JoAnn T. Hackos, PhD
President
Comtech Services, Inc.
710 Kipling Street, Suite 400
Denver, CO 80215
303-232-7586
joann.hackos@comtech-serv.com
joannhackos Skype
www.comtech-serv.com
Well, we may need to discuss it, but here's a position statement.
As Chris Wong wrote,
index entries are point-like by default.
Here's a potential accomodation.
Chris wrote: > We can leave open the possibility that a processor may elect
to treat an indexterm in a topic prolog as a page range: for example, if that
topic is deeply nested.
This one is a tempting accomodation, but
I'll try an argument that justifies not making this accomodation.
Suppose that we are looking at a topic
with no nested sub-topics.
When indexing the first reference to an
item, the entry should generate a point reference to the initial point where
that item enters the discussion. If the item is a prominent item within that
scope, a reference to the initial point is sufficient, because the
reader is likely to be interested in a large fraction of the scope without
being prompted by an index entry.
If the item is a subsidiary item in the
scope and only occurs once, a reference to the initial point is
sufficient, because the item only occurs once.
If the item is a subsidiary item that
occurs multiple times, or if the occurrences span multiple adjacent scopes,
than a page range is appropriate.
Now applying these guidelines to
topic-level index entries ...
A topic-level index entry is an assertion
that the item is a prominent item within that scope. The reference is to the
topic as a whole, and a reference to the initial point is sufficient.
A start-of-range assertion at the topic
level is not well defined. How do you know in a single topic that there will be
other subsequent topics that will address the same item? Ranges are inherently
appropriate for spans across contents of a topic or contents of a grouping of
topics.
In a map, a start-of-range assertion does
make sense.