MHonArc v2.5.0b2 -->
xliff message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: RE: [xliff] Fwd: Handling escaped characters in Translation Units
Paul,
You've raised a good question. As you've discovered, XLIFF is flexible in
its representation of escaped characters. I don't believe there's one "right
way" or even necessarily a "best way" to convert source to XLIFF. To let you
know my perspective, most of the XLIFF work I've done is with XHTML and
JavaScript rather than MS C++ or Java. The committee's HTML profile document
(currently in draft), in fact, describes two different approaches to process
HTML to XLIFF. Nevertheless, please consider the following principles:
(1) XLIFF text should not include computer language-specific encoding.
Ideally, in my opinion, XLIFF source text should be independent of the
computer language from which it was extracted. For example, "—" and
"©" are HTML entities that should be converted to their equivalent
binary value, U+0097 and U+00A9 respectively, or XML character references,
"—" and "©" respectively. Similarly, \uNNNN and \xNN escaped
characters in C-like languages should be converted to their binary value or
XML character reference, with the binary value being preferred. Line-breaks,
\n and \r\n, however, should be converted to an XLIFF tag with ctype="lb",
as in <x id="1" ctype="lb" />.
(2) Use standard XLIFF types, when available, rather than custom "x-" types.
For example, use ctype="lb" for a line-break.
<x id="1" ctype="lb" />
Regards,
Doug Domeny