Novell, Inc., THE leading provider of Net business solutions
http://www.novell.com>>> Enda
McDonnell <
EndaMcD@alchemysoftware.ie> 04/15/02 10:28 >>>
Hi,
In the current specification, the tool attribute is free text, the 1.0
spec
says that it "is used to specify the signature and version of the tool
that
created or modified the document".
However, this mechanism is a
bit loose and open to mis-use. For example, a
tool may omit the version
number. Including tool-name and tool-version
attributes in the next version
would be a better solution.
Regarding a tools registry, I don't think we
could limit the names to a
standard list. The hope is that as many tools as
possible will use this
xliff format. Is it necessary to have a naming
convention for tool names?
A convention is too easy to ignore, I think the
best solution may be to
introduce another attribute, tool-company. This way
a tool can be clearly
defined as
tool-company = ACME
tool-name =
Killer App
tool-version = 4.0
and not in a confusing manner such as
tool = ACME Killer
or
tool = ACME Ltd. Killer 4
or
tool
= ACME, Killer App
or
tool = ACME Ltd., Killer App 4.0
I will
write this up in more detail and propose these additions to the TC
for the
next release of xliff.
Enda
-----Original Message-----
From:
Stephen Holmes [
mailto:sholmes@novell.com]Sent:
14 April 2002 11:54
To:
xliff@lists.oasis-open.org ;
John Reid
Subject: Re: [xliff] XLIFF 1.0 issues
Thanks for the
information. Can I ask then, does the tool information
capture the version
of the tool aswell - is it just a free-text
attribute?
Reason: The
responsibiility to produce the count in the first instance
is the
responsibility of the content parser. As the parser may be
revised X times
to address defects, add functionality etc, can we look
at a standardised way
of specifying the tool/parser and version?
Is this XLIFF group, or
some subcommitte looking at a tools registry,
i.e., agreed and standard
names for the tools out there or some form of
guidelines for creating these
Tool names?
Finally, is there a plan to integrate/leverage the LISA
findings on this
topic?
Steve.
S t e p h e n H o l m
e s
Localisation Development Manager
International Product Development
Voice: +353 (1) 241 5732
Fax: +353 (1) 241 5749
Novell,
Inc., THE leading provider of Net business solutions
http://www.novell.com>>>
John Reid <
JREID@novell.com
> 04/12/02 19:05 PM >>>
On Point 1:
[Alt-jr1] The purpose
for including the tool as an attribute of the
<count-group> is so that
Tool Y will know that the counts it is about to
use/update are not theirs.
Thus, Tool Y may want to produce its own.
However, if Tool Y is compatable
with Tool X, then it can use Tool X's
counts. Meanwhile, Tool X can find its
own counts and update,
accordingly.
<count-group tool="Tool X"
name="example">
<count count-type="untranslated">132</count>
</count-group>
This does become complicated, though, when Tool
Z is used. Tool Z may
have a compatibility issue with Tool Y but not ToolX.
If Tool Y updated
Tool X's counts, Tool Z may use those inaccurately.
[Alt-jr2] There is another solution to this: We already have a
<phase>
element that stores the tool used in that phase. The
phase-name
attribute could be added to <count>. Thus, when that count
was produced
and by what, could be ascertained by any subsequent tool and a
determination of if to use the count could be made.
<phase-group>
<phase phase-name="create"
process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
</phase-group>
.
<count-group name="example">
<count phase-name="create" count-type="untranslated">132</count>
</count-group>
Then again, with either method, a tool has to
update the attribute of a
count or count-group element and historical data
is lost. Thus, adding
the phase-name or the tool attribute methods have
essentially the same
consequence: Tool Z knows which tool last touched the
named count. Using
a tool attribute has the advantage of being in the scope
of the current
node. The phase-name has the advantage of carrying additional
information such as date.
[Alt-jr3] Another alternative is to add a
new element to <count-group>,
such as <update>, that has
attributes of tool and date. Thus, multiple
updates could be recorded for a
<count-group>. This would need to be at
the <count-group> level
since we do want to keep the contents of <count>
to the actual count.
<count-group name="example">
<update tool="Tool X"
date="2002-04-10T09:41:02Z"/>
<count phase-name="create"
count-type="untranslated">132</count>
</count-group>
This solution has the disadvantage that it implies an update to all the
counts within a <count-group> of which there may be many and only one
updated. This is also a weakness for adding the tool attribute to
<count-group>.
Alt-jr2 can be used to keep historical data
since ther is no
restriction on the number of counts that can be stored
within the
<count-group>. Thus, Tool X can supply a count in phase 1,
<phase-group>
<phase phase-name="create"
process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
</phase-group>
.
<count-group name="example">
<count phase-name="create" count-type="untranslated">132</count>
</count-group>
Tool Y can add an update to it in phase 2,
<phase-group>
<phase phase-name="create"
process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
<phase phase-name="translate" process-name="Translation" tool="Tool Y"
date="2002-04-11T11:43:04Z"/>
</phase-group>
.
<count-group name="example">
<count phase-name="create"
count-type="untranslated">132</count>
<count
phase-name="translate" count-type="untranslated">43</count>
</count-group>
and Tool Z can update Tool X's count and ignore
Tool Z's in phase 3.
<phase-group>
<phase
phase-name="create" process-name="Translation" tool="Tool X"
date="2002-04-10T09:41:02Z"/>
<phase phase-name="translate"
process-name="Translation" tool="Tool Y"
date="2002-04-11T10:42:03Z"/>
<phase phase-name="review" process-name="Translation" tool="Tool Z"
date="2002-04-12T11:43:04Z"/>
</phase-group>
.
<count-group name="example">
<count phase-name="create"
count-type="untranslated">132</count>
<count
phase-name="translate" count-type="untranslated">43</count>
<count phase-name="review" count-type="untranslated">56</count>
</count-group>
Thoughts?
cheers,
john
>>> Stephen Holmes <
sholmes@novell.com > 4/11/02 4:44:58
PM >>>
On point 1, I'd just make the comment that the value of
adding the
tool
that created the wordcount as an attribute is of
relatively little use
if you take a situation where, for example, "Tool X"
generates the
data,
but "Tool Y" reads it for processing and has
different ideas about
what
consitutes a word count.
It's an age
old problem in localisation - "Who has the correct word
count?". As tools
may be completely proprietary, even if based on
XLIFF
containers, I see
no reason in complicating the attribute qualifiers.
This may become the
topic of a subcommitte...
On point 3 - bear in mind that
localisatin/language tools that aspire
to
be network-based will find
base64 encoded content to be monumentally
large to transfer. Europe,
remember, is still predominantly 56K and
we
all remember the hassle
involved in FedEx'ing CD's to China - business
reality supercedes
specification.
Cheers
Steve.
S t e p h e n H o l m e
s
Localisation Development Manager
International Product Development
Voice: +353 (1) 241 5732
Fax: +353 (1) 241 5749
Novell,
Inc., THE leading provider of Net business solutions
http://www.novell.com>>>
John Reid <
JREID@novell.com
> 04/11/02 19:02 PM >>>
Hi All,
My comments follow
Mark's, between <jr>...</jr> tags.
>>> Mark Levins
<
mark_levins@ie.ibm.com
> 4/5/02 5:59:53 AM >>>
1. <note> as a child of
<count>
Currently the <count> element is very ambiguous, a note
as a child
element
could be used to indicate what was being counted,
what was considered
a
word etc.
<jr>The
<count-group> and <count> elements can be very problematic. A
<note> element within the <count> element may help in the
customized
support required by these elements but that is a human readable
approach
and probably would need to be defined even more to be truly
useful. A
stronger definition of the count element may do more for us.
<count> has the 'unit' attribute which has recommended values of word,
page, trans-unit, bin-unit, and item. The latter three are defined
according to elements within the spec but the former two must be
defined
by the tool creating the count. I suggest that we include the tool as
an
attribute to the count-group. This would be the same attribute used in
<file>, <phase>, and <alt-trans>. Further refinement of
the 'unit'
attribute may alo be necessary.</jr>
2. The
<count-group>, <prop-group> and <context-group> elements can
be
used within a <group> without any other relevant child elements
The 1.0 specification allows that a <group> element can contain (for
example) a <count-group> without containing anything to count. I think
the
<group> element should be changed to contain at least one of
<group>,
<trans-unit> or <bin-unit>.
<jr>Shouldn't this requirement be placed on the <body>
also?</jr>
3. Binary elements & <internal-file>
This is kind of a big one. At the moment the specification does not
define
the form of the content of the <internal-file> element
(although there
is
an optional 'form' attribute). The problem is see
with this is that
the
specification allows users place binary data
directly as content -
this
binary content may contain the reserved
XML characters < > etc which
will
cause parsers to choke.
The
CDATA section approach is also not good enough to provide a
solution.
My
suggestion is that the content of the <internal-file> be restricted
to
Base64 or at least stated so.
Also, the description in the spec for the
<internal-file> element
reads
"The <internal-file>
element will contain the data for the skeleton
file."
which is
technically wrong, it may also contain data for an
<bin-source>
or
<bin-target> element.
<jr>How does CDATA fail this purpose?
I wouldn't want to restrict this
to just Base64; thus, requiring a
conversion for both the producer and
any subsequent processor that may be
able to handle the original
format
without a problem. Additionally,
wouldn't we need an attribute such as
'original-format' if we forced your
conversion?</jr>
4. mime-type attribute of <bin-source>
How come this attribute is omitted from the <bin-source> element? Note
that it is an attribute of <bin-target>
<jr>We
generally put attributes for <source> and <bin-source> in the
parent, <trans-unt> and <bin-unit>, respectively. The
'mime-type'
attribute of the target allows a different mime-type for the
target in
cases where it differs from that specified from the
<bin-unit>'s.
Otherwise, the mime-type of the target is
unnecessary.</jr>
Cheers,
john
----------------------------------------------------------------
To
subscribe or unsubscribe from this elist use the subscription
manager:
<
http://lists.oasis-open.org/ob/adm.pl
>
----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <
http://lists.oasis-open.org/ob/adm.pl
>
----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <
http://lists.oasis-open.org/ob/adm.pl
>
----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <
http://lists.oasis-open.org/ob/adm.pl
>
----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <
http://lists.oasis-open.org/ob/adm.pl
>