Kevin, I am at a loss here. Perhaps you
can elaborate on " there
is a risk that Notes could indeed become overloaded with information "
and why by adding additional attributes would help?
From:
"Kevin O'Donnell"
<
kevinod@microsoft.com>
To:
Helena S Chapman/San
Jose/IBM@IBMUS, "Dr. David Filip" <
David.Filip@ul.ie>
Cc:
"Estreen, Fredrik"
<
Fredrik.Estreen@lionbridge.com>, Ryan King <
ryanki@microsoft.com>,
"xliff@lists.oasis-open.org" <
xliff@lists.oasis-open.org>
Date:
12/17/2012 12:21 PM
Subject:
RE: [xliff]
1.2 to 2.0 Gaps and Proposals (notes)
Sent by:
<
xliff@lists.oasis-open.org>
Ryan’s examples demonstrate
some of the many theoretical uses of the Notes element. Until now, I haven’t
seen any guidance/caution regarding how the elements should be used, and
Ryan’s suggestions were intended to provide better organization and structure
when including multiple notes.
I think the usage
of Notes is a different matter; whether non-translatable information (e.g.
instructions) is included seems to be a matter for the XLIFF creator. I
don’t think it would be possible for us to restrict what data is included
in a Note.
As we identified previously,
there is a risk that Notes could indeed become overloaded with information,
given their similarity to metadata, but realistically, it’s difficult
to mitigate for this.
Thanks,
Kevin.
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Helena S Chapman
Sent: Monday, December 17, 2012 8:28 AM
To: Dr. David Filip
Cc: Estreen, Fredrik; Ryan King;
xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
Sorry I missed this discussion quite a bit.
Remind me again why we are putting in "non-translatable" or "non-localizable"
information in XLIFF file for translators in your examples?
Technically speaking, in Ryan's example, id is missing, is that intentional
or a mistake? Using a date string is also not advisable, prior to the XLIFF
file being transmitted, a system (any system) should be using a normalized
(UTC) Date object (e.g. Date object in _javascript_) to pass the information
around and not a date string.
Using XLIFF <notes> to convey "instructions" (in the example)
to me is just asking for trouble. In our environment, there is really no
comprehensive way to cover a set of "build" instructions in any
XLIFF files that would describe exactly how an"input" or "final
output" can be built with a set of XLIFF files that was given. We
do most of the interchange between systems as much as we can outside the
XLIFF file and leave the complexity as outlined in the examples below to
a separate metadata interchange process that encompasses not just the files
being exchanged but potentially the files that are yet to come or have
already arrived some time ago.
We do and can take advantage of the <notes> feature in XLIFF. However,
that is limited to only information exchange specifically about that particular
translation "transaction". A full job can contain many more transactions.
I would not recommend instructions to be passed along in the XLIFF file
because that would make the XLIFF interchange unnecessarily complex.
From: "Dr.
David Filip" <
David.Filip@ul.ie >
To: "
xliff@lists.oasis-open.org "
<
xliff@lists.oasis-open.org >
Cc: Ryan King
<
ryanki@microsoft.com >,
"Estreen, Fredrik" <
Fredrik.Estreen@lionbridge.com >
Date: 12/16/2012
08:47 AM
Subject: Re:
[xliff] 1.2 to 2.0 Gaps and Proposals (notes)
Sent by: <
xliff@lists.oasis-open.org >
Hi all, as there were no further comments, can we assume consensus and
change the spec along the lines discussed in this thread?
The summary is that note remains structurally the same, but receives new
core attributes as discussed, and will also allow for attributes from any
namespaces. It won't be extensible by foreign child elements.
Cheers
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Wed, Dec 12, 2012 at 12:00 AM, Dr. David Filip <
David.Filip@ul.ie >
wrote:
+1 as proposed now
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Tue, Dec 11, 2012 at 11:16 PM, Ryan King <
ryanki@microsoft.com >
wrote:
Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding
priority is OK) and I support the extensibility of note as well. Given
that, is there anyone on the list who disagrees with this proposal?
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Dr. David Filip
Sent: Tuesday, December 11, 2012 2:07 PM
To: Ryan King
Cc: Estreen, Fredrik; Rodolfo M. Raya;
xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
Ryan, I support adding the core
attributes as proposed, plus perhaps the priority [1-10] from Fredriks
example
But I also reiterate the request
for note to be extensible. This seemed to have Fredrik's support on this
thread.
I think that note and inline markers
should be extensible as part of the generic annotations design that will
allow development of annotation modules, such as ITS mapping..
Cheers
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Tue, Dec 11, 2012 at 6:24 PM,
Ryan King <
ryanki@microsoft.com >
wrote:
Do we have consensus on this
proposal? E.g. adding category, origin, and datetime (or timestamp) attributes
to <note>?
Thanks,
ryan
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Ryan King
Sent: Monday, December 3, 2012 11:59 AM
To: Dr. David Filip; Estreen, Fredrik
Cc: Rodolfo M. Raya;
xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2
to 2.0 Gaps and Proposals (notes)
>> On
the other hand having a minimum set for interoperability for ITS unaware
tools sounds good.
Agreed. And as stated on another
thread…w e suggest the list of additional
and optional attributes to be origin, category, datetime.
<notes>
<note category=”instruction”
origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize
Windows</note>
</note>
Thanks,
ryan
.
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Dr. David Filip
Sent: Friday, November 30, 2012 5:00 PM
To: Estreen, Fredrik
Cc: Ryan King; Rodolfo M. Raya;
xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
Thanks for outlining the options,
Fredrik,
I would be personally OK with note
being just extensible.
The ITS categories would allow to
specify pretty much everything that you would need. First as extension,
that should later turn into a module using the same mechanism.
On the other hand having a miniumum
set for interoprability for ITS unaware tools sounds good. And as Fredrik
pointed out ITS note can be easily mapped on these, so not an issue from
here.
Even with the minimum set of core
attributes, I still think it should be extensible.. to allow for unforeseen
types of annotations..
The only danger is of creating unnecessary
clutter if the adoption is minimal.. hard to say what the adoption will
be..
Cheers
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM,
Estreen, Fredrik <
Fredrik.Estreen@lionbridge.com >
wrote:
Hi Ryan, David,
How it would look is dependent
on if we add one or more standard attributes to the <note> element
or rely solely on third party extensions. First an examples of one of the
notes in your original sample and one showing a potential use of David’s
ITS mapping case.
<notes>
<note id=”n1”
ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”>
This string cannot be longer than 100 characters</note>
<note id=”n2”
its:locNoteType=”alert”>Make sure to adapt date format when localizing</note>
</notes>
It could be argued that there
is a set of very common metadata associated with notes and that we should
provide standard attributes in these cases. I’m not sure exactly which,
if any, we should have but the ones I can immediately think of are the
kind of information in the above sample plus a date:
* origin / author – Indicate
source of the note
* priority – indicate relative
importance of a note. Must have strict simple definition. Integer lower
is more important than higher for example.
* type / category – indicate
what type / aspect of the data or process the note applies to or annotates.
* date – creation or modification
date. Which of these it is should be specified.
The good thing about using
standard attributes instead of extensions for common properties is of course
better interoperability for the data contained. The negative side is that
it adds complexity to the standard which is against one of the goals of
the 2.0 work. One part of that is the attempt to reduce the number of seldom
or never used constructs to get a leaner core model. A solution that has
been discussed before is to have a more complex comment / annotation module
in addition to or extending the core feature. This way we get the same
complexity in the core as we would with just third party extensions but
with the added value of a fully interoperable path for those that want
that in this area.
If we hypothetically assume
we add origin and priority to the core the above example could look like
the bellow. Assuming the same mapping for ITS is used as the one proposed
for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and
stored in “priority”.
<notes>
<note id=”n1”
author=”developer” priority=”1” ms:noteType=”comment”> This string
cannot be longer than 100 characters</note>
<note id=”n2”
priority=”1” >Make sure to adapt date format when localizing</note>
</notes>
Regarding the naming of potential
core / module attributes I would prefer to use “category” instead of
“type” as the former does not convey the level of functional meaning
that the later does for me. It is more ‘just metadata’.
Regards,
Fredrik Estreen
From: Ryan King [mailto:
ryanki@microsoft.com ]
Sent: den 28 november 2012 21:55
To: Dr. David Filip; Estreen, Fredrik
Cc: Rodolfo M. Raya;
xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
David or Frederick, can you
give us an XLIFF example of how that would look?
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Dr. David Filip
Sent: Wednesday, November 28, 2012 12:50 PM
To: Estreen, Fredrik
Cc: Rodolfo M. Raya;
xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
Fredrik, all, same as Fredrik, I
think that extensibility makes sense here. I agree that the grouping mechanism
in the style of mda is not appropriate here and would change the semantics
in an undesired way.
Annotations are perfect extension
points in general, and besides we need the extensibility here for the its
mapping.
Cheers
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM,
Estreen, Fredrik <
Fredrik.Estreen@lionbridge.com >
wrote:
Hi Rodolfo, Ryan,
I think the intent of the
<notes> is lost with the current proposal. The feature is designed
so that <notes> is a container for a group of <note>s at a
specific level in the document. Where each <note> is one annotation
/ comment in itself. The suggested change transforms that so that the <notes>
element becomes the entity describing one note, with <note> describing
specific pieces of metadata related to that note. The ID is intended to
be used to refer to the note from other places such as from <mrk>
elements in the inline content, so overloading it to be the type of data
would cause additional problems.
I think the initial model
is much easier to work with and more clean as it contain all note related
information in one sub tree per document level where notes are allowed.
Adding attributes to the <note> element is in my opinion the best
way to go. If we should have more standard attributes or if a processor
is free to use the third party namespace extension mechanism to add them
is another question. Depending on how simple we want to keep the basic
notes feature it could be either or a mix of the two methods.
Although I’m not a fan of
the third party extensions I think this is a case where they could make
sense. And if used for process specific metadata only I don’t see an issue.
Of course there will be no standard way to display them in a UI or report
if they are not specified in the standard.
Regards,
Fredrik Estreen
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Rodolfo M. Raya
Sent: den 28 november 2012 09:32
To:
xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
Still a bad use case that
doesn’t justify ruining a good design.
Regards,
Rodolfo
--
Rodolfo M. Raya
rmraya@maxprograms.com Maxprograms
http://www.maxprograms.com From: Ryan King [ mailto:
ryanki@microsoft.com ]
Sent: Wednesday, November 28, 2012 5:32 AM
To: Rodolfo M. Raya; <
xliff@lists.oasis-open.org >;
Yves Savourel
Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
So that our original reason
for proposing having more than one <notes> at the extension point
does not get obfuscated in all of the replies and “see inlines”, here
once again, is the use case for adding more than one <notes> per
extension:
Proposal 4: Add an optional name
attribute on <notes> in core and <mds:metadata> module.
We believe it will be typical for
content providers to want to group their notes or metadata in meaningful
ways. This might be done so that a certain number of notes or bits of metadata
can be processed in the same way, or simply grouped and displayed together,
such as in an editor UI. Here are some examples:
<notes name="comments">
<note id=“comment">This string cannot be longer than
100 characters</note>
<note id=“origin">developer</note>
<note id=”priority”>1</note>
</notes>
<notes name="instructions">
<note id=“instruction">Do not localize the product name</note>
<note id=“origin">loc-engineer</note>
<note id=”priority”>2</note>
</notes>
As opposed to something less structured
and more difficult to process:
<notes>
<note id=“instruction">Do not localize the product name</note>
<note id=“instruction-origin">loc-engineer</note>
<note id=”instructions-priority”>1</note>
<note id=“comment">This
string cannot be longer than 100 characters</note>
<note id=”comment-priority”>2</note>
</notes>
Thanks,
Ryan
From: Rodolfo M. Raya [ mailto:
rmraya@maxprograms.com ]
Sent: Tuesday, November 27, 2012 5:41 PM
To: Ryan King
Cc: Yves Savourel; <
xliff@lists.oasis-open.org >
Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
Please don't ruin te design for
<notes>. Only one should be allowed per insertion point.
Regards,
Rodolfo
Sent from my iPad
On Nov 27, 2012, at 9:45 PM, "Ryan King" <
ryanki@microsoft.com >
wrote:
Hi Yves, in last week’s TC call it was mentioned that I should work with
the owners of the current features to get our requirements implemented
for proposals that weren’t deemed as features. I believe you are the owner
for the matches module and notes. Can you please let me know what we need
to do to move forward with getting these implemented?
·
Proposal
2: Be able to specify optional custom values for match type in <mtc:matches>
·
Proposal
4: Add an optional name attribute on <notes> in core (which also
means that we need to allow zero, one or more <notes> in each
position in the tree structure)
Additionally, it was deemed that we should
add Reference Language to the <mtc:matches> module. How do you want
to move forward with that? Since the module is already defined in the 2.0
spec, can I just suggest the method and if you agree, you can fold it into
the current module definition? I would propose:
1.
That we allow zero,
one or more <mtc:matches> at each extension point, because you
might have both recycling and reference language data.
2.
Add an optional
attribute reference=”yes no” with no as default. Additionally, PR for
a “reference match” would be to allow an xml:lang on the target different
from the document and allow the <source> not to be present as it
would be redundant information with the core <source>, e.g. Spanish
reference for Quechua might look like this:
<mtc:matches>
<mtc:match reference=”yes”>
<segment>
<target xml:lang=”es-es”>hola
mundo</target>
</segment>
</mtc:match>
</match>
I’m not sure if any of these require an
electronic ballot. I got the impression from the call that they don’t,
but hopefully Bryan or David or someone else from the call will correct
that if false.
Please let me know how I can work with you
on these.
Ryan
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Ryan King
Sent: Friday, November 16, 2012 5:02 PM
To: Dr. David Filip; Yves Savourel;
xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
Thanks Yves and David for
the valuable feedback. See our comments inline below prefixed with [Microsoft].
As David suggested on another thread, we will add these soon to the wiki.
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Dr. David Filip
Sent: Thursday, November 15, 2012 5:24 PM
To: Yves Savourel
Cc:
xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
Yves, Ryan et. al.
Commenting inline..
Cheers
dF
On Thu, Nov 15, 2012 at 8:23 PM,
Yves Savourel <
ysavourel@enlaso.com >
wrote:
Hi Ryan, all,
> Proposal 1: Add an optional build attribute to 2.0 <file> element
in core.
> ..
> <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01">
I don't see anything wrong with
this.
> Proposal 2: Be able to specify optional custom values for match type
> attribute in the <mtc:matches> module.
> Content providers and Localization Suppliers base their cost and billing
> models on match similarity and match types. Localization suppliers
charge
> us differently for ICE Matches, Exact Matches, and Fuzzy Matches,
and we
> might even want to get more granular than that as our cost and billing
models
> evolve with the business.
> In 2.0, the match type doesn’t support the values exact-match and
fuzzy-match,
> which were defined in the state-qualifier attribute in 1.2. Instead
of supporting
> these two, or any others that may not have migrated from 1.2 to 2.0,
> as a separate attribute, the request is, that like the discussion
on state
> and sub-state in the Face-to-Face in Seattle, we add a sub-type to
match type.
> This will allow us to add extra business logic to types, such as "tm"
or "mt",
> which are already defined in the spec.
> <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
> <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”>
> <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”>
I understand the need for the information,
but to me, it seems the similarity give you whether a match is exact or
not.
The example however, shows (I think) that you are thinking about categories
that could be mapped differently to the similarity depending on projects.
For example in one project a near-match corresponds to one range and in
another to a different range, and you want to simply map that info to something
common across your process, without having to carry the ranges around.
If that's the case I wonder if XLIFF should define any default like xlf:exact,
etc.
I believe there is value in decoupling
the "percentage" from the "business" type of the match.
The number means nothing unless we opt to prescribe a specific variety
of (modified) Levenshtein, and I i guess we should not open this particular
can of worms..
So I wouldn't see a problem with
a sub-type there.
A side comment on the match type: especially, if we allow sub-type, I'm
still not sure about the values currently listed.
[Microsoft] we definitely
advocate decoupling the “percentage” from the “business” type of match
as David puts it. And we should not prescribe meaning to the percentage,
either. Costing models built on top of these values will necessarily change
from one provider/supplier to the next and as Yves states, possibly from
one project to the next. We could very easily have the following (and we
do in much of our recycled content):
<match id=”1”
similarity=”100.0” type=”tm/xlf:exact”>
<match id=”1” similarity=”100.0” type=”ice”>
In the first case, we’ve
recycled a candidate which is 100% match, but came from a segment whose
state isn’t signed off or final yet, whereas the ice match, in our case,
has the requirement of being 100% and signed off or final.
> Proposal 3: Add an optional Reference Language to core.
> This is a crucial feature for Microsoft and other large companies
that localize
> minority languages. For example, it is typical that when we localize
from
> English into Quechua, localizers are more efficient and provide much
higher
> quality translation, when along with English source, we provide them
with
> Spanish target. In 1.2, Reference Languages could be defined in
> an <alt-trans> element:
I see the use case and I've seen
other cases like this, with Chinese (simplified/Traditional).
Could that be part of the match module?
Possibly with a new attribute (e.g. reference='yes no' defaulting to no)
Adding something along with <source>/<target> is bound to cause
additional PR issues. If it's part of the Match module, it just uses whatever
the module PRs are.
I agree with Yves's reasons to have
this within the match module, which is anyway the alt-trans successor.
I guess it does not fulfill the core criteria
[Microsoft] Adding this to
the match module would be fine as long as the proper explanatory text and
processing instructions make it clear what this data should be used for
as opposed to recycling.
> Proposal 4: Add an optional name attribute on <notes> in core
> and <mds:metadata> module.
> We believe it will be typical for content providers to want to
> ...
> <notes name="comments">
> <note id=“comment">This string cannot be longer
than 100 characters</note>
> <note id=“user">
Developer@microsoft.com </note>
> <note id=“date">10/21/2012 5:28:13 PM</note>
> </notes>
Sounds reasonable. We'll have to
allow several <notes> and <m:metadadat> (I think (but I may
be wrong) only one is allowed)) on the extension point.
The example makes me wonder about the long term life of XLIFF though: likely
this type of info (author, timestamp) will be needed by other. Maybe a
better way to address it would be to add attributes to the note and meta
that carry the author and time stamp?
That would obviously work only if those two info are the only example you
have in mind.
I agree with Yves that a couple
of standard attributes should be added to increase interoperability, still
I believe that note should be fully extendable, as it is part of the general
annotation mechanism and should be able to carry attributes from other
namespaces.
[Microsoft] Capturing an author
and timestamp on a comment is specific to our needs and thus that example.
However, we do see value in being able to apply an author and timestamp
on potentially any piece of data. So a module (as Yves suggests below)
that can exists at the same extension points as metadata (and including
metadata) might lend itself better to that.
> Proposal 5: Add optional change tracking attributes to <segment>.
> ...
> <segment id=”1” modifiedBy=”
translator@loc.com ”
> modifiedDate=”10/21/2012 5:28:13 PM”>
> <source>hello world</source>
> <target>hola món</target>
> </segment>
Here again I'm wondering if a "change
track" module may be better?
You could use it not just on segments but other elements: notes.
The issue then would be how this gets updated if it's not a core component?
Actually if it's a core attribute, does it means it's not optional?
I'm not sure there is a way, even with a PR, to guarantee these data will
be up-to-date.
But maybe that's ok?
Optional attributes in core are
tricky, IMHO It means you do not need to introduce it yourself, if you
do not feel so.. But if present it would need to be processed by agents
who modify the segment. If it is thinkable that change agents do not update
it, it feels more like a module...
[Microsoft] Since
we are heading down the same path to MUST preserve modules as well, if
we introduce a “change track” module, then user agents would need to
preserve it if present, but as for any other processing requirements, such
as updating it, that could be specified as part of the module’s processing
requirements. For example: The module MUST be preserved and SHOULD be updated
by user agents.
cheers,
-yves
---------------------------------------------------------------------
To unsubscribe, e-mail:
xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail:
xliff-help@lists.oasis-open.org