Hi all,
I am a bit unsure if we get all this done in a hurry. Particularly the discussions about the version integration that Yves mentions below.
Anyway, here a preliminary, for discussion, proposal to add the capability to point to internal matches to XLIFF 2.1
Translation Candidate Reference Annotation
This annotation can be used to mark up content with a reference to other content which can be used as a translation proposal, but where
the translation is not yet known at the time of annotation.
This annotation can reference any source spans of content that are referencable via the
XLIFF
Fragment Identification
mechanism
Usage:
• The
id
attribute is REQUIRED
• The
type
attribute is REQUIRED and set to
mtc:imatch
• The
ref
attribute is REQUIRED and points to source content which can be used as translation candidate
• The
value
attribute is OPTIONAL and if used represents the similarity value for the translation proposal in the range from 0.0 to 100.0
• The
translate
attribute is OPTIONAL
For example:
<unit id="u1">
<segment id="s1">
<source>He is my friend.</source>
</segment>
</unit>
<unit id="u2">
<segment id="s1">
<source><mrk id="m1" type="mtc:imatch" ref="#u=u1/s1" value="100.0">He is my friend.</mrk></source>
</segment>
</unit>
As you see, there are at least two problems:
1)
Other than in the original concept of the matches module the internal matches have to cross unit boundaries
2)
To make the referenced content of an internal match re-segmentable, it would be best to mark it with <mrk> tags, too. In case there is
a translation added to that reference, it needs <mrk> tags, too.
The question is if one should make it a requirement that the ref attribute always points to a <mrk> (to enable resegmentation of the referenced
content without breaking the match).
Best regards,
Joachim
From: Yves Savourel [mailto:
ysavourel@enlaso.com]
Sent: Donnerstag, 13. November 2014 18:51
To: 'Dr. David Filip'; Schurig, Joachim
Cc:
xliff@lists.oasis-open.org Subject: RE: [xliff] internal matches
There are 5 more day until closure of the 2.1 features. So if this has to have any chance to make it someone needs to fill a proposal very soon.
Also, will we have anyone willing to implement it? (before January).
It would also bring an interesting first case for implementing/(or not) backward compatibility with modules:
-
Can a 2.1 document have 2.0 Translation candidates? (or both 2.1 and 2.1)?
-
Does the 2.1 core schema would have to include both Translation Candidates schemes?
-
Etc.
A lot of question for dealing with updated modules will need to be resolved (which is a good thing).
Cheers,
-yves
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org]
On Behalf Of Dr. David Filip
Sent: Thursday, November 13, 2014 10:12 AM
To: Schurig, Joachim
Cc:
xliff@lists.oasis-open.org Subject: Re: [xliff] internal matches
I think that this custom annotation would be a natural extension to the mtc module, should not be too difficult to add.
Dr. David Filip
=======================
OASIS XLIFF TC Secretary, Editor, and Liaison Officer
LRC CNGL CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
http://www.cngl.ie/profile/?i=452 mailto:
david.filip@ul.ie On Thu, Nov 13, 2014 at 12:12 PM, Schurig, Joachim <
Joachim.Schurig@lionbridge.com > wrote:
Hi Yves,
yes, thanks! I had been thinking about using a custom <mrk>, and it would work (by adding match quality in the value attribute), but it
is – custom and hence is not easily understood across tools of multiple parties.
I know that translation candidates are optional either, but at least if they are understood they should be understood in a common manner.
So I guess we really missed an important use case for them.
Internal matches cannot be neglected for their impact on translation cost, and to support them we now either need to implement a custom
annotation or implement a database and fuzzy matching engine in translation clients. Which renders translation candidates in the XLIFF superfluous, as we actually need to merge them then into the same database, and in that case we would have run better with
TMX embedded in the XLIFF file header (because it would avoid duplicates) and not associated with single segments/units.
I do not want to dramatize actually – I am only thinking that we missed by a hair the chance to have a match proposal mechanism that works
without a dynamic database of some kind.
Am I the only one worrying?
Best regards,
Joachim
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Yves Savourel
Sent: Montag, 10. November 2014 16:42
To:
xliff@lists.oasis-open.org Subject: RE: [xliff] internal matches
Hi Joachim,
Yes, you are correct, I there is no official ways to link a content to another content that is the same or very
similar (a duplicate or a fuzzy duplicate).
You could probably define some kind of annotation for this:
An mrk element spanning the duplicated/repetition content with ‘ref’ pointing to the original, and possibly ‘value’
as some indicator of the type of match (exact/fuzzy). If there is a need for more info one would have to define a module for that.
Cheers,
-yves
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Schurig, Joachim
Sent: Monday, November 10, 2014 5:56 AM
To:
xliff@lists.oasis-open.org Subject: [xliff] internal matches
Dear colleagues,
I wonder if we have overlooked a use case in the translation candidates module.
As you know, with XLIFF 2.0, it is easily possible to include reference data into the translatable file, such as matches and glossary data. The agent working
on these data does not need to perform searches or comparison on the content or reference data, as all reference data can be linked to specific portions of the content data.
However, for reference data which is only to become created during the modification process I do not see currently a method to link.
Think of content data in one segment which, after translation, is a reasonable translation candidate in another segment. This relationship is easy to detect in
the XLIFF creation or enricher phase.
But because this relationship cannot be expressed properly by reference mechanisms, one still needed to include e.g. fuzzy matching logic into the translation
agent.
That is, if I did not overlook something. Did I?
Thanks,
Joachim
________________________________
Joachim Schurig
Senior Technical Director Language Technology,
Lionbridge Fellow
Lionbridge
1240 Route des Dolines
06560 Sophia Antipolis
France