OASIS XML Localisation Interchange File Format (XLIFF) TC

 View Only
  • 1.  RE: [xliff] ref value in translation candidates module element

    Posted 04-06-2021 18:17
    Hello Yoshito, I don't know what the original goal was, but I'll offer my interpretation. The specs says that "ref" attribute points to a span of text, that does not mean it must be a <mrk> element, I take it just means that it has to point to an element that contains the text that is being matched. If my assumption is right, there is an error in example 4.3.1.22 as you correctly pointed out and it should say: <mtc:match id="tc01" ref="#seg2"> I certainly hope the original intention was not to force enclosing <source> content in a <mrk> element as was done with <note>. Forcing the enclosure of <source> content in <mrk> is the only way to point <note> to a segment because <note> lacks "ref". Noted content has to reference the <note> instead of having <note> referencing annotated text. I think that <note> and <mtc:match> should both use the same mechanism for pointing at text in a <unit>. I would prefer leaving <source> untouched, but I consider that a consistent mechanism is more important. The element <gls:glossentry> also uses "ref" but there is no example pointing to a whole <source> content. We may need to add one. Regards, Rodolfo -- Rodolfo M. Raya Maxprograms http://www.maxprograms.com


  • 2.  RE: [xliff] ref value in translation candidates module element

    Posted 04-06-2021 19:08
    Thanks Rodolfo for your quick response. > The specs says that "ref" attribute points to a span of text, that does not mean it must be a <mrk> element, I take it just means that it has to point to an element that contains the text that is being matched. This is somewhat I initially thought and I thought it make sense to point <segment> element. However, I realized that the original intent was not to limit a text span of match can be in <target> element, I started thinking the spec really meant exact text span specified. For for example: <?xml version = "1.0" encoding = "UTF-8" ?> <xliff xmlns = "urn:oasis:names:tc:xliff:document:2.0" version = "2.0" srcLang = "en" xmlns:fs = "urn:oasis:names:tc:xliff:fs:2.0" xml:space = "preserve" xmlns:mtc = "urn:oasis:names:tc:xliff:matches:2.0" trgLang = "fr" > <file id = "f1" > <unit id = "u1" > <mtc:matches> <mtc:match id = "tc1" ref = "#s1" similarity = "65" > <source> Please check the output. </source> <target> Veuillez vérifier la sortie. </target> </mtc:match> </mtc:matches> <segment id = "s1" > <source> Please check the results. </source> <target> Veuillez vérifier les résultats. </target> </segment> </unit> </file> </xliff> If we allow <segment> can be referenced from <mtc:match> in bi-lingual XLIFF, it becomes ambiguous whether match/similarity is coming from the <source> value or the <target> value. I think typical translation service application using XLIFF would do translation memory look up for each <segment>. If I were author of the spec and if segment element can be used for target span of text, I would not put example found in 5.1.4 - <unit id = "1" > <mtc:matches> <mtc:match ref = "#m1" > <source> He is my friend. </source> <target> Il est mon ami. </target> </mtc:match> <mtc:match ref = "#m1" > <source> He is my best friend. </source> <target> Il est mon meilleur ami. </target> </mtc:match> </mtc:matches> <segment> <source><mrk id = "m1" type = "mtc:match" > He is my friend. </mrk></source> </segment> <segment> <source> Yet, I barely see him. </source> </segment> </unit> Use of <mrk> will allow us to annotate sub-string of source/target value. If you can use a segment ID for the reference, the use of segment ID looks more natural choice. But this example uses <mrk> to enclose entire span of text in <source> element. I also carefully looked at the term "span of text" "spanning" etc. in the specification. I feel these terms are specifically used for text value (including inline elements). The segment element is a container of source/target element, so I feel segment element does not specify "span of text". If the original intention was to allow segment ID reference, then the spec should state - "points to a span of text or a container..." -Yoshito