OASIS XML Localisation Interchange File Format (XLIFF) TC

 View Only
Expand all | Collapse all

RE: [xliff] ref value in translation candidates module element

  • 1.  RE: [xliff] ref value in translation candidates module element

    Posted 04-06-2021 19:35
    Hi Yoshito, I would differentiate two cases: 1) The match applies to the whole segment 2) The match applies to a fragment If the match applies to the whole segment, pointing ref to the id of the segment makes sense. Keep in mind that both <segment> and <mtc:match> are containers. The concept of span doesn't make much sense here. As you say, we need to be more specific regarding the use of containers. The text is weak here. If the match applies to a fragment, then pointing to the id of an inline element of any kind makes sense. It could be a <mrk> but it could also be something else. What is not clear enough is the location of the text span. Right now it could be anywhere inside the <unit>, so it might be in <target>,  <ignorable> or even a <gls:glossEntry>!!!  This certainly needs to be clarified.  Regards, Rodolfo -- Rodolfo M. Raya Maxprograms http://www.maxprograms.com -------- Original Message -------- Subject: RE: [xliff] ref value in translation candidates module element <mtc:match> From: Yoshito Umaoka < yoshito_umaoka@us.ibm.com > Date: Tue, April 06, 2021 4:07 pm To: Mr. Rodolfo Raya < rmraya@maxprograms.com > Cc: xliff@lists.oasis-open.org Thanks Rodolfo for your quick response. > The specs says that ref attribute points to a span of text, that does not mean it must be a <mrk> element, I take it just means that it has to point to an element that contains the text that is being matched. This is somewhat I initially thought and I thought it make sense to point <segment> element. However, I realized that the original intent was not to limit a text span of match can be in <target> element, I started thinking the spec really meant exact text span specified. For for example: <?xml version = 1.0 encoding = UTF-8 ?> <xliff xmlns = urn:oasis:names:tc:xliff:document:2.0 version = 2.0 srcLang = en xmlns:fs = urn:oasis:names:tc:xliff:fs:2.0 xml:space = preserve xmlns:mtc = urn:oasis:names:tc:xliff:matches:2.0 trgLang = fr > <file id = f1 > <unit id = u1 > <mtc:matches> <mtc:match id = tc1 ref = #s1 similarity = 65 > <source> Please check the output. </source> <target> Veuillez vÃrifier la sortie. </target> </mtc:match> </mtc:matches> <segment id = s1 > <source> Please check the results. </source> <target> Veuillez vÃrifier les rÃsultats. </target> </segment> </unit> </file> </xliff> If we allow <segment> can be referenced from <mtc:match> in bi-lingual XLIFF, it becomes ambiguous whether match/similarity is coming from the <source> value or the <target> value. I think typical translation service application using XLIFF would do translation memory look up for each <segment>. If I were author of the spec and if segment element can be used for target span of text, I would not put example found in 5.1.4 - <unit id = 1 > <mtc:matches> <mtc:match ref = #m1 > <source> He is my friend. </source> <target> Il est mon ami. </target> </mtc:match> <mtc:match ref = #m1 > <source> He is my best friend. </source> <target> Il est mon meilleur ami. </target> </mtc:match> </mtc:matches> <segment> <source><mrk id = m1 type = mtc:match > He is my friend. </mrk></source> </segment> <segment> <source> Yet, I barely see him. </source> </segment> </unit> Use of <mrk> will allow us to annotate sub-string of source/target value. If you can use a segment ID for the reference, the use of segment ID looks more natural choice. But this example uses <mrk> to enclose entire span of text in <source> element. I also carefully looked at the term span of text spanning etc. in the specification. I feel these terms are specifically used for text value (including inline elements). The segment element is a container of source/target element, so I feel segment element does not specify span of text . If the original intention was to allow segment ID reference, then the spec should state - points to a span of text or a container... -Yoshito


  • 2.  RE: [xliff] ref value in translation candidates module element

    Posted 04-06-2021 20:08
    Hi Rodolfo, >I would differentiate two cases: >1) The match applies to the whole segment Yes, I understand the idea. What I'm not sure is, as I mentioned in my previous message, <segment> is a container of <source> and <target>. It makes sense to point to <source> alone (or <target> alone) to specify relevant text for a match candidate. But if it points to <segment> and if the segment has both <source> and <target>, then suddenly it becomes ambiguous. If we can point <source> or <target>, then it makes more sense, but <source>/<target> cannot have "id". If we will refine the specification, and if we really want to allow segment can be referenced directly, then we should explain the semantics. In other words: <mtc:matches> <mtc:match id = "tu1" ref = "#s1" similarity = "65" > <source> Please check the output. </source> <target> Veuillez vérifier la sortie. </target> </mtc:match> </mtc:matches> <segment id = "s1" > <source> Please check the results. </source> <target> Veuillez vérifier les résultats. </target> </segment> Above is equivalent to <mtc:match id = "tu1" ref = "#m1" similarity = "65" > <source> Please check the output. </source> <target> Veuillez vérifier la sortie. </target> </mtc:match> </mtc:matches> <segment id = "s1" > <source><mrk id = "m1" type = "mtc:match" > Please check the results. </mrk></source> <target> Veuillez vérifier les résultats. </target> </segment> or <mtc:matches> <mtc:match id = "tu1" ref = "#m1" similarity = "65" > <source> Please check the output. </source> <target> Veuillez vérifier la sortie. </target> </mtc:match> </mtc:matches> <segment id = "s1" > <source> Please check the results. </source> <target><mrk id = "m1" type = "mtc:match" > Veuillez vérifier les résultats. </mrk></target> </segment> or something else. -Yoshito