OASIS XML Localisation Interchange File Format (XLIFF) TC

 View Only
  • 1.  Question on CDATA in Xliff

    Posted 04-21-2015 00:21
    Hi TC, I have a question regarding CDATA in Xliff.   Here is the relevant section from the spec:   4.4 CDATA sections   CDATA sections (<![CDATA[...]]>) are allowed in XLIFF content, but on output they MAY be changed into normal escaped content.   Note that avoiding CDATA sections is considered a best practice from the internationalization viewpoint [XML I18N BP].   Processing Requirements   •Agents MUST process CDATA sections.   •Writers MAY preserve the original CDATA sections.   Given this example: <![CDATA[this is <b>my cdata</b>]]>   What does “but on output they MAY be changed into normal escaped content” mean? Can it mean using inline tags? E.g.   <source>no text &lt;![CDATA[this is <pc dataRefEnd="d2" dataRefStart="d1" id="tag1">my cdata</pc>]]&gt;</source>   Or is that not recommended and it means using entities? e.g.   <source>no text &lt;![CDATA[this is &lt;b&gt;my cdata&lg;/b&lt;]]&gt;</source>   I’m assuming since it says MAY, that it is also valid to not escape it at all:   <source><![CDATA[this is <b>my cdata</b>]]></source>   Thanks, Ryan


  • 2.  RE: [xliff] Question on CDATA in Xliff

    Posted 04-21-2015 02:43
    Hi Ryan, all, > What does "but on output they MAY be changed into normal escaped content" mean? > Can it mean using inline tags? E.g. > <source>no text &lt;![CDATA[this is <pc dataRefEnd="d2" dataRefStart="d1" > id="tag1">my cdata</pc>]]&gt;</source> No. Not at all. > Or is that not recommended and it means using entities? e.g. > > <source>no text &lt;![CDATA[this is &lt;b&gt;my cdata&lg;/b&lt;]]&gt;</source> Yes, but it's not a CDATA section anymore, so the CDATA marker should not be there. > I'm assuming since it says MAY, that it is also valid to not escape it at all: > > <source><![CDATA[this is <b>my cdata</b>]]></source> Yes. In summary, it means you can output: a) like the original (preserve the CDATA notation): <source><![CDATA[this is <b>my cdata</b>]]></source> b) use the normal content notation, and in that case, any XML meta character (like <, or &) would be escaped: <source>this is &lt;b>my cdata&lt;/b></source> Or, if you want to alo escape > to &gt;: <source>this is &lt;b&gt;my cdata&lt;/b&gt;</source> Those three notations result in the exact same parsed string: "this is <b>my cdata</b>", which is what you had in input. I hope this helps, -yves


  • 3.  RE: [xliff] Question on CDATA in Xliff

    Posted 04-21-2015 07:32
    Thanks Yves, that answers my question. It wasn't clear to me what escaped meant in this context and now I see it means using entities AND removal of the CDATA marker. Ryan From: Yves Savourel Sent: ?4/?20/?2015 7:42 PM To: Ryan King ; xliff@lists.oasis-open.org Subject: RE: [xliff] Question on CDATA in Xliff Hi Ryan, all, > What does "but on output they MAY be changed into normal escaped content" mean? > Can it mean using inline tags? E.g. > <source>no text &lt;![CDATA[this is <pc dataRefEnd="d2" dataRefStart="d1" > id="tag1">my cdata</pc>]]&gt;</source> No. Not at all. > Or is that not recommended and it means using entities? e.g. > > <source>no text &lt;![CDATA[this is &lt;b&gt;my cdata&lg;/b&lt;]]&gt;</source> Yes, but it's not a CDATA section anymore, so the CDATA marker should not be there. > I'm assuming since it says MAY, that it is also valid to not escape it at all: > > <source><![CDATA[this is <b>my cdata</b>]]></source> Yes. In summary, it means you can output: a) like the original (preserve the CDATA notation): <source><![CDATA[this is <b>my cdata</b>]]></source> b) use the normal content notation, and in that case, any XML meta character (like <, or &) would be escaped: <source>this is &lt;b>my cdata&lt;/b></source> Or, if you want to alo escape > to &gt;: <source>this is &lt;b&gt;my cdata&lt;/b&gt;</source> Those three notations result in the exact same parsed string: "this is <b>my cdata</b>", which is what you had in input. I hope this helps, -yves


  • 4.  RE: [xliff] Question on CDATA in Xliff

    Posted 04-21-2015 19:44
    One more point of clarification. If I choose to “escape” and drop the CDATA marker:   <source>this is &lt;b>my cdata&lt;/b></source>   Then it is also recommended, I assume, to encode using inline codes at that point as well:   <originalData> <data id-“d1”> &lt;b></data> <data id-“d2”> &lt;b></data> </originalData> … <source>this is <pc dataRefStart="d1"  dataRefEnd="d2" id=”tag1”>my cdata</pc></source>   But of course, I don’t really want to mess with CDATA anyway, so I’ll just leave it as a CDATA block J .   Thanks again!   From: Ryan King Sent: Tuesday, April 21, 2015 12:31 AM To: Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] Question on CDATA in Xliff   Thanks Yves, that answers my question. It wasn't clear to me what escaped meant in this context and now I see it means using entities AND removal of the CDATA marker. Ryan From: Yves Savourel Sent: ?4/?20/?2015 7:42 PM To: Ryan King ; xliff@lists.oasis-open.org Subject: RE: [xliff] Question on CDATA in Xliff Hi Ryan, all, > What does "but on output they MAY be changed into normal escaped content" mean? > Can it mean using inline tags? E.g. > <source>no text &lt;![CDATA[this is <pc dataRefEnd="d2" dataRefStart="d1" > id="tag1">my cdata</pc>]]&gt;</source> No. Not at all. > Or is that not recommended and it means using entities? e.g. > > <source>no text &lt;![CDATA[this is &lt;b&gt;my cdata&lg;/b&lt;]]&gt;</source> Yes, but it's not a CDATA section anymore, so the CDATA marker should not be there. > I'm assuming since it says MAY, that it is also valid to not escape it at all: > > <source><![CDATA[this is <b>my cdata</b>]]></source> Yes. In summary, it means you can output: a) like the original (preserve the CDATA notation): <source><![CDATA[this is <b>my cdata</b>]]></source> b) use the normal content notation, and in that case, any XML meta character (like <, or &) would be escaped: <source>this is &lt;b>my cdata&lt;/b></source> Or, if you want to alo escape > to &gt;: <source>this is &lt;b&gt;my cdata&lt;/b&gt;</source> Those three notations result in the exact same parsed string: "this is <b>my cdata</b>", which is what you had in input. I hope this helps, -yves


  • 5.  RE: [QUAR] RE: [xliff] Question on CDATA in Xliff

    Posted 04-21-2015 19:53
    Yes, ideally the <b> element would be somehow processed as inline before it gets to XLIFF.     From: Ryan King [mailto:ryanki@microsoft.com] Sent: Tuesday, April 21, 2015 1:44 PM To: Yves Savourel; xliff@lists.oasis-open.org Subject: [QUAR] RE: [xliff] Question on CDATA in Xliff   One more point of clarification. If I choose to “escape” and drop the CDATA marker:   <source>this is &lt;b>my cdata&lt;/b></source>   Then it is also recommended, I assume, to encode using inline codes at that point as well:   <originalData> <data id-“d1”> &lt;b></data> <data id-“d2”> &lt;b></data> </originalData> … <source>this is <pc dataRefStart="d1"  dataRefEnd="d2" id=”tag1”>my cdata</pc></source>   But of course, I don’t really want to mess with CDATA anyway, so I’ll just leave it as a CDATA block J .   Thanks again!   From: Ryan King Sent: Tuesday, April 21, 2015 12:31 AM To: Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] Question on CDATA in Xliff   Thanks Yves, that answers my question. It wasn't clear to me what escaped meant in this context and now I see it means using entities AND removal of the CDATA marker. Ryan From: Yves Savourel Sent: ?4/?20/?2015 7:42 PM To: Ryan King ; xliff@lists.oasis-open.org Subject: RE: [xliff] Question on CDATA in Xliff Hi Ryan, all, > What does "but on output they MAY be changed into normal escaped content" mean? > Can it mean using inline tags? E.g. > <source>no text &lt;![CDATA[this is <pc dataRefEnd="d2" dataRefStart="d1" > id="tag1">my cdata</pc>]]&gt;</source> No. Not at all. > Or is that not recommended and it means using entities? e.g. > > <source>no text &lt;![CDATA[this is &lt;b&gt;my cdata&lg;/b&lt;]]&gt;</source> Yes, but it's not a CDATA section anymore, so the CDATA marker should not be there. > I'm assuming since it says MAY, that it is also valid to not escape it at all: > > <source><![CDATA[this is <b>my cdata</b>]]></source> Yes. In summary, it means you can output: a) like the original (preserve the CDATA notation): <source><![CDATA[this is <b>my cdata</b>]]></source> b) use the normal content notation, and in that case, any XML meta character (like <, or &) would be escaped: <source>this is &lt;b>my cdata&lt;/b></source> Or, if you want to alo escape > to &gt;: <source>this is &lt;b&gt;my cdata&lt;/b&gt;</source> Those three notations result in the exact same parsed string: "this is <b>my cdata</b>", which is what you had in input. I hope this helps, -yves