XLIFF Inline Markup SC

 View Only
  • 1.  Various XML constructs in inline content

    Posted 07-26-2011 20:27
    Hi all, Another aspect we have to address is the various XML constructs one can find in mixed content (comments, processing instructions, CDATA sections) and what are the processing expectation for them. Here is my take: -- comments XML comments must be stripped out. An XLIFF processor must strip XML comments (<!--...-->) found inside a content. Any comment that need to survive processing needs to be set using the XLIFF notes mechanism. -- Processing Instructions XML processing instructions must be stripped out. An XLIFF processor must strip XML processing instructions (<?...?>) found inside a content. Tool-specific data should be set using XLIFF's extension mechanism. -- CDATA sections CDATA sections (<![CDATA[...]]>) are allowed, but on output they may be changed into normal escaped content by the processor agents. An XLIFF processor must read properly CDATA sections inside an inline content, but it is not required to preserve the notation when writing out the document. Any thoughts? -yves


  • 2.  Re: [xliff-inline] Various XML constructs in inline content

    Posted 08-01-2011 14:30
    With respect to processing instructions. It seems like just stripping them out could have a negative effect when the translated text is integrated with the skeleton file. It seems more reasonable that XLIFF should require that processing instructions should be converted to standard XLIFF inline elements so that processing instructions can be preserved during the translation process. David Corporate Globalization Tool Development EMail: waltersd@us.ibm.com Phone: (507) 253-7278, T/L:553-7278, Fax: (507) 253-1721 CHKPII: http://w3-03.ibm.com/globalization/page/2011 TM file formats: http://w3-03.ibm.com/globalization/page/2083 TM markups: http://w3-03.ibm.com/globalization/page/2071 Yves Savourel ---07/26/2011 03:28:28 PM---Hi all, Another aspect we have to address is the various XML constructs one can find in mixed conten From: Yves Savourel <ysavourel@enlaso.com> To: <xliff-inline@lists.oasis-open.org> Date: 07/26/2011 03:28 PM Subject: [xliff-inline] Various XML constructs in inline content Hi all, Another aspect we have to address is the various XML constructs one can find in mixed content (comments, processing instructions, CDATA sections) and what are the processing expectation for them. Here is my take: -- comments XML comments must be stripped out. An XLIFF processor must strip XML comments (<!--...-->) found inside a content. Any comment that need to survive processing needs to be set using the XLIFF notes mechanism. -- Processing Instructions XML processing instructions must be stripped out. An XLIFF processor must strip XML processing instructions (<?...?>) found inside a content. Tool-specific data should be set using XLIFF's extension mechanism. -- CDATA sections CDATA sections (<![CDATA[...]]>) are allowed, but on output they may be changed into normal escaped content by the processor agents. An XLIFF processor must read properly CDATA sections inside an inline content, but it is not required to preserve the notation when writing out the document. Any thoughts? -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 3.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-02-2011 05:56
    Hi David, all, > With respect to processing instructions. It seems like just > stripping them out could have a negative effect when > the translated text is integrated with the skeleton file. > It seems more reasonable that XLIFF should require that > processing instructions should be converted to standard > XLIFF inline elements so that processing instructions > can be preserved during the translation process. Maybe we are talking about two different things here: I think that you may be saying: if an original XML file has processing instructions, when extracting the file to XLIFF, those PIs should be converted to inline elements. (and I agree: that makes sense. And the same goes possibly for comments) I was assuming that conversion above was always done. And I was talking about PIs that would be generated during the XLIFF output. In other words, PIs that are not from the original extracted file, but some type of tool-specific inline information that tool would rely on to merge things back. Such PI shouldn't exist, but if an XLIFF reader finds one, it should strip it out (and possibly even generate some warning/error?) Is that what you have in mind too? -yves


  • 4.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-02-2011 12:41
    I'm glad we agree on how to handle processing instructions in the original source files. For PIs generated during the XLIFF file creation process, wouldn't an XLIFF reader have to support preserving the PIs, since PIs are a standard XML item? The XML Specification states ( http://www.w3.org/TR/REC-xml/#sec-pi ): "PIs are not part of the document's character data , but MUST be passed through to the application." David Corporate Globalization Tool Development EMail: waltersd@us.ibm.com Phone: (507) 253-7278, T/L:553-7278, Fax: (507) 253-1721 CHKPII: http://w3-03.ibm.com/globalization/page/2011 TM file formats: http://w3-03.ibm.com/globalization/page/2083 TM markups: http://w3-03.ibm.com/globalization/page/2071 Yves Savourel ---08/02/2011 01:02:28 AM---Hi David, all, > With respect to processing instructions. It seems like just From: Yves Savourel <ysavourel@enlaso.com> To: <xliff-inline@lists.oasis-open.org> Date: 08/02/2011 01:02 AM Subject: RE: [xliff-inline] Various XML constructs in inline content Hi David, all, > With respect to processing instructions. It seems like just > stripping them out could have a negative effect when > the translated text is integrated with the skeleton file. > It seems more reasonable that XLIFF should require that > processing instructions should be converted to standard > XLIFF inline elements so that processing instructions > can be preserved during the translation process. Maybe we are talking about two different things here: I think that you may be saying: if an original XML file has processing instructions, when extracting the file to XLIFF, those PIs should be converted to inline elements. (and I agree: that makes sense. And the same goes possibly for comments) I was assuming that conversion above was always done. And I was talking about PIs that would be generated during the XLIFF output. In other words, PIs that are not from the original extracted file, but some type of tool-specific inline information that tool would rely on to merge things back. Such PI shouldn't exist, but if an XLIFF reader finds one, it should strip it out (and possibly even generate some warning/error?) Is that what you have in mind too? -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 5.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-02-2011 13:27
    > For PIs generated during the XLIFF file creation > process, wouldn't an XLIFF reader have to support > preserving the PIs, since PIs are a standard XML item? > The XML Specification states ( http://www.w3.org/TR/REC-xml/#sec-pi ): > "PIs are not part of the document's character data, but > MUST be passed through to the application." The XML parser must pass them through to the application, which then do what it wants with them. I was assuming an XLIFF reader is already an application. The problem with preserving PIs in inline content is that a tool would have to keep track of them. But I suppose they could be passed as some kind of annotation, just like a marker for a term, or a comment for instance. What I'm a bit concerned about is that tools start using PIs for placing extensions. But I suppose also that, once processed, there is not many differences between an extension in a general-purpose inline marker, and a PI. For example, the mapping for the two notations below would probably be very similar. <target>My <mrk myNS:info='flag'>translated</mrk> text</target> <target>My <?myFlagStart?>translated<?myFlagEnd?> text</target> Hmmm... I guess this needs more thinking. -ys


  • 6.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-04-2011 06:32
    Hi,   I wonder if we need to split the discussion between a basic requirement, and a more advanced one:   1.        Basic: Whenever the native format of an XLIFF file is XML, XLIFF-related processes have to ensure that the original XML (including any processing instructions and XML comments) can be reconstructed. Any information that is needed for this reconstruction has to be part of the XLIFF file (as opposed for example to being part of an external skeleton file). 2.        Advanced: The XLIFF representation for processing instructions and XML comments is the one for “generic inline markup”   Cheers, Christian   From: David Walters [mailto:waltersd@us.ibm.com] Sent: Dienstag, 2. August 2011 14:41 To: Yves Savourel Cc: xliff-inline@lists.oasis-open.org Subject: RE: [xliff-inline] Various XML constructs in inline content   I'm glad we agree on how to handle processing instructions in the original source files. For PIs generated during the XLIFF file creation process, wouldn't an XLIFF reader have to support preserving the PIs, since PIs are a standard XML item? The XML Specification states ( http://www.w3.org/TR/REC-xml/#sec-pi ): "PIs are not part of the document's character data , but MUST be passed through to the application." David Corporate Globalization Tool Development EMail: waltersd@us.ibm.com Phone: (507) 253-7278, T/L:553-7278, Fax: (507) 253-1721 CHKPII: http://w3-03.ibm.com/globalization/page/2011 TM file formats: http://w3-03.ibm.com/globalization/page/2083 TM markups: http://w3-03.ibm.com/globalization/page/2071 Yves Savourel ---08/02/2011 01:02:28 AM---Hi David, all, > With respect to processing instructions. It seems like just From: Yves Savourel <ysavourel@enlaso.com> To: <xliff-inline@lists.oasis-open.org> Date: 08/02/2011 01:02 AM Subject: RE: [xliff-inline] Various XML constructs in inline content Hi David, all, > With respect to processing instructions. It seems like just > stripping them out could have a negative effect when > the translated text is integrated with the skeleton file. > It seems more reasonable that XLIFF should require that > processing instructions should be converted to standard > XLIFF inline elements so that processing instructions > can be preserved during the translation process. Maybe we are talking about two different things here: I think that you may be saying: if an original XML file has processing instructions, when extracting the file to XLIFF, those PIs should be converted to inline elements. (and I agree: that makes sense. And the same goes possibly for comments) I was assuming that conversion above was always done. And I was talking about PIs that would be generated during the XLIFF output. In other words, PIs that are not from the original extracted file, but some type of tool-specific inline information that tool would rely on to merge things back. Such PI shouldn't exist, but if an XLIFF reader finds one, it should strip it out (and possibly even generate some warning/error?) Is that what you have in mind too? -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 7.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-04-2011 07:50
    Hi Christian, all, > 1. Basic: > Whenever the native format of an XLIFF file is XML, XLIFF-related processes > have to ensure that the original XML (including any processing instructions > and XML comments) can be reconstructed. Any information that is needed for > this reconstruction has to be part of the XLIFF file (as opposed for > example to being part of an external skeleton file). I'm not sure I agree with this formulation. Let's fight :) --A) It's not the role of the XLIFF specification to decide what a filter needs to preserve in the original format. That's for each tool to decide. E.g. Some may be fine with stripping out comments while other may want to keep them. The role of the XLIFF format is just to provide a (well-defined) way to store "whatever info" the filter wants to pass through; not to decide what the information should be. This would be more the role of one of those representation guides we have, no? --B) Why should we have a special case when the original format is XML? Here again it seems this XML-specific guidance should be part of a representation guide, not part of the specification. --C) I don't think any information needed for the reconstruction should be part of the XLIFF file. We have actually a requirement that implies it does not have too: "1.12.1. to store only the XLIFF representation, discarding the native data" > 2. Advanced: > The XLIFF representation for processing instructions and XML comments > is the one for €œgeneric inline markup € Not sure I understand this. Is this means "to represent XML PIs or XML comments, you must use the representation described in the generic inline markup? I don't think PIs and comments outside extracted content should be covered by the inline markup. For example, we could have a PI somewhere in the structural markup of the original XML. That one should be handled by the "normal skeleton". Cheers, -yves


  • 8.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-05-2011 10:25
    Hi Yves,

    See comments below.

    Cheers,
    Christian




  • 9.  RE: [xliff-inline] Various XML constructs in inline content

    Posted 08-09-2011 03:48
    Hi Christian, >> --B) Why should we have a special case when the original format >> is XML? Here again it seems this XML-specific guidance should >> be part of a representation guide, not part of the specification. >> CL> I agree. There should be no special case. I wonder, however, > if we should turn observation into a general rule for XLIFF: > XLIFF and related processes should be able allow roundtripping > of any kind of native data (example: for XML as native format, > processing instructions, comments etc. should survive conversions). Sounds good. >> ...Is this means "to represent XML PIs or XML comments, you must >> use the representation described in the generic inline markup? >> I don't think PIs and comments outside extracted content should >> be covered by the inline markup. For example, we could have a >> PI somewhere in the structural markup of the original XML. >> That one should be handled by the "normal skeleton". > CL> Your paraphrase is correct. However, I understand your point > that it might be good to have two different ways of representing > PIs or XML comments. I wonder, however, which mechanisms a > filter would need to implement if he wants to distinguish the > two occurrence types. He first would need to understand what > the "structural markup" is. I should use the proper term based on our definitions: By "structural" I meant "block-level" ( http://wiki.oasis-open.org/xliff/OneContentModel#Definitions.2BAC8-Terminology ) I would say: if the PIs or comment occur within a block-level markup it should be treated like other block-level markup, and if they occur within inline-level markup they should be represented like other inline-level markup. -ys