OASIS XML Localisation Interchange File Format (XLIFF) TC

  • 1.  RE: [xliff] Segmentation Modifications

    Posted 12-12-2013 13:51
    For reference, the bidi text I?m talking about is this one: [[ If the dir attributes of the <source> or <target> elements differ: The content of the <source> or <target> elements set to a different directionality than the directionality for the <source> or <target> elements of the joined segment MUST be enclosed between Unicode bi-directional control characters reflecting their original directionality (U+202A and U+202C for left-to-right spans, and U+202B and U+202C for right-to-left spans). ]] From the attached file in this post: https://lists.oasis-open.org/archives/xliff/201311/msg00176.html The question is basically: are those Unicode control characters the one to use for this mapping? I based the text on this article: http://www.w3.org/International/questions/qa-bidi-controls Thanks, -yves From: Yves Savourel [ mailto:ysavourel@enlaso.com ] Sent: Thursday, December 12, 2013 6:04 AM To: 'xliff@lists.oasis-open.org' Subject: RE: [xliff] Segmentation Modifications Hi David, I can do the change, that will free you time for other ones. Did you double check the bidi mapping? I?m not expert on bidi, so it?d be good to have more than my input on that part. Cheers, -yves From: Dr. David Filip [ mailto:David.Filip@ul.ie ] Sent: Thursday, December 12, 2013 5:48 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] Segmentation Modifications Yves, all I did not hear any dissent on that As far as i checked this, your proposal is equivalent to what was there for csprd02 with two small exceptions that add to clarity: 1) You use an explicit bidi provision, so that people do not need to research the Unicode BiDi algorithm for merging segments with different dir 2) You also proposed to have an option to downgrade state on split segments, which makes sense to me Otherwise it is is just reorganizing the PRs by the perfomred type of modification, which seems fine and I do not have a preference regarding the presentation of the provisions. @Yves, Do you want to implement this proposal in the spec or should I? Please let me know Thanks dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie On Sat, Nov 30, 2013 at 1:56 PM, Yves Savourel <ysavourel@enlaso.com> wrote: Hi all,   As mentioned here: https://lists.oasis-open.org/archives/xliff/201311/msg00138.html , I've been trying to implement segmentation modification for XLIFF 2.0 for a while now and I have a few comments.   For reference, the cs02 section for this is here: http://docs.oasis-open.org/xliff/xliff-core/v2.0/csprd02/xliff-core-v2.0-csprd02.html#d0e9317     --- The section (starting with its new title) keeps talking about "segmentation modification" and "resegmentation". Could we just talk about segmentation modification everywhere? The two things are the same thing.     --- That section has many constraints and processing requirements. It was quite difficult to follow when I tried to implement it.   For example: (take a deep breath) "Modifiers MUST copy all attributes including values, except for the id and order attributes, from their original instances on or within the original <segment> element onto both instances on and within the resulting two <segment> or <ignorable> elements, except for attributes that do not have valid instances on the eventually resulting <ignorable> element."   To make a long story short and get to the point, I think that section should be re-worded to be simpler, organized by action (split or join), and completed with a few things (some subState PRs, explicit directionality conversion, etc.)   The proposed modified text is in the attached document.   I believe it covers what is needed, but it's a complex set of PRs and it should be carefully checked by all. For example I'd like a confirmation on the Unicode control characters used for the directionality conversion.   Thanks, -yves     --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 2.  RE: [xliff] Segmentation Modifications

    Posted 12-12-2013 18:09

    Jumping in here..
     Please note that Unicode 6.3 adds directional isolate characters, which could be useful for joining segments.

    See:   http://www.unicode.org/reports/tr9/#Directional_Formatting_Characters  

    Directional isolate characters were introduced in Unicode 6.3 after it became apparent that directional embeddings usually have too strong an effect on their surroundings and are thus unnecessarily difficult to use. The new characters were introduced instead of changing the behavior of the existing ones because doing so might have had an undesirable effect on those existing documents that do rely on the old behavior. Nevertheless, the use of the directional isolates instead of embeddings is encouraged in new documents – once target platforms are known to support them .

    -s


    Yves Savourel ---12/12/2013 05:50:43---For reference, the bidi text I’m talking about is this one: [[

    From: Yves Savourel <ysavourel@enlaso.com>
    To: <xliff@lists.oasis-open.org>
    Date: 12/12/2013 05:50
    Subject: RE: [xliff] Segmentation Modifications
    Sent by: <xliff@lists.oasis-open.org>



    For reference, the bidi text I’m talking about is this one:

    [[
    If the dir attributes of the <source> or <target> elements differ: The content of the <source> or <target> elements set to a
    different directionality than the directionality for the <source> or <target> elements of the joined segment MUST be enclosed
    between Unicode bi-directional control characters reflecting their original directionality (U+202A and U+202C for left-to-right
    spans, and U+202B and U+202C for right-to-left spans).
    ]]

    From the attached file in this post:
    https://lists.oasis-open.org/archives/xliff/201311/msg00176.html

    The question is basically: are those Unicode control characters the one to use for this mapping?

    I based the text on this article:
    http://www.w3.org/International/questions/qa-bidi-controls


    Thanks,
    -yves


    From: Yves Savourel [ mailto:ysavourel@enlaso.com ]
    Sent: Thursday, December 12, 2013 6:04 AM
    To: 'xliff@lists.oasis-open.org'
    Subject: RE: [xliff] Segmentation Modifications

    Hi David,

    I can do the change, that will free you time for other ones.

    Did you double check the bidi mapping?
    I’m not expert on bidi, so it’d be good to have more than my input on that part.

    Cheers,
    -yves

    From: Dr. David Filip [ mailto:David.Filip@ul.ie ]
    Sent: Thursday, December 12, 2013 5:48 AM
    To: Yves Savourel
    Cc: xliff@lists.oasis-open.org
    Subject: Re: [xliff] Segmentation Modifications

    Yves, all I did not hear any dissent on that

    As far as i checked this, your proposal is equivalent to what was there for csprd02 with two small exceptions that add to clarity:

    1) You use an explicit bidi provision, so that people do not need to research the Unicode BiDi algorithm for merging segments with
    different dir

    2) You also proposed to have an option to downgrade state on split segments, which makes sense to me

    Otherwise it is is just reorganizing the PRs by the perfomred type of modification, which seems fine and I do not have a preference
    regarding the presentation of the provisions.


    @Yves, Do you want to implement this proposal in the spec or should I?
    Please let me know

    Thanks
    dF


    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158
    facsimile: +353-6120-2734
    http://www.cngl.ie/profile/?i=452
    mailto: david.filip@ul.ie

    On Sat, Nov 30, 2013 at 1:56 PM, Yves Savourel <ysavourel@enlaso.com> wrote:
    Hi all,
     
    As mentioned here: https://lists.oasis-open.org/archives/xliff/201311/msg00138.html , I've been trying to implement segmentation
    modification for XLIFF 2.0 for a while now and I have a few comments.
     
    For reference, the cs02 section for this is here:
    http://docs.oasis-open.org/xliff/xliff-core/v2.0/csprd02/xliff-core-v2.0-csprd02.html#d0e9317
     
     
    --- The section (starting with its new title) keeps talking about "segmentation modification" and "resegmentation". Could we just
    talk about segmentation modification everywhere? The two things are the same thing.
     
     
    --- That section has many constraints and processing requirements.
    It was quite difficult to follow when I tried to implement it.
     
    For example: (take a deep breath) "Modifiers MUST copy all attributes including values, except for the id and order attributes, from
    their original instances on or within the original <segment> element onto both instances on and within the resulting two <segment>
    or <ignorable> elements, except for attributes that do not have valid instances on the eventually resulting <ignorable> element."
     
    To make a long story short and get to the point, I think that section should be re-worded to be simpler, organized by action (split
    or join), and completed with a few things (some subState PRs, explicit directionality conversion, etc.)
     
    The proposed modified text is in the attached document.
     
    I believe it covers what is needed, but it's a complex set of PRs and it should be carefully checked by all. For example I'd like a
    confirmation on the Unicode control characters used for the directionality conversion.
     
    Thanks,
    -yves
     
     


    ---------------------------------------------------------------------
    To unsubscribe from this mail list, you must leave the OASIS TC that
    generates this mail.  Follow this link to all your TCs in OASIS at:
    https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  



    ---------------------------------------------------------------------
    To unsubscribe from this mail list, you must leave the OASIS TC that
    generates this mail.  Follow this link to all your TCs in OASIS at:
    https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  





  • 3.  RE: [xliff] Segmentation Modifications

    Posted 12-12-2013 18:42
    Thanks Steven, Exactly the type of feedback I was looking for.   So we should do RLI+PDI and LRI+PDI instead of RLE+PDF and LRE+PDF  I suppose?   -ys   From: Steven R Loomis [mailto:srloomis@us.ibm.com] Sent: Thursday, December 12, 2013 10:55 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: RE: [xliff] Segmentation Modifications   Jumping in here..  Please note that Unicode 6.3 adds directional isolate characters, which could be useful for joining segments. See:   http://www.unicode.org/reports/tr9/#Directional_Formatting_Characters   Directional isolate characters were introduced in Unicode 6.3 after it became apparent that directional embeddings usually have too strong an effect on their surroundings and are thus unnecessarily difficult to use. The new characters were introduced instead of changing the behavior of the existing ones because doing so might have had an undesirable effect on those existing documents that do rely on the old behavior. Nevertheless, the use of the directional isolates instead of embeddings is encouraged in new documents – once target platforms are known to support them . -s Yves Savourel ---12/12/2013 05:50:43---For reference, the bidi text I’m talking about is this one: [[ From: Yves Savourel < ysavourel@enlaso.com > To: < xliff@lists.oasis-open.org > Date: 12/12/2013 05:50 Subject: RE: [xliff] Segmentation Modifications Sent by: < xliff@lists.oasis-open.org > For reference, the bidi text I’m talking about is this one: [[ If the dir attributes of the <source> or <target> elements differ: The content of the <source> or <target> elements set to a different directionality than the directionality for the <source> or <target> elements of the joined segment MUST be enclosed between Unicode bi-directional control characters reflecting their original directionality (U+202A and U+202C for left-to-right spans, and U+202B and U+202C for right-to-left spans). ]] From the attached file in this post: https://lists.oasis-open.org/archives/xliff/201311/msg00176.html The question is basically: are those Unicode control characters the one to use for this mapping? I based the text on this article: http://www.w3.org/International/questions/qa-bidi-controls Thanks, -yves From: Yves Savourel [ mailto:ysavourel@enlaso.com ] Sent: Thursday, December 12, 2013 6:04 AM To: 'xliff@lists.oasis-open.org' Subject: RE: [xliff] Segmentation Modifications Hi David, I can do the change, that will free you time for other ones. Did you double check the bidi mapping? I’m not expert on bidi, so it’d be good to have more than my input on that part. Cheers, -yves From: Dr. David Filip [ mailto:David.Filip@ul.ie ] Sent: Thursday, December 12, 2013 5:48 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] Segmentation Modifications Yves, all I did not hear any dissent on that As far as i checked this, your proposal is equivalent to what was there for csprd02 with two small exceptions that add to clarity: 1) You use an explicit bidi provision, so that people do not need to research the Unicode BiDi algorithm for merging segments with different dir 2) You also proposed to have an option to downgrade state on split segments, which makes sense to me Otherwise it is is just reorganizing the PRs by the perfomred type of modification, which seems fine and I do not have a preference regarding the presentation of the provisions. @Yves, Do you want to implement this proposal in the spec or should I? Please let me know Thanks dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie On Sat, Nov 30, 2013 at 1:56 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi all,   As mentioned here: https://lists.oasis-open.org/archives/xliff/201311/msg00138.html , I've been trying to implement segmentation modification for XLIFF 2.0 for a while now and I have a few comments.   For reference, the cs02 section for this is here: http://docs.oasis-open.org/xliff/xliff-core/v2.0/csprd02/xliff-core-v2.0-csprd02.html#d0e9317     --- The section (starting with its new title) keeps talking about "segmentation modification" and "resegmentation". Could we just talk about segmentation modification everywhere? The two things are the same thing.     --- That section has many constraints and processing requirements. It was quite difficult to follow when I tried to implement it.   For example: (take a deep breath) "Modifiers MUST copy all attributes including values, except for the id and order attributes, from their original instances on or within the original <segment> element onto both instances on and within the resulting two <segment> or <ignorable> elements, except for attributes that do not have valid instances on the eventually resulting <ignorable> element."   To make a long story short and get to the point, I think that section should be re-worded to be simpler, organized by action (split or join), and completed with a few things (some subState PRs, explicit directionality conversion, etc.)   The proposed modified text is in the attached document.   I believe it covers what is needed, but it's a complex set of PRs and it should be carefully checked by all. For example I'd like a confirmation on the Unicode control characters used for the directionality conversion.   Thanks, -yves     --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php   --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  


  • 4.  Re: [xliff] Segmentation Modifications

    Posted 12-13-2013 14:36
    Stephen, Yves, Fredrik, all, I was looking up the bidi algorithm UAX#9, and I am not sure if we should be using the explicit directionality control characters. The UAX#9 itself quotes UTR#20,  http://www.w3.org/TR/unicode-xml/  which discourages the use of control characters in markup environment. I know that XLIFF 1.2 did not have anything else, but why not have a full markup solution this time round.. I wonder if we should rather use directionality annotations based on markers, or dedicated directionality elements. Another related issue is that both Unicode 6.3 and HTML 5 now allow for heuristic determination of the directionality by the first strong character, and there might be cases where this cannot be resolved into an explicit directionality becuase of varaibales.. So whether we use control characters or if we go for marker based directionality markup or even for dedicated directionality elements similar to HTML bdi, we should have a value equivalent to FSI and bdi="auto" Rgds dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie On Thu, Dec 12, 2013 at 6:41 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Thanks Steven, Exactly the type of feedback I was looking for.   So we should do RLI+PDI and LRI+PDI instead of RLE+PDF and LRE+PDF  I suppose?   -ys   From: Steven R Loomis [mailto: srloomis@us.ibm.com ] Sent: Thursday, December 12, 2013 10:55 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: RE: [xliff] Segmentation Modifications   Jumping in here..  Please note that Unicode 6.3 adds directional isolate characters, which could be useful for joining segments. See:   http://www.unicode.org/reports/tr9/#Directional_Formatting_Characters   Directional isolate characters were introduced in Unicode 6.3 after it became apparent that directional embeddings usually have too strong an effect on their surroundings and are thus unnecessarily difficult to use. The new characters were introduced instead of changing the behavior of the existing ones because doing so might have had an undesirable effect on those existing documents that do rely on the old behavior. Nevertheless, the use of the directional isolates instead of embeddings is encouraged in new documents – once target platforms are known to support them . -s Yves Savourel ---12/12/2013 05:50:43---For reference, the bidi text I’m talking about is this one: [[ From: Yves Savourel < ysavourel@enlaso.com > To: < xliff@lists.oasis-open.org > Date: 12/12/2013 05:50 Subject: RE: [xliff] Segmentation Modifications Sent by: < xliff@lists.oasis-open.org > For reference, the bidi text I’m talking about is this one: [[ If the dir attributes of the <source> or <target> elements differ: The content of the <source> or <target> elements set to a different directionality than the directionality for the <source> or <target> elements of the joined segment MUST be enclosed between Unicode bi-directional control characters reflecting their original directionality (U+202A and U+202C for left-to-right spans, and U+202B and U+202C for right-to-left spans). ]] From the attached file in this post: https://lists.oasis-open.org/archives/xliff/201311/msg00176.html The question is basically: are those Unicode control characters the one to use for this mapping? I based the text on this article: http://www.w3.org/International/questions/qa-bidi-controls Thanks, -yves From: Yves Savourel [ mailto:ysavourel@enlaso.com ] Sent: Thursday, December 12, 2013 6:04 AM To: ' xliff@lists.oasis-open.org ' Subject: RE: [xliff] Segmentation Modifications Hi David, I can do the change, that will free you time for other ones. Did you double check the bidi mapping? I’m not expert on bidi, so it’d be good to have more than my input on that part. Cheers, -yves From: Dr. David Filip [ mailto:David.Filip@ul.ie ] Sent: Thursday, December 12, 2013 5:48 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] Segmentation Modifications Yves, all I did not hear any dissent on that As far as i checked this, your proposal is equivalent to what was there for csprd02 with two small exceptions that add to clarity: 1) You use an explicit bidi provision, so that people do not need to research the Unicode BiDi algorithm for merging segments with different dir 2) You also proposed to have an option to downgrade state on split segments, which makes sense to me Otherwise it is is just reorganizing the PRs by the perfomred type of modification, which seems fine and I do not have a preference regarding the presentation of the provisions. @Yves, Do you want to implement this proposal in the spec or should I? Please let me know Thanks dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie On Sat, Nov 30, 2013 at 1:56 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi all,   As mentioned here: https://lists.oasis-open.org/archives/xliff/201311/msg00138.html , I've been trying to implement segmentation modification for XLIFF 2.0 for a while now and I have a few comments.   For reference, the cs02 section for this is here: http://docs.oasis-open.org/xliff/xliff-core/v2.0/csprd02/xliff-core-v2.0-csprd02.html#d0e9317     --- The section (starting with its new title) keeps talking about "segmentation modification" and "resegmentation". Could we just talk about segmentation modification everywhere? The two things are the same thing.     --- That section has many constraints and processing requirements. It was quite difficult to follow when I tried to implement it.   For example: (take a deep breath) "Modifiers MUST copy all attributes including values, except for the id and order attributes, from their original instances on or within the original <segment> element onto both instances on and within the resulting two <segment> or <ignorable> elements, except for attributes that do not have valid instances on the eventually resulting <ignorable> element."   To make a long story short and get to the point, I think that section should be re-worded to be simpler, organized by action (split or join), and completed with a few things (some subState PRs, explicit directionality conversion, etc.)   The proposed modified text is in the attached document.   I believe it covers what is needed, but it's a complex set of PRs and it should be carefully checked by all. For example I'd like a confirmation on the Unicode control characters used for the directionality conversion.   Thanks, -yves     --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php   --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  


  • 5.  RE: [xliff] Segmentation Modifications

    Posted 12-13-2013 16:17
    Hi David, all,   We discussed the use or not use of directionality characters quite a bit and the consensus at that time was to allow them but provide enough markup support to make it less common that you need to use them. Currently you can specify directionality of spans using the <pc> and <sc> inline elements and the directionality of the native code referenced by inline elements using dir on the <data> element.   I’m not sure outlawing the use of control characters and adding explicit inline markup to replace them is appropriate at this time. Adding directionality on <mrk> I do not like. In my mind <mrk> is for metadata transfer and annotation while explicit directionality is something that must be handled in forward or backward conversion. Unlike <cp> (few special codes that most of the time need to be preserved) or the other inline elements the explicit tagging for directionality would need to be something that is added and removed during translation. Most likely, to not complicate tagging using <pc> and <mrk> and content distribution over segments,  we would need to make them self-closing elements that just represent a single Unicode directional control character anyway.   As to allowing “auto” instead of “ltr” or “rtl” and make it a default I think this could be a good idea. UAX#9 already support searching the text of a paragraph for the first strong (non-embedded) character and using that as the initial directionality of the paragraph. I have not looked in detail how that differ from what HTML5 or Unicode 6.3 specifies but it seem very similar. For the XLIFF case we should consider the <unit> as being the paragraph. The explicit “ltr” and “rtl” on <file>,<group> and <unit> would be what UAX#9 refers to as the high-level protocol that is allowed to override the search for base directionality.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Dr. David Filip Sent: den 13 december 2013 15:36 To: Yves Savourel Cc: Steven R Loomis; xliff@lists.oasis-open.org Subject: Re: [xliff] Segmentation Modifications   Stephen, Yves, Fredrik, all,   I was looking up the bidi algorithm UAX#9, and I am not sure if we should be using the explicit directionality control characters. The UAX#9 itself quotes UTR#20,  http://www.w3.org/TR/unicode-xml/  which discourages the use of control characters in markup environment.   I know that XLIFF 1.2 did not have anything else, but why not have a full markup solution this time round..   I wonder if we should rather use directionality annotations based on markers, or dedicated directionality elements.   Another related issue is that both Unicode 6.3 and HTML 5 now allow for heuristic determination of the directionality by the first strong character, and there might be cases where this cannot be resolved into an explicit directionality becuase of varaibales..   So whether we use control characters or if we go for marker based directionality markup or even for dedicated directionality elements similar to HTML bdi, we should have a value equivalent to FSI and bdi="auto"   Rgds dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie   On Thu, Dec 12, 2013 at 6:41 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Thanks Steven, Exactly the type of feedback I was looking for.   So we should do RLI+PDI and LRI+PDI instead of RLE+PDF and LRE+PDF  I suppose?   -ys   From: Steven R Loomis [mailto: srloomis@us.ibm.com ] Sent: Thursday, December 12, 2013 10:55 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: RE: [xliff] Segmentation Modifications   Jumping in here..  Please note that Unicode 6.3 adds directional isolate characters, which could be useful for joining segments. See:   http://www.unicode.org/reports/tr9/#Directional_Formatting_Characters   Directional isolate characters were introduced in Unicode 6.3 after it became apparent that directional embeddings usually have too strong an effect on their surroundings and are thus unnecessarily difficult to use. The new characters were introduced instead of changing the behavior of the existing ones because doing so might have had an undesirable effect on those existing documents that do rely on the old behavior. Nevertheless, the use of the directional isolates instead of embeddings is encouraged in new documents – once target platforms are known to support them . -s Yves Savourel ---12/12/2013 05:50:43---For reference, the bidi text I’m talking about is this one: [[ From: Yves Savourel < ysavourel@enlaso.com > To: < xliff@lists.oasis-open.org > Date: 12/12/2013 05:50 Subject: RE: [xliff] Segmentation Modifications Sent by: < xliff@lists.oasis-open.org > For reference, the bidi text I’m talking about is this one: [[ If the dir attributes of the <source> or <target> elements differ: The content of the <source> or <target> elements set to a different directionality than the directionality for the <source> or <target> elements of the joined segment MUST be enclosed between Unicode bi-directional control characters reflecting their original directionality (U+202A and U+202C for left-to-right spans, and U+202B and U+202C for right-to-left spans). ]] From the attached file in this post: https://lists.oasis-open.org/archives/xliff/201311/msg00176.html The question is basically: are those Unicode control characters the one to use for this mapping? I based the text on this article: http://www.w3.org/International/questions/qa-bidi-controls Thanks, -yves From: Yves Savourel [ mailto:ysavourel@enlaso.com ] Sent: Thursday, December 12, 2013 6:04 AM To: ' xliff@lists.oasis-open.org ' Subject: RE: [xliff] Segmentation Modifications Hi David, I can do the change, that will free you time for other ones. Did you double check the bidi mapping? I’m not expert on bidi, so it’d be good to have more than my input on that part. Cheers, -yves From: Dr. David Filip [ mailto:David.Filip@ul.ie ] Sent: Thursday, December 12, 2013 5:48 AM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] Segmentation Modifications Yves, all I did not hear any dissent on that As far as i checked this, your proposal is equivalent to what was there for csprd02 with two small exceptions that add to clarity: 1) You use an explicit bidi provision, so that people do not need to research the Unicode BiDi algorithm for merging segments with different dir 2) You also proposed to have an option to downgrade state on split segments, which makes sense to me Otherwise it is is just reorganizing the PRs by the perfomred type of modification, which seems fine and I do not have a preference regarding the presentation of the provisions. @Yves, Do you want to implement this proposal in the spec or should I? Please let me know Thanks dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie On Sat, Nov 30, 2013 at 1:56 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi all,   As mentioned here: https://lists.oasis-open.org/archives/xliff/201311/msg00138.html , I've been trying to implement segmentation modification for XLIFF 2.0 for a while now and I have a few comments.   For reference, the cs02 section for this is here: http://docs.oasis-open.org/xliff/xliff-core/v2.0/csprd02/xliff-core-v2.0-csprd02.html#d0e9317     --- The section (starting with its new title) keeps talking about "segmentation modification" and "resegmentation". Could we just talk about segmentation modification everywhere? The two things are the same thing.     --- That section has many constraints and processing requirements. It was quite difficult to follow when I tried to implement it.   For example: (take a deep breath) "Modifiers MUST copy all attributes including values, except for the id and order attributes, from their original instances on or within the original <segment> element onto both instances on and within the resulting two <segment> or <ignorable> elements, except for attributes that do not have valid instances on the eventually resulting <ignorable> element."   To make a long story short and get to the point, I think that section should be re-worded to be simpler, organized by action (split or join), and completed with a few things (some subState PRs, explicit directionality conversion, etc.)   The proposed modified text is in the attached document.   I believe it covers what is needed, but it's a complex set of PRs and it should be carefully checked by all. For example I'd like a confirmation on the Unicode control characters used for the directionality conversion.   Thanks, -yves     --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php   --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php    


  • 6.  RE: [xliff] Segmentation Modifications

    Posted 12-15-2013 15:23
    >> @Yves, Do you want to implement this proposal in the >> spec or should I? > > I can do the change, that will free you time for other ones. I've updated the source XML files for the Segmentation Modification section. The text will be in the next PDF/HTML output David generates. (Meanwhile if you like reading un-formatted XML with a lot of codes the text is here: https://tools.oasis-open.org/version-control/browse/wsvn/xliff/trunk/xliff-20/segmentation/segmentation.xml ) I've put a "TBD" flag in the bi-directionality mapping paragraph since this is under discussion. Cheers, -ys


  • 7.  RE: [xliff] Segmentation Modifications

    Posted 12-13-2013 16:56
    > I know that XLIFF 1.2 did not have anything else, > but why not have a full markup solution this time round.. Because, despite repeated notes about this, the discussion attracted close to no interest. Even our calls to the W3C to provide guidelines went pretty much un-answered. If I recall correctly every single tool implementer noted that their current tools were using Unicode control characters and didn't see the need for anything else. So the decision was to provide directionality at the unit level and above, and use control characters for the content. I doubt the interest has changed. Cheers, -yves


  • 8.  RE: [xliff] Segmentation Modifications

    Posted 12-13-2013 19:20
    w3c and Unicode's recommendations are to use markup/attributes rather than control characters, where possible. the new controls are just for where it may not be possible. see http://dev.w3.org/csswg/css-writing-modes/#unicode-bidi  - rather than just providing directionality, may want to provide isolate controls at the attribute level in xliff -s Yves Savourel ---13/12/2013 08:55:42---> I know that XLIFF 1.2 did not have anything else, > but why not have a full markup solution this t From: Yves Savourel <ysavourel@enlaso.com> To: <xliff@lists.oasis-open.org> Date: 13/12/2013 08:55 Subject: RE: [xliff] Segmentation Modifications Sent by: <xliff@lists.oasis-open.org> > I know that XLIFF 1.2 did not have anything else, > but why not have a full markup solution this time round.. Because, despite repeated notes about this, the discussion attracted close to no interest. Even our calls to the W3C to provide guidelines went pretty much un-answered. If I recall correctly every single tool implementer noted that their current tools were using Unicode control characters and didn't see the need for anything else. So the decision was to provide directionality at the unit level and above, and use control characters for the content. I doubt the interest has changed. Cheers, -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  


  • 9.  Re: [xliff] Segmentation Modifications

    Posted 12-13-2013 19:20

    David,
     Yes, I think that's the point, control chars are discouraged in markup. What I meant to say is that the concepts should be there:
      " we should have a value equivalent to FSI and bdi="auto" "

    I sent a note on to Richard Ishida  of w3 and Aharon Lanin who worked on the new Bidi chars in Unicode to try to get some feedback on this topic for XLIFF. Will let you know if I get something back.

    -s

    "Dr. David Filip" ---13/12/2013 06:36:24---Stephen, Yves, Fredrik, all, I was looking up the bidi algorithm UAX#9, and I am not sure if we shou

    From: "Dr. David Filip" <David.Filip@ul.ie>
    To: Yves Savourel <ysavourel@enlaso.com>
    Cc: Steven R Loomis/Cupertino/IBM@IBMUS, "xliff@lists.oasis-open.org" <xliff@lists.oasis-open.org>
    Date: 13/12/2013 06:36
    Subject: Re: [xliff] Segmentation Modifications



    Stephen, Yves, Fredrik, all,

    I was looking up the bidi algorithm UAX#9, and I am not sure if we should be using the explicit directionality control characters. The UAX#9 itself quotes UTR#20,  http://www.w3.org/TR/unicode-xml/  which discourages the use of control characters in markup environment.

    I know that XLIFF 1.2 did not have anything else, but why not have a full markup solution this time round..

    I wonder if we should rather use directionality annotations based on markers, or dedicated directionality elements.

    Another related issue is that both Unicode 6.3 and HTML 5 now allow for heuristic determination of the directionality by the first strong character, and there might be cases where this cannot be resolved into an explicit directionality becuase of varaibales..

    So whether we use control characters or if we go for marker based directionality markup or even for dedicated directionality elements similar to HTML bdi, we should have a value equivalent to FSI and bdi="auto"

    Rgds
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158  
    facsimile: +353-6120-2734
    http://www.cngl.ie/profile/?i=452
    mailto: david.filip@ul.ie


    On Thu, Dec 12, 2013 at 6:41 PM, Yves Savourel < ysavourel@enlaso.com > wrote:
    Thanks Steven,
    Exactly the type of feedback I was looking for.
     
    So we should do RLI+PDI and LRI+PDI instead of RLE+PDF and LRE+PDF  I suppose?
     
    -ys
     
    From:  Steven R Loomis [mailto: srloomis@us.ibm.com ]
    Sent:  Thursday, December 12, 2013 10:55 AM

    To:  Yves Savourel
    Cc:   xliff@lists.oasis-open.org
    Subject:  RE: [xliff] Segmentation Modifications
     
    Jumping in here..
     Please note that Unicode 6.3 adds directional isolate characters, which could be useful for joining segments.

    See:   http://www.unicode.org/reports/tr9/#Directional_Formatting_Characters  

    Directional isolate characters were introduced in Unicode 6.3 after it became apparent that directional embeddings usually have too strong an effect on their surroundings and are thus unnecessarily difficult to use. The new characters were introduced instead of changing the behavior of the existing ones because doing so might have had an undesirable effect on those existing documents that do rely on the old behavior. Nevertheless, the use of the directional isolates instead of embeddings is encouraged in new documents – once target platforms are known to support them .

    -s


    Yves Savourel ---12/12/2013 05:50:43---For reference, the bidi text I’m talking about is this one: [[

    From: Yves Savourel < ysavourel@enlaso.com >
    To: < xliff@lists.oasis-open.org >
    Date: 12/12/2013 05:50
    Subject: RE: [xliff] Segmentation Modifications
    Sent by: < xliff@lists.oasis-open.org >




    For reference, the bidi text I’m talking about is this one:

    [[
    If the dir attributes of the <source> or <target> elements differ: The content of the <source> or <target> elements set to a
    different directionality than the directionality for the <source> or <target> elements of the joined segment MUST be enclosed
    between Unicode bi-directional control characters reflecting their original directionality (U+202A and U+202C for left-to-right
    spans, and U+202B and U+202C for right-to-left spans).
    ]]

    From the attached file in this post:
    https://lists.oasis-open.org/archives/xliff/201311/msg00176.html

    The question is basically: are those Unicode control characters the one to use for this mapping?

    I based the text on this article:
    http://www.w3.org/International/questions/qa-bidi-controls


    Thanks,
    -yves


    From: Yves Savourel [ mailto:ysavourel@enlaso.com ]
    Sent: Thursday, December 12, 2013 6:04 AM
    To: ' xliff@lists.oasis-open.org '
    Subject: RE: [xliff] Segmentation Modifications

    Hi David,

    I can do the change, that will free you time for other ones.

    Did you double check the bidi mapping?
    I’m not expert on bidi, so it’d be good to have more than my input on that part.

    Cheers,
    -yves

    From: Dr. David Filip [ mailto:David.Filip@ul.ie ]
    Sent: Thursday, December 12, 2013 5:48 AM
    To: Yves Savourel
    Cc: xliff@lists.oasis-open.org
    Subject: Re: [xliff] Segmentation Modifications

    Yves, all I did not hear any dissent on that

    As far as i checked this, your proposal is equivalent to what was there for csprd02 with two small exceptions that add to clarity:

    1) You use an explicit bidi provision, so that people do not need to research the Unicode BiDi algorithm for merging segments with
    different dir

    2) You also proposed to have an option to downgrade state on split segments, which makes sense to me

    Otherwise it is is just reorganizing the PRs by the perfomred type of modification, which seems fine and I do not have a preference
    regarding the presentation of the provisions.


    @Yves, Do you want to implement this proposal in the spec or should I?
    Please let me know

    Thanks
    dF


    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781
    cellphone: +353-86-0222-158  
    facsimile:  +353-6120-2734
    http://www.cngl.ie/profile/?i=452
    mailto: david.filip@ul.ie

    On Sat, Nov 30, 2013 at 1:56 PM, Yves Savourel < ysavourel@enlaso.com > wrote:
    Hi all,
     
    As mentioned here: https://lists.oasis-open.org/archives/xliff/201311/msg00138.html , I've been trying to implement segmentation
    modification for XLIFF 2.0 for a while now and I have a few comments.
     
    For reference, the cs02 section for this is here:
    http://docs.oasis-open.org/xliff/xliff-core/v2.0/csprd02/xliff-core-v2.0-csprd02.html#d0e9317
     
     
    --- The section (starting with its new title) keeps talking about "segmentation modification" and "resegmentation". Could we just
    talk about segmentation modification everywhere? The two things are the same thing.
     
     
    --- That section has many constraints and processing requirements.
    It was quite difficult to follow when I tried to implement it.
     
    For example: (take a deep breath) "Modifiers MUST copy all attributes including values, except for the id and order attributes, from
    their original instances on or within the original <segment> element onto both instances on and within the resulting two <segment>
    or <ignorable> elements, except for attributes that do not have valid instances on the eventually resulting <ignorable> element."
     
    To make a long story short and get to the point, I think that section should be re-worded to be simpler, organized by action (split
    or join), and completed with a few things (some subState PRs, explicit directionality conversion, etc.)
     
    The proposed modified text is in the attached document.
     
    I believe it covers what is needed, but it's a complex set of PRs and it should be carefully checked by all. For example I'd like a
    confirmation on the Unicode control characters used for the directionality conversion.
     
    Thanks,
    -yves
     
     


    ---------------------------------------------------------------------
    To unsubscribe from this mail list, you must leave the OASIS TC that
    generates this mail.  Follow this link to all your TCs in OASIS at:
    https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  



    ---------------------------------------------------------------------
    To unsubscribe from this mail list, you must leave the OASIS TC that
    generates this mail.  Follow this link to all your TCs in OASIS at:
    https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php