OASIS XML Localisation Interchange File Format (XLIFF) TC

  • 1.  IDs - uniqueness for inline only within segments (A)

    Posted 10-22-2013 17:41
    Hi David, all, > *A)* > Uniqueness of inline used ids is currently only enforced within a segment, > which is clearly wrong and there is an easy fix. > *Proposed Solution: Enforce inline id uniqueness within <unit>* I believe the original intent was to have uniqueness within the <unit>. For example see the PR for adding inline codes: "The id value of the added code MUST be different from all id values in both source and target content of the unit where the new code is added." But obviously not all PRs that. So I certainly support the changes to make sure we have uniqueness of id within the unit rather than segment. -yves


  • 2.  Re: [xliff] IDs - uniqueness for inline only within segments (A)

    Posted 10-22-2013 18:45
    Thanks for this, Yves I think this one is clear, so I am hereby making a call for dissent and will assume that making inline ids unique within unit consistently throughout Constraints and PRs is approved unless I hear otherwise by the end of this week, Friday COB, PDT. Rgds dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Tue, Oct 22, 2013 at 6:40 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi David, all, > *A)* > Uniqueness of inline used ids is currently only enforced within a segment, > which is clearly wrong and there is an easy fix. > *Proposed Solution: Enforce inline id uniqueness within <unit>* I believe the original intent was to have uniqueness within the <unit>. For example see the PR for adding inline codes: "The id value of the added code MUST be different from all id values in both source and target content of the unit where the new code is added." But obviously not all PRs that. So I certainly support the changes to make sure we have uniqueness of id within the unit rather than segment. -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 3.  RE: [xliff] IDs - uniqueness for inline only within segments (A)

    Posted 10-23-2013 15:30
    David, Yves, all,   I’ve finally caught up on this interesting discussion. I’ve been thinking about this statement:   > There is a long standing consensus in this TC that XLIFF ids cannot > be of the xsd type XML id for several good business reasons > 1) duplicity of id between source and target is being used as >    indicating the sameness of the element in source and target > 2) XLIFF files can be huge and strict enforcement of uniqueness would >    prevent streamed handling of XLIFFs   Perhaps you’ll excuse my naïveté as not having participated in the discussions that led to this consensus. While working with the schemas I had a tacit understanding of the issues, without having considered all of the implications. I also had a bit of philosophical unease with defining attributes named ID that aren’t of type xs:ID. With David’s parsing of the issues surrounding IDs, I see a different approach that may be viable.   I’ll start out by saying that I don’t understand the first point, given that the ID attribute is not present for either <source> or <target>. These elements can occur only within <segment> and <ignorable> (each of which allows but does not require an ID). The correlation between <source> and <target> is enforced structurally. But that actually isn’t relevant to my remaining points.   Let’s consider the perspective of the extracting agent. The extractor is responsible for compliance with five different uniqueness constraints (2.3.1.18). The values for the ID attributes will be drawn from the source content, if possible, or generated by the agent as necessary. Five different sets of values might need to be generated; and the extracting agent might need to use a naming convention, and/or append a sequential number, in order to prevent any one set of ID values from conflicting with the other four.   Note that all of the constraints are satisfied if the ID attribute values are unique for all uses of the ID attribute. The extracting agent could be designed to generate and apply a single set of values, avoiding the possibility of conflict, regardless of the scopes defined in our constraints. Are globally unique ID values a greater demand on the designers of an extracting agent than the five separate uniqueness constraints?   A second consideration is how an XLIFF element is referenced. Currently we would use an ID attribute, of type xs:NMTOKEN. If we have ID attributes of type xs:ID, it would make sense to reference elements by IDs using an attribute of type xs:IDREF (and named IDREF, I would suggest). Global uniqueness is not required for IDREF attributes. We can add the constraint that the IDREF must point to an element with the corresponding ID value (a condition that is not part of the XML schema language).   Could this approach address some of the issues that David has outlined? I’m interested in any and all feedback, fully aware I may be missing something here.   Thanks,   Tom     From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Dr. David Filip Sent: Tuesday, October 22, 2013 02:44 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] IDs - uniqueness for inline only within segments (A)   Thanks for this, Yves I think this one is clear, so I am hereby making a call for dissent and will assume that making inline ids unique within unit consistently throughout Constraints and PRs is approved unless I hear otherwise by the end of this week, Friday COB, PDT. Rgds dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie   On Tue, Oct 22, 2013 at 6:40 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi David, all, > *A)* > Uniqueness of inline used ids is currently only enforced within a segment, > which is clearly wrong and there is an easy fix. > *Proposed Solution: Enforce inline id uniqueness within <unit>* I believe the original intent was to have uniqueness within the <unit>. For example see the PR for adding inline codes: "The id value of the added code MUST be different from all id values in both source and target content of the unit where the new code is added." But obviously not all PRs that. So I certainly support the changes to make sure we have uniqueness of id within the unit rather than segment. -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  


  • 4.  RE: [xliff] IDs - uniqueness for inline only within segments (A)

    Posted 10-23-2013 17:28
    Further to my previous message, I’ve answered one of my own questions, re: ID values in <source> and <target> (to which the first of David’s business reasons applies) should of course consider inline elements within them. It doesn’t really change my question, though: why not globally unique IDs?   ·          All IDs for <group>, <unit>, <file>, <segment>, <ignorable>, and inline elements within <source> can be of type xs:ID and globally unique ·          IDs for inline elements within <target> can be of type xs:IDREF corresponding to an ID within <source> (and I would waive my philosophical objection to naming the element ID rather than IDREF)   These constraints are easily expressed, and enforceable through the schema. How to manage the uniqueness would be up to the writer agent(s).   Thanks,   Tom   From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Tom Comerford Sent: Wednesday, October 23, 2013 11:30 AM To: 'Dr. David Filip'; 'Yves Savourel' Cc: xliff@lists.oasis-open.org Subject: RE: [xliff] IDs - uniqueness for inline only within segments (A)   David, Yves, all,   I’ve finally caught up on this interesting discussion. I’ve been thinking about this statement:   > There is a long standing consensus in this TC that XLIFF ids cannot > be of the xsd type XML id for several good business reasons > 1) duplicity of id between source and target is being used as >    indicating the sameness of the element in source and target > 2) XLIFF files can be huge and strict enforcement of uniqueness would >    prevent streamed handling of XLIFFs   Perhaps you’ll excuse my naïveté as not having participated in the discussions that led to this consensus. While working with the schemas I had a tacit understanding of the issues, without having considered all of the implications. I also had a bit of philosophical unease with defining attributes named ID that aren’t of type xs:ID. With David’s parsing of the issues surrounding IDs, I see a different approach that may be viable.   I’ll start out by saying that I don’t understand the first point, given that the ID attribute is not present for either <source> or <target>. These elements can occur only within <segment> and <ignorable> (each of which allows but does not require an ID). The correlation between <source> and <target> is enforced structurally. But that actually isn’t relevant to my remaining points.   Let’s consider the perspective of the extracting agent. The extractor is responsible for compliance with five different uniqueness constraints (2.3.1.18). The values for the ID attributes will be drawn from the source content, if possible, or generated by the agent as necessary. Five different sets of values might need to be generated; and the extracting agent might need to use a naming convention, and/or append a sequential number, in order to prevent any one set of ID values from conflicting with the other four.   Note that all of the constraints are satisfied if the ID attribute values are unique for all uses of the ID attribute. The extracting agent could be designed to generate and apply a single set of values, avoiding the possibility of conflict, regardless of the scopes defined in our constraints. Are globally unique ID values a greater demand on the designers of an extracting agent than the five separate uniqueness constraints?   A second consideration is how an XLIFF element is referenced. Currently we would use an ID attribute, of type xs:NMTOKEN. If we have ID attributes of type xs:ID, it would make sense to reference elements by IDs using an attribute of type xs:IDREF (and named IDREF, I would suggest). Global uniqueness is not required for IDREF attributes. We can add the constraint that the IDREF must point to an element with the corresponding ID value (a condition that is not part of the XML schema language).   Could this approach address some of the issues that David has outlined? I’m interested in any and all feedback, fully aware I may be missing something here.   Thanks,   Tom     From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Tuesday, October 22, 2013 02:44 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] IDs - uniqueness for inline only within segments (A)   Thanks for this, Yves I think this one is clear, so I am hereby making a call for dissent and will assume that making inline ids unique within unit consistently throughout Constraints and PRs is approved unless I hear otherwise by the end of this week, Friday COB, PDT. Rgds dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie   On Tue, Oct 22, 2013 at 6:40 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi David, all, > *A)* > Uniqueness of inline used ids is currently only enforced within a segment, > which is clearly wrong and there is an easy fix. > *Proposed Solution: Enforce inline id uniqueness within <unit>* I believe the original intent was to have uniqueness within the <unit>. For example see the PR for adding inline codes: "The id value of the added code MUST be different from all id values in both source and target content of the unit where the new code is added." But obviously not all PRs that. So I certainly support the changes to make sure we have uniqueness of id within the unit rather than segment. -yves --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php  


  • 5.  RE: [xliff] IDs - uniqueness for inline only within segments (A)

    Posted 10-23-2013 20:58
    Hi Tom, Thanks for the well thought points. One of the main reason why can't use simple xs:ID values is because one can move <file> elements into different documents. So you could end up having ID clash after grouping the <files> elements of several documents into a single one for example. So the only way we could use xs:ID for the id attribute type would be to use globally unique IDs. That would fit the constraints of xs:ID, and that would allow manipulating the <file> any way we want too. But using GUIDs comes with problems too: As you noted, the problem with xs:ID would then be that you obviously cannot have them both in <source> and <target>. Your solution then would be to use xs:IDREF in the <target>. But then you can have extra inline elements in target, so those won't point to anything in <source> and can't be IDREFs. In addition, we do a lot of things with inline codes IDs (outside of XLIFF), and GUIDs are not helpful in many cases. For example. When leveraging from a TM the IDs are used to map the inline code of new source to the inline codes of the old translation. GUIDs will prevent that. And just from a usability viewpoint: GUIDs are going to look like this: id="550e8400-e29b-41d4-a716-446655440000". Doing any kind of inline code manipulation or reading error reports will quickly become very hard for humans. I don't want to imagine the mess in a XLIFF document with an RTF layer. The simpler the inline codes are the better. Another challenge: how would we validate that the id values are true GUIDs? In short: I'm not against xs:ID, but to use that type the values would have to be GUIDs, and, in my opinion, it is not truly practical. I think, while it'd be nice to use XML's at its fullest, we have to remember that XLIFF is just a serialization of objects not necessarily meant to be used with XML tools. We shouldn't try to force data like ID values to an XML specific kind because it's more convenient. Cheers, -ys


  • 6.  RE: [xliff] IDs - uniqueness for inline only within segments (A)

    Posted 10-24-2013 16:15
    Yves, thanks for your response. So I understand now about the issue of moving <file> elements between XLIFF documents. I do believe it would be possible to prevent ID conflicts, without the need for GUIDs, through contextual qualification of IDs per <file> element. This would be an additional detail for developers to contend with, which of course would be one reason to disfavor it. I also agree for now that IDs within <target> would have to remain as NMTOKEN attributes. Nothing I've said should be taken as advocacy; I just need to understand the broader issues. Thanks for your help. Tom