OASIS XML Localisation Interchange File Format (XLIFF) TC

 View Only
  • 1.  csprd02 comment 138 - schema ambiguity in core and matches

    Posted 11-12-2013 03:21
    Hi all,   Comment 138 describes the issue with Unique Particle Attribution: within <group>, <unit>, or <match> in an XLIFF document, any occurrence of an optional element from a module will cause validation to fail because the element is ambiguous. The original comment is below. Note that <file> also references optional module elements, but there’s no ambiguity because <unit> or <group> is required before the extension point.   I propose to remove these specific module elements from the specification. This is the proper solution, syntactically, as it eliminates the ambiguity. The module elements are still permitted via the extension point. If there’s no dissent, I’ll complete this before our next meeting.   This issue also insinuates a more general philosophical point. Since an extension point allows elements or attributes from other namespaces, then explicitly referencing those elements or attributes is redundant. Should those references be removed as well?   Thanks,   Tom       Subject: [xliff-comment] schema ambiguity in core and matches   Element references in the following three element definitions are syntactically ambiguous within the XML schema language:   element definition for xlf:group references:                 mda:metadata                 slr:data                 val:validation   element definition for xlf:unit references:                 mtc:matches                 gls:glossary                 mda:metadata                 res:resourceData                 slr:data                 val:validation   element definition for mtc:match references:                 xlf:originalData                 mda:metadata   Each element reference is optional in the given context. In all three element definitions these references are followed by <xs:any>, which allows any element from any namespace, including any of the referenced elements. This redundancy is explicable; the element references show implementers how those elements can be used. It’s also exemplary, by which I mean to suggest that they could as easily be shown in examples and/or in the prose descriptions of how the respective elements can be used.   The reason that they SHOULD be in the documentation, and MUST NOT be in the schema, is this: A validating parser cannot unambiguously determine whether any occurrence of the referenced element satisfies the explicit reference, or the wild-card <xs:any> token. Thus, strict validation of the schema fails.       Tom Comerford tom@supratext.com +1 856 787 9090   Supratext LLC 43 Michaelson Drive Mount Laurel, NJ 08054   www.supratext.com  


  • 2.  RE: [xliff] csprd02 comment 138 - schema ambiguity in core and matches

    Posted 11-12-2013 12:21
    Hi Tom, all I tend to agree with your solution. It fits with my long-standing view that, from the viewpoint of a core processor, modules cannot be different from extensions. This would leave us with one issue: The text of each module specification can detail where the module's elements are expected to go (as constraint and PRs), but there is no way to validate that using an XSD schema. That means modules would have to be validated at least partially through parsing. Or could something like Schematron or/and NVDL be helpful? Cheers, -yves From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Tom Comerford Sent: Monday, November 11, 2013 8:21 PM To: xliff@lists.oasis-open.org Subject: [xliff] csprd02 comment 138 - schema ambiguity in core and matches Hi all, Comment 138 describes the issue with Unique Particle Attribution: within <group>, <unit>, or <match> in an XLIFF document, any occurrence of an optional element from a module will cause validation to fail because the element is ambiguous. The original comment is below. Note that <file> also references optional module elements, but there?s no ambiguity because <unit> or <group> is required before the extension point. I propose to remove these specific module elements from the specification. This is the proper solution, syntactically, as it eliminates the ambiguity. The module elements are still permitted via the extension point. If there?s no dissent, I?ll complete this before our next meeting. This issue also insinuates a more general philosophical point. Since an extension point allows elements or attributes from other namespaces, then explicitly referencing those elements or attributes is redundant. Should those references be removed as well? Thanks, Tom Subject: [xliff-comment] schema ambiguity in core and matches Element references in the following three element definitions are syntactically ambiguous within the XML schema language: element definition for xlf:group references:                 mda:metadata                 slr:data                 val:validation element definition for xlf:unit references:                 mtc:matches                 gls:glossary                 mda:metadata                 res:resourceData                 slr:data                 val:validation element definition for mtc:match references:                 xlf:originalData                 mda:metadata Each element reference is optional in the given context. In all three element definitions these references are followed by <xs:any>, which allows any element from any namespace, including any of the referenced elements. This redundancy is explicable; the element references show implementers how those elements can be used. It?s also exemplary, by which I mean to suggest that they could as easily be shown in examples and/or in the prose descriptions of how the respective elements can be used. The reason that they SHOULD be in the documentation, and MUST NOT be in the schema, is this: A validating parser cannot unambiguously determine whether any occurrence of the referenced element satisfies the explicit reference, or the wild-card <xs:any> token. Thus, strict validation of the schema fails. Tom Comerford tom@supratext.com +1 856 787 9090 Supratext LLC 43 Michaelson Drive Mount Laurel, NJ 08054 www.supratext.com


  • 3.  Re: [xliff] csprd02 comment 138 - schema ambiguity in core and matches

    Posted 11-12-2013 17:34
    On Tue, Nov 12, 2013 at 3:21 AM, Tom Comerford < tom@supratext.com > wrote: A validating parser cannot unambiguously determine whether any occurrence of the referenced element satisfies the explicit reference, or the wild-card <xs:any> token. Thus, strict validation of the schema fails. How come? Order of elements (unlike attributes) is not arbitrary in XML, so how come that validators cannot determine this. There should be a standard way to check for presence of a number of allowed modules before the wild card starts applying. If xsd cannot do this, we MUST supply a schema-like mechanism for automatically checking this.. Yves suggested Schematron or NVDL, maybe relax ng? Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie


  • 4.  RE: [xliff] csprd02 comment 138 - schema ambiguity in core and matches

    Posted 11-12-2013 22:34
    Hello David,   It's true that the specified order is unambiguous, but the referenced elements are optional and that makes all the difference. Formally the term is "unique particle attribution", where the element reference in the schema is a particle and an occurrence of the element in a document conforming to that schema must be attributable to exactly one particle in the schema. As it's currently defined, in an XLIFF document an occurrence of a non-core element can be attributed to either the explicit reference or to the wild-card reference in the schema.   Note that <file> also explicitly allows non-core elements, all optional and in a specific order; and there's also an extension point. In this case, though, there's no ambiguity because at least one of <unit> or <group> must occur before the extension point. Any non-core elements can be attributed correctly based on whether they occur before or after the required elements.   I believe Relax NG eliminates the problem, but I haven't thought through the specific approach (or, for that matter, what other implications there may be if we were to choose to change the schema language). If we have XLIFF documents with an XSD schema, then the ambiguities have to be resolved. Tools that supplement the validating parser can be used to check conditions (like acceptable locations for optional non-core elements) that XSD doesn’t support, but they don’t eliminate the validation error.   Regards,   Tom   From: Dr. David Filip [mailto:David.Filip@ul.ie] Sent: Tuesday, November 12, 2013 12:33 PM To: Tom Comerford Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] csprd02 comment 138 - schema ambiguity in core and matches     On Tue, Nov 12, 2013 at 3:21 AM, Tom Comerford < tom@supratext.com > wrote: A validating parser cannot unambiguously determine whether any occurrence of the referenced element satisfies the explicit reference, or the wild-card <xs:any> token. Thus, strict validation of the schema fails. How come?   Order of elements (unlike attributes) is not arbitrary in XML, so how come that validators cannot determine this. There should be a standard way to check for presence of a number of allowed modules before the wild card starts applying.   If xsd cannot do this, we MUST supply a schema-like mechanism for automatically checking this..   Yves suggested Schematron or NVDL, maybe relax ng?       Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 http://www.cngl.ie/profile/?i=452 mailto: david.filip@ul.ie