OASIS XML Localisation Interchange File Format (XLIFF) TC

Expand all | Collapse all

RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

  • 1.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 11-29-2012 10:39
    Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto:ryanki@microsoft.com] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org    


  • 2.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-01-2012 00:17
    Hi Fredrik, David, ALL   We think this is a good compromise for <notes>. It does make the standard between the use of notes and metadata a bit clearer. We also suggest to keep the list of attributes to origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan   From: Estreen, Fredrik [mailto:Fredrik.Estreen@lionbridge.com] Sent: Thursday, November 29, 2012 2:39 AM To: Ryan King; Dr. David Filip Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org    


  • 3.  Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-01-2012 01:00
    Thanks for outlining the options, Fredrik, I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations.. The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be.. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org    


  • 4.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-03-2012 20:07
    >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org      


  • 5.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-11-2012 18:27
    Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?   Thanks, ryan   From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Ryan King Sent: Monday, December 3, 2012 11:59 AM To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org      


  • 6.  Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-11-2012 22:08
    Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread. I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping.. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote: Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?   Thanks, ryan   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Monday, December 3, 2012 11:59 AM To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org      


  • 7.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-11-2012 23:17
    Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding priority is OK) and I support the extensibility of note as well. Given that, is there anyone on the list who disagrees with this proposal?   From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org] On Behalf Of Dr. David Filip Sent: Tuesday, December 11, 2012 2:07 PM To: Ryan King Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread. I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping.. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote: Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?   Thanks, ryan   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Monday, December 3, 2012 11:59 AM To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org        


  • 8.  Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-12-2012 00:01
    +1 as proposed now Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com > wrote: Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding priority is OK) and I support the extensibility of note as well. Given that, is there anyone on the list who disagrees with this proposal?   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Tuesday, December 11, 2012 2:07 PM To: Ryan King Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread. I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping.. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote: Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?   Thanks, ryan   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Monday, December 3, 2012 11:59 AM To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org        


  • 9.  Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-16-2012 13:47
    Hi all, as there were no further comments, can we assume consensus and change the spec along the lines discussed in this thread? The summary is that note remains structurally the same, but receives new core attributes as discussed, and will also allow for attributes from any namespaces. It won't be extensible by foreign child elements. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Wed, Dec 12, 2012 at 12:00 AM, Dr. David Filip < David.Filip@ul.ie > wrote: +1 as proposed now Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com > wrote: Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding priority is OK) and I support the extensibility of note as well. Given that, is there anyone on the list who disagrees with this proposal?   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Tuesday, December 11, 2012 2:07 PM To: Ryan King Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread. I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping.. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote: Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?   Thanks, ryan   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Monday, December 3, 2012 11:59 AM To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org        


  • 10.  Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-17-2012 16:40
    Sorry I missed this discussion quite a
    bit. Remind me again why we are putting in "non-translatable"
    or "non-localizable" information in XLIFF file for translators
    in your examples?

    Technically speaking, in Ryan's example,
    id is missing, is that intentional or a mistake? Using a date string is
    also not advisable, prior to the XLIFF file being transmitted, a system
    (any system) should be using a normalized (UTC) Date object (e.g. Date
    object in _javascript_) to pass the information around and not a date string.


    Using XLIFF <notes> to convey
    "instructions" (in the example) to me is just asking for trouble.
    In our environment, there is really no comprehensive way to cover a set
    of "build" instructions in any XLIFF files that would describe
    exactly how an"input" or "final output" can be built
    with a set of XLIFF files that was given. We do most of the interchange
    between systems as much as we can outside the XLIFF file and leave the
    complexity as outlined in the examples below to a separate metadata interchange
    process that encompasses not just the files being exchanged but potentially
    the files that are yet to come or have already arrived some time ago.

    We do and can take advantage of the
    <notes> feature in XLIFF. However, that is limited to only information
    exchange specifically about that particular translation "transaction".
    A full job can contain many more transactions. I would not recommend instructions
    to be passed along in the XLIFF file because that would make the XLIFF
    interchange unnecessarily complex.






    From:      
      "Dr. David Filip"
    <David.Filip@ul.ie>
    To:      
      "xliff@lists.oasis-open.org"
    <xliff@lists.oasis-open.org>
    Cc:      
      Ryan King <ryanki@microsoft.com>,
    "Estreen, Fredrik" <Fredrik.Estreen@lionbridge.com>
    Date:      
      12/16/2012 08:47 AM
    Subject:    
        Re: [xliff]
    1.2 to 2.0 Gaps and Proposals (notes)
    Sent by:    
        <xliff@lists.oasis-open.org>




    Hi all, as there were no further comments, can we assume
    consensus and change the spec along the lines discussed in this thread?
    The summary is that note remains structurally the
    same, but receives new core attributes as discussed, and will also allow
    for attributes from any namespaces. It won't be extensible by foreign child
    elements.

    Cheers
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158
    facsimile: +353-6120-2734
    mailto: david.filip@ul.ie



    On Wed, Dec 12, 2012 at 12:00 AM, Dr. David Filip < David.Filip@ul.ie >
    wrote:
    +1 as proposed now

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781
    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734
    mailto: david.filip@ul.ie



    On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com >
    wrote:
    Thanks David, then I think
    we (you, Fredrik, and I) are in agreement (adding priority is OK) and I
    support the extensibility of note as well. Given that, is there anyone
    on the list who disagrees with this proposal?
     
    From: xliff@lists.oasis-open.org
    [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Tuesday, December 11, 2012 2:07 PM
    To: Ryan King
    Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org

    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Ryan, I support adding the core attributes as proposed,
    plus perhaps the priority [1-10] from Fredriks example
    But I also reiterate the request for note to be extensible.
    This seemed to have Fredrik's support on this thread.
    I think that note and inline markers should be extensible
    as part of the generic annotations design that will allow development of
    annotation modules, such as ITS mapping..
    Cheers
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781
    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734
    mailto: david.filip@ul.ie


    On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com >
    wrote:
    Do we have consensus on this
    proposal? E.g. adding category, origin, and datetime (or timestamp) attributes
    to <note>?
     
    Thanks,
    ryan
     
    From: xliff@lists.oasis-open.org
    [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Monday, December 3, 2012 11:59 AM

    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
    (notes)
     
    >> On the other
    hand having a minimum set for interoperability for ITS unaware tools sounds
    good.
    Agreed. And as stated on another thread…w e
    suggest the list of additional and optional attributes to be origin, category,
    datetime.
     
    <notes>
      <note category=”instruction”
    origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize
    Windows</note>
    </note>
     
    Thanks,
    ryan
    .
     
    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Friday, November 30, 2012 5:00 PM
    To: Estreen, Fredrik
    Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Thanks for outlining the options, Fredrik,
     
    I would be personally OK with note being just extensible.
    The ITS categories would allow to specify pretty much everything
    that you would need. First as extension, that should later turn into a
    module using the same mechanism.
    On the other hand having a miniumum set for interoprability
    for ITS unaware tools sounds good. And as Fredrik pointed out ITS note
    can be easily mapped on these, so not an issue from here.
    Even with the minimum set of core attributes, I still think
    it should be extensible.. to allow for unforeseen types of annotations..
     
    The only danger is of creating unnecessary clutter if the
    adoption is minimal.. hard to say what the adoption will be..
     
    Cheers
    dF
     
     

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781
    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734
    mailto: david.filip@ul.ie


    On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com >
    wrote:
    Hi Ryan, David,
     
    How it would look is dependent
    on if we add one or more standard attributes to the <note> element
    or rely solely on third party extensions. First an examples of one of the
    notes in your original sample and one showing a potential use of David’s
    ITS mapping case.
     
    <notes>
      <note id=”n1”
    ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”>
    This string cannot be longer than 100 characters</note>
      <note id=”n2”
    its:locNoteType=”alert”>Make sure to adapt date format when localizing</note>
    </notes>
     
    It could be argued that there
    is a set of very common metadata associated with notes and that we should
    provide standard attributes in these cases. I’m not sure exactly which,
    if any, we should have but the ones I can immediately think of are the
    kind of information in the above sample plus a date:
    * origin / author – Indicate
    source of the note
    * priority – indicate relative
    importance of a note. Must have strict simple definition. Integer lower
    is more important than higher for example.
    * type / category – indicate
    what type / aspect of the data or process the note applies to or annotates.
    * date – creation or modification
    date. Which of these it is should be specified.
     
    The good thing about using
    standard attributes instead of extensions for common properties is of course
    better interoperability for the data contained. The negative side is that
    it adds complexity to the standard which is against one of the goals of
    the 2.0 work. One part of that is the attempt to reduce the number of seldom
    or never used constructs to get a leaner core model. A solution that has
    been discussed before is to have a more complex comment / annotation module
    in addition to or extending the core feature. This way we get the same
    complexity in the core as we would with just third party extensions but
    with the added value of a fully interoperable path for those that want
    that in this area.
     
    If we hypothetically assume
    we add origin and priority to the core the above example could look like
    the bellow. Assuming the same mapping for ITS is used as the one proposed
    for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and
    stored in “priority”.
     
    <notes>
      <note id=”n1”
    author=”developer” priority=”1” ms:noteType=”comment”> This string
    cannot be longer than 100 characters</note>
      <note id=”n2”
    priority=”1” >Make sure to adapt date format when localizing</note>
    </notes>
     
    Regarding the naming of potential
    core / module attributes I would prefer to use “category” instead of
    “type” as the former does not convey the level of functional meaning
    that the later does for me. It is more ‘just metadata’.
     
    Regards,
    Fredrik Estreen
     
    From: Ryan King [mailto: ryanki@microsoft.com ]

    Sent: den 28 november 2012 21:55
    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    David or Frederick, can you
    give us an XLIFF example of how that would look?
     
    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Wednesday, November 28, 2012 12:50 PM
    To: Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Fredrik, all, same as Fredrik, I think that extensibility
    makes sense here. I agree that the grouping mechanism in the style of mda
    is not appropriate here and would change the semantics in an undesired
    way.
    Annotations are perfect extension points in general, and
    besides we need the extensibility here for the its mapping.
     
    Cheers
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781
    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734
    mailto: david.filip@ul.ie


    On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com >
    wrote:
    Hi Rodolfo, Ryan,
     
    I think the intent of the
    <notes> is lost with the current proposal. The feature is designed
    so that <notes> is a container for a group of <note>s at a
    specific level in the document. Where each <note> is one annotation
    / comment in itself. The suggested change transforms that so that the <notes>
    element becomes the entity describing one note, with <note> describing
    specific pieces of metadata related to that note. The ID is intended to
    be used to refer to the note from other places such as from <mrk>
    elements in the inline content, so overloading it to be the type of data
    would cause additional problems.
     
    I think the initial model
    is much easier to work with and more clean as it contain all note related
    information in one sub tree per document level where notes are allowed.
    Adding attributes to the <note> element is in my opinion the best
    way to go. If we should have more standard attributes or if a processor
    is free to use the third party namespace extension mechanism to add them
    is another question. Depending on how simple we want to keep the basic
    notes feature it could be either or a mix of the two methods.
     
    Although I’m not a fan of
    the third party extensions I think this is a case where they could make
    sense. And if used for process specific metadata only I don’t see an issue.
    Of course there will be no standard way to display them in a UI or report
    if they are not specified in the standard.
     
    Regards,
    Fredrik Estreen
     
    From: xliff@lists.oasis-open.org
    [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Rodolfo M. Raya
    Sent: den 28 november 2012 09:32
    To: xliff@lists.oasis-open.org

    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Still a bad use case that
    doesn’t justify ruining a good design.
     
    Regards,
    Rodolfo
    --
    Rodolfo M. Raya       rmraya@maxprograms.com
    Maxprograms       http://www.maxprograms.com
     
    From: Ryan King [ mailto:ryanki@microsoft.com ]

    Sent: Wednesday, November 28, 2012 5:32 AM
    To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >;
    Yves Savourel
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    So that our original reason
    for proposing having more than one <notes> at the extension point
    does not get obfuscated in all of the replies and “see inlines”, here
    once again, is the use case for adding more than one <notes> per
    extension:
     
    Proposal 4: Add an optional name attribute on <notes>
    in core and <mds:metadata> module.
    We believe it will be typical for content providers to
    want to group their notes or metadata in meaningful ways. This might be
    done so that a certain number of notes or bits of metadata can be processed
    in the same way, or simply grouped and displayed together, such as in an
    editor UI. Here are some examples:
     
    <notes name="comments">
      <note id=“comment">This string cannot be longer than
    100 characters</note>
      <note id=“origin">developer</note>
      <note id=”priority”>1</note>
    </notes>
     
    <notes name="instructions">
      <note id=“instruction">Do not localize the product name</note>
      <note id=“origin">loc-engineer</note>
      <note id=”priority”>2</note>
    </notes>
     
    As opposed to something less structured and more difficult
    to process:
     
    <notes>
      <note id=“instruction">Do not localize the product name</note>
      <note id=“instruction-origin">loc-engineer</note>
      <note id=”instructions-priority”>1</note>
      <note id=“comment">This string cannot
    be longer than 100 characters</note>
      <note id=”comment-priority”>2</note>
    </notes>
     
    Thanks,
    Ryan
     
    From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ]

    Sent: Tuesday, November 27, 2012 5:41 PM
    To: Ryan King
    Cc: Yves Savourel; < xliff@lists.oasis-open.org >
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Please don't ruin te design for <notes>. Only one
    should be allowed per insertion point.
     
    Regards,
    Rodolfo

    Sent from my iPad

    On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com >
    wrote:
    Hi Yves, in last week’s TC call it was
    mentioned that I should work with the owners of the current features to
    get our requirements implemented for proposals that weren’t deemed as
    features. I believe you are the owner for the matches module and notes.
    Can you please let me know what we need to do to move forward with getting
    these implemented?
     
    ·         
    Proposal 2: Be able to specify optional
    custom values for match type in <mtc:matches>
    ·         
    Proposal 4: Add an optional name attribute
    on <notes> in core (which also means that we need to allow zero,
    one or more <notes> in each position in the tree structure)
     
    Additionally, it was deemed that we should
    add Reference Language to the <mtc:matches> module. How do you want
    to move forward with that? Since the module is already defined in the 2.0
    spec, can I just suggest the method and if you agree, you can fold it into
    the current module definition? I would propose:
     
    1.       That
    we allow zero, one or more <mtc:matches> at each extension
    point, because you might have both recycling and reference language data.
    2.       Add
    an optional attribute reference=”yes no” with no as default. Additionally,
    PR for a “reference match” would be to allow an xml:lang on the target
    different from the document and allow the <source> not to be present
    as it would be redundant information with the core <source>, e.g.
    Spanish reference for Quechua might look like this:
     
    <mtc:matches>
      <mtc:match reference=”yes”>
       <segment>
        <target xml:lang=”es-es”>hola
    mundo</target>
       </segment>
      </mtc:match>
    </match>    
     
    I’m not sure if any of these require an
    electronic ballot. I got the impression from the call that they don’t,
    but hopefully Bryan or David or someone else from the call will correct
    that if false.
     
    Please let me know how I can work with you
    on these.
    Ryan
     
    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Friday, November 16, 2012 5:02 PM
    To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Thanks Yves and David for
    the valuable feedback. See our comments inline below prefixed with [Microsoft].
    As David suggested on another thread, we will add these soon to the wiki.
     
    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Thursday, November 15, 2012 5:24 PM
    To: Yves Savourel
    Cc: xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Yves, Ryan et. al.
     
    Commenting inline..
    Cheers
    dF
    On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com >
    wrote:
    Hi Ryan, all,


    > Proposal 1: Add an optional build attribute to 2.0 <file> element
    in core.
    > ..
    > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01">
    I don't see anything wrong with this.
     

    > Proposal 2: Be able to specify optional custom values for match type
    > attribute in the <mtc:matches> module.
    > Content providers and Localization Suppliers base their cost and billing
    > models on match similarity and match types. Localization suppliers
    charge
    > us differently for ICE Matches, Exact Matches, and Fuzzy Matches,
    and we
    > might even want to get more granular than that as our cost and billing
    models
    > evolve with the business.
    > In 2.0, the match type doesn’t support the values exact-match and
    fuzzy-match,
    > which were defined in the state-qualifier attribute in 1.2. Instead
    of supporting
    > these two, or any others that may not have migrated from 1.2 to 2.0,
    > as a separate attribute, the request is, that like the discussion
    on state
    > and sub-state in the Face-to-Face in Seattle, we add a sub-type to
    match type.
    > This will allow us to add extra business logic to types, such as "tm"
    or "mt",
    > which are already defined in the spec.
    > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
    > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”>
    > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”>
    I understand the need for the information, but to me, it
    seems the similarity give you whether a match is exact or not.

    The example however, shows (I think) that you are thinking about categories
    that could be mapped differently to the similarity depending on projects.
    For example in one project a near-match corresponds to one range and in
    another to a different range, and you want to simply map that info to something
    common across your process, without having to carry the ranges around.
    If that's the case I wonder if XLIFF should define any default like xlf:exact,
    etc.
    I believe there is value in decoupling the "percentage"
    from the "business" type of the match. The number means nothing
    unless we opt to prescribe a specific variety of (modified) Levenshtein,
    and I i guess we should not open this particular can of worms..
     
    So I wouldn't see a problem with a sub-type there.

    A side comment on the match type: especially, if we allow sub-type, I'm
    still not sure about the values currently listed.
     
    [Microsoft] we definitely
    advocate decoupling the “percentage” from the “business” type of match
    as David puts it. And we should not prescribe meaning to the percentage,
    either. Costing models built on top of these values will necessarily change
    from one provider/supplier to the next and as Yves states, possibly from
    one project to the next. We could very easily have the following (and we
    do in much of our recycled content):
      <match id=”1”
    similarity=”100.0” type=”tm/xlf:exact”>
      <match id=”1” similarity=”100.0” type=”ice”>
    In the first case, we’ve
    recycled a candidate which is 100% match, but came from a segment whose
    state isn’t signed off or final yet, whereas the ice match, in our case,
    has the requirement of being 100% and signed off or final.

    > Proposal 3: Add an optional Reference Language to core.
    > This is a crucial feature for Microsoft and other large companies
    that localize
    > minority languages. For example, it is typical that when we localize
    from
    > English into Quechua, localizers are more efficient and provide much
    higher
    > quality translation, when along with English source, we provide them
    with
    > Spanish target. In 1.2, Reference Languages could be defined in
    > an <alt-trans> element:
    I see the use case and I've seen other cases like this,
    with Chinese (simplified/Traditional).

    Could that be part of the match module?
    Possibly with a new attribute (e.g. reference='yes no' defaulting to no)

    Adding something along with <source>/<target> is bound to cause
    additional PR issues. If it's part of the Match module, it just uses whatever
    the module PRs are.
     
    I agree with Yves's reasons to have this within the match
    module, which is anyway the alt-trans successor. I guess it does not fulfill
    the core criteria
     
    [Microsoft] Adding this to
    the match module would be fine as long as the proper explanatory text and
    processing instructions make it clear what this data should be used for
    as opposed to recycling.

    > Proposal 4: Add an optional name attribute on <notes> in core
    > and <mds:metadata> module.
    > We believe it will be typical for content providers to want to
    > ...
    > <notes name="comments">
    >  <note id=“comment">This string cannot be longer
    than 100 characters</note>
    >  <note id=“user"> Developer@microsoft.com </note>
    >  <note id=“date">10/21/2012 5:28:13 PM</note>
    > </notes>
    Sounds reasonable. We'll have to allow several <notes>
    and <m:metadadat> (I think (but I may be wrong) only one is allowed))
    on the extension point.

    The example makes me wonder about the long term life of XLIFF though: likely
    this type of info (author, timestamp) will be needed by other. Maybe a
    better way to address it would be to add attributes to the note and meta
    that carry the author and time stamp?
    That would obviously work only if those two info are the only example you
    have in mind.
     
    I agree with Yves that a couple of standard attributes
    should be added to increase interoperability, still I believe that note
    should be fully extendable, as it is part of the general annotation mechanism
    and should be able to carry attributes from other namespaces.
     
    [Microsoft] Capturing an author
    and timestamp on a comment is specific to our needs and thus that example.
    However, we do see value in being able to apply an author and timestamp
    on potentially any piece of data. So a module (as Yves suggests below)
    that can exists at the same extension points as metadata (and including
    metadata) might lend itself better to that.
     

    > Proposal 5: Add optional change tracking attributes to <segment>.
    > ...
    > <segment id=”1” modifiedBy=” translator@loc.com
    > modifiedDate=”10/21/2012 5:28:13 PM”>
    >    <source>hello world</source>
    >    <target>hola món</target>
    > </segment>
    Here again I'm wondering if a "change track"
    module may be better?
    You could use it not just on segments but other elements: notes.
    The issue then would be how this gets updated if it's not a core component?
    Actually if it's a core attribute, does it means it's not optional?
    I'm not sure there is a way, even with a PR, to guarantee these data will
    be up-to-date.
    But maybe that's ok?
     
    Optional attributes in core are tricky, IMHO It means you
    do not need to introduce it yourself, if you do not feel so.. But if present
    it would need to be processed by agents who modify the segment. If it is
    thinkable that change agents do not update it, it feels more like a module...
     
    [Microsoft] Since we are heading down the
    same path to MUST preserve modules as well, if we introduce a “change
    track” module, then user agents would need to preserve it if present,
    but as for any other processing requirements, such as updating it, that
    could be specified as part of the module’s processing requirements. For
    example: The module MUST be preserved and SHOULD be updated by user agents.


    cheers,
    -yves



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: xliff-help@lists.oasis-open.org
     
     
     
     






  • 11.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-17-2012 17:16




    Ryan’s examples demonstrate some of the many theoretical uses of the Notes element. Until now, I haven’t seen any guidance/caution regarding how the elements
    should be used, and Ryan’s suggestions were intended to provide better organization and structure when including multiple notes.
     
    I think the
    usage of Notes is a different matter; whether non-translatable information (e.g. instructions) is included seems to be a matter for the XLIFF creator. I don’t think it would be possible for us to restrict what data is included in a Note.

     
    As we identified previously, there is a risk that Notes could indeed become overloaded with information, given their similarity to metadata, but realistically,
    it’s difficult to mitigate for this.
     
    Thanks,
    Kevin.
     
    From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
    On Behalf Of Helena S Chapman
    Sent: Monday, December 17, 2012 8:28 AM
    To: Dr. David Filip
    Cc: Estreen, Fredrik; Ryan King; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Sorry I missed this discussion quite a bit. Remind me again why we are putting in "non-translatable" or "non-localizable" information in XLIFF file for translators in your
    examples?

    Technically speaking, in Ryan's example, id is missing, is that intentional or a mistake? Using a date string is also not advisable, prior to the XLIFF file being transmitted, a system (any system)
    should be using a normalized (UTC) Date object (e.g. Date object in _javascript_) to pass the information around and not a date string.


    Using XLIFF <notes> to convey "instructions" (in the example) to me is just asking for trouble. In our environment, there is really no comprehensive way to cover a set of "build" instructions in
    any XLIFF files that would describe exactly how an"input" or "final output" can be built with a set of XLIFF files that was given. We do most of the interchange between systems as much as we can outside the XLIFF file and leave the complexity as outlined in
    the examples below to a separate metadata interchange process that encompasses not just the files being exchanged but potentially the files that are yet to come or have already arrived some time ago.


    We do and can take advantage of the <notes> feature in XLIFF. However, that is limited to only information exchange specifically about that particular translation "transaction". A full job can
    contain many more transactions. I would not recommend instructions to be passed along in the XLIFF file because that would make the XLIFF interchange unnecessarily complex.







    From:         "Dr. David Filip" < David.Filip@ul.ie >

    To:         " xliff@lists.oasis-open.org " < xliff@lists.oasis-open.org >

    Cc:         Ryan King < ryanki@microsoft.com >, "Estreen, Fredrik" < Fredrik.Estreen@lionbridge.com >

    Date:         12/16/2012 08:47 AM

    Subject:         Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Sent by:         < xliff@lists.oasis-open.org >







    Hi all, as there were no further comments, can we assume consensus and change the spec along the lines discussed in this thread?

    The summary is that note remains structurally the same, but receives new core attributes as discussed, and will also allow for attributes from any namespaces. It won't be extensible by foreign child elements.


    Cheers
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158
    facsimile: +353-6120-2734
    mailto: david.filip@ul.ie




    On Wed, Dec 12, 2012 at 12:00 AM, Dr. David Filip < David.Filip@ul.ie > wrote:

    +1 as proposed now

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781
    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734
    mailto: david.filip@ul.ie




    On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com > wrote:

    Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding priority is OK) and I support the extensibility of note as well. Given that, is there anyone on the
    list who disagrees with this proposal?
     

    From:
    xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Tuesday, December 11, 2012 2:07 PM
    To: Ryan King
    Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org


    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example

    But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread.

    I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping..

    Cheers
    dF
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote:

    Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?

     

    Thanks,

    ryan

     

    From:
    xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Monday, December 3, 2012 11:59 AM

    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya;
    xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good.

    Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.

     

    <notes>

      <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note>

    </note>

     

    Thanks,

    ryan

    .
     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Friday, November 30, 2012 5:00 PM
    To: Estreen, Fredrik
    Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

     
    Thanks for outlining the options, Fredrik,
     
    I would be personally OK with note being just extensible.
    The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism.

    On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here.

    Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..

     
    The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..

     
    Cheers
    dF
     
     
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote:

    Hi Ryan, David,

     

    How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes
    in your original sample and one showing a potential use of David’s ITS mapping case.

     

    <notes>

      <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>

      <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note>

    </notes>

     

    It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which,
    if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date:

    * origin / author – Indicate source of the note

    * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example.

    * type / category – indicate what type / aspect of the data or process the note applies to or annotates.

    * date – creation or modification date. Which of these it is should be specified.

     

    The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that
    it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a
    more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that
    in this area.
     

    If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed
    for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.

     

    <notes>

      <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>

      <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note>

    </notes>

     

    Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that
    the later does for me. It is more ‘just metadata’.
     

    Regards,

    Fredrik Estreen

     

    From: Ryan King [mailto: ryanki@microsoft.com ]

    Sent: den 28 november 2012 21:55
    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    David or Frederick, can you give us an XLIFF example of how that would look?

     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Wednesday, November 28, 2012 12:50 PM
    To: Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way.

    Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.

     
    Cheers
    dF
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone:  +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile:  +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote:

    Hi Rodolfo, Ryan,

     

    I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the
    document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended
    to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.

     

    I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding
    attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want
    to keep the basic notes feature it could be either or a mix of the two methods.

     

    Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue.
    Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.

     

    Regards,

    Fredrik Estreen

     

    From:
    xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Rodolfo M. Raya
    Sent: den 28 november 2012 09:32
    To: xliff@lists.oasis-open.org


    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Still a bad use case that doesn’t justify ruining a good design.

     

    Regards,

    Rodolfo

    --
    Rodolfo M. Raya       rmraya@maxprograms.com
    Maxprograms       http://www.maxprograms.com

     

    From: Ryan King [ mailto:ryanki@microsoft.com ]

    Sent: Wednesday, November 28, 2012 5:32 AM
    To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >;
    Yves Savourel
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again,
    is the use case for adding more than one <notes> per extension:
     

    Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module.

    We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together,
    such as in an editor UI. Here are some examples:
     
    <notes name="comments">
      <note id=“comment">This string cannot be longer than 100 characters</note>
      <note id=“origin">developer</note>
      <note id=”priority”>1</note>
    </notes>
     
    <notes name="instructions">
      <note id=“instruction">Do not localize the product name</note>
      <note id=“origin">loc-engineer</note>
      <note id=”priority”>2</note>
    </notes>
     
    As opposed to something less structured and more difficult to process:
     
    <notes>
      <note id=“instruction">Do not localize the product name</note>
      <note id=“instruction-origin">loc-engineer</note>
      <note id=”instructions-priority”>1</note>
      <note id=“comment">This string cannot be longer than 100 characters</note>
      <note id=”comment-priority”>2</note>
    </notes>
     
    Thanks,
    Ryan
     

    From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ]

    Sent: Tuesday, November 27, 2012 5:41 PM
    To: Ryan King
    Cc: Yves Savourel; < xliff@lists.oasis-open.org >
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Please don't ruin te design for <notes>. Only one should be allowed per insertion point.

     
    Regards,
    Rodolfo

    Sent from my iPad

    On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote:

    Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features.
    I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?

     

    ·         
    Proposal 2: Be able to specify optional custom values for match type in <mtc:matches>

    ·         
    Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow
    zero, one or more <notes> in each position in the tree structure)

     

    Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec,
    can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:

     

    1.       That we allow
    zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data.

    2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the
    target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:

     

    <mtc:matches>

      <mtc:match reference=”yes”>

       <segment>

        <target xml:lang=”es-es”>hola mundo</target>

       </segment>

      </mtc:match>

    </match>    

     

    I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that
    if false.
     

    Please let me know how I can work with you on these.

    Ryan

     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Friday, November 16, 2012 5:02 PM
    To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the
    wiki.
     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Thursday, November 15, 2012 5:24 PM
    To: Yves Savourel
    Cc: xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Yves, Ryan et. al.
     
    Commenting inline..
    Cheers
    dF
    On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote:

    Hi Ryan, all,


    > Proposal 1: Add an optional build attribute to 2.0 <file> element in core.

    > ..
    > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01">

    I don't see anything wrong with this.
     

    > Proposal 2: Be able to specify optional custom values for match type
    > attribute in the <mtc:matches> module.
    > Content providers and Localization Suppliers base their cost and billing
    > models on match similarity and match types. Localization suppliers charge
    > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we
    > might even want to get more granular than that as our cost and billing models
    > evolve with the business.
    > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match,
    > which were defined in the state-qualifier attribute in 1.2. Instead of supporting
    > these two, or any others that may not have migrated from 1.2 to 2.0,
    > as a separate attribute, the request is, that like the discussion on state
    > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type.
    > This will allow us to add extra business logic to types, such as "tm" or "mt",
    > which are already defined in the spec.
    > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
    > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”>
    > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”>
    I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not.

    The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you
    want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc.

    I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..

     
    So I wouldn't see a problem with a sub-type there.

    A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.

     

    [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either.
    Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):

      <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
      <match id=”1” similarity=”100.0” type=”ice”>
    In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has
    the requirement of being 100% and signed off or final.

    > Proposal 3: Add an optional Reference Language to core.
    > This is a crucial feature for Microsoft and other large companies that localize
    > minority languages. For example, it is typical that when we localize from
    > English into Quechua, localizers are more efficient and provide much higher
    > quality translation, when along with English source, we provide them with
    > Spanish target. In 1.2, Reference Languages could be defined in
    > an <alt-trans> element:
    I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional).

    Could that be part of the match module?
    Possibly with a new attribute (e.g. reference='yes no' defaulting to no)

    Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.

     
    I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria

     

    [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as
    opposed to recycling.

    > Proposal 4: Add an optional name attribute on <notes> in core
    > and <mds:metadata> module.
    > We believe it will be typical for content providers to want to
    > ...
    > <notes name="comments">
    >  <note id=“comment">This string cannot be longer than 100 characters</note>
    >  <note id=“user"> Developer@microsoft.com </note>
    >  <note id=“date">10/21/2012 5:28:13 PM</note>
    > </notes>
    Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point.

    The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp?
    That would obviously work only if those two info are the only example you have in mind.

     
    I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other
    namespaces.
     
    [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp
    on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.

     

    > Proposal 5: Add optional change tracking attributes to <segment>.
    > ...
    > <segment id=”1” modifiedBy=” translator@loc.com
    > modifiedDate=”10/21/2012 5:28:13 PM”>
    >    <source>hello world</source>
    >    <target>hola món</target>
    > </segment>
    Here again I'm wondering if a "change track" module may be better?
    You could use it not just on segments but other elements: notes.
    The issue then would be how this gets updated if it's not a core component?
    Actually if it's a core attribute, does it means it's not optional?
    I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date.
    But maybe that's ok?
     
    Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it,
    it feels more like a module...
     
    [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements,
    such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents.


    cheers,
    -yves



    ---------------------------------------------------------------------
    To unsubscribe, e-mail:
    xliff-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail:
    xliff-help@lists.oasis-open.org
     
     
     
     
     






  • 12.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-17-2012 18:37
    Kevin, I am at a loss here. Perhaps you
    can elaborate on " there
    is a risk that Notes could indeed become overloaded with information "
    and why by adding additional attributes would help?




    From:      
      "Kevin O'Donnell"
    <kevinod@microsoft.com>
    To:      
      Helena S Chapman/San
    Jose/IBM@IBMUS, "Dr. David Filip" <David.Filip@ul.ie>
    Cc:      
      "Estreen, Fredrik"
    <Fredrik.Estreen@lionbridge.com>, Ryan King <ryanki@microsoft.com>,
    "xliff@lists.oasis-open.org" <xliff@lists.oasis-open.org>
    Date:      
      12/17/2012 12:21 PM
    Subject:    
        RE: [xliff]
    1.2 to 2.0 Gaps and Proposals (notes)
    Sent by:    
        <xliff@lists.oasis-open.org>




    Ryan’s examples demonstrate
    some of the many theoretical uses of the Notes element. Until now, I haven’t
    seen any guidance/caution regarding how the elements should be used, and
    Ryan’s suggestions were intended to provide better organization and structure
    when including multiple notes.
     
    I think the usage
    of Notes is a different matter; whether non-translatable information (e.g.
    instructions) is included seems to be a matter for the XLIFF creator. I
    don’t think it would be possible for us to restrict what data is included
    in a Note.
     
    As we identified previously,
    there is a risk that Notes could indeed become overloaded with information,
    given their similarity to metadata, but realistically, it’s difficult
    to mitigate for this.
     
    Thanks,
    Kevin.
     
    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Helena S Chapman
    Sent: Monday, December 17, 2012 8:28 AM
    To: Dr. David Filip
    Cc: Estreen, Fredrik; Ryan King; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Sorry I missed this discussion quite a bit.
    Remind me again why we are putting in "non-translatable" or "non-localizable"
    information in XLIFF file for translators in your examples?

    Technically speaking, in Ryan's example, id is missing, is that intentional
    or a mistake? Using a date string is also not advisable, prior to the XLIFF
    file being transmitted, a system (any system) should be using a normalized
    (UTC) Date object (e.g. Date object in _javascript_) to pass the information
    around and not a date string.

    Using XLIFF <notes> to convey "instructions" (in the example)
    to me is just asking for trouble. In our environment, there is really no
    comprehensive way to cover a set of "build" instructions in any
    XLIFF files that would describe exactly how an"input" or "final
    output" can be built with a set of XLIFF files that was given. We
    do most of the interchange between systems as much as we can outside the
    XLIFF file and leave the complexity as outlined in the examples below to
    a separate metadata interchange process that encompasses not just the files
    being exchanged but potentially the files that are yet to come or have
    already arrived some time ago.

    We do and can take advantage of the <notes> feature in XLIFF. However,
    that is limited to only information exchange specifically about that particular
    translation "transaction". A full job can contain many more transactions.
    I would not recommend instructions to be passed along in the XLIFF file
    because that would make the XLIFF interchange unnecessarily complex.







    From:         "Dr.
    David Filip" < David.Filip@ul.ie >

    To:         " xliff@lists.oasis-open.org "
    < xliff@lists.oasis-open.org >

    Cc:         Ryan King
    < ryanki@microsoft.com >,
    "Estreen, Fredrik" < Fredrik.Estreen@lionbridge.com >

    Date:         12/16/2012
    08:47 AM
    Subject:         Re:
    [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Sent by:         < xliff@lists.oasis-open.org >







    Hi all, as there were no further comments, can we assume consensus and
    change the spec along the lines discussed in this thread?
    The summary is that note remains structurally the same, but receives new
    core attributes as discussed, and will also allow for attributes from any
    namespaces. It won't be extensible by foreign child elements.

    Cheers
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158
    facsimile: +353-6120-2734
    mailto: david.filip@ul.ie




    On Wed, Dec 12, 2012 at 12:00 AM, Dr. David Filip < David.Filip@ul.ie >
    wrote:
    +1 as proposed now

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie




    On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com >
    wrote:
    Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding
    priority is OK) and I support the extensibility of note as well. Given
    that, is there anyone on the list who disagrees with this proposal?

     

    From: xliff@lists.oasis-open.org
    [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Tuesday, December 11, 2012 2:07 PM
    To: Ryan King
    Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org


    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Ryan, I support adding the core
    attributes as proposed, plus perhaps the priority [1-10] from Fredriks
    example
    But I also reiterate the request
    for note to be extensible. This seemed to have Fredrik's support on this
    thread.
    I think that note and inline markers
    should be extensible as part of the generic annotations design that will
    allow development of annotation modules, such as ITS mapping..
    Cheers
    dF
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland

    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Tue, Dec 11, 2012 at 6:24 PM,
    Ryan King < ryanki@microsoft.com >
    wrote:
    Do we have consensus on this
    proposal? E.g. adding category, origin, and datetime (or timestamp) attributes
    to <note>?
     

    Thanks,

    ryan

     

    From: xliff@lists.oasis-open.org
    [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Monday, December 3, 2012 11:59 AM


    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org

    Subject: RE: [xliff] 1.2
    to 2.0 Gaps and Proposals (notes)
     
    >> On
    the other hand having a minimum set for interoperability for ITS unaware
    tools sounds good.
    Agreed. And as stated on another
    thread…w e suggest the list of additional
    and optional attributes to be origin, category, datetime.
     

    <notes>

      <note category=”instruction”
    origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize
    Windows</note>
    </note>

     

    Thanks,

    ryan

    .
     

    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Friday, November 30, 2012 5:00 PM
    To: Estreen, Fredrik
    Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

     
    Thanks for outlining the options,
    Fredrik,
     
    I would be personally OK with note
    being just extensible.
    The ITS categories would allow to
    specify pretty much everything that you would need. First as extension,
    that should later turn into a module using the same mechanism.
    On the other hand having a miniumum
    set for interoprability for ITS unaware tools sounds good. And as Fredrik
    pointed out ITS note can be easily mapped on these, so not an issue from
    here.
    Even with the minimum set of core
    attributes, I still think it should be extensible.. to allow for unforeseen
    types of annotations..
     
    The only danger is of creating unnecessary
    clutter if the adoption is minimal.. hard to say what the adoption will
    be..
     
    Cheers
    dF
     
     
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland

    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Thu, Nov 29, 2012 at 10:39 AM,
    Estreen, Fredrik < Fredrik.Estreen@lionbridge.com >
    wrote:
    Hi Ryan, David,

     

    How it would look is dependent
    on if we add one or more standard attributes to the <note> element
    or rely solely on third party extensions. First an examples of one of the
    notes in your original sample and one showing a potential use of David’s
    ITS mapping case.
     

    <notes>

      <note id=”n1”
    ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”>
    This string cannot be longer than 100 characters</note>

      <note id=”n2”
    its:locNoteType=”alert”>Make sure to adapt date format when localizing</note>

    </notes>

     

    It could be argued that there
    is a set of very common metadata associated with notes and that we should
    provide standard attributes in these cases. I’m not sure exactly which,
    if any, we should have but the ones I can immediately think of are the
    kind of information in the above sample plus a date:

    * origin / author – Indicate
    source of the note
    * priority – indicate relative
    importance of a note. Must have strict simple definition. Integer lower
    is more important than higher for example.

    * type / category – indicate
    what type / aspect of the data or process the note applies to or annotates.

    * date – creation or modification
    date. Which of these it is should be specified.

     

    The good thing about using
    standard attributes instead of extensions for common properties is of course
    better interoperability for the data contained. The negative side is that
    it adds complexity to the standard which is against one of the goals of
    the 2.0 work. One part of that is the attempt to reduce the number of seldom
    or never used constructs to get a leaner core model. A solution that has
    been discussed before is to have a more complex comment / annotation module
    in addition to or extending the core feature. This way we get the same
    complexity in the core as we would with just third party extensions but
    with the added value of a fully interoperable path for those that want
    that in this area.
     

    If we hypothetically assume
    we add origin and priority to the core the above example could look like
    the bellow. Assuming the same mapping for ITS is used as the one proposed
    for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and
    stored in “priority”.
     

    <notes>

      <note id=”n1”
    author=”developer” priority=”1” ms:noteType=”comment”> This string
    cannot be longer than 100 characters</note>

      <note id=”n2”
    priority=”1” >Make sure to adapt date format when localizing</note>

    </notes>

     

    Regarding the naming of potential
    core / module attributes I would prefer to use “category” instead of
    “type” as the former does not convey the level of functional meaning
    that the later does for me. It is more ‘just metadata’.

     

    Regards,

    Fredrik Estreen

     

    From: Ryan King [mailto: ryanki@microsoft.com ]

    Sent: den 28 november 2012 21:55
    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    David or Frederick, can you
    give us an XLIFF example of how that would look?

     

    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Wednesday, November 28, 2012 12:50 PM
    To: Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Fredrik, all, same as Fredrik, I
    think that extensibility makes sense here. I agree that the grouping mechanism
    in the style of mda is not appropriate here and would change the semantics
    in an undesired way.
    Annotations are perfect extension
    points in general, and besides we need the extensibility here for the its
    mapping.
     
    Cheers
    dF
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland

    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Wed, Nov 28, 2012 at 10:10 AM,
    Estreen, Fredrik < Fredrik.Estreen@lionbridge.com >
    wrote:
    Hi Rodolfo, Ryan,

     

    I think the intent of the
    <notes> is lost with the current proposal. The feature is designed
    so that <notes> is a container for a group of <note>s at a
    specific level in the document. Where each <note> is one annotation
    / comment in itself. The suggested change transforms that so that the <notes>
    element becomes the entity describing one note, with <note> describing
    specific pieces of metadata related to that note. The ID is intended to
    be used to refer to the note from other places such as from <mrk>
    elements in the inline content, so overloading it to be the type of data
    would cause additional problems.

     

    I think the initial model
    is much easier to work with and more clean as it contain all note related
    information in one sub tree per document level where notes are allowed.
    Adding attributes to the <note> element is in my opinion the best
    way to go. If we should have more standard attributes or if a processor
    is free to use the third party namespace extension mechanism to add them
    is another question. Depending on how simple we want to keep the basic
    notes feature it could be either or a mix of the two methods.

     

    Although I’m not a fan of
    the third party extensions I think this is a case where they could make
    sense. And if used for process specific metadata only I don’t see an issue.
    Of course there will be no standard way to display them in a UI or report
    if they are not specified in the standard.

     

    Regards,

    Fredrik Estreen

     

    From: xliff@lists.oasis-open.org
    [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Rodolfo M. Raya
    Sent: den 28 november 2012 09:32
    To: xliff@lists.oasis-open.org


    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Still a bad use case that
    doesn’t justify ruining a good design.

     

    Regards,

    Rodolfo

    --
    Rodolfo M. Raya       rmraya@maxprograms.com
    Maxprograms       http://www.maxprograms.com

     

    From: Ryan King [ mailto:ryanki@microsoft.com ]

    Sent: Wednesday, November 28, 2012 5:32 AM
    To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >;
    Yves Savourel
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    So that our original reason
    for proposing having more than one <notes> at the extension point
    does not get obfuscated in all of the replies and “see inlines”, here
    once again, is the use case for adding more than one <notes> per
    extension:
     

    Proposal 4: Add an optional name
    attribute on <notes> in core and <mds:metadata> module.

    We believe it will be typical for
    content providers to want to group their notes or metadata in meaningful
    ways. This might be done so that a certain number of notes or bits of metadata
    can be processed in the same way, or simply grouped and displayed together,
    such as in an editor UI. Here are some examples:
     
    <notes name="comments">
     <note id=“comment">This string cannot be longer than
    100 characters</note>
     <note id=“origin">developer</note>
      <note id=”priority”>1</note>

    </notes>
     
    <notes name="instructions">
     <note id=“instruction">Do not localize the product name</note>
     <note id=“origin">loc-engineer</note>
      <note id=”priority”>2</note>

    </notes>
     
    As opposed to something less structured
    and more difficult to process:
     
    <notes>
     <note id=“instruction">Do not localize the product name</note>
     <note id=“instruction-origin">loc-engineer</note>
     <note id=”instructions-priority”>1</note>
      <note id=“comment">This
    string cannot be longer than 100 characters</note>
     <note id=”comment-priority”>2</note>
    </notes>
     
    Thanks,
    Ryan
     

    From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ]

    Sent: Tuesday, November 27, 2012 5:41 PM
    To: Ryan King
    Cc: Yves Savourel; < xliff@lists.oasis-open.org >
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Please don't ruin te design for
    <notes>. Only one should be allowed per insertion point.
     
    Regards,
    Rodolfo

    Sent from my iPad

    On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com >
    wrote:
    Hi Yves, in last week’s TC call it was mentioned that I should work with
    the owners of the current features to get our requirements implemented
    for proposals that weren’t deemed as features. I believe you are the owner
    for the matches module and notes. Can you please let me know what we need
    to do to move forward with getting these implemented?
     

    ·
            Proposal
    2: Be able to specify optional custom values for match type in <mtc:matches>

    ·
            Proposal
    4: Add an optional name attribute on <notes> in core (which also
    means that we need to allow zero, one or more <notes> in each
    position in the tree structure)

     

    Additionally, it was deemed that we should
    add Reference Language to the <mtc:matches> module. How do you want
    to move forward with that? Since the module is already defined in the 2.0
    spec, can I just suggest the method and if you agree, you can fold it into
    the current module definition? I would propose:

     

    1.
          That we allow zero,
    one or more <mtc:matches> at each extension point, because you
    might have both recycling and reference language data.

    2.
          Add an optional
    attribute reference=”yes no” with no as default. Additionally, PR for
    a “reference match” would be to allow an xml:lang on the target different
    from the document and allow the <source> not to be present as it
    would be redundant information with the core <source>, e.g. Spanish
    reference for Quechua might look like this:

     

    <mtc:matches>

      <mtc:match reference=”yes”>

       <segment>

        <target xml:lang=”es-es”>hola
    mundo</target>
       </segment>

      </mtc:match>

    </match>    
     

    I’m not sure if any of these require an
    electronic ballot. I got the impression from the call that they don’t,
    but hopefully Bryan or David or someone else from the call will correct
    that if false.
     

    Please let me know how I can work with you
    on these.
    Ryan

     

    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Friday, November 16, 2012 5:02 PM
    To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Thanks Yves and David for
    the valuable feedback. See our comments inline below prefixed with [Microsoft].
    As David suggested on another thread, we will add these soon to the wiki.

     

    From: xliff@lists.oasis-open.org
    [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Thursday, November 15, 2012 5:24 PM
    To: Yves Savourel
    Cc: xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Yves, Ryan et. al.
     
    Commenting inline..
    Cheers
    dF
    On Thu, Nov 15, 2012 at 8:23 PM,
    Yves Savourel < ysavourel@enlaso.com >
    wrote:
    Hi Ryan, all,


    > Proposal 1: Add an optional build attribute to 2.0 <file> element
    in core.
    > ..
    > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01">

    I don't see anything wrong with
    this.
     

    > Proposal 2: Be able to specify optional custom values for match type
    > attribute in the <mtc:matches> module.
    > Content providers and Localization Suppliers base their cost and billing
    > models on match similarity and match types. Localization suppliers
    charge
    > us differently for ICE Matches, Exact Matches, and Fuzzy Matches,
    and we
    > might even want to get more granular than that as our cost and billing
    models
    > evolve with the business.
    > In 2.0, the match type doesn’t support the values exact-match and
    fuzzy-match,
    > which were defined in the state-qualifier attribute in 1.2. Instead
    of supporting
    > these two, or any others that may not have migrated from 1.2 to 2.0,
    > as a separate attribute, the request is, that like the discussion
    on state
    > and sub-state in the Face-to-Face in Seattle, we add a sub-type to
    match type.
    > This will allow us to add extra business logic to types, such as "tm"
    or "mt",
    > which are already defined in the spec.
    > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
    > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”>
    > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”>

    I understand the need for the information,
    but to me, it seems the similarity give you whether a match is exact or
    not.

    The example however, shows (I think) that you are thinking about categories
    that could be mapped differently to the similarity depending on projects.
    For example in one project a near-match corresponds to one range and in
    another to a different range, and you want to simply map that info to something
    common across your process, without having to carry the ranges around.
    If that's the case I wonder if XLIFF should define any default like xlf:exact,
    etc.
    I believe there is value in decoupling
    the "percentage" from the "business" type of the match.
    The number means nothing unless we opt to prescribe a specific variety
    of (modified) Levenshtein, and I i guess we should not open this particular
    can of worms..
     
    So I wouldn't see a problem with
    a sub-type there.

    A side comment on the match type: especially, if we allow sub-type, I'm
    still not sure about the values currently listed.
     

    [Microsoft] we definitely
    advocate decoupling the “percentage” from the “business” type of match
    as David puts it. And we should not prescribe meaning to the percentage,
    either. Costing models built on top of these values will necessarily change
    from one provider/supplier to the next and as Yves states, possibly from
    one project to the next. We could very easily have the following (and we
    do in much of our recycled content):

      <match id=”1”
    similarity=”100.0” type=”tm/xlf:exact”>
     <match id=”1” similarity=”100.0” type=”ice”>

    In the first case, we’ve
    recycled a candidate which is 100% match, but came from a segment whose
    state isn’t signed off or final yet, whereas the ice match, in our case,
    has the requirement of being 100% and signed off or final.

    > Proposal 3: Add an optional Reference Language to core.
    > This is a crucial feature for Microsoft and other large companies
    that localize
    > minority languages. For example, it is typical that when we localize
    from
    > English into Quechua, localizers are more efficient and provide much
    higher
    > quality translation, when along with English source, we provide them
    with
    > Spanish target. In 1.2, Reference Languages could be defined in
    > an <alt-trans> element:
    I see the use case and I've seen
    other cases like this, with Chinese (simplified/Traditional).

    Could that be part of the match module?
    Possibly with a new attribute (e.g. reference='yes no' defaulting to no)

    Adding something along with <source>/<target> is bound to cause
    additional PR issues. If it's part of the Match module, it just uses whatever
    the module PRs are.
     
    I agree with Yves's reasons to have
    this within the match module, which is anyway the alt-trans successor.
    I guess it does not fulfill the core criteria
     

    [Microsoft] Adding this to
    the match module would be fine as long as the proper explanatory text and
    processing instructions make it clear what this data should be used for
    as opposed to recycling.

    > Proposal 4: Add an optional name attribute on <notes> in core
    > and <mds:metadata> module.
    > We believe it will be typical for content providers to want to
    > ...
    > <notes name="comments">
    >  <note id=“comment">This string cannot be longer
    than 100 characters</note>
    >  <note id=“user"> Developer@microsoft.com </note>
    >  <note id=“date">10/21/2012 5:28:13 PM</note>
    > </notes>
    Sounds reasonable. We'll have to
    allow several <notes> and <m:metadadat> (I think (but I may
    be wrong) only one is allowed)) on the extension point.

    The example makes me wonder about the long term life of XLIFF though: likely
    this type of info (author, timestamp) will be needed by other. Maybe a
    better way to address it would be to add attributes to the note and meta
    that carry the author and time stamp?
    That would obviously work only if those two info are the only example you
    have in mind.
     
    I agree with Yves that a couple
    of standard attributes should be added to increase interoperability, still
    I believe that note should be fully extendable, as it is part of the general
    annotation mechanism and should be able to carry attributes from other
    namespaces.
     
    [Microsoft] Capturing an author
    and timestamp on a comment is specific to our needs and thus that example.
    However, we do see value in being able to apply an author and timestamp
    on potentially any piece of data. So a module (as Yves suggests below)
    that can exists at the same extension points as metadata (and including
    metadata) might lend itself better to that.

     

    > Proposal 5: Add optional change tracking attributes to <segment>.

    > ...
    > <segment id=”1” modifiedBy=” translator@loc.com
    > modifiedDate=”10/21/2012 5:28:13 PM”>
    >    <source>hello world</source>
    >    <target>hola món</target>
    > </segment>
    Here again I'm wondering if a "change
    track" module may be better?
    You could use it not just on segments but other elements: notes.
    The issue then would be how this gets updated if it's not a core component?
    Actually if it's a core attribute, does it means it's not optional?
    I'm not sure there is a way, even with a PR, to guarantee these data will
    be up-to-date.
    But maybe that's ok?
     
    Optional attributes in core are
    tricky, IMHO It means you do not need to introduce it yourself, if you
    do not feel so.. But if present it would need to be processed by agents
    who modify the segment. If it is thinkable that change agents do not update
    it, it feels more like a module...
     
    [Microsoft] Since
    we are heading down the same path to MUST preserve modules as well, if
    we introduce a “change track” module, then user agents would need to
    preserve it if present, but as for any other processing requirements, such
    as updating it, that could be specified as part of the module’s processing
    requirements. For example: The module MUST be preserved and SHOULD be updated
    by user agents.

    cheers,
    -yves



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: xliff-help@lists.oasis-open.org

     
     
     
     
     




  • 13.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 12-17-2012 19:45




    I didn’t mean to conflate both issues; I was referring to the previous discussion (attached) that considers that possibility of overloading the Notes element
    – your comment on usage of Notes reminded me of this.
     
    I am not suggesting that the additional attributes may help this situation. They were proposed to aid the organization and processing of notes.
     
    Thanks,
    Kevin.
     
    From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
    On Behalf Of Helena S Chapman
    Sent: Monday, December 17, 2012 10:37 AM
    To: Kevin O'Donnell
    Cc: Dr. David Filip; Estreen, Fredrik; Ryan King; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Kevin, I am at a loss here. Perhaps you can elaborate on " there is a risk that Notes
    could indeed become overloaded with information " and why by adding additional attributes would help?




    From:         "Kevin O'Donnell" < kevinod@microsoft.com >

    To:         Helena S Chapman/San Jose/IBM@IBMUS, "Dr. David Filip" < David.Filip@ul.ie >

    Cc:         "Estreen, Fredrik" < Fredrik.Estreen@lionbridge.com >,
    Ryan King < ryanki@microsoft.com >, " xliff@lists.oasis-open.org " < xliff@lists.oasis-open.org >

    Date:         12/17/2012 12:21 PM

    Subject:         RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Sent by:         < xliff@lists.oasis-open.org >







    Ryan’s examples demonstrate some of the many theoretical uses of the Notes element. Until now, I haven’t seen any guidance/caution regarding how the elements should be used, and
    Ryan’s suggestions were intended to provide better organization and structure when including multiple notes.

     

    I think the
    usage of Notes is a different matter; whether non-translatable information (e.g. instructions) is included seems to be a matter for the XLIFF creator. I don’t think it would be possible for us to restrict what data is included in a Note.

     

    As we identified previously, there is a risk that Notes could indeed become overloaded with information, given their similarity to metadata, but realistically, it’s difficult to
    mitigate for this.
     

    Thanks,

    Kevin.

     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Helena S Chapman
    Sent: Monday, December 17, 2012 8:28 AM
    To: Dr. David Filip
    Cc: Estreen, Fredrik; Ryan King;
    xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Sorry I missed this discussion quite a bit. Remind me again why we are putting in "non-translatable" or "non-localizable" information in XLIFF file for translators in your examples?


    Technically speaking, in Ryan's example, id is missing, is that intentional or a mistake? Using a date string is also not advisable, prior to the XLIFF file being transmitted, a system (any system) should be using a normalized (UTC) Date object (e.g. Date object
    in _javascript_) to pass the information around and not a date string.

    Using XLIFF <notes> to convey "instructions" (in the example) to me is just asking for trouble. In our environment, there is really no comprehensive way to cover a set of "build" instructions in any XLIFF files that would describe exactly how an"input" or "final
    output" can be built with a set of XLIFF files that was given. We do most of the interchange between systems as much as we can outside the XLIFF file and leave the complexity as outlined in the examples below to a separate metadata interchange process that
    encompasses not just the files being exchanged but potentially the files that are yet to come or have already arrived some time ago.


    We do and can take advantage of the <notes> feature in XLIFF. However, that is limited to only information exchange specifically about that particular translation "transaction". A full job can contain many more transactions. I would not recommend instructions
    to be passed along in the XLIFF file because that would make the XLIFF interchange unnecessarily complex.







    From:         "Dr. David Filip" < David.Filip@ul.ie >

    To:         " xliff@lists.oasis-open.org "
    < xliff@lists.oasis-open.org >

    Cc:         Ryan King < ryanki@microsoft.com >,
    "Estreen, Fredrik" < Fredrik.Estreen@lionbridge.com >

    Date:         12/16/2012 08:47 AM

    Subject:         Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Sent by:         < xliff@lists.oasis-open.org >

     







    Hi all, as there were no further comments, can we assume consensus and change the spec along the lines discussed in this thread?

    The summary is that note remains structurally the same, but receives new core attributes as discussed, and will also allow for attributes from any namespaces. It won't be extensible by foreign child elements.


    Cheers
    dF

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158
    facsimile: +353-6120-2734
    mailto: david.filip@ul.ie




    On Wed, Dec 12, 2012 at 12:00 AM, Dr. David Filip < David.Filip@ul.ie > wrote:

    +1 as proposed now

    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781
    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie




    On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com > wrote:

    Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding priority is OK) and I support the extensibility of note as well. Given that, is there anyone on the list who disagrees with this proposal?

     

    From:
    xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Tuesday, December 11, 2012 2:07 PM
    To: Ryan King
    Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org


    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example

    But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread.

    I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping..

    Cheers
    dF
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote:

    Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?

     

    Thanks,

    ryan

     

    From:
    xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Monday, December 3, 2012 11:59 AM

    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya;
    xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
     
    >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good.

    Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.

     

    <notes>

      <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note>

    </note>

     

    Thanks,

    ryan

    .
     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Friday, November 30, 2012 5:00 PM
    To: Estreen, Fredrik
    Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

     
    Thanks for outlining the options, Fredrik,
     
    I would be personally OK with note being just extensible.
    The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism.

    On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here.

    Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..

     
    The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..

     
    Cheers
    dF
     
     
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote:

    Hi Ryan, David,

     

    How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes
    in your original sample and one showing a potential use of David’s ITS mapping case.

     

    <notes>

      <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>

      <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note>

    </notes>

     

    It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which,
    if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date:

    * origin / author – Indicate source of the note

    * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example.

    * type / category – indicate what type / aspect of the data or process the note applies to or annotates.

    * date – creation or modification date. Which of these it is should be specified.

     

    The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that
    it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a
    more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that
    in this area.
     

    If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed
    for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.

     

    <notes>

      <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>

      <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note>

    </notes>

     

    Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that
    the later does for me. It is more ‘just metadata’.
     

    Regards,

    Fredrik Estreen

     

    From: Ryan King [mailto: ryanki@microsoft.com ]

    Sent: den 28 november 2012 21:55
    To: Dr. David Filip; Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    David or Frederick, can you give us an XLIFF example of how that would look?

     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Wednesday, November 28, 2012 12:50 PM
    To: Estreen, Fredrik
    Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way.

    Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.

     
    Cheers
    dF
    Dr. David Filip
    =======================
    LRC CNGL LT-Web CSIS
    University of Limerick, Ireland
    telephone: +353-6120-2781

    cellphone: +353-86-0222-158

    facsimile: +353-6120-2734

    mailto: david.filip@ul.ie

     
    On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote:

    Hi Rodolfo, Ryan,

     

    I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the
    document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended
    to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.

     

    I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding
    attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want
    to keep the basic notes feature it could be either or a mix of the two methods.

     

    Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue.
    Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.

     

    Regards,

    Fredrik Estreen

     

    From:
    xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ]
    On Behalf Of Rodolfo M. Raya
    Sent: den 28 november 2012 09:32
    To: xliff@lists.oasis-open.org


    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Still a bad use case that doesn’t justify ruining a good design.

     

    Regards,

    Rodolfo

    --
    Rodolfo M. Raya       rmraya@maxprograms.com
    Maxprograms       http://www.maxprograms.com

     

    From: Ryan King [ mailto:ryanki@microsoft.com ]

    Sent: Wednesday, November 28, 2012 5:32 AM
    To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >;
    Yves Savourel
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again,
    is the use case for adding more than one <notes> per extension:
     

    Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module.

    We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together,
    such as in an editor UI. Here are some examples:
     
    <notes name="comments">
     <note id=“comment">This string cannot be longer than 100 characters</note>
     <note id=“origin">developer</note>
      <note id=”priority”>1</note>
    </notes>
     
    <notes name="instructions">
     <note id=“instruction">Do not localize the product name</note>
     <note id=“origin">loc-engineer</note>
      <note id=”priority”>2</note>
    </notes>
     
    As opposed to something less structured and more difficult to process:
     
    <notes>
     <note id=“instruction">Do not localize the product name</note>
     <note id=“instruction-origin">loc-engineer</note>
     <note id=”instructions-priority”>1</note>
      <note id=“comment">This string cannot be longer than 100 characters</note>
     <note id=”comment-priority”>2</note>
    </notes>
     
    Thanks,
    Ryan
     

    From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ]

    Sent: Tuesday, November 27, 2012 5:41 PM
    To: Ryan King
    Cc: Yves Savourel; < xliff@lists.oasis-open.org >
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals
     
    Please don't ruin te design for <notes>. Only one should be allowed per insertion point.

     
    Regards,
    Rodolfo

    Sent from my iPad

    On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote:

    Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can
    you please let me know what we need to do to move forward with getting these implemented?

     

    ·        
    Proposal 2: Be able to specify optional custom values for match type in <mtc:matches>

    ·        
    Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow
    zero, one or more <notes> in each position in the tree structure)

     

    Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec,
    can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:

     

    1.       That we allow
    zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data.

    2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the
    target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:

     

    <mtc:matches>

      <mtc:match reference=”yes”>

       <segment>

        <target xml:lang=”es-es”>hola mundo</target>

       </segment>

      </mtc:match>

    </match>    

     

    I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that
    if false.
     

    Please let me know how I can work with you on these.

    Ryan

     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Ryan King
    Sent: Friday, November 16, 2012 5:02 PM
    To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org
    Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the
    wiki.
     

    From:
    xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ]
    On Behalf Of Dr. David Filip
    Sent: Thursday, November 15, 2012 5:24 PM
    To: Yves Savourel
    Cc: xliff@lists.oasis-open.org
    Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals

     
    Yves, Ryan et. al.
     
    Commenting inline..
    Cheers
    dF
    On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote:

    Hi Ryan, all,


    > Proposal 1: Add an optional build attribute to 2.0 <file> element in core.

    > ..
    > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01">

    I don't see anything wrong with this.
     

    > Proposal 2: Be able to specify optional custom values for match type
    > attribute in the <mtc:matches> module.
    > Content providers and Localization Suppliers base their cost and billing
    > models on match similarity and match types. Localization suppliers charge
    > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we
    > might even want to get more granular than that as our cost and billing models
    > evolve with the business.
    > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match,
    > which were defined in the state-qualifier attribute in 1.2. Instead of supporting
    > these two, or any others that may not have migrated from 1.2 to 2.0,
    > as a separate attribute, the request is, that like the discussion on state
    > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type.
    > This will allow us to add extra business logic to types, such as "tm" or "mt",
    > which are already defined in the spec.
    > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
    > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”>
    > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”>
    I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not.

    The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you
    want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc.

    I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..

     
    So I wouldn't see a problem with a sub-type there.

    A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.

     

    [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either.
    Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):

      <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>
     <match id=”1” similarity=”100.0” type=”ice”>
    In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has
    the requirement of being 100% and signed off or final.

    > Proposal 3: Add an optional Reference Language to core.
    > This is a crucial feature for Microsoft and other large companies that localize
    > minority languages. For example, it is typical that when we localize from
    > English into Quechua, localizers are more efficient and provide much higher
    > quality translation, when along with English source, we provide them with
    > Spanish target. In 1.2, Reference Languages could be defined in
    > an <alt-trans> element:
    I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional).

    Could that be part of the match module?
    Possibly with a new attribute (e.g. reference='yes no' defaulting to no)

    Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.

     
    I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria

     

    [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as
    opposed to recycling.

    > Proposal 4: Add an optional name attribute on <notes> in core
    > and <mds:metadata> module.
    > We believe it will be typical for content providers to want to
    > ...
    > <notes name="comments">
    >  <note id=“comment">This string cannot be longer than 100 characters</note>
    >  <note id=“user"> Developer@microsoft.com </note>
    >  <note id=“date">10/21/2012 5:28:13 PM</note>
    > </notes>
    Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point.

    The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp?
    That would obviously work only if those two info are the only example you have in mind.

     
    I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other
    namespaces.
     
    [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp
    on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.

     

    > Proposal 5: Add optional change tracking attributes to <segment>.
    > ...
    > <segment id=”1” modifiedBy=” translator@loc.com
    > modifiedDate=”10/21/2012 5:28:13 PM”>
    >    <source>hello world</source>
    >    <target>hola món</target>
    > </segment>
    Here again I'm wondering if a "change track" module may be better?
    You could use it not just on segments but other elements: notes.
    The issue then would be how this gets updated if it's not a core component?
    Actually if it's a core attribute, does it means it's not optional?
    I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date.
    But maybe that's ok?
     
    Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it,
    it feels more like a module...
     
    [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements,
    such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents.


    cheers,
    -yves



    ---------------------------------------------------------------------
    To unsubscribe, e-mail:
    xliff-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail:
    xliff-help@lists.oasis-open.org
     
     
     
     
     



    ---  Begin Message  --- From : "Dr. David Filip" <David.Filip@ul.ie> To : Ryan King <ryanki@microsoft.com> Date : Fri, 16 Nov 2012 10:33:58 +0000 Ryan, Yves, you are right that there is need for definitions and feature/element specific PRs. We must not rely on the tribal knowledge.. As the spec becomes mainstream it must be accessible to "naive implementers" Ryan, please feel free start proposing them, this is elaboration of features, and should not be affected by the feature freeze IMHO. I mean MSFT is one of the supporters of the mda module, so should have an idea what to do with that, and of course also the feature owner Bryan :-). But same as Yves, I think that metadata is by design broad and abstract, so that best practices or PEs rather than PRs would do the trick here. The caveat is however is that a PE cannot override the general PR that says MUST preserve. Ther merit of mda is that people can stack key value pairs without toying with namespaces. A kind of kitchen sink approach, but at least you know where to find them and can be sure that they survive. If you have an idea of specific metadata type with specific handling, it should be pushed as namespace based module at this or rather a later stage. The general PRs are a principle and fallback in case of lacking specific PRs, this fallback is generally intended for non-core elements. Core features are underspcified if they need to fall back on the general principle. Support means to read, understand, process, and write (including delete if this is the result of the intended processing) To read you must know the syntax, to understand there must be definitions (and not a mere tribal knowledge), to process, there must be PRs, and maybe PEs, in case we are not sure if we can normatively prescribe the behavior. But PEs can never define change/update behavior because they have not the normative power to override the general PRs. Anyway, according the general PRs, nothing can be changed, unless there is a specific PR that tells you when and how you are allowed to change. I see the danger of overloading note as it is a core element. But there will be little value in preservation that is also guaranteed for module. In case of note, I think the no-change behavior seems valid, as comment seems to imply read-only behavior, except for producer and maybe stripping for merge, unless it carries something you need for merge (:-))), right? People should go for gls and mda, as these should allow for more varied handling. gls module should be enforced for native terminology transfer.. And we should explicitly forbid (via definitions and PRs) misuse of mda and notes for transferring actual resources (definition of metadta vs. payload, vs. resource, as a type of payload). For that we have matches and gls. Although matches and gls are not in core, it should be absolutely clear that no other native method for transfering termonology and segment level matches is allowed, INCLUDING copying matches or source content into target. XLIFF enrichers of all sorts should play with matches, and gls, as target is for translation editors and must not be overloaded. We saw how target only is not enough for managing translation suggestions lifecycle in so many cases, prominently po files AND GTT. Rgds dF Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Thu, Nov 15, 2012 at 8:12 PM, Ryan King < ryanki@microsoft.com > wrote: Thanks Yves, I agree with all of your comments. I take your point about the use of “editor”. That was used as an illustrative example only, and the concerns raised are valid for any user agent. I also agree with what you said about “displaying” metadata, and that’s why the wording was “The most we can hope for” J . In the case of displaying notes, however, in the context of an editor, my natural reaction was to assume that they would be displayed, but this is exactly part of the problem: I’m not sure what “supported in core” means for elements, like notes, that there isn’t already loc industry tribal knowledge about how to handle them, like “translation unit source and target” and “Match candidates”, etc.   Ryan     From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Yves Savourel Sent: Thursday, November 15, 2012 6:13 AM To: xliff@lists.oasis-open.org Subject: RE: [xliff] [xlif] 2.0 Processing instructions for differentiation   Hi Ryan, all,   I addition to adding processing requirements there may be a need for additional definitions. For example what does the specification call “note”, “metadata”, etc. As you noted, the descriptions are often short and assume some known terms.   More examples could also help in illustrating the intended usage of those elements. That cannot replace something normative obviously, but it may help the users in the expected direction.   Some of the issues you raise seem also related to the lack of expectations on what an editor (as opposed to any user agent) has to do with the elements. For example that notes should be displayed. I’m not sure that is something we can always address. Things like metadata, by nature, are rather generic and it would be difficult to expect tools to have an expected behavior (e.g. displaying them) that works with all of them. In addition, we can’t really have requirements not directly related to the file format (input/output) as it’s not enforceable.   This said, we indeed have no explicit information about what one can do with something like a note. Technically it means we have to fall back to the general PR, which is “preserve” the element. So one could read the specification in a way that say: once something is in the file it cannot be changed.   In general, I agree with your advice to go through the specification and comparing it to a tool’s use of 1.2. It would also be really nice to have test implementations that actually verify that all those processing requirements are doable. This is the one thing I have a hard time to go along with: How could the TC stamp a specification without having several working implementations that demonstrate it works. It’s like delivering a software without running any test on it.   cheers, -yves     From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Wednesday, November 14, 2012 9:17 PM To: xliff@lists.oasis-open.org Subject: [xliff] [xlif] 2.0 Processing instructions for differentiation   In anticipation of closing down on 2.0, we recently went through an exercise where we mapped our 1.2 implementation (and challenges) to 2.0 to see if there were any gaps or issues. This exercise led to some usage observations and questions on the current spec, some new proposals for core and existing modules, and some proposals for new modules. We’ll tackle the usage observations and questions in this mail and send follow-up emails for the new proposals.   First of all, we would highly recommend that folks in the TC go through the same exercise to see if there are gaps or issues you identify in your tools and processes. And now on to our observations and questions…   With the exception of inline content, the spec seems to be slim on processing instructions for core elements/attributes and modules. Without a good set of processing instructions to differentiate usage between and among core elements/attributes and modules, an OM or an editor could treat them very differently from one implementation to another.   As an example, what is the fundamental difference between <notes> and <mda:metadata>? Because <notes> are core, they must be supported by an Editor and preserved. Because <mda:metadata> is a module, an editor can choose to not to support it, but if it does, it must preserve it. 2.2.2.8 states that <notes> contains a “Collection of Comments”. So, if we have metadata that we want supported for sure, say the string category to display to the user, chances are we won’t do this:   <mda:metadata>   <meta type="string-category">Text Box</meta> </mda:metadata>   But this instead, we’ll consider string category just to be a comment, which perhaps our extractor has derived from metadata in my original file and automatically added to our XLIFF, and do this:   <notes>   <note id=”string-category">Text Box</note> </notes>   Which begs the question, what is to stop a user from never using <mda:metadata> and overloading <notes> with their definition of what a “comment” is to make sure their metadata gets supported?   For another example, what is the fundamental difference between <mda:metadata>, <mtc:match>, and <gls:glossary>? Without processing instructions to give my editor some idea of what to do with each of these, <mtc:match> and <gls:glossary> could just be alternate methods for storing specific types metadata. Now, you could argue that there are inherent processing instructions in <mtc:matches> because by its nature, it is data that everyone knows should be used as translation candidates and similarity and type provide some clues to judge its reuse (so hopefully tool developers will support it and get it right unlike <alt-trans>). But how do different editors consistently treat in the same way this differently…   <gls:glossary>   <glossentry>     <term>hello</term>     <definition>A nice greeting.</definition>      <translation>Hola</translation>   </glossentry> <gls:glossay>   …from this…   <mda:metadata>   <meta type="term">hello</meta>   <meta type=”definition”>A nice greeting.</meta>   <meta type=”translation”>Hola</meta> </mda:metadata>   …or even this…   <notes>   <note id="term">hello</meta>   <note id=”definition”>A nice greeting.</meta>   <note id=”translation”>Hola</meta> </notes>   At least in the case of <notes>, we know editors will support it. The most we can hope for in the other two is that editors will displayed them. And in fact, we’re not even sure what “supported” means with <notes> other than display it because there are no processing instructions defined for it in the spec. Can we edit a <note>? Can we delete a <note>? Can we add a <note>? Can we reorder <note> elements?   So once again, without a good set of processing instructions to differentiate usage, it seems it will be wholly up to an OM or an Editor to decide on their own how to treat them, which may make the specification too broad.   Let us know what your opinions on the matter are.   Thanks, Microsoft Corporation (Ryan King, Kevin O'Donnell, Uwe Stahlschmidt, Alan Michael)   ---  End Message  ---


  • 14.  RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)

    Posted 01-18-2013 22:50
    Hi David, Fredrik and ALL,   Although we discussed the following changes to  <notes> in this thread: <notes>   <note id=”1” priority=”1” category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z” any:custom=”blah” >Don’t localize Windows</note> </note> There was an earlier discussion about proposed change track information on <source> and <target>, where Yves suggested a Change Track module could be used for those instead, as well as for notes. Therefore, I would like to suggest we remove origin and datetime from the core <note> element and move them to the Change Track module for consistency. Giving us now: <notes>   <note id=”1” priority=”1” category=”instruction” any:custom=”blah”>Don’t localize Windows</note> </note> Please let me know your thoughts.   I will send out the Change Track module proposal separately for discussion.   ryan       From: Dr. David Filip [mailto:David.Filip@ul.ie] Sent: Tuesday, December 11, 2012 4:01 PM To: Ryan King Cc: xliff@lists.oasis-open.org; Estreen, Fredrik Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   +1 as proposed now Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158 facsimile: +353-6120-2734 mailto: david.filip@ul.ie On Tue, Dec 11, 2012 at 11:16 PM, Ryan King < ryanki@microsoft.com > wrote: Thanks David, then I think we (you, Fredrik, and I) are in agreement (adding priority is OK) and I support the extensibility of note as well. Given that, is there anyone on the list who disagrees with this proposal?   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Tuesday, December 11, 2012 2:07 PM To: Ryan King Cc: Estreen, Fredrik; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread. I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping.. Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie   On Tue, Dec 11, 2012 at 6:24 PM, Ryan King < ryanki@microsoft.com > wrote: Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?   Thanks, ryan   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Monday, December 3, 2012 11:59 AM To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good. Agreed. And as stated on another thread…w e suggest the list of additional and optional attributes to be origin, category, datetime.   <notes>   <note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note> </note>   Thanks, ryan .   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Friday, November 30, 2012 5:00 PM To: Estreen, Fredrik Cc: Ryan King; Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)   Thanks for outlining the options, Fredrik,   I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism. On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here. Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..   The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..   Cheers dF     Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie   On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Ryan, David,   How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes in your original sample and one showing a potential use of David’s ITS mapping case.   <notes>   <note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note> </notes>   It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which, if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date: * origin / author – Indicate source of the note * priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example. * type / category – indicate what type / aspect of the data or process the note applies to or annotates. * date – creation or modification date. Which of these it is should be specified.   The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that in this area.   If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.   <notes>   <note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>   <note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note> </notes>   Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that the later does for me. It is more ‘just metadata’.   Regards, Fredrik Estreen   From: Ryan King [mailto: ryanki@microsoft.com ] Sent: den 28 november 2012 21:55 To: Dr. David Filip; Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   David or Frederick, can you give us an XLIFF example of how that would look?   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Wednesday, November 28, 2012 12:50 PM To: Estreen, Fredrik Cc: Rodolfo M. Raya; xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way. Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.   Cheers dF Dr. David Filip ======================= LRC CNGL LT-Web CSIS University of Limerick, Ireland telephone:  +353-6120-2781 cellphone: +353-86-0222-158 facsimile:  +353-6120-2734 mailto: david.filip@ul.ie   On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik < Fredrik.Estreen@lionbridge.com > wrote: Hi Rodolfo, Ryan,   I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.   I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want to keep the basic notes feature it could be either or a mix of the two methods.   Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue. Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.   Regards, Fredrik Estreen   From: xliff@lists.oasis-open.org [mailto: xliff@lists.oasis-open.org ] On Behalf Of Rodolfo M. Raya Sent: den 28 november 2012 09:32 To: xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Still a bad use case that doesn’t justify ruining a good design.   Regards, Rodolfo -- Rodolfo M. Raya       rmraya@maxprograms.com Maxprograms       http://www.maxprograms.com   From: Ryan King [ mailto:ryanki@microsoft.com ] Sent: Wednesday, November 28, 2012 5:32 AM To: Rodolfo M. Raya; < xliff@lists.oasis-open.org >; Yves Savourel Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again, is the use case for adding more than one <notes> per extension:   Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module. We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together, such as in an editor UI. Here are some examples:   <notes name="comments" >   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=“origin">developer</note>   <note id=”priority”>1</note> </notes>   <notes name="instructions" >   <note id=“instruction">Do not localize the product name</note>   <note id=“origin">loc-engineer</note>   <note id=”priority”>2</note> </notes>   As opposed to something less structured and more difficult to process:   <notes>   <note id=“instruction">Do not localize the product name</note>   <note id=“instruction-origin">loc-engineer</note>   <note id=”instructions-priority”>1</note>   <note id=“comment">This string cannot be longer than 100 characters</note>   <note id=”comment-priority”>2</note> </notes>   Thanks, Ryan   From: Rodolfo M. Raya [ mailto:rmraya@maxprograms.com ] Sent: Tuesday, November 27, 2012 5:41 PM To: Ryan King Cc: Yves Savourel; < xliff@lists.oasis-open.org > Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Please don't ruin te design for <notes>. Only one should be allowed per insertion point.   Regards, Rodolfo Sent from my iPad On Nov 27, 2012, at 9:45 PM, "Ryan King" < ryanki@microsoft.com > wrote: Hi Yves, in last week’s TC call it was mentioned that I should work with the owners of the current features to get our requirements implemented for proposals that weren’t deemed as features. I believe you are the owner for the matches module and notes. Can you please let me know what we need to do to move forward with getting these implemented?   ·          Proposal 2: Be able to specify optional custom values for match type in <mtc:matches> ·          Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow zero, one or more <notes> in each position in the tree structure)   Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec, can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:   1.       That we allow zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data. 2.       Add an optional attribute reference=”yes no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:   <mtc:matches>   <mtc:match reference=”yes”>    <segment>     <target xml:lang=”es-es”>hola mundo</target>    </segment>   </mtc:match> </match>       I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that if false.   Please let me know how I can work with you on these. Ryan   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Ryan King Sent: Friday, November 16, 2012 5:02 PM To: Dr. David Filip; Yves Savourel; xliff@lists.oasis-open.org Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals   Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the wiki.   From: xliff@lists.oasis-open.org [ mailto:xliff@lists.oasis-open.org ] On Behalf Of Dr. David Filip Sent: Thursday, November 15, 2012 5:24 PM To: Yves Savourel Cc: xliff@lists.oasis-open.org Subject: Re: [xliff] 1.2 to 2.0 Gaps and Proposals   Yves, Ryan et. al.   Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel < ysavourel@enlaso.com > wrote: Hi Ryan, all, > Proposal 1: Add an optional build attribute to 2.0 <file> element in core. > .. > <file id=”1” original=”mainUI.resx” build="2011-11-23-133615307_windc.win8.beta.b01"> I don't see anything wrong with this.   > Proposal 2: Be able to specify optional custom values for match type > attribute in the <mtc:matches> module. > Content providers and Localization Suppliers base their cost and billing > models on match similarity and match types. Localization suppliers charge > us differently for ICE Matches, Exact Matches, and Fuzzy Matches, and we > might even want to get more granular than that as our cost and billing models > evolve with the business. > In 2.0, the match type doesn’t support the values exact-match and fuzzy-match, > which were defined in the state-qualifier attribute in 1.2. Instead of supporting > these two, or any others that may not have migrated from 1.2 to 2.0, > as a separate attribute, the request is, that like the discussion on state > and sub-state in the Face-to-Face in Seattle, we add a sub-type to match type. > This will allow us to add extra business logic to types, such as "tm" or "mt", > which are already defined in the spec. > <match id=”1” similarity=”100.0” type=”tm/xlf:exact”> > <match id=”1” similarity=”75.0” type=”tm/xlf:fuzzy”> > <match id=”1” similarity=”99.0” type=”tm/custom:near-exact”> I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. The example however, shows (I think) that you are thinking about categories that could be mapped differently to the similarity depending on projects. For example in one project a near-match corresponds to one range and in another to a different range, and you want to simply map that info to something common across your process, without having to carry the ranges around. If that's the case I wonder if XLIFF should define any default like xlf:exact, etc. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..   So I wouldn't see a problem with a sub-type there. A side comment on the match type: especially, if we allow sub-type, I'm still not sure about the values currently listed.   [Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either. Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):   <match id=”1” similarity=”100.0” type=”tm/xlf:exact”>   <match id=”1” similarity=”100.0” type=”ice”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has the requirement of being 100% and signed off or final. > Proposal 3: Add an optional Reference Language to core. > This is a crucial feature for Microsoft and other large companies that localize > minority languages. For example, it is typical that when we localize from > English into Quechua, localizers are more efficient and provide much higher > quality translation, when along with English source, we provide them with > Spanish target. In 1.2, Reference Languages could be defined in > an <alt-trans> element: I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). Could that be part of the match module? Possibly with a new attribute (e.g. reference='yes no' defaulting to no) Adding something along with <source>/<target> is bound to cause additional PR issues. If it's part of the Match module, it just uses whatever the module PRs are.   I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria   [Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as opposed to recycling. > Proposal 4: Add an optional name attribute on <notes> in core > and <mds:metadata> module. > We believe it will be typical for content providers to want to > ... > <notes name="comments"> >  <note id=“comment">This string cannot be longer than 100 characters</note> >  <note id=“user"> Developer@microsoft.com </note> >  <note id=“date">10/21/2012 5:28:13 PM</note> > </notes> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. The example makes me wonder about the long term life of XLIFF though: likely this type of info (author, timestamp) will be needed by other. Maybe a better way to address it would be to add attributes to the note and meta that carry the author and time stamp? That would obviously work only if those two info are the only example you have in mind.   I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other namespaces.   [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.   > Proposal 5: Add optional change tracking attributes to <segment>. > ... > <segment id=”1” modifiedBy=” translator@loc.com ” > modifiedDate=”10/21/2012 5:28:13 PM”> >    <source>hello world</source> >    <target>hola món</target> > </segment> Here again I'm wondering if a "change track" module may be better? You could use it not just on segments but other elements: notes. The issue then would be how this gets updated if it's not a core component? Actually if it's a core attribute, does it means it's not optional? I'm not sure there is a way, even with a PR, to guarantee these data will be up-to-date. But maybe that's ok?   Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it, it feels more like a module...   [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements, such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents. cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail: xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail: xliff-help@lists.oasis-open.org