OASIS XML Localisation Interchange File Format (XLIFF) TC

  • 1.  Resolution of your public review comments: 030, 031, 033, 034

    Posted 08-30-2013 19:54
    Chase

    I've been scanning our comment list to be sure that we've closed the loop on all of the very useful comments we received from you. I can see that members of the TC have been in fruitful discussion about your comments, questions, and suggestions. But there are a few of your items we've resolved, but I'm not sure that we've communicated our resolution to you. So just to be sure we close on all of the issues you've raised regarding our XLIFF 2.0 first public review, I want to take a moment to address those for which I could not find what I would consider a complete communication back to you. Please forgive me if any of this is redundant (in the case where you may have been replied to - but I just didn't see it).

    I will use the comment numbers we've documented in our wiki tracker https://wiki.oasis-open.org/xliff/XLIFF%202.0%20Public%20Review%20submitted%20comments%20tracker

    030: You had three comments on the basic structure. part 1, we clarified the <file> and <group> granularity (http://markmail.org/thread/7eyc7565iqhhv3mu) part 2, confusion about <ignorable>, Ryan expressed the same opinion, and the TC resolved it with a better definition (https://lists.oasis-open.org/archives/xliff-comment/201307/msg00016.html), part 3, the use of @fs on notes, I already communicated the TC's consensus on this (https://lists.oasis-open.org/archives/xliff-comment/201308/msg00010.html)


    031: You had three questions around the SLR module. We spent a good deal of time looking and came to consensus about naming and improved processing requirements (https://www.oasis-open.org/apps/org/workgroup/xliff/email/archives/201308/msg00090.html)

    033: You raised several issues around the Validation module that the TC spent a lot of time grappling with. In the end we improved this module and came to consensus about how to resolve your issues (and the similar issues, 057 and 061). They are summarized by Ryan and company (https://lists.oasis-open.org/archives/xliff/201307/msg00099.html).

    034: You asked questions about the Matches module. This lead to several rounds of debate in the TC. But we ultimately agreed to make id on matches compulsory, clarify on suitability attributes, introduce pointing from inline annotations, and remove mentions of <segment> (as result of the resegmentation changes), a high level summary is here (http://markmail.org/thread/2mqkks4n6aftodjm).

    I know this is a bit scattered. I hope it communicates the direction the TC feels is best, based on your very much appreciated inputs.

    Thanks again for taking the time to help us to put out the best XLIFF 2.0 we can. Please continue to call on us with any questions, concerns, or feedback.

    Thanks,

    Bryan




  • 2.  RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 08-31-2013 16:45
    Hi Bryan,

    Regading point 033 below, I've just read the Validation module as published with datestamp August 20, 2013 and I seem to still have a couple of doubts regarding the use cases of the validation module:

    1. As described in the spec, attributes like isPresent, isNotPresent, startWith, and endsWith take substring. Are these substrings case-sensitive? If so, how can they be made case-insenstive? If not, how can they be made case-sensitive?

    2. Examples of usage in the spec for isPresent show these substrings matching whole words, but it is a coincidence. I understand by the spec that it you have <rule isPresent="iphone"> and <target> contains "triphone", you pass the validation rule for that target segment. Is it correct to understand that in spite of examples showing usage with whole words, to check the isPresent validation one must not consider it a whole word.

    3. I am under the impression that regular expression-like rules might be needed in several production scenarios. However, it is not fully clear to me how one can use custom rules to extend the validation rules syntax with, for example, one supporting a RegEx-like grammar. For custom rules, in the list of constraints, the spec reads "a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element.". Could you please provide an example of how one could define custom rules with an alternative (RegEx-like) grammar in a way that is still compliant with the validation module?

    Regards,
    Josep.




  • 3.  RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-02-2013 03:41
    Hi Josep,

    Thank you very much for providing the very clear and very useful feedback.

    We will consider these points and get back to you as soon as we finish our analysis.

    I will ask Ryan to take a lead on this, and to reply to you as we arrive at a conclusion.

    Thank you,

    Bryan
    ________________________________________
    From: Josep Condal [pcondal@apsic.com]
    Sent: Saturday, August 31, 2013 9:45 AM
    To: Schnabel, Bryan S
    Cc: xliff-comment@lists.oasis-open.org
    Subject: RE: Resolution of your public review comments: 030, 031, 033, 034

    Hi Bryan,

    Regading point 033 below, I've just read the Validation module as published with datestamp August 20, 2013 and I seem to still have a couple of doubts regarding the use cases of the validation module:

    1. As described in the spec, attributes like isPresent, isNotPresent, startWith, and endsWith take substring. Are these substrings case-sensitive? If so, how can they be made case-insenstive? If not, how can they be made case-sensitive?

    2. Examples of usage in the spec for isPresent show these substrings matching whole words, but it is a coincidence. I understand by the spec that it you have <rule isPresent="iphone"> and <target> contains "triphone", you pass the validation rule for that target segment. Is it correct to understand that in spite of examples showing usage with whole words, to check the isPresent validation one must not consider it a whole word.

    3. I am under the impression that regular expression-like rules might be needed in several production scenarios. However, it is not fully clear to me how one can use custom rules to extend the validation rules syntax with, for example, one supporting a RegEx-like grammar. For custom rules, in the list of constraints, the spec reads "a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element.". Could you please provide an example of how one could define custom rules with an alternative (RegEx-like) grammar in a way that is still compliant with the validation module?

    Regards,
    Josep.




  • 4.  Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-02-2013 10:15
    Ryan, all,

    I suggest that we address Josep's feedback with adding an optional
    "caseSensitive" flag with default "yes"


    The whole-world-vs-substring-case seems a candidate for a custom extension
    regex based rule, but is also partially addressed with the case sensitivity
    IMHO, so I would not try to introduce a "wholeWord" flag to address it. It
    would feel very ad hoc and would smuggle a need for regex support into the
    basic module implementation..

    If this seems OK with everyone and Ryan can implement this today, I can
    print it in time for the meeting tomorrow and we can OK the solution online
    before we go for the csd and scprd ballot.

    @Josep, do you think that the above would address your concerns?


    Regarding your third comment,
    This constraint

    Constraints
    • Exactly one of the following attributes:
    • isPresent
    • isNotPresent
    • startsWith
    • endsWith
    • a custom rule defined by attributes from any namespace
    is REQUIRED in any one <rule> element.


    simply says that any one rule must EITHER use one of the predefined rule
    attributes or contain a custom made rule.

    If you want to do a regex based validation, you can simply define a custom
    rule including the required regex engine, it would be carried by the same
    <rule> element as other rules..

    Rgds
    dF





    David Filip, Ph.D.
    =====================
    *cellphone: +353-86-0222-158*
    mailto:davidf@davidf.org
    www.davidf.org, http://www.linkedin.com/in/davidfatdavidf


    On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal <pcondal@apsic.com> wrote:

    > Hi Bryan,
    >
    > Regading point 033 below, I've just read the Validation module as
    > published with datestamp August 20, 2013 and I seem to still have a couple
    > of doubts regarding the use cases of the validation module:
    >
    > 1. As described in the spec, attributes like isPresent, isNotPresent,
    > startWith, and endsWith take substring. Are these substrings
    > case-sensitive? If so, how can they be made case-insenstive? If not, how
    > can they be made case-sensitive?
    >
    > 2. Examples of usage in the spec for isPresent show these substrings
    > matching whole words, but it is a coincidence. I understand by the spec
    > that it you have <rule isPresent="iphone"> and <target> contains
    > "triphone", you pass the validation rule for that target segment. Is it
    > correct to understand that in spite of examples showing usage with whole
    > words, to check the isPresent validation one must not consider it a whole
    > word.
    >
    > 3. I am under the impression that regular expression-like rules might be
    > needed in several production scenarios. However, it is not fully clear to
    > me how one can use custom rules to extend the validation rules syntax with,
    > for example, one supporting a RegEx-like grammar. For custom rules, in the
    > list of constraints, the spec reads "a custom rule defined by attributes
    > from any namespace is REQUIRED in any one <rule> element.". Could you
    > please provide an example of how one could define custom rules with an
    > alternative (RegEx-like) grammar in a way that is still compliant with the
    > validation module?
    >
    > Regards,
    > Josep.
    >
    >


  • 5.  Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-02-2013 10:15
    Ryan, all, I suggest that we address Josep's feedback with adding an optional "caseSensitive" flag with default "yes" The whole-world-vs-substring-case seems a candidate for a custom extension regex based rule, but is also partially addressed with the case sensitivity IMHO, so I would not try to introduce a "wholeWord" flag to address it. It would feel very ad hoc and would smuggle a need for regex support into the basic module implementation.. If this seems OK with everyone and Ryan can implement this today, I can print it in time for the meeting tomorrow and we can OK the solution online before we go for the csd and scprd ballot. @Josep, do you think that the above would address your concerns? Regarding your third comment, This constraint Constraints • Exactly one of the following attributes: • isPresent • isNotPresent • startsWith • endsWith • a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element. simply says that any one rule must EITHER use one of the predefined rule attributes or contain a custom made rule. If you want to do a regex based validation, you can simply define a custom rule including the required regex engine, it would be carried by the same <rule> element as other rules.. Rgds dF David Filip, Ph.D. ===================== cellphone: +353-86-0222-158 mailto: davidf@davidf.org www.davidf.org , http://www.linkedin.com/in/davidfatdavidf On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal < pcondal@apsic.com > wrote: Hi Bryan, Regading point 033 below, I've just read the Validation module as published with datestamp August 20, 2013 and I seem to still have a couple of doubts regarding the use cases of the validation module: 1. As described in the spec, attributes like isPresent, isNotPresent, startWith, and endsWith take substring.  Are these substrings case-sensitive? If so, how can they be made case-insenstive? If not, how can they be made case-sensitive? 2. Examples of usage in the spec for isPresent show these substrings matching whole words, but it is a coincidence.  I understand by the spec that it you have <rule isPresent="iphone"> and <target> contains "triphone", you pass the validation rule for that target segment.  Is it correct to understand that in spite of examples showing usage with whole words, to check the isPresent validation one must not consider it a whole word. 3. I am under the impression that regular _expression_-like rules might be needed in several production scenarios. However, it is not fully clear to me how one can use custom rules to extend the validation rules syntax with, for example, one supporting a RegEx-like grammar.  For custom rules, in the list of constraints, the spec reads "a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element.".  Could you please provide an example of how one could define custom rules with an alternative (RegEx-like) grammar in a way that is still compliant with the validation module? Regards, Josep.


  • 6.  RE: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-02-2013 11:42
    Hi David,

    Thank you for your clarifications.

    Regarding the goal of the module being basic as a design goal, while it is obviously a good goal, I brought it up because I found that it did not seem to have enough features to support quite a few real-life scenarios. In other words, I believe that while being basic, it has the risk of not reaching a usefulness threshold that can make it or break it.

    If it is basic but not perceived as useful enough by users, it is true that users have the option of not using it (no apparent harm), but implementors will have still to invest time supporting it if they want to be compliant with XLIFF 2.0. Granted it's not too hard to implement support for it as currently defined, but if users will be reluctant to use it because it does not reach a certain threshold of usefulness, I think it hinders implementability because the likeness of user adoption is potentially low (from my point of view).

    Since with ApSIC Xbench (mostly seen as a QA tool and therefore related to validation) we've had an evolution of what users have asked for validation, Id' like to share with you our experience over years in case it is useful to define/decide what the basic implementation threshold for an XLIFF 2.0 validation module must be.

    Stage 1. At the very first Xbench only searched for terminology in substring mode, either source and target. Mandatory features at this very early stage of evolution were "case sensitive" and "whole word". Actually, this was before the concept of QA was introduced in Xbench. At first we were about terminology in its context and only terminology, but even then "case sensitive" and "whole word" was a must at that point or you did not have enough specification precision for shorter strings.

    Stage 2. When a few years later we introduced QA as a feature in Xbench, we added a conditional layer we called "PowerSearch". It simply added the "AND", "OR" and "NOT" operators to the search with parenthesys as means to override priority. We used opted to use Google search approach for that (that is operator "AND" is implicit, "OR" is "or", and "NOT" is a minus sign in front of a string), but any other AND/OR/NOT approach would work.

    Stage 3. When QA as an Xbench feature became more important as users got more serious about automated validation, we had to introduce regular expressions and variables. And even at this stage, there are still a few features that we will implement in the future.

    Per our feeback from users, I have the impression that anything that does not have at least the features of both 1. and 2. above does not reach the threshold of being useful enough for wide implementation so my recomendation would be to put at least all these features in the most basic approach. I believe that if you do not implement a conditional layer, I think that it will be too prone to false alarms (and false alarms are a turn off for users adoption).

    If after discussing it in the TC you opt to go not beyond 1. for a basic approach, I suggest that you do not skip "whole word" implementation as I feel it can be important for basic precision.

    I recognize there are many open fronts in the XLIFF definition and that there are time and resource limitations, and I acknoledge that I can be more prone to overrate the importance of automated QA (because we get frequently user feedback on this topic). Still I hope it is helpful as input for you to make decisions on how to implement the basic form lf the validation module.

    Regards,
    Josep.

    ________________________________
    From: David Filip [mailto:davidf@davidf.org]
    Sent: lunes, 02 de septiembre de 2013 12:15
    To: Ryan King; xliff@lists.oasis-open.org
    Cc: Schnabel, Bryan S; xliff-comment@lists.oasis-open.org; Josep Condal
    Subject: Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Ryan, all,

    I suggest that we address Josep's feedback with adding an optional
    "caseSensitive" flag with default "yes"


    The whole-world-vs-substring-case seems a candidate for a custom extension regex based rule, but is also partially addressed with the case sensitivity IMHO, so I would not try to introduce a "wholeWord" flag to address it. It would feel very ad hoc and would smuggle a need for regex support into the basic module implementation..

    If this seems OK with everyone and Ryan can implement this today, I can print it in time for the meeting tomorrow and we can OK the solution online before we go for the csd and scprd ballot.

    @Josep, do you think that the above would address your concerns?


    Regarding your third comment,
    This constraint

    Constraints
    * Exactly one of the following attributes:
    * isPresent
    * isNotPresent
    * startsWith
    * endsWith
    * a custom rule defined by attributes from any namespace
    is REQUIRED in any one <rule> element.

    simply says that any one rule must EITHER use one of the predefined rule attributes or contain a custom made rule.

    If you want to do a regex based validation, you can simply define a custom rule including the required regex engine, it would be carried by the same <rule> element as other rules..

    Rgds
    dF





    David Filip, Ph.D.
    =====================
    cellphone: +353-86-0222-158
    mailto:davidf@davidf.org<mailto:davidf@davidf.org>
    www.davidf.org<http://www.davidf.org>, http://www.linkedin.com/in/davidfatdavidf


    On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal <pcondal@apsic.com<mailto:pcondal@apsic.com>> wrote:
    Hi Bryan,

    Regading point 033 below, I've just read the Validation module as published with datestamp August 20, 2013 and I seem to still have a couple of doubts regarding the use cases of the validation module:

    1. As described in the spec, attributes like isPresent, isNotPresent, startWith, and endsWith take substring. Are these substrings case-sensitive? If so, how can they be made case-insenstive? If not, how can they be made case-sensitive?

    2. Examples of usage in the spec for isPresent show these substrings matching whole words, but it is a coincidence. I understand by the spec that it you have <rule isPresent="iphone"> and <target> contains "triphone", you pass the validation rule for that target segment. Is it correct to understand that in spite of examples showing usage with whole words, to check the isPresent validation one must not consider it a whole word.

    3. I am under the impression that regular expression-like rules might be needed in several production scenarios. However, it is not fully clear to me how one can use custom rules to extend the validation rules syntax with, for example, one supporting a RegEx-like grammar. For custom rules, in the list of constraints, the spec reads "a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element.". Could you please provide an example of how one could define custom rules with an alternative (RegEx-like) grammar in a way that is still compliant with the validation module?

    Regards,
    Josep.




  • 7.  Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-02-2013 12:14
    Thanks Josep,

    this is very useful, I can see where you are coming from and I don't
    disagree.
    The logician and computer scientist in me would love to have 1 and 2
    features in there.

    The secretary and editor wants the spec progress, so that is why I
    suggested the one simple fix for XLIFF 2.0

    Re regex, this is tricky, and I am convinced that this is stuff for
    extensions really, at least at the current state of the art.

    2013 had seen very robust discussion on use of regex in localization
    standards, not only XLIFF but also ITS 2.0, and ETSI LIS (as I hear from
    its Chair Joachim Schurig)
    IMHO the minimum interoperable regex subset is yet to be defined, but I am
    afraid it can take many months if not years.

    Some sort of regex support may make it into XLIFF 2.x (hopefully 2.1) with
    the planned ITS module. For now I believe that regex and logical operands
    need to be defined as user extensions.

    This is not a bad thing per se.

    Modules are optional (per definition) and XLIFF 2.0 was designed in modular
    way to allow for features to have a life cycle and to allow for a
    relatively quick production of 2.x versions.

    If a user extension is useful and widespread, it can initiate creation of a
    module, enhance a module, or even get into the core and vice versa, core
    stuff that proved not so useful, can be moved into optional modules, not
    useful module can be deprecated or superseded in 2.x versions without
    impacting the core interoperability.

    Finally, the current validation module being basic in design is driven
    mainly by its main proponent Microsoft and is designed to cater for their
    production need, the design has been OKed by other TC members including
    corporations, service providers and tool makers, so I would not be too
    worried that it is too basic to reach a certain usefulness threshold..

    Anyways, it would be great if you could be interested in joining the TC to
    work on a 2.x version of the validation module :-)


    Cheers
    dF









    David Filip, Ph.D.
    =====================
    *cellphone: +353-86-0222-158*
    mailto:davidf@davidf.org
    www.davidf.org, http://www.linkedin.com/in/davidfatdavidf


    On Mon, Sep 2, 2013 at 12:42 PM, Josep Condal <pcondal@apsic.com> wrote:

    > **
    > Hi David,
    >
    > Thank you for your clarifications.
    >
    > Regarding the goal of the module being basic as a design goal, while it is
    > obviously a good goal, I brought it up because I found that it did not seem
    > to have enough features to support quite a few real-life scenarios. In
    > other words, I believe that while being basic, it has the risk of not
    > reaching a usefulness threshold that can make it or break it.
    >
    > If it is basic but not perceived as useful enough by users, it is true
    > that users have the option of not using it (no apparent harm), but
    > implementors will have still to invest time supporting it if they want to
    > be compliant with XLIFF 2.0. Granted it's not too hard to implement support
    > for it as currently defined, but if users will be reluctant to use it
    > because it does not reach a certain threshold of usefulness, I think it
    > hinders implementability because the likeness of user adoption is
    > potentially low (from my point of view).
    >
    > Since with ApSIC Xbench (mostly seen as a QA tool and therefore related to
    > validation) we've had an evolution of what users have asked for validation,
    > Id' like to share with you our experience over years in case it is useful
    > to define/decide what the basic implementation threshold for an XLIFF 2.0
    > validation module must be.
    >
    > Stage 1. At the very first Xbench only searched for terminology in
    > substring mode, either source and target. Mandatory features at this very
    > early stage of evolution were "case sensitive" and "whole word". Actually,
    > this was before the concept of QA was introduced in Xbench. At first we
    > were about terminology in its context and only terminology, but even then
    > "case sensitive" and "whole word" was a must at that point or you did not
    > have enough specification precision for shorter strings.
    >
    > Stage 2. When a few years later we introduced QA as a feature in Xbench,
    > we added a conditional layer we called "PowerSearch". It simply added the
    > "AND", "OR" and "NOT" operators to the search with parenthesys as means to
    > override priority. We used opted to use Google search approach for that
    > (that is operator "AND" is implicit, "OR" is "or", and "NOT" is a minus
    > sign in front of a string), but any other AND/OR/NOT approach would work.
    >
    > Stage 3. When QA as an Xbench feature became more important as users got
    > more serious about automated validation, we had to introduce regular
    > expressions and variables. And even at this stage, there are still a few
    > features that we will implement in the future.
    >
    > Per our feeback from users, I have the impression that anything that does
    > not have at least the features of both 1. and 2. above does not reach the
    > threshold of being useful enough for wide implementation so my
    > recomendation would be to put at least all these features in the most basic
    > approach. I believe that if you do not implement a conditional layer, I
    > think that it will be too prone to false alarms (and false alarms are a
    > turn off for users adoption).
    >
    > If after discussing it in the TC you opt to go not beyond 1. for a basic
    > approach, I suggest that you do not skip "whole word" implementation as I
    > feel it can be important for basic precision.
    >
    > I recognize there are many open fronts in the XLIFF definition and that
    > there are time and resource limitations, and I acknoledge that I can be
    > more prone to overrate the importance of automated QA (because we get
    > frequently user feedback on this topic). Still I hope it is helpful as
    > input for you to make decisions on how to implement the basic form lf the
    > validation module.
    >
    > Regards,
    > Josep.
    >
    > ------------------------------
    > *From:* David Filip [mailto:davidf@davidf.org]
    > *Sent:* lunes, 02 de septiembre de 2013 12:15
    > *To:* Ryan King; xliff@lists.oasis-open.org
    > *Cc:* Schnabel, Bryan S; xliff-comment@lists.oasis-open.org; Josep Condal
    > *Subject:* Re: [xliff-comment] RE: Resolution of your public review
    > comments: 030, 031, 033, 034
    >
    > Ryan, all,
    >
    > I suggest that we address Josep's feedback with adding an optional
    > "caseSensitive" flag with default "yes"
    >
    >
    > The whole-world-vs-substring-case seems a candidate for a custom
    > extension regex based rule, but is also partially addressed with the case
    > sensitivity IMHO, so I would not try to introduce a "wholeWord" flag to
    > address it. It would feel very ad hoc and would smuggle a need for regex
    > support into the basic module implementation..
    >
    > If this seems OK with everyone and Ryan can implement this today, I can
    > print it in time for the meeting tomorrow and we can OK the solution online
    > before we go for the csd and scprd ballot.
    >
    > @Josep, do you think that the above would address your concerns?
    >
    >
    > Regarding your third comment,
    > This constraint
    >
    > Constraints
    > • Exactly one of the following attributes:
    > • isPresent
    > • isNotPresent
    > • startsWith
    > • endsWith
    > • a custom rule defined by attributes from any namespace
    > is REQUIRED in any one <rule> element.
    >
    >
    > simply says that any one rule must EITHER use one of the predefined rule
    > attributes or contain a custom made rule.
    >
    > If you want to do a regex based validation, you can simply define a
    > custom rule including the required regex engine, it would be carried by the
    > same <rule> element as other rules..
    >
    > Rgds
    > dF
    >
    >
    >
    >
    >
    > David Filip, Ph.D.
    > =====================
    > *cellphone: +353-86-0222-158*
    > mailto:davidf@davidf.org
    > www.davidf.org, http://www.linkedin.com/in/davidfatdavidf
    >
    >
    > On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal <pcondal@apsic.com> wrote:
    >
    >> Hi Bryan,
    >>
    >> Regading point 033 below, I've just read the Validation module as
    >> published with datestamp August 20, 2013 and I seem to still have a couple
    >> of doubts regarding the use cases of the validation module:
    >>
    >> 1. As described in the spec, attributes like isPresent, isNotPresent,
    >> startWith, and endsWith take substring. Are these substrings
    >> case-sensitive? If so, how can they be made case-insenstive? If not, how
    >> can they be made case-sensitive?
    >>
    >> 2. Examples of usage in the spec for isPresent show these substrings
    >> matching whole words, but it is a coincidence. I understand by the spec
    >> that it you have <rule isPresent="iphone"> and <target> contains
    >> "triphone", you pass the validation rule for that target segment. Is it
    >> correct to understand that in spite of examples showing usage with whole
    >> words, to check the isPresent validation one must not consider it a whole
    >> word.
    >>
    >> 3. I am under the impression that regular expression-like rules might be
    >> needed in several production scenarios. However, it is not fully clear to
    >> me how one can use custom rules to extend the validation rules syntax with,
    >> for example, one supporting a RegEx-like grammar. For custom rules, in the
    >> list of constraints, the spec reads "a custom rule defined by attributes
    >> from any namespace is REQUIRED in any one <rule> element.". Could you
    >> please provide an example of how one could define custom rules with an
    >> alternative (RegEx-like) grammar in a way that is still compliant with the
    >> validation module?
    >>
    >> Regards,
    >> Josep.
    >>
    >>


  • 8.  Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-02-2013 12:15
    Thanks Josep, this is very useful, I can see where you are coming from and I don't disagree. The logician and computer scientist in me would love to have 1 and 2 features in there. The secretary and editor wants the spec progress, so that is why I suggested the one simple fix for XLIFF 2.0 Re regex, this is tricky, and I am convinced that this is stuff for extensions really, at least at the current state of the art. 2013 had seen very robust discussion on use of regex in localization standards, not only XLIFF but also ITS 2.0, and ETSI LIS (as I hear from its Chair Joachim Schurig) IMHO the minimum interoperable regex subset is yet to be defined, but I am afraid it can take many months if not years. Some sort of regex support may make it into XLIFF 2.x (hopefully 2.1) with the planned ITS module. For now I believe that regex and logical operands need to be defined as user extensions. This is not a bad thing per se. Modules are optional (per definition) and XLIFF 2.0 was designed in modular way to allow for features to have a life cycle and to allow for a relatively quick production of 2.x versions. If a user extension is useful and widespread, it can initiate creation of a module, enhance a module, or even get into the core and vice versa, core stuff that proved not so useful, can be moved into optional modules, not useful module can be deprecated or superseded in 2.x versions without impacting the core interoperability. Finally, the current validation module being basic in design is driven mainly by its main proponent Microsoft and is designed to cater for their production need, the design has been OKed by other TC members including corporations, service providers and tool makers, so I would not be too worried that it is too basic to reach a certain usefulness threshold.. Anyways, it would be great if you could be interested in joining the TC to work on a 2.x version of the validation module :-) Cheers dF David Filip, Ph.D. ===================== cellphone: +353-86-0222-158 mailto: davidf@davidf.org www.davidf.org , http://www.linkedin.com/in/davidfatdavidf On Mon, Sep 2, 2013 at 12:42 PM, Josep Condal < pcondal@apsic.com > wrote: Hi David,   Thank you for your clarifications.   Regarding the goal of the module being basic as a design goal, while it is obviously a good goal, I brought it up because I found that it did not seem to have enough features to support quite a few real-life scenarios. In other words, I believe that while being basic, it has the risk of not reaching a usefulness threshold that can make it or break it.   If it is basic but not perceived as useful enough by users, it is true that users have the option of not using it (no apparent harm), but implementors will have still to invest time supporting it if they want to be compliant with XLIFF 2.0. Granted it's not too hard to implement support for it as currently defined, but if users will be reluctant to use it because it does not reach a certain threshold of usefulness, I think it hinders implementability because the likeness of user adoption is potentially low (from my point of view).   Since with ApSIC Xbench (mostly seen as a QA tool and therefore related to validation) we've had an evolution of what users have asked for validation, Id' like to share with you our experience over years in case it is useful to define/decide what the basic implementation threshold for an XLIFF 2.0 validation module must be.   Stage 1. At the very first Xbench only searched for terminology in substring mode, either source and target.  Mandatory features at this very early stage of evolution were "case sensitive" and "whole word".  Actually, this was before the concept of QA was introduced in Xbench.  At first we were about terminology in its context and only terminology, but even then "case sensitive" and "whole word" was a must at that point or you did not have enough specification precision for shorter strings.   Stage 2. When a few years later we introduced QA as a feature in Xbench, we added a conditional layer we called "PowerSearch".  It simply added the "AND", "OR" and "NOT" operators to the search with parenthesys as means to override priority.  We used opted to use Google search approach for that (that is operator "AND" is implicit, "OR" is "or", and "NOT" is a minus sign in front of a string), but any other AND/OR/NOT approach would work.   Stage 3. When QA as an Xbench feature became more important as users got more serious about automated validation, we had to introduce regular expressions and variables. And even at this stage, there are still a few features that we will implement in the future.   Per our feeback from users, I have the impression that anything that does not have at least the features of both 1. and 2. above does not reach the threshold of being useful enough for wide implementation so my recomendation would be to put at least all these features in the most basic approach. I believe that if you do not implement a conditional layer, I think that it will be too prone to false alarms (and false alarms are a turn off for users adoption).    If after discussing it in the TC you opt to go not beyond 1. for a basic approach, I suggest that you do not skip "whole word" implementation as I feel it can be important for basic precision.   I recognize there are many open fronts in the XLIFF definition and that there are time and resource limitations, and I acknoledge that I can be more prone to overrate the importance of automated QA (because we get frequently user feedback on this topic). Still I hope it is helpful as input for you to make decisions on how to implement the basic form lf the validation module.    Regards, Josep. From: David Filip [mailto: davidf@davidf.org ] Sent: lunes, 02 de septiembre de 2013 12:15 To: Ryan King; xliff@lists.oasis-open.org Cc: Schnabel, Bryan S; xliff-comment@lists.oasis-open.org ; Josep Condal Subject: Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034 Ryan, all, I suggest that we address Josep's feedback with adding an optional "caseSensitive" flag with default "yes" The whole-world-vs-substring-case seems a candidate for a custom extension regex based rule, but is also partially addressed with the case sensitivity IMHO, so I would not try to introduce a "wholeWord" flag to address it. It would feel very ad hoc and would smuggle a need for regex support into the basic module implementation.. If this seems OK with everyone and Ryan can implement this today, I can print it in time for the meeting tomorrow and we can OK the solution online before we go for the csd and scprd ballot. @Josep, do you think that the above would address your concerns? Regarding your third comment, This constraint Constraints • Exactly one of the following attributes: • isPresent • isNotPresent • startsWith • endsWith • a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element. simply says that any one rule must EITHER use one of the predefined rule attributes or contain a custom made rule. If you want to do a regex based validation, you can simply define a custom rule including the required regex engine, it would be carried by the same <rule> element as other rules.. Rgds dF David Filip, Ph.D. ===================== cellphone: +353-86-0222-158 mailto: davidf@davidf.org www.davidf.org , http://www.linkedin.com/in/davidfatdavidf On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal < pcondal@apsic.com > wrote: Hi Bryan, Regading point 033 below, I've just read the Validation module as published with datestamp August 20, 2013 and I seem to still have a couple of doubts regarding the use cases of the validation module: 1. As described in the spec, attributes like isPresent, isNotPresent, startWith, and endsWith take substring.  Are these substrings case-sensitive? If so, how can they be made case-insenstive? If not, how can they be made case-sensitive? 2. Examples of usage in the spec for isPresent show these substrings matching whole words, but it is a coincidence.  I understand by the spec that it you have <rule isPresent="iphone"> and <target> contains "triphone", you pass the validation rule for that target segment.  Is it correct to understand that in spite of examples showing usage with whole words, to check the isPresent validation one must not consider it a whole word. 3. I am under the impression that regular _expression_-like rules might be needed in several production scenarios. However, it is not fully clear to me how one can use custom rules to extend the validation rules syntax with, for example, one supporting a RegEx-like grammar.  For custom rules, in the list of constraints, the spec reads "a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element.".  Could you please provide an example of how one could define custom rules with an alternative (RegEx-like) grammar in a way that is still compliant with the validation module? Regards, Josep.


  • 9.  Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-03-2013 09:16
    Since I did not hear back from Ryan, and since I think that having that
    flag is uncontroversial, I will go ahead and add it for today's meeting.
    I hope that it can be Okayed by the TC before we hopefully finally proceed
    to the csd, csprd vote

    Rgds
    dF


    David Filip, Ph.D.
    =====================
    *cellphone: +353-86-0222-158*
    mailto:davidf@davidf.org
    www.davidf.org, http://www.linkedin.com/in/davidfatdavidf


    On Mon, Sep 2, 2013 at 11:14 AM, David Filip <davidf@davidf.org> wrote:

    > Ryan, all,
    >
    > I suggest that we address Josep's feedback with adding an optional
    > "caseSensitive" flag with default "yes"
    >
    >
    > The whole-world-vs-substring-case seems a candidate for a custom
    > extension regex based rule, but is also partially addressed with the case
    > sensitivity IMHO, so I would not try to introduce a "wholeWord" flag to
    > address it. It would feel very ad hoc and would smuggle a need for regex
    > support into the basic module implementation..
    >
    > If this seems OK with everyone and Ryan can implement this today, I can
    > print it in time for the meeting tomorrow and we can OK the solution online
    > before we go for the csd and scprd ballot.
    >
    > @Josep, do you think that the above would address your concerns?
    >
    >
    > Regarding your third comment,
    > This constraint
    >
    > Constraints
    > • Exactly one of the following attributes:
    > • isPresent
    > • isNotPresent
    > • startsWith
    > • endsWith
    > • a custom rule defined by attributes from any namespace
    > is REQUIRED in any one <rule> element.
    >
    >
    > simply says that any one rule must EITHER use one of the predefined rule
    > attributes or contain a custom made rule.
    >
    > If you want to do a regex based validation, you can simply define a custom
    > rule including the required regex engine, it would be carried by the same
    > <rule> element as other rules..
    >
    > Rgds
    > dF
    >
    >
    >
    >
    >
    > David Filip, Ph.D.
    > =====================
    > *cellphone: +353-86-0222-158*
    > mailto:davidf@davidf.org
    > www.davidf.org, http://www.linkedin.com/in/davidfatdavidf
    >
    >
    > On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal <pcondal@apsic.com> wrote:
    >
    >> Hi Bryan,
    >>
    >> Regading point 033 below, I've just read the Validation module as
    >> published with datestamp August 20, 2013 and I seem to still have a couple
    >> of doubts regarding the use cases of the validation module:
    >>
    >> 1. As described in the spec, attributes like isPresent, isNotPresent,
    >> startWith, and endsWith take substring. Are these substrings
    >> case-sensitive? If so, how can they be made case-insenstive? If not, how
    >> can they be made case-sensitive?
    >>
    >> 2. Examples of usage in the spec for isPresent show these substrings
    >> matching whole words, but it is a coincidence. I understand by the spec
    >> that it you have <rule isPresent="iphone"> and <target> contains
    >> "triphone", you pass the validation rule for that target segment. Is it
    >> correct to understand that in spite of examples showing usage with whole
    >> words, to check the isPresent validation one must not consider it a whole
    >> word.
    >>
    >> 3. I am under the impression that regular expression-like rules might be
    >> needed in several production scenarios. However, it is not fully clear to
    >> me how one can use custom rules to extend the validation rules syntax with,
    >> for example, one supporting a RegEx-like grammar. For custom rules, in the
    >> list of constraints, the spec reads "a custom rule defined by attributes
    >> from any namespace is REQUIRED in any one <rule> element.". Could you
    >> please provide an example of how one could define custom rules with an
    >> alternative (RegEx-like) grammar in a way that is still compliant with the
    >> validation module?
    >>
    >> Regards,
    >> Josep.
    >>
    >>


  • 10.  Re: [xliff-comment] RE: Resolution of your public review comments: 030, 031, 033, 034

    Posted 09-03-2013 09:16
    Since I did not hear back from Ryan, and since I think that having that flag is uncontroversial, I will go ahead and add it for today's meeting. I hope that it can be Okayed by the TC before we hopefully finally proceed to the csd, csprd vote Rgds dF David Filip, Ph.D. ===================== cellphone: +353-86-0222-158 mailto: davidf@davidf.org www.davidf.org , http://www.linkedin.com/in/davidfatdavidf On Mon, Sep 2, 2013 at 11:14 AM, David Filip < davidf@davidf.org > wrote: Ryan, all, I suggest that we address Josep's feedback with adding an optional "caseSensitive" flag with default "yes" The whole-world-vs-substring-case seems a candidate for a custom extension regex based rule, but is also partially addressed with the case sensitivity IMHO, so I would not try to introduce a "wholeWord" flag to address it. It would feel very ad hoc and would smuggle a need for regex support into the basic module implementation.. If this seems OK with everyone and Ryan can implement this today, I can print it in time for the meeting tomorrow and we can OK the solution online before we go for the csd and scprd ballot. @Josep, do you think that the above would address your concerns? Regarding your third comment, This constraint Constraints • Exactly one of the following attributes: • isPresent • isNotPresent • startsWith • endsWith • a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element. simply says that any one rule must EITHER use one of the predefined rule attributes or contain a custom made rule. If you want to do a regex based validation, you can simply define a custom rule including the required regex engine, it would be carried by the same <rule> element as other rules.. Rgds dF David Filip, Ph.D. ===================== cellphone: +353-86-0222-158 mailto: davidf@davidf.org www.davidf.org , http://www.linkedin.com/in/davidfatdavidf On Sat, Aug 31, 2013 at 5:45 PM, Josep Condal < pcondal@apsic.com > wrote: Hi Bryan, Regading point 033 below, I've just read the Validation module as published with datestamp August 20, 2013 and I seem to still have a couple of doubts regarding the use cases of the validation module: 1. As described in the spec, attributes like isPresent, isNotPresent, startWith, and endsWith take substring.  Are these substrings case-sensitive? If so, how can they be made case-insenstive? If not, how can they be made case-sensitive? 2. Examples of usage in the spec for isPresent show these substrings matching whole words, but it is a coincidence.  I understand by the spec that it you have <rule isPresent="iphone"> and <target> contains "triphone", you pass the validation rule for that target segment.  Is it correct to understand that in spite of examples showing usage with whole words, to check the isPresent validation one must not consider it a whole word. 3. I am under the impression that regular _expression_-like rules might be needed in several production scenarios. However, it is not fully clear to me how one can use custom rules to extend the validation rules syntax with, for example, one supporting a RegEx-like grammar.  For custom rules, in the list of constraints, the spec reads "a custom rule defined by attributes from any namespace is REQUIRED in any one <rule> element.".  Could you please provide an example of how one could define custom rules with an alternative (RegEx-like) grammar in a way that is still compliant with the validation module? Regards, Josep.