OASIS XML Localisation Interchange File Format (XLIFF) TC

  • 1.  RE: [xliff-comment] Section 2.5.4 Validating Documents with Extensions - needs revision

    Posted 10-25-2006 15:19
    Hi Doug,
     
    I will check the XSD.  
     
    I don't disagree with your recommendation for using strict.  But I think it is worth stating exactly what that means, because it is different than what we discussed when we chose lax in the first place.  
     
    With lax, the parser was obligated to use the schema of the namespaced elements and attributes to validate them, IF it could find the schema.  If it could not find the schema, it checked the elements and attribute for well-formed-ness.  
     
    By using strict, the parser MUST find the schema for the namespaced elements and attributes, and the elements and attributes MUST be valid per the spec.  That means if the schema for the namespace elements or attributes ever gets lost, the XLIFF file CANNOT be valid.  It also means if, for example, an XLIFF document uses an element that is not written as a first child of the schema element, it could also be invalid.
     
    This is okay by me, but we should all know that this is a much higher standard than we talked about when we implemented this in the first place.
     
    Thanks,
     
    Bryan
     
    
    ________________________________
    
    From: Doug Domeny [mailto:ddomeny@ektron.com]
    Sent: Wed 10/25/2006 7:07 AM
    To: Schnabel, Bryan S; tony.jewtushenko@productinnovator.com; asgeirf@gmail.com; xliff@lists.oasis-open.org
    Subject: RE: [xliff-comment] Section 2.5.4 Validating Documents with Extensions - needs revision
    
    
    
    Bryan, et al.,
    
    After analysis, I now recommend processContents="strict" for the strict
    schema. All modified files are attached.
    
                            strict         lax w/ xsd      lax w/o xsd
                        valid  invalid    valid  invalid    valid
    StylusStudio 2006  correct correct   correct valid     correct
    MSXML 4 DOM        correct correct   correct valid     correct
    MSXML 6 DOM        correct correct   correct valid     correct
    Saxonica 8.7       correct correct   correct valid     correct
    .NET XML           correct correct   correct correct   error
    Xerces-J 2.5.1     correct correct   correct valid     correct
    XSV 2.10-1         correct correct   correct valid     error?
    
    ? if URI returns error 404                                     
    
    As you can see, "lax" for most of the parsers is the same as "skip" and
    therefore useless except in .NET parser, which acts like "strict".
    
    The most help comes from using "strict" and providing the schema.
    
    The change is to add the tek schema to documents that use the tek:
    namespace.
    
    Sample_AlmostEverything_1.2_strict.xlf
    
    


  • 2.  RE: [xliff] RE: [xliff-comment] Section 2.5.4 Validating Documents with Extensions - needs revision

    Posted 10-25-2006 17:01
    Bryan,
    
    You bring up excellent points. The real problem seems to be with the
    implementation of the schema standard. None of the parsers I have correctly
    implement "lax". In principle, "lax" is definitely the best choice. In
    practice, it's up for grabs.
    
    If we choose "lax" it's not much better than "skip" until the parsers are
    fixed. In fact, they currently give a false sense of security.
    
    If we choose "strict" it provides the best protection for determining
    whether the XLIFF is truly valid or not. On the downside, it means a person
    needs to go through the trouble of making AND DISTRIBUTING a schema if they
    wish to use other namespaces and validate their XLIFF. (This applies to me
    personally; Ektron uses a proprietary namespace.) It means the strict XLIFF
    schema would be very strict.
    
    If we choose "skip" it provides the best chances of avoiding a false
    negative due to a missing proprietary schema or defective XML parser.
    
    In the final analysis, we could choose "lax" and let the parsers fall where
    they may. Perhaps the Excel table in my previous email could be in an FAQ to
    help developers workaround the problems. Personally I prefer "skip" for both
    strict and transitional schemas. (Although, for Ektron's purposes, I'll
    internally use "strict" during development and debugging because it's best
    at detecting defects. I just need to create the proprietary schema.) I
    realize this is different from the decision to use "lax" made earlier by the
    TC, but I'm concerned with the practical implications. I still agree with
    the original decision to use "lax" and would stick by it if the parsers
    functioned correctly.
    
    Having said all that, I just realized a serious drawback to using "strict".
    Because XLIFF is used as an interchange format (the 'I' in XLIFF), the file
    is processed multiple times by various applications and vendors. Using
    "strict" would require ALL the vendors to HAVE ALL the proprietary schemas
    used within the XLIFF document. Obviously, this is not practical. So I guess
    "strict" is not an option.
    
    That leaves us with "lax" and "skip". The .NET parser (and sometimes XSV)
    treats "lax" as "strict". So I guess I'm back to (reluctantly) preferring
    "skip".
    
    Regardless of which option we pick, as a fall back, the transitional schema
    can always be used to determine an acceptable level of XLIFF validity.
    
    Regards,
     
    Doug Domeny
    Software Analyst
     
    Ektron, Inc.
    +1 603 594-0249 x212
    http://www.ektron.com
     
    
    


  • 3.  RE: [xliff] RE: [xliff-comment] Section 2.5.4 Validating Documentswith Extensions - needs revision

    Posted 10-25-2006 17:16
    
    
      
      
    
    
    On Wed, 2006-10-25 at 12:54 -0400, Doug Domeny wrote:

    Hi,

    Having said all that, I just realized a serious drawback to using "strict".
    Because XLIFF is used as an interchange format (the 'I' in XLIFF), the file
    is processed multiple times by various applications and vendors. Using
    "strict" would require ALL the vendors to HAVE ALL the proprietary schemas
    used within the XLIFF document. Obviously, this is not practical. So I guess
    "strict" is not an option.
    
    
    
    I believe that "strict" is the right option for the Strict schema.

    Adding proprietary extensions to XLIFF is against the notion of easy interchange. Vendors that use custom namespaces in their XLIFF files are responsible for making their Schemas available.

    If a vendor wants to use a custom namespace but doesn't want to publish its proprietary schema, then that vendor should use the "Transitional" Schema, not the Strict one.


    That leaves us with "lax" and "skip". The .NET parser (and sometimes XSV)
    treats "lax" as "strict". So I guess I'm back to (reluctantly) preferring
    "skip".
    

    As said above, "strict" is still the right option for the Strict Schema.

    Best regards,
    Rodolfo
    --
    The information in this e-mail is intended strictly for the addressee, without prejudices, as a confidential document. Should it reach you, not being the addressee, it is not to be made accessible to any other unauthorised person or copied, distributed or disclosed to any other third party as this would constitute an unlawful act under certain circumstances, unless prior approval is given for its transmission. The content of this e-mail is solely that of the sender and not necessarily that of Heartsome.