Bryan,
You bring up excellent points. The real problem seems to be with the
implementation of the schema standard. None of the parsers I have correctly
implement "lax". In principle, "lax" is definitely the best choice. In
practice, it's up for grabs.
If we choose "lax" it's not much better than "skip" until the parsers are
fixed. In fact, they currently give a false sense of security.
If we choose "strict" it provides the best protection for determining
whether the XLIFF is truly valid or not. On the downside, it means a person
needs to go through the trouble of making AND DISTRIBUTING a schema if they
wish to use other namespaces and validate their XLIFF. (This applies to me
personally; Ektron uses a proprietary namespace.) It means the strict XLIFF
schema would be very strict.
If we choose "skip" it provides the best chances of avoiding a false
negative due to a missing proprietary schema or defective XML parser.
In the final analysis, we could choose "lax" and let the parsers fall where
they may. Perhaps the Excel table in my previous email could be in an FAQ to
help developers workaround the problems. Personally I prefer "skip" for both
strict and transitional schemas. (Although, for Ektron's purposes, I'll
internally use "strict" during development and debugging because it's best
at detecting defects. I just need to create the proprietary schema.) I
realize this is different from the decision to use "lax" made earlier by the
TC, but I'm concerned with the practical implications. I still agree with
the original decision to use "lax" and would stick by it if the parsers
functioned correctly.
Having said all that, I just realized a serious drawback to using "strict".
Because XLIFF is used as an interchange format (the 'I' in XLIFF), the file
is processed multiple times by various applications and vendors. Using
"strict" would require ALL the vendors to HAVE ALL the proprietary schemas
used within the XLIFF document. Obviously, this is not practical. So I guess
"strict" is not an option.
That leaves us with "lax" and "skip". The .NET parser (and sometimes XSV)
treats "lax" as "strict". So I guess I'm back to (reluctantly) preferring
"skip".
Regardless of which option we pick, as a fall back, the transitional schema
can always be used to determine an acceptable level of XLIFF validity.
Regards,
Doug Domeny
Software Analyst
Ektron, Inc.
+1 603 594-0249 x212
http://www.ektron.com
Original Message-----
From: bryan.s.schnabel@exgate.tek.com
[mailto:bryan.s.schnabel@exgate.tek.com]
Sent: Wednesday, October 25, 2006 11:18 AM
To: ddomeny@ektron.com; tony.jewtushenko@productinnovator.com;
asgeirf@gmail.com; xliff@lists.oasis-open.org
Subject: [xliff] RE: [xliff-comment] Section 2.5.4 Validating Documents with
Extensions - needs revision
Hi Doug,
I will check the XSD.
I don't disagree with your recommendation for using strict. But I think it
is worth stating exactly what that means, because it is different than what
we discussed when we chose lax in the first place.
With lax, the parser was obligated to use the schema of the namespaced
elements and attributes to validate them, IF it could find the schema. If
it could not find the schema, it checked the elements and attribute for
well-formed-ness.
By using strict, the parser MUST find the schema for the namespaced elements
and attributes, and the elements and attributes MUST be valid per the spec.
That means if the schema for the namespace elements or attributes ever gets
lost, the XLIFF file CANNOT be valid. It also means if, for example, an
XLIFF document uses an element that is not written as a first child of the
schema element, it could also be invalid.
This is okay by me, but we should all know that this is a much higher
standard than we talked about when we implemented this in the first place.
Thanks,
Bryan
________________________________
From: Doug Domeny [mailto:ddomeny@ektron.com]
Sent: Wed 10/25/2006 7:07 AM
To: Schnabel, Bryan S; tony.jewtushenko@productinnovator.com;
asgeirf@gmail.com; xliff@lists.oasis-open.org
Subject: RE: [xliff-comment] Section 2.5.4 Validating Documents with
Extensions - needs revision
Bryan, et al.,
After analysis, I now recommend processContents="strict" for the strict
schema. All modified files are attached.
strict lax w/ xsd lax w/o xsd
valid invalid valid invalid valid
StylusStudio 2006 correct correct correct valid correct
MSXML 4 DOM correct correct correct valid correct
MSXML 6 DOM correct correct correct valid correct
Saxonica 8.7 correct correct correct valid correct
.NET XML correct correct correct correct error
Xerces-J 2.5.1 correct correct correct valid correct
XSV 2.10-1 correct correct correct valid error?
? if URI returns error 404
As you can see, "lax" for most of the parsers is the same as "skip" and
therefore useless except in .NET parser, which acts like "strict".
The most help comes from using "strict" and providing the schema.
The change is to add the tek schema to documents that use the tek:
namespace.
Sample_AlmostEverything_1.2_strict.xlf