On 7/1/07, robert_weir@us.ibm.com <robert_weir@us.ibm.com> wrote:
I suppose I should throw in my $.02.
First, we should remember that ODF mandates
behavior at several levels. The schema itself encodes requirements
in terms of what elements or attributes are optional or mandatory, what
nesting is permitted, what restrictions there are on data types, etc.
And then the normative text of the standard, along with external normative
references, make additional provisions by the use of "shall"
and "shall not".
But virtually all are undercut by the following sentence in the conformance section:
"There are ***no rules regarding the elements and
attributes that actually have to be supported by conforming
applications,*** except that applications should not use foreign
elements and attributes for features defined in the OpenDocument
schema."
But note that in that case,the provision
is only applicable to those who implement that feature. A "shall"
concerning the calculation of the SUM() spreadsheet function may be totally
ignored by someone who is implementing a word processor only. Finally,
we have the conformance clause, that defines which features and additional
constraints are required for conformance with the standard.
Today our conformance clause designates
requirements for conformant documents, conformant applications that read,
conformant applications that write, and conformant applications that both
read and write.
We have very few conformance *requirements,* in the sense of mandatory requirements. Here is the sum total:
>>>
Documents that conform to the OpenDocument
specification may contain
elements and attributes not specified within the OpenDocument
schema. Such elements and attributes **must not** be part of a
namespace that is defined within this specification and are called
foreign elements and attributes.
...
Conforming applications either **shall** read documents that are valid
against the OpenDocument schema if all foreign elements and
attributes are removed before validation takes place, or **shall** write documents that are valid
against the OpenDocument schema if all foreign elements and
attributes are removed before validation takes place.
...
Foreign elements may have an office:process-content
attribute
attached that has the value true
or false
. If the attribute's value is
true
, or if the attribute
does not exist, the element's content should be processed by conforming
applications. Otherwise conforming applications should not process the element's
content, but may only
preserve its content. If the element's content should be processed,
the document itself ***shall***
be valid against the OpenDocument schema if the unknown element is
replaced with its content only.
Conforming applications
***shall*** read documents containing
processing instructions and
should preserve them.
<<<
We should also realize that all of those "may" and "optional" requirements keywords changed their meaning between ODF
1.0 and 1.1. In ODF 1.0, they meant:
>>>
5. MAY This word, or the adjective "OPTIONAL", mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.)
<
http://www.ietf.org/rfc/rfc2119.txt>. This is the definition used by nearly all OASIS standards.
<<<
At ISO's request, that definition changed to:
>>>
The verbal forms shown in Table G.3 shall be used to indicate a course of action permissible
within the limits of the document.
Table G.3 — Permission
Verbal form
Equivalent expressions for use in exceptional cases
(see
6.6.1.3)
may
is permitted
is allowed
is permissible
need not
it is not required that
no … is required
Do not use "possible" or "impossible" in this context.
Do not use "can" instead of "may" in this context.
NOTE 1
"May" signifies permission expressed by the document, whereas "can"
refers to the ability of a user of the document or to a possibility open to him/her.
NOTE 2
The French verb "pouvoir" can indicate both permission and possibility.
For clarity, the use of other expressions is advisable if otherwise there is a risk of misunderstanding.
<<<
<
http://72.14.253.104/search?q=cache:DxJI76h9l8QJ:www.iec.ch/tiss/iec/Directives-Part2-Ed4.pdf+nnex+H+of+%5BISO/IEC+Directives&hl=en&ct=clnk&cd=1&gl=us
>, pg. 62.
So in ODF 1.0 the keywords "may" and "optional" imported a requirement of interoperability. In ODF 1.1, that requirement disappeared with the stroke of a pen. My reading of the ISO directives suggests that we do not have the option of going back to the RFC 2119 definitions. But nonetheless it is my understanding that the TC did not study the impact of the change in requirements keyword definitions before making the change.
For example, the use of the word "may" in the preservation of foreign elements and attributes section would at least arguably, under the RFC 2119 definition, **require** preservation of foreign elements and attributes needed for interoperability purposes whether or not an application supported foreign elements and attributes.
But I think it might fly with ISO to use the RFC 2119 definition of "may" and "optional" in the conformance section alone and that might put us further down the road toward interoperability.
As you may already know, OASIS has added
a new requirement for all OASIS standards:
"A specification that is approved
by the TC at the Public Review Draft, Committee Specification or OASIS
Standard level must include a separate section, listing a set of numbered
conformance clauses, to which any implementation of the specification must
adhere in order to claim conformance to the specification (or any optional
portion thereof) "
I think thisis particularly important because procurement officers want to be able to simply specify that a candidate application must produce conformant format X. They do not want to, in effect, have to write their own file format specifications
When we make the changes required for
the new OASIS rules, I suggest we think about conformance in general, and
consider making a more substantial statement. For example, we could define
things at a more granular level: a conformant ODF spreadsheet shall
support workbooks of at least a single sheet, with at least 100 rows and
25 columns and at least the Group 1 spreadsheet functions. (Just
an example, not a real proposal). So we have the opportunity to specify
multiple levels of conformance, either in the main text, or as separate
profiles.
+1. I'd add that we should approach such issues with suspicion that every option is a potential interoperability breakpoint.
To the specific question at hand, I
am concerned with the loose use of the word "preserve." What
exactly does that mean? For example, must the xml:id's of the saved
document be lexically identical to the read document? Or are looser
version of equivalence allowed? For example, if the id originally
is "foo" and then it is saved with the id "bar" is
that permitted, provided that the structure and referential integrity of
the id and references are maintained? Remember, it will be common
for an application to read an XML document and convert id's and links into
internal runtime representations that are not at all similar to the XML.
Id/references might be converted into C-language pointer references
between objects, etc. Then when writing out the document, new unique
ID's might be generated on-the-fly, perhaps in sequential order. This
might vary from implementation to implementation. Beyond referential
integrity, I don't know if there is any additional value in saying that
a document created in KOffice must have identical ID labels when that document
is later saved in OpenOffice.
I do not have the technical knowledge to answer that question. However, I request that we approach the issue from recognition that a document may pass through many applications before wending its way back to the originating application. From a layman's view, it would seem that a shifting vocabulary would interfere with interoperability mightily in situations where it is unknown what application will be the next to process a document.
We should also note that it is a feature
of some programs, such as Office 2007, to have a menu item specifically
for removing metadata from a document, for privacy and security reasons.
I don't think we want to prevent such an application from claiming
conformance.
Wouldn't an exception for user initiated actions cover this situation?
So we need to be need to be very careful
how we word this. Perhaps something like "Conforming applications
that read and write documents shall be capable of "preserving"
xml:id's, etc." With the proviso that "preserving"
needs a better definition, this ensures that conforming applications support
preservation, while also allowing that not every mode of use may actually
do so, such as when a user deletes content or metadata, etc.
I'm not sure that "capable" helps a lot. E.g., if an application is capable of preserving metadata but ships with that option turned off and an arcane set of keystrokes to enable the option known only to the developers, the app is still "capable" of preserving metadata. Maybe call that an Easter Egg optional setting.
While on the subject of the conformance section and requirements keywords, we have another problem to deal with. The Notation section currently reads: "
Within this
specification, the key words "shall",
"shall not", "
should", "should not"
and "may" are to be interpreted as described in Annex H of [ISO/IEC
Directives] ***if they appear in bold letters.*** Between ODF 1.0 and ODF 1.1, many of the keywords lost their boldfacing. I suspect that is because we tend to bat language back and forth in plain text email, which strips text attributes.
1. We could avoid much of that kind of problem in the future if we switched to keywords in all cap rather than bold face, since they will remain all caps in emails.
2. Does anyone know if their are any instances of the keywords that should ***not*** be boldfaced (or all caps)? If not, we have a simple global search and replace task. If so, we have a tedious review ahead of us.