Dennis, On 02/04/2013 12:00 PM, Dennis E. Hamilton wrote: In today's call, there was interesting discussion about producers preserving xml:id attributes on elements that are preserved from a document that is being consumed. This is in reference to the proposal of OFFICE-3788: <
https://tools.oasis-open.org/issues/browse/OFFICE-3788 >. I believe that is a valuable feature for complex document cases, but that it is not a good idea for a .x release of the ODF specification. The ODF 1.0/1.1/1.2 line does not require any such preservation. There is also nothing to prevent an implementation from doing it. So there is room for implementations to determine whether it is important for their use cases. There might be guidance about that, but I don't believe there should be any requirement about it. Absent implementation differentiation becoming a factor in interoperability, it is perhaps not a good idea to suddenly impose this requirement on implementations. I don't think there is any question of "sudden imposition" of this as a requirement on implementations. If it bothers you that it might appear in ODF 1.3, we can always change the number to 2.0. When new versions of standards appear, software always takes some time to catch up. That has always been the case with new and exciting features. Not to mention that the discussion overlooked the fact that ODF applications currently preserve attribute values for elements, lots of them. I don't have numbers on that so do need to find an archive of realistic ODF documents to see how often applications currently save/re-write ids. It is not clear that the benefit is such that all implementations would be required to preserve xml:id attribute ID values so long as the element having the xml:id occurrence persists. As desirable as this might be from a puristic position, it does damage to many implementations that have never found an use case sufficient to implement this already-allowed capability. How so? If an implementation choose to not support persistent IDs, it can always be an ODF 1.2 implementation that implements some extra features defined in ODF 1.3 (or some other numbering). If "being allowed" were the test for interoperability, there is very reason to specify values in ODF. Applications could mimic each other if they really wanted to. Nothing stops them from doing just that. For calibration and added perspective, here are three use cases for the preservation of xml:ids. All have problems. These are all for preserving xml:ids for referential integrity of references from outside the document that refer to internal elements of a document (derivative). Accomodating any of them in ODF 1.3 might be a bridge too far. And the basis for "...might be a bridge too far" is? I understand that: 1) Present implementation don't preserve xml:ids 2) To require preservation of xml:ids would make existing applications non-conformant to some future version of ODF 1.3. (I assume if we change anything in ODF 1.3, they are not going to be fully conformant with ODF 1.3.) 3) What I am missing is some evidence, other than your saying it, that preservation of xml:ids is any harder than preserving any other attribute value. I understand implementations don't do the preservation now, but at one time implementations didn't use XML either. Non-use doesn't mean that a proposal is too difficult or unworkable. Being able to reliably point into documents would be the next step towards not simply getting document level pointers from a search engine. I would rather get a <text:p> level pointer than a point to 800+ pages of text. You? Hope you are having a great day! Patrick PS: Having said all that, I welcome pointers to archives of ODF documents so I can investigate current usage on ids. CASE 1: [X]HTML Production. When a document is saved as HTML, the xml:ids are presumably turned into identified anchors. This is necessary simply to allow for internal cross-references by IDREF attribute values that target an xml:id ID value. Changing those ID and IDREF values on editing of a replacement for an existing HTML document will break any deep links into the updated HTML export from anywhere else in the World Wide Web. That may not be acceptable for some usage of ODF implementations as tools for maintaining and producing an HTML rendition. (The same problem arises for user-created bookmarks and the identifiers that are generated for them and cross-references to them.) CASE 2: RDF in the same package and elsewhere. (Not just the RDFa in content.xml itself) ODF 1.2 permits RDF parts to be included in a document that refer into elements of the document structure. These RDF parts need a way to identify the elements being referenced, and fragment IDs in URIs of the RDF terms are the common means. Likewise, when the RDF is extracted from the document (e.g., via a GRDDL procedure) or is otherwise external from a document, that RDF can make use of the ODF Package and OWL Document OWL classes to continue to refer to specific elements internal to the ODF package. To the extent that a revision of the document is logically the same work with respect to the nature of the RDF about it, not preserving fragment IDs becomes a problem. (It is also a challenge to deal with the fact that ODF currently lacks a means for creating a location-independent entity identification of a document. Something is needed for where different occurrences of instances are to be taken as logically the same document. This requires something that can work as a persistent URI or URN for a document that is relatively instance-independent and where the document is not necessarily found only at a unique URL location on the Web.) Finally, it is not to be expected that all implementations will be in a position to adjust RDF within packages to align with changed xml:id ID values in order to perserve the referential integrity from such metadata. Some implementations will simply not deal with such RDF and they may but need not preserve that RDF within the package. (There are pros and cons about this. Having mystery material can be a problem for document safety/security and also for documents that are digitally signed when there is implementation-unknown material.) ODF 1.2 doesn't constrain this and it is difficult to see what ODF 1.3 can do beyond adding some guidance. It is perhaps better for guidance to be worked out and demonstrated at OIC first. That's certainly the case for RDF that is not in the package at all. CASE 3: ODF 1.2 CHANGE TRACKING Depending on how references to portions of documents involving tracked changes happens, there can be a problem with the preservation of xml:id attributes. In ODF 1.0/1.1/1.2 the connection of change information with the places in the document where the change applies is accomplished by the xml:id ID value on a <text:changed-region> element. It is also the case that element start tags with xml:id attributes can be swept up into <text:deletion> elements that carry removed material. Those xml:ids would need to be preserved, since the deletion can be rejected in a later edit. (This situation has remarkable consequences for RDF now referencing an element that is (partially) deleted.) I don't know whether this is comprehended as an edge case for the MCT-based change-tracking for ODF 1.3. AND EDGE CASES There are many edge cases to all of this. There is the interaction with change-tracking (and whether that can synchronize with arbitrary RDF in the package), accessibility (also impacted by change tracking), and probably other provisions, including concerns about covert content and digital signatures. It is also important to note that the xml:id attribute ID values in ODF 1.2 documents are generally not thought to be user-specifiable. Where there are user-specified names, these are in other attributes that are usually not used as attribute values of type ID and IDREF. (Note that this xml:id case should actually be about all ODF 1.x attributes having values of type ID, since uniqueness must be preserved across all of them. The xml:id ones are the only ones automatically accessible via fragment values in URI references.) - Dennis PS: Another cat picture: <
http://www.flickr.com/photos/orcmid/1502722674/in/set-72157600230263578 >. --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php -- Patrick Durusau
patrick@durusau.net Technical Advisory Board, OASIS (TAB) Former Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog):
http://tm.durusau.net Homepage:
http://www.durusau.net Twitter: patrickDurusau