<
office-collab@lists.oasis-open.org> wrote on 10/14/2013 01:11:08 PM: > From: Svante Schubert <
svante.schubert@gmail.com> > To: "office-collab@lists.oasis-open.org" <
office-collab@lists.oasis-open.org> > Date: 10/14/2013 01:09 PM > Subject: [office-collab] Normalization of ODF XML (Earlier - Re: IRC > log from today's meeting - 2013-10-09) > Sent by: <
office-collab@lists.oasis-open.org> > > For the description of ODF XML test output and for the comparison of > ODF XML of different applications an ODF normalization is useful. > > In our last meeting we listed already several points, where ODF XML > may differ, but the ODF document not: > > Am 09.10.2013 16:18, schrieb Svante Schubert: > [15:52] Svante Schubert: Other examples of ODF variations of the > same document lead to in different XML: > [15:52] Svante Schubert: 1) Order of automatic styles > [15:52] Svante Schubert: 2) automatic style names > [15:52] Svante Schubert: 3) text:space elements might be split or a > single one (similar to span) > [15:52] Svante Schubert: 4) span can also be nested or split > [15:52] Svante Schubert: 5) Hyperlink can be split as well > [15:52] Svante Schubert: 6) Further existence of optional elements/ > attributes.. > [15:55] Svante Schubert: An implementation note: Nesting spans > should be avoided it just makes things complicated (traversing > recursivly content, searching for spans, inheriting.. etc.) > In addition I would like to add that the text properties of an > automatic paragraph style are equal to the text properties of an > automatic text style, which uses a span embracing the full paragraph content. > > This will be mapped, when two paragraphs will be merged and the text > properties of the automatic paragraph style of the second paragraph > will be mapped as span around the remaining content of the second. > > In addition, whenever an ODF application does a "Clear Formatting", > which removes all the "hard" text attributes. The automatic text > properties of automatic paragraph styles are being mapped to spans > to the remaining "uncleared" area of the paragraph. > > Does anyone know further examples or categories of examples? > Of course there are all the lower level XML differences which are covered in the Canonical XML Recommendation:
http://www.w3.org/TR/xml-c14n Other things to think about: 1) Is a signed document the same document as an unsigned one? (In general we might want to think about how signing works with CT. For example, can you sign a document and then I sign only my changes?) 2) Are non-visible RDFa/RDF XML annotations considered changes? 3) Document level settings, e.g., zoom factor in a spreadsheet. 4) Removing stuff that is not referenced, e.g., removing a style that is not used. 5) Some spreadsheet functions are "volatile" and change values at every recalc, e.g., NOW() and RAND(). If I load a document and save it, the value of that cell changes. But should that be a tracked change? Regards, -Rob > Thanks, > Svante