OASIS Open Document Format for Office Applications (OpenDocument) TC

Expand all | Collapse all

Single XML file

  • 1.  Single XML file

    Posted 08-29-2012 15:48
    Greetings! Since there are almost three weeks until the next TC meeting, I am hopeful that everyone can decide if the single XML file format option is something they really need? See: https://tools.oasis-open.org/issues/browse/OFFICE-3781 It is probably doable, but I don't see the value-add from keeping it as an option. Thoughts/suggestions? Thanks! Hope everyone is having a great day! Patrick -- Patrick Durusau patrick@durusau.net Former Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau


  • 2.  Re: [office] Single XML file

    Posted 08-29-2012 18:23
    I thought WebODF used the single file XML representation.  Jos would be able to confirm. -Rob <office@lists.oasis-open.org> wrote on 08/29/2012 11:47:31 AM: > From: Patrick Durusau <patrick@durusau.net> > To: office@lists.oasis-open.org, > Date: 08/29/2012 11:49 AM > Subject: [office] Single XML file > Sent by: <office@lists.oasis-open.org> > > Greetings! > > Since there are almost three weeks until the next TC meeting, I am > hopeful that everyone can decide if the single XML file format option is > something they really need? See: > https://tools.oasis-open.org/issues/browse/OFFICE-3781 > > It is probably doable, but I don't see the value-add from keeping it as > an option. > > Thoughts/suggestions? > > Thanks! > > Hope everyone is having a great day! > > Patrick > > -- > Patrick Durusau > patrick@durusau.net > Former Chair, V1 - US TAG to JTC 1/SC 34 > Convener, JTC 1/SC 34/WG 3 (Topic Maps) > Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 > Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) > > Another Word For It (blog): http://tm.durusau.net > Homepage: http://www.durusau.net > Twitter: patrickDurusau > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: office-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: office-help@lists.oasis-open.org >


  • 3.  Re: [office] Single XML file

    Posted 08-29-2012 18:43
    Rob, On 08/29/2012 02:23 PM, robert_weir@us.ibm.com wrote: I thought WebODF used the single file XML representation.  Jos would be able to confirm. Thanks for the reminder! I did look at WebODF, after your post, and it says that it reads all ODF documents. So I am assuming that means the package format as well. Whether it writes the package back out wasn't clear. Hope you are having a great week! Patrick -Rob <office@lists.oasis-open.org> wrote on 08/29/2012 11:47:31 AM: > From: Patrick Durusau <patrick@durusau.net> > To: office@lists.oasis-open.org , > Date: 08/29/2012 11:49 AM > Subject: [office] Single XML file > Sent by: <office@lists.oasis-open.org> > > Greetings! > > Since there are almost three weeks until the next TC meeting, I am > hopeful that everyone can decide if the single XML file format option is > something they really need? See: > https://tools.oasis-open.org/issues/browse/OFFICE-3781 > > It is probably doable, but I don't see the value-add from keeping it as > an option. > > Thoughts/suggestions? > > Thanks! > > Hope everyone is having a great day! > > Patrick > > -- > Patrick Durusau > patrick@durusau.net > Former Chair, V1 - US TAG to JTC 1/SC 34 > Convener, JTC 1/SC 34/WG 3 (Topic Maps) > Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 > Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) > > Another Word For It (blog): http://tm.durusau.net > Homepage: http://www.durusau.net > Twitter: patrickDurusau > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: office-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: office-help@lists.oasis-open.org > -- Patrick Durusau patrick@durusau.net Former Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau


  • 4.  Re: [office] Single XML file

    Posted 08-29-2012 18:44
    AFAIK the single file format was once introduced to provide easy XSLT support. Unfortunately it was not taking serious as the round trip of ODF documents between single file format and ZIP is broken. A zipped ODF saved as single file and loaded as saved as zip back is not guaranteed to produce the same info set of the original zip. For instance, embedded files other than images get usually lost, in addition the single file gets invalid when multiple similar XML IDs exist in the files and even similar named automatic styles from content and styles XML will cause a problem. IMHO we should either specifying the single format in a way that it is 100% compatible, e.g. a) like a specified generic mechanism to map any file into the single stream b) like a specified mapping of IDs (XML, style names) of various files (e.g. with the file path+name as prefix) into the single file or dropping the whole thing. I would tend to drop it, as I see only burden to the implementor, even the ODF Toolkit does not implement it. But I can not see the overall benefit for the ODF format. There is no problem in using such a file format internally, if it suits the needs (e.g. WebODF), but IMHO it is just a bad idea to specify it as an ODF document exchange option. PS: Even in XSLT it gets annoying as all images will be BASE64 encoded, parsed by the XML parser, causing time inefficiency and often out-of-memory exception, instead references inside the ZIP could be resolved using SAX EntityResolver oder URIResolver. Svante On 29.08.2012 20:23, robert_weir@us.ibm.com wrote: I thought WebODF used the single file XML representation.  Jos would be able to confirm. -Rob <office@lists.oasis-open.org> wrote on 08/29/2012 11:47:31 AM: > From: Patrick Durusau <patrick@durusau.net> > To: office@lists.oasis-open.org , > Date: 08/29/2012 11:49 AM > Subject: [office] Single XML file > Sent by: <office@lists.oasis-open.org> > > Greetings! > > Since there are almost three weeks until the next TC meeting, I am > hopeful that everyone can decide if the single XML file format option is > something they really need? See: > https://tools.oasis-open.org/issues/browse/OFFICE-3781 > > It is probably doable, but I don't see the value-add from keeping it as > an option. > > Thoughts/suggestions? > > Thanks! > > Hope everyone is having a great day! > > Patrick > > -- > Patrick Durusau > patrick@durusau.net > Former Chair, V1 - US TAG to JTC 1/SC 34 > Convener, JTC 1/SC 34/WG 3 (Topic Maps) > Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 > Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) > > Another Word For It (blog): http://tm.durusau.net > Homepage: http://www.durusau.net > Twitter: patrickDurusau > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: office-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: office-help@lists.oasis-open.org >


  • 5.  Re: [office] Single XML file

    Posted 08-29-2012 21:18
    On Wednesday 29 August 2012 17:47:31 PM Patrick Durusau wrote: > Greetings! > > Since there are almost three weeks until the next TC meeting, I am > hopeful that everyone can decide if the single XML file format option is > something they really need? See: > https://tools.oasis-open.org/issues/browse/OFFICE-3781 > > It is probably doable, but I don't see the value-add from keeping it as > an option. > > Thoughts/suggestions? It is good to revisit features to see if we still need them. The single xml file format for ODF is tremendously useful in my opinion, for a number of reasons. I will list what I think are reasons pro and contra the single xml format. Pro --- - easier to create than zip - easier to read than zip - easier to make diffs, which is useful for people that put ODF in version control systems - it is being used, notably by LibreOffice and WebODF - is useful for web based office suites because of easy file -> DOM mapping - can be embedded in other XML files Contra ------ - it makes the specification larger - it is not widely supported even though it was introduced in ODF 1.0 - the file size is larger - it does not have all the features that the zip file has (e.g. rdf is not supported at the moment) The missing RDF support can be fixed, at which point there are not so many downsides to keeping the single file format. If this topic came up for a vote, I would vote firmly in favor of keeping the single file format. Cheers, Jos


  • 6.  RE: [office] Single XML file

    Posted 08-29-2012 21:29
    You also cannot sign or encrypt it. It would be possible to construct a signature, but not using the current ODF 1.2 signing standard.


  • 7.  Re: [office] Single XML file

    Posted 08-29-2012 23:50
    Jos, Thanks for the quick summary! Would you or someone you would like to volunteer be willing to see what needs to be added to make the single file format match the package format? RDF being one instance, I suspect there will be others. I don't think omitting the single file format will make the standard than much shorter, I was more concerned with the single file format being underspecified and the resources it would take to specify it. If we can find people who are willing to devote the resources necessary to align the two formats, I might feel differently about it. Hope you are having a great week! Patrick PS: Reserving judgement on the zip issues. Haven't thought about that aspect in a while. On 08/29/2012 05:17 PM, Jos van den Oever wrote: On Wednesday 29 August 2012 17:47:31 PM Patrick Durusau wrote: Greetings! Since there are almost three weeks until the next TC meeting, I am hopeful that everyone can decide if the single XML file format option is something they really need? See: https://tools.oasis-open.org/issues/browse/OFFICE-3781 It is probably doable, but I don't see the value-add from keeping it as an option. Thoughts/suggestions? It is good to revisit features to see if we still need them. The single xml file format for ODF is tremendously useful in my opinion, for a number of reasons. I will list what I think are reasons pro and contra the single xml format. Pro --- - easier to create than zip - easier to read than zip - easier to make diffs, which is useful for people that put ODF in version control systems - it is being used, notably by LibreOffice and WebODF - is useful for web based office suites because of easy file -> DOM mapping - can be embedded in other XML files Contra ------ - it makes the specification larger - it is not widely supported even though it was introduced in ODF 1.0 - the file size is larger - it does not have all the features that the zip file has (e.g. rdf is not supported at the moment) The missing RDF support can be fixed, at which point there are not so many downsides to keeping the single file format. If this topic came up for a vote, I would vote firmly in favor of keeping the single file format. Cheers, Jos --------------------------------------------------------------------- To unsubscribe, e-mail: office-unsubscribe@lists.oasis-open.org For additional commands, e-mail: office-help@lists.oasis-open.org -- Patrick Durusau patrick@durusau.net Former Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau


  • 8.  Re: [office] Single XML file

    Posted 08-31-2012 16:10
    Jos van den Oever wrote: > The single xml file format for ODF is tremendously useful in my opinion, for a > number of reasons. I will list what I think are reasons pro and contra the > single xml format. > I concur with Jos. Highly useful when you want to stream stuff, both in and out of odf processors. Also nice for being treated with run-of-the-mill revision control systems. Cheers, -- Thorsten Behrens SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg; GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) Attachment: pgpDzE6wCtTh0.pgp Description: PGP signature


  • 9.  RE: [office] Single XML file

    Posted 08-31-2012 21:49
    I agree that the single XML file is very utilitarian. It is adaptable to a wide variety of special-purpose uses and it is easy to generate, even manually and with or without a schema-aware editing tool. Also, 1. The existing change-tracking model, however ill-specified, works with the single XML file. 2. It is a trivial matter to allow the RDF/XML <rdf:RDF> element or even node elements within the metadata section. These can work without any need of the OWL classes or any of the other contortions. Internal relative IRIs are also easy in this case. The internal RDFa is already allowed although it is not clear to me why, since ODF uses XML, embedded RDF/XML wasn't allowed more generally in content.xml in the first place. Of course it could be used anyhow in an extended ODF document. 3. XML DSig and XML Encryption work with these XML documents, of course, and again, there is very little that needs to be said beyond their allowance. - Dennis PS: I would also argue that just because there are pathological or at least impractical cases for the single XML file, that does not mean all single XML files are evil. RTF is a text-based file structure and there seems to be no way to avoid accepting and producing RTF in interoperating office-productivity implementations.


  • 10.  Re: [office] Single XML file

    Posted 08-31-2012 22:40
    Am 31.08.2012 23:48, schrieb Dennis E. Hamilton: I agree that the single XML file is very utilitarian. It is adaptable to a wide variety of special-purpose uses and it is easy to generate, even manually and with or without a schema-aware editing tool. * The discussion so far addresses the usefulness of the Single XML incarnation but leaves the question open if Single XML should be a derivative deliverable of the TC/OASIS/... or remain a part of the upcoming 1.3 standard, which then essentially continues to comprise two twin file formats. A key criteria here appears pace of standardisation. Does Single XML inclusion slow the process down? Patricks resource concerns apply. * The second question from my side is structure, editorial, do you perceive it as appropriate to list both as equivalent options, or should the Single XML option be moved to a dedicated chapter ("part) which describes the alternative file format option, and derivations from the mainstream packaged ODF if any? The latter would allow to avoid underspecification and unclutter the structure. It also would enable us to better understand and monitor a potential feature gap, and keep the main stream format documentation clean from unnecessary variants, * The third issue I would like to raise is inline elements. For instance we have the options to link packaged binary items and scripts or include them inline. Would it be useful to take a decision for the 1.3 incarnation producer in the sense that inline is only applicable for Single XML format but not packaged ODF 1.3? The more variants the format provides for producers the more complexity, the more implementation specific differences and potential interoperability challenges in a cross-implementation roundtrip environment arise. Best, André


  • 11.  RE: [office] Single XML file

    Posted 08-31-2012 23:29
    André, The work to integrate the schemas for the parts (content.xml, styles.xml, etc.) and the single XML document has already been done. That basic structure has not changed as the result of evolution of the specification from ODF 1.0 through ODF 1.2 and it need not change for ODF 1.3. Any radical change-tracking changes will impact the single XML only a little and the package-level bits a great deal. There is some extra wording in the conformance section to deal with conformance of single XML documents versus the kindred packaged documents. That could probably be simplified but, again, the work has already been done. David LeBlanc is correct, specifying a particular case for XML DSig introduction in the single XML document does require something in the specification. Ideally that would be at the <office:document> element, the unique root for the single XML document. There are the usual rules for how the signing is done when the signature is embedded. It's a little trickier for XML just because it is not a binary format and canonicalization enters into the picture. Other than that, it is not much different than inserting Authenticode signatures into compiled binaries, it seems to me. If there were an informal agreement on XML DSig introduction, it could be done immediately. Since it can be ignored as a foreign element under the <office:document>, it should not disrupt any implementation except one that wants to be strict about foreign elements and do something more disruptive than simply ignore them. (Likewise for RDF elements.) - Dennis


  • 12.  RE: [office] Single XML file

    Posted 08-31-2012 23:40
    The pros and cons are this - if you wanted to sign the root element, then you have to make an XPath transform to exclude the document-signatures node. Which then implies that implementers have to support XPath transforms, and they have to check that the transform that excludes the document-signatures node is the only one there, or they will get into trouble, security and otherwise. If you do not want to sign the root node itself - say there's no important data, only child nodes, and the attributes on the root are also not important, then things become easier. You then make the document-signatures node a direct child of the root, and say that there has to be a reference covering every direct child, excepting the document-signatures. Note that we currently have an exclusion for a certain directory within the zip, so you should provide for an unsigned exclusion area for the single-file case, too. If any of my assumptions above are incorrect, then we need to figure something out. There are some minor nuances with the canonicalization. A standard canonicalization will preserve the root node namespace, and that seems like what we want to do here. Canonicalization transforms do exist that allow a node to be put under some other arbitrary node, and not break signatures. I'd tend not to want to go that route. Whichever is decided upon needs to be documented in the standard.


  • 13.  Re: [office] Single XML file

    Posted 09-01-2012 00:21
    Am 01.09.2012 01:28, schrieb Dennis E. Hamilton: André, The work to integrate the schemas for the parts (content.xml, styles.xml, etc.) and the single XML document has already been done. That basic structure has not changed as the result of evolution of the specification from ODF 1.0 through ODF 1.2 and it need not change for ODF 1.3. Any radical change-tracking changes will impact the single XML only a little and the package-level bits a great deal. There is some extra wording in the conformance section to deal with conformance of single XML documents versus the kindred packaged documents. That could probably be simplified but, again, the work has already been done. I meant parts/chapters of the ODF specification. When you read chapter 3 in the current 1.3 draft it becomes apparent that both Single XML and Packaged format are put on equal footing for ODF document representation. In fact, Single XML is the first listed choice. But in reality Packaged format is de facto preferred and Single XML "useful" to be still kept for certain use cases, for the reasons raised here. --- André


  • 14.  RE: [office] Single XML file

    Posted 08-31-2012 22:59
    Yes, you can do signatures, but the current standard does not cover it. You would need to create a new section determining how to do this. The current standard mandates a document-signatures root in a particular file. You can obviously put that node internal to a larger XML, but we don't currently document it. In addition, you need to create a transform that excludes the signature node, or conversely, you would not sign the top-level root container node, and then sign all the child nodes excepting the signature container node itself. Both approaches are possible, but again, the standard itself doesn't tell you how to do that. I'd suggest taking the second approach, but others could have different opinions. So if you want to continue to support the single XML file format, and would like to also support signatures with it, then you need to add a paragraph to the signature area to cover how it should be done. On the issue of whether to support the single XML format, I'm personally agnostic, and you should let someone else at Microsoft chime in on the bigger issue. A suggestion I might make would be to do some web searches and see how many of the single XML files you find in the wild, and then decide whether you think it is worthwhile doing the work to maintain the standard for parallel formats.


  • 15.  RE: [office] Single XML file

    Posted 09-04-2012 03:45
    Before I forget to mention this one more time, here are two more: 4. The <office:document> element is not only the root element of the single-XML form of ODF document, it can be embedded as an element *within* an ODF document. In particular, the element can be used directly in <db:component> and <draw:object> and those occurrences can be within another <office:document> root or in a content.xml file. 5. In addition, an <office:document> single-XML form is a perfectly legitimate ODF package file that is used by reference in the main portion (i.e., whatever content.xml is) and other XML-based ODF package files. In the case that <office:document> occurs within an ODF package (embedded in content.xml or referenced), that <office:document> is going to be compressed (indirectly or directly) and it can also refer to separate blobs in the package rather than have them base64 encoded and occupying the stream of the <office:document> XML. This is probably a missed opportunity. I see ODT documents that embed full sub-document structures for spreadsheets used as chart and table sources when it would be much more compact, in that case, if an <office:document> single XML package part had been used for the spreadsheet subdocument instead. - Dennis