OpenDocument - Adv Document Collab SC

 View Only
  • 1.  Groups - Proposal for generic representation of tracked changes for ODF (generic-ct-proposalV5-updates.odt) uploaded

    Posted 09-13-2011 11:32
    We have updated the GCT to reflect comments made by SC members. Specifically it has been updated to move deleted text out from the content area and also to add meta-data so a change transaction can be identified as a particular type of edit operation.This version also provides an alternative representation for attribute changes.

    (The document posted highlights the updates.) -- Robin LaFontaine The document revision named Proposal for generic representation of tracked changes for ODF (generic-ct-proposalV5-updates.odt) has been submitted by Robin LaFontaine to the OpenDocument - Advanced Document Collaboration SC document repository. This document is revision #1 of generic-ct-proposal.odt. Document Description: This document describes a generic method for tracking changes in ODF documents. The format described provides a way of representing successive changes or edits to an XML document, typically in one or more editing sessions. The document describes how to apply this to a host format by modifying the schema for that host format. The OpenDocument Format (ODF) is used as an example host format for this integration. View Document Details: http://www.oasis-open.org/committees/document.php?document_id=43517 Download Document: http://www.oasis-open.org/committees/download.php/43517/generic-ct-proposalV5-updates.odt Revision: This document is revision #1 of generic-ct-proposal.odt. The document details page referenced above will show the complete revision history. PLEASE NOTE: If the above links do not work for you, your email application may be breaking the link into two pieces. You may be able to copy and paste the entire link address into the address field of your web browser. -OASIS Open Administration


  • 2.  Re: [office-collab] Groups - Proposal for generic representationof tracked changes for ODF (generic-ct-proposalV5-updates.odt) uploaded

    Posted 09-13-2011 14:43
    On Tue, 2011-09-13 at 05:31 -0600, robin.lafontaine@deltaxml.com wrote: > We have updated the GCT to reflect comments made by SC members. > Specifically it has been updated to move deleted text out from the content > area and also to add meta-data so a change transaction can be identified as > a particular type of edit operation.This version also provides an > alternative representation for attribute changes. > Just some observations regarding these changes: ------------- In 2 Introduction: "If conversion between them is part of the standard then a reader or writer application can choose which is most convenient." If the writer can choose one of two representations, then the reader will need to be able to read both since either could appear in the file. So there would not be a choice for the reader to choose the more convenient. On the other hand, if the reader would have the opportunity to choose the more convenient, then both representations must be in every file so the writer has no choice of the more convenient. Please clarify this. (The text says something about a "standard XSLT style sheet will be provided to convert between these" it is not clear to be how this would help a general editing application.) ------------- In 3 (15): "An application can elect to cache deleted content in situ, i.e. where it was deleted, or in another place in the document. They cannot be mixed, one or other must be chosen. [Rationale: The alternatives are a choice for a reader or writer for its convenience. If one or other is mandated in a particular situation then it is no longer a choice.]" How does this provide a choice of convenience for the reader? The reader would need to be able to read either since only one can be used. ------------- A similar question applies to 3(16) but there also appears to be a typo in that section. ------------- If the possible values of delta:edit-operation can be defined "by a particular editing application" how does this assist in interoperability. What is the expected behaviour of an implementation that encounters such a value on reading? With foreign elements/attribute values there is usually the possibility to ignore them. I don't see how one can ignore s single unsupported piece of change tracking info. Please elaborate. ------------- While having delta:removed-content allows the deleted content to place elsewhere, it still allows it to be retained in situ. This does not seem to resolve the issue that a 1.2 or earlier implementation when encountering this element needs to understand it to be able to ignore the element content. ------------- My reading of these changes is that these changes move the proposal in the wrong direction: writers have even more choices so implementing a reader implementation becomes even more difficult. Note that a writer implementation always knows what change it tries to record. It is the reader implementation that has to determine it from the file content. Andreas -- Andreas J. Guelzow, PhD, FTICA Concordia University College of Alberta This is a digitally signed message part


  • 3.  Re: [office-collab] Groups - Proposal for generic representationof tracked changes for ODF (generic-ct-proposalV5-updates.odt) uploaded

    Posted 09-14-2011 08:25
    On 13/09/2011 15:42, Andreas J. Guelzow wrote: 1315924938.30321.41.camel@kirkman type= cite > On Tue, 2011-09-13 at 05:31 -0600, robin.lafontaine@deltaxml.com wrote: We have updated the GCT to reflect comments made by SC members. Specifically it has been updated to move deleted text out from the content area and also to add meta-data so a change transaction can be identified as a particular type of edit operation.This version also provides an alternative representation for attribute changes. Just some observations regarding these changes: ------------- In 2 Introduction: If conversion between them is part of the standard then a reader or writer application can choose which is most convenient. If the writer can choose one of two representations, then the reader will need to be able to read both since either could appear in the file. So there would not be a choice for the reader to choose the more convenient. On the other hand, if the reader would have the opportunity to choose the more convenient, then both representations must be in every file so the writer has no choice of the more convenient. Please clarify this. (The text says something about a standard XSLT style sheet will be provided to convert between these it is not clear to be how this would help a general editing application.) The concept is similar to the XML representation for RelaxNG and the compact form, you can convert from one to the other. If there is standard code to move between them (be it XSLT or other) then a reader can convert before it reads in. Of course there is also no reason why for a specific application, e.g. ODF, one form is mandated. Then the writer can run the standard conversion if it needs to. In general alternative, semantically equivalent, representations are useful because some applications will find one easier to process than another. But ODF can choose one, certainly, if that is better! 1315924938.30321.41.camel@kirkman type= cite > ------------- In 3 (15): An application can elect to cache deleted content in situ, i.e. where it was deleted, or in another place in the document. They cannot be mixed, one or other must be chosen. [Rationale: The alternatives are a choice for a reader or writer for its convenience. If one or other is mandated in a particular situation then it is no longer a choice.] How does this provide a choice of convenience for the reader? The reader would need to be able to read either since only one can be used. This is not very clear, apologies. For 'application' it would be better as 'format' so that, say, ODF could always have its content in another place. The point here is that mixing these would seem unnecessary, though there may be a good use case, this could be discussed. 1315924938.30321.41.camel@kirkman type= cite > ------------- A similar question applies to 3(16) but there also appears to be a typo in that section. Same comment applies, typo noted. 1315924938.30321.41.camel@kirkman type= cite > ------------- If the possible values of delta:edit-operation can be defined by a particular editing application how does this assist in interoperability. What is the expected behaviour of an implementation that encounters such a value on reading? With foreign elements/attribute values there is usually the possibility to ignore them. I don't see how one can ignore s single unsupported piece of change tracking info. Please elaborate. Again rather a loose use of 'application'. ODF could define these operations, as seems to be the requirement. Some members of the SC seem to have the opinion that there is a finite list of these that can be defined, i.e. the list in ECT spec. IMHO I think the list will grow and some 'extension' mechanism will be needed (what about macros that perform edits, for example). But then this method could be used for 'odf:format-change' or 'msword:text-to-table' or whatever might be needed. Of course as you say a msword:xx could not necessarily be read by other applications. I see no reason why, in general, a specific unsupported change could not be ignored, as is the case now when a Word document is converted to ODF the format changes are ignored. This needs to be specified in detail but it is all tractable. 1315924938.30321.41.camel@kirkman type= cite > ------------- While having delta:removed-content allows the deleted content to place elsewhere, it still allows it to be retained in situ. This does not seem to resolve the issue that a 1.2 or earlier implementation when encountering this element needs to understand it to be able to ignore the element content. I am not sure I understand your comment. I would not envisage a 1.2 or earlier implementation could read GCT at all, nor could it read most of ECT either. In the same way 1.1 reader cannot understand all of 1.2. 1315924938.30321.41.camel@kirkman type= cite > ------------- My reading of these changes is that these changes move the proposal in the wrong direction: writers have even more choices so implementing a reader implementation becomes even more difficult. Note that a writer implementation always knows what change it tries to record. It is the reader implementation that has to determine it from the file content. I hope the above comments clarify this: ODF can tie down the choices as much as it wishes. I agree making life easier for the reader is always the priority. I see GCT (perhaps defined as a module of ODF) as potentially being useful for other XML formats and they may make different choices. It would be interesting to push it further and have a representation where the changes were separate from the document. Svante's work points in that direction and it is a valid choice in some situations. Robin 1315924938.30321.41.camel@kirkman type= cite > Andreas -- -- ----------------------------------------------------------------- Robin La Fontaine, Director, DeltaXML Ltd Change control for XML T: +44 1684 592 144 E: robin.lafontaine@deltaxml.com http://www.deltaxml.com Registered in England 02528681 Reg. Office: Monsell House, WR8 0QN, UK


  • 4.  Re: [office-collab] Groups - Proposal for generic representationof tracked changes for ODF (generic-ct-proposalV5-updates.odt) uploaded

    Posted 09-15-2011 23:21
    Hi Robin, It hadn't occurred to me that with "application" you mean "format" so that ODF would be one application of some more general GCT. As far as I am concerned this subcommittee deals with ODF. So I read the proposal within the context of ODF and if there are choices given for the writer I would assume that these choices are intended to be available inside ODF. Perhaps we need a translation of the GCT proposal to ODF. On Wed, 2011-09-14 at 02:25 -0600, Robin LaFontaine wrote: > On 13/09/2011 15:42, Andreas J. Guelzow wrote: > > ------------- > > While having delta:removed-content allows the deleted content to place > > elsewhere, it still allows it to be retained in situ. This does not seem > > to resolve the issue that a 1.2 or earlier implementation when > > encountering this element needs to understand it to be able to ignore > > the element content. > I am not sure I understand your comment. I would not envisage a 1.2 or > earlier implementation could read GCT at all, nor could it read most > of ECT either. In the same way 1.1 reader cannot understand all of > 1.2. Of course an ODF 1.2 consumer would not be able to interpret the GCT elements within an ODF 1.3 file. But if that consumer views the GCT elements inside the ODF 1.3 file as foreign elements, I would hope that the result would make some sense. So character content should not remain in-situ but moved somewhere else. Andreas -- Andreas J. Guelzow, PhD, FTICA Concordia University College of Alberta


  • 5.  RE: [office-collab] Groups - Proposal for generic representation of tracked changes for ODF (generic-ct-proposalV5-updates.odt) uploaded

    Posted 10-17-2011 20:53
    I apologize for my lateness in commenting. It's taken quite a while to catch up from missing September. A few questions... In section 3 Definitions, #15 (caching deleted content) and #16 (caching attribute changes) each say that only one of the two methods (in situ vs. cached elsewhere) must be chosen. Can those choices be made independently or is the choice made once for both? Would an implementation need to apply an XSLT on every document load or parse the document before determining whether one is needed? There is no provision for a doc-level property indicating what storage choice was made and thus whether an implementation really needs to apply the XSLT. Of course, load code ought not blindly trust such a property anyway. What about SAX parsers? I believe applying an XSLT requires having the entire XML document in memory, which essentially implies DOM, or at least something DOM-like. If so, wouldn't that preclude an implementation from taking advantage of the performance and memory benefits of SAX streamed reading? Thanks, John


  • 6.  Re: [office-collab] Groups - Proposal for generic representation of tracked changes for ODF (generic-ct-proposalV5-updates.odt) uploaded

    Posted 10-18-2011 16:53
    Answers inline below. Robin On 17/10/2011 21:52, John Haug wrote: I apologize for my lateness in commenting. It's taken quite a while to catch up from missing September. A few questions... In section 3 Definitions, #15 (caching deleted content) and #16 (caching attribute changes) each say that only one of the two methods (in situ vs. cached elsewhere) must be chosen. Can those choices be made independently or is the choice made once for both? These are independent, I believe. I cannot think of a reason why a choice for one would dictate a choice for the other, though it might influence it. Would an implementation need to apply an XSLT on every document load or parse the document before determining whether one is needed? There is no provision for a doc-level property indicating what storage choice was made and thus whether an implementation really needs to apply the XSLT. Of course, load code ought not blindly trust such a property anyway. A doc-level property might be useful, though if ODF always required one method then a reader could always expect/trust that it would be that way and only apply a transform if it preferred it the other way. I think a doc would be considered invalid if the doc-level property was wrong. What about SAX parsers? I believe applying an XSLT requires having the entire XML document in memory, which essentially implies DOM, or at least something DOM-like. If so, wouldn't that preclude an implementation from taking advantage of the performance and memory benefits of SAX streamed reading? There is quite a lot of work going on (in the XSLT community) to enable streaming XSLT so the doc does not all need to be in memory. That said, Saxon uses tree representation which is considerably smaller than DOM. I am not sure at this stage if the XSLT could be streamed in this particular case, it depends on the transformation. However moving info from cached to in-situ would likely not be streamable. FYI we use a lot of pipelined XSLT transformations and they are now pretty efficient, though of course streaming is faster. Thanks, John