OpenDocument - Adv Document Collab SC

 View Only
  • 1.  Compromise on change-tracking proposals

    Posted 08-20-2011 22:45
    Dear SC, After a thorough analysis of the ongoing discussion regarding change-tracking (CT), I would like to suggest a compromise between the two existing proposals. When comparing both existing proposals, I liked aspects of both: the generic approach of GCT and from ECT its emphasis on simplicity, its focus on extending existing functionality. My goal was to combine above advantages and in addition allowing the merging of changes coming from different documents. This will improve usability of ODF and allow the reuse of CT work for the upcoming real-time collaboration. The solution is quite simple: Both proposal were trying to optimize the serialization of changes (change-tracking), without specifying in the first place what a change in ODF might be. By first specifying the possible ODF changes in the ODF spec and later only referring to those changes from the ODF document, there is suddenly much room for improvement. In this mail I will condense the work of months, starting with the refactoring the GCT approach: As we know GCT allows the addition of annotations to GCT changes (e.g. move column). It helps ODF users/application to understand the semantic of the change. We might say that one type of annotation identifies a reoccurring change of ODF. We can reduce the complexity of the GCT proposal by splitting a GCT change into two parts. Instead of defining one XML change by GCT, we split the definition into an ODF change event and its related ODF XML change. This allows the following design steps: In our ODF specification we will explain the ODF XML change of an ODF change event (with GCT). In the ODF document remains only a reference to the ODF change event of the spec (e.g. add-table, add-row). Together with necessary variable parameters (e.g. the start and end position of the column to be moved). The changes are moved out of the content.xml and styles.xml into an undo.xml file. This is to allow applications to load either the ODF content or the changes. By taking step 1+2 we are able to move the quite verbose XML change from the document to the specification. Explaining the ODF change event only once in the specification. By taking step 3 we only keep the final state of the document in the content.xml and styles.xml files. Add a new file for changes. Splitting responsibilities in files as done before ( content.xml , styles.xml , meta.xml , settings.xml ). But a second important concept, the concept of components, has to be introduced. It is the definition of implicit known groupings of ODF XML (e.g. table, paragraph, cell, section, etc.). A component is defined to be a disjunctive module of an ODF document. Disjunctive means changing the status of the component will only change the component itself, but it will indirectly change the status of the ancestor components. For example, deleting an image at root level will change the status of the overall document, but not of any other component as for instance a table in the end. A document can be seen as a tree of components. (Note: There are not only components in a document, but there are also views. Those views are just mirroring the state of one or more components. Examples for this are the Content Table, which is a view of headings. Another example is a row & column, which is a view of table cells. Row & column are views and no components, as changing a row will change the status of a column as well.) Having disjunctive components helps us to reduce the complexity of the collaboration. By dividing the document, we are dividing the problem as well and therefore diminishing it, see http://plugfest.opendocsociety.org/lib/exe/fetch.php?media=plugfests:201007_berlin:odfplugfest2011-sschubert.pdf - 164kb. Components are necessary to identify the location of the change in the document. Only via components it is possible to reference a change that does not depend on XML details (e.g. XPath). Otherwise there would be the requirement of knowledge of XML during run-time. The relative reference is necessary for Operational Transformation (OT) . OT will adapt the relative index of a referenced component when someone else has added or removed a preceding-sibling. For example, if I want to delete the third row, but someone added a new row as second row, I still want to delete the same row, which is now the fourth. For this reason components are vital for merging changes and real-time collaboration, which both depend on the usage of OT. To summarize the design so far: The ODF XML change is no longer part of the content.xml and styles.xml files. A change has been abstracted to change-events, hiding details of the ODF XML change. The ODF change-events will be defined in the ODF specification, especially including the mapping to the ODF XML change (and vice versa) via GCT. The idea is to serialize the change-events into a single file at document root level called undo.xml (adjacent to content.xml and styles.xml ). Why saving the undo events and not the changes done? To answer this, let me go back to the main difference between change-tracking and real-time collaboration events: An ODF document can be seen as a frozen entity of a living document.  It reflects one possible state of all possible ODF states that a living document might have. When a document is being loaded into an ODF application, it can be represented by any arbitrary internal model (e.g. web office, mobiles, etc.). This works as long as the document can be changed (as the user would usually expect) and as long as it is representing a valid ODF state. Let us assume that an ODF application is only able to change the state of its ODF model - representing a valid document - by a change event. Those change events can be seen as transitions from one valid status to another. For example by adding or deleting a character, a table, etc. Real-time collaboration and change-tracking are both based on change events, but there is a difference. The relation between real-time collaboration and change-tracking events is similar to the relation between a function and its inverse-function. For example, when changing a background-color of a paragraph from green to red, the collaboration-event would be change background-color on paragraph XY to red . The change-tracking event would be quite the opposite, similar to an undo event it would be change background-color on paragraph XY to green . We can state that for change-tracking, the undo event of a change is mandatory (directed into the past), while collaboration requires the do event of a change (directed into the future). By saving the undo event , the original values before a change are kept. There are now two ways to avoid redundancy: Saving the start document and the do changes or Saving the end document and the undo changes With our existing content.xml and styles.xml the complete final state of the document is saved. Therefore it seems naturally to save the undo changes. As from content.xml and styles.xml and undo.xml all the previous document states can be recreated. Even all do events can be created from this information, by using the state of a component, the undo event and the previous state of the component. Saving the undo events results most often into a smaller file than saving the do events . This is because the most often occurring user scenario is adding new content do event insert , which includes a large amount of text and structure. By saving the undo.xml , only the very small inverse undo event delete will be saved. The inserted text and structure is already in the content.xml and styles.xml , therefore would be redundant to save. An advantage is that change-tracking only uses undo events . Lets give an example of the undo.xml : Starting markup: < text:p > 123456789ABCDEFGH </ text:p > < text:p > 123456789ABCDEFGH </ text:p > Ending markup: < text:p > 123456EFGH </ text:p > < text:p > 123*some text*456789ABCDEFGH </ text:p > <changes>     <delete s= 1/7 e= 1/13 />     <insert s= 2/4 >*some text*</insert> <changes> Description: The RelaxNG schema and explanation of all XML nodes of the undo.xml file (the resulting XML change) will be in a new part of our ODF specification. The attributes s (start) and e (end) identify components. In case there is only the attribute s , it is a single component. In case both attributes s & e are used, it is a selection of multiple components within the component tree (ordered as in XML). The numbers represent the location of component in the document component tree. Even a single character are taken as a component. XML based counting starts with 1. For example, the access of the 12th character within the 3rd paragraph of a text document is shown as 3/12 . An insertion at this position would place the new component in front of the selected (e.g. 12th character). (Note: I am uncertain yet if a preceding label specifying the type of component is helpful, e.g. p3/c12 for 3rd paragraph/12th character. The prefix would be redundant, but is more human readable and might simplify the mapping into internal models, different to the ODF. The label would provide immediately the information without need to look-up in the content.xml ). The sequence of the change-events is chronological. But change-events might be grouped by further <changes> elements, even creating a tree hierarchy. For instance the changes of a document (e.g. our specification) can be sorted by an ODF errata version, which is separated by ISO national body issues, separated by issue numbers. (Note: Let's say an ODF application supports a history feature, allowing to go back to previous versions. For example, we have a specification of ODF 1.2, including the undo.xml to ODF 1.1 and ODF 1.0 specifications. Now a user is able save the document as the previous version ODF 1.1, with less changes in the undo.xml (now only from ODF 1.1 to 1.0), but now as well with a do.xml file (containing the changes from ODF 1.1. to ODF 1.2) allowing to know how to get to ODF 1.2.  This would give the user the following options to choose from:  Changes made in the current working version (in this example ODF 1.1) will also affect following versions (ODF 1.2). For example, this would be handy for fixing typos, etc. Changes made in the current working version (in this example ODF 1.1) only affect the current version (ODF 1.1). For example, this would be handy to fix only the ODF 1.1. version, while keeping ODF 1.2 unchanged. Changes made in the current working version (in this example ODF 1.1), would create a new in between version (ODF 1.1.1). For example, this would be handy to provide a fixed version, while keeping ODF 1.1. and ODF 1.2 unchanged. Finally I would like to mention the perhaps unexpected positive side effects we might have with above compromise: Changes made in real-time collaboration scenarios can use the same change-events as specified for change-tracking. The change-events provide an abstraction from the ODF XML details. Therefore ODF applications with different run-time models may use change-events as lingua franca. In addition changes made during off-line real-time collaboration are able to be saved same way as change-events in change-tracking via content.xml , styles.xml and undo.xml . By defining components and events, we would be able to create profiles of ODF. As we know now what to add or leave out of a profile and we can define more precisely the capabilities of an ODF application. Transformations to other formats (e.g. HTML) could rely the high-level abstraction of change-events. Therefore there is no longer a need to do the mapping to ODF XML. Or even more, if someone would provide HTML with a similar component tree and change-events, abstracting the HTML XML implementation details, the transformation is just connecting similar events (e.g. odf.insertTable to html.insertTable). More intresting for office applications, this would apply as well for OOXML! ODF would be easier to understand. It will be a catalyst for our ODF ecosystem. Follow up: I will sent two mails with examples of some use cases and of some more advanced scenarios that I would like to share. If you have any question or comments on the above, please feel free to contact me. Regards, Svante


  • 2.  Re: [office-collab] Compromise on change-tracking proposals

    Posted 08-30-2011 09:11
    Svante Schubert wrote: > After a thorough analysis of the ongoing discussion regarding > change-tracking (CT), I would like to suggest a compromise between > the two existing proposals. > Hi Svante, all, first off, I like your fresh view on this, and especially the broader scope of collaboration you address. Please allow me a few comments inline. > 3. The changes are moved out of the "content.xml" and "styles.xml" into > an "undo.xml" file. This is to allow applications to load either the > ODF content or the changes. > This is nice in and of itself. It makes it trivial for e.g. viewing applications to ignore document history, and it conveniently abstracts away change tracking from any specific rendering of a document to odf. It's also quite terse. :) > The ODF change-events will be defined in the ODF specification, > especially including the mapping to the ODF XML change (and vice > versa) via GCT. > If that yields a good, descriptive way to document change event semantics, sure. Though it's not an end in itself, and I see the same ambiguity problems for complex change events, as with GCT in general. At the very least, I'd refuse mandating detailed infoset changes, unless we're sure ODF is non-ambiguous in the document feature at hand. > 1. Changes made in real-time collaboration scenarios can use the same > change-events as specified for change-tracking. > A clear winner for me. I'd actually like to see a few more examples of the more complex use cases from http://wiki.oasis-open.org/office/ChangeTrackingUseCases - especially those where one of the current contestants show issues. I think that would help us all to better appreciate your proposal. Cheers, -- Thorsten Behrens Novell GmbH, Nördlicher Zubringer 9-11, 40470 Düsseldorf; GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 21108 (AG Düsseldorf) PGP signature


  • 3.  Re: [office-collab] Compromise on change-tracking proposals

    Posted 09-12-2011 15:38
    Svante, I have read through your compromise solution, and I find it hard to understand this as a compromise, but it is certainly interesting as a new proposal. As you know, we are out of time to consider completely new approaches, unless we are directed to do so by the TC. This was mentioned briefly in the TC call today, and there is no direction from them to change our course at the moment. However, your comments on the other proposals from your perspective of needing to implement a collaboration method, are very relevant and useful. I have read through your proposal in detail, and also the references that you have previously provided. I have also had some discussions with you to better understand the methodology. I know you still have a lot of work to do to develop this. Please keep us informed, as you develop your approach, on how your work might influence our choice of approach for change tracking. Robin On 20/08/2011 23:43, Svante Schubert wrote: Dear SC, After a thorough analysis of the ongoing discussion regarding change-tracking (CT), I would like to suggest a compromise between the two existing proposals. ..snip If you have any question or comments on the above, please feel free to contact me. Regards, Svante -- -- ----------------------------------------------------------------- Robin La Fontaine, Director, DeltaXML Ltd Change control for XML T: +44 1684 592 144 E: robin.lafontaine@deltaxml.com http://www.deltaxml.com Registered in England 02528681 Reg. Office: Monsell House, WR8 0QN, UK


  • 4.  Re: [office-collab] Compromise on change-tracking proposals

    Posted 09-12-2011 18:28
    On Mon, 2011-09-12 at 09:37 -0600, Robin LaFontaine wrote: > Svante, > > I have read through your compromise solution, and I find it hard to > understand this as a compromise, but it is certainly interesting as a > new proposal. As you know, we are out of time to consider completely > new approaches, unless we are directed to do so by the TC. This was > mentioned briefly in the TC call today, and there is no direction from > them to change our course at the moment. While there may be TC members that do not _want_ to consider other proposals or anything but an "either ECT or GCT", until the TC has approved a change tracking proposal we really should not ignore other suggestions. In fact, we know that even after a change tracking proposal has been approved (as for example the change tracking currently contained in ODF 1.2), the TC is of course free to consider changing that (as we do in the moment). > > However, your comments on the other proposals from your perspective of > needing to implement a collaboration method, are very relevant and > useful. I have read through your proposal in detail, and also the > references that you have previously provided. I have also had some > discussions with you to better understand the methodology. I know you > still have a lot of work to do to develop this. Please keep us > informed, as you develop your approach, on how your work might > influence our choice of approach for change tracking. > During the last months there have been many discussions on whether the considered approaches (ECT or GCT) can encode required use cases. Unfortunately, as far as I am concerned, we appear to be ignoring the initial concern that at least the GCT approach appears to be to unrestricted to be reasonably implementable. I do realize that is was suggested that restriction can be added but I have yet to see any proposal to that effect. (And in my mind any addition of such restrictions would turn GCT essentially into ECT.) Andreas -- Andreas J. Guelzow, PhD, FTICA Concordia University College of Alberta


  • 5.  Re: [office-collab] Compromise on change-tracking proposals

    Posted 09-22-2011 19:54
    Robin, You are right, it is best to ask the TC for permission. There is no need to for discussion if it is a compromise or a new proposal, still I would like to explain why I think it would be a good compromise. From my understanding the ECT was submitted as the elegance of the simplicity of the GCT is only granted for XML model based ODF application. Applications like OpenOffice or MS Office 'need a good K.I. to find the mapping from the XML to their model' to quote one of their developers. ECT takes therefore the document level approach, but neglects the beauty of a generic approach. Allow me a little magic trick. We start with chapter 5.2 of your latest updated draft: 5.2 Edit-operation It may be difficult for an editing application to determine the editing operation that was applied in order to achieve a particular set of changes represented by a CT, or CT group or a CT stack. Therefore editing operations can be defined in the standard and identified using the delta:edit-operation attribute. For each definition of an edit operation, there will be rules to say what type of change is allowed. For example, an edit operation defined as a 'text-replacement' would not be allowed to include the deletion of a paragraph, or the change of a format attribute. This mechanism could easily be extended so that a given editing application which has an operation that is not defined in the standard, would be able to create a new definition perhaps using its own namespace as a prefix. You say there might be a label for the operation, as it might be difficult for an editing application to identify the editing operation. If the label is an addition that sometimes help, why not always providing it to enhance usability? Making it mandatory to be able to trust upon it? And we should be also aware that those labels have to be standardized, otherwise Apache OOo might label deleteColumn , while LibreOffice uses columnDelete . A potential nightmare for interoperability. So if we agree to standardize all labels of operations, our two proposals are close to equal. The difference is: I neglect the redundant verbose XML change, only keeping the CT labels in the document, move them out of the content into their own file undo.xml and specify the XML change of each label on TC level in a specification. Saving a lot of space in the document and allowing ODF applications with no XML model to deal with CT on a higher abstraction level. In the end I only refactored your GCT approach. The generic design came from GCT, the high level requirement from ECT, therefore I was mentioned it to be a good compromise. Best regards, Svante Am 12.09.2011 17:37, schrieb Robin LaFontaine: Svante, I have read through your compromise solution, and I find it hard to understand this as a compromise, but it is certainly interesting as a new proposal. As you know, we are out of time to consider completely new approaches, unless we are directed to do so by the TC. This was mentioned briefly in the TC call today, and there is no direction from them to change our course at the moment. However, your comments on the other proposals from your perspective of needing to implement a collaboration method, are very relevant and useful. I have read through your proposal in detail, and also the references that you have previously provided. I have also had some discussions with you to better understand the methodology. I know you still have a lot of work to do to develop this. Please keep us informed, as you develop your approach, on how your work might influence our choice of approach for change tracking. Robin On 20/08/2011 23:43, Svante Schubert wrote: Dear SC, After a thorough analysis of the ongoing discussion regarding change-tracking (CT), I would like to suggest a compromise between the two existing proposals. ..snip If you have any question or comments on the above, please feel free to contact me. Regards, Svante -- -- ----------------------------------------------------------------- Robin La Fontaine, Director, DeltaXML Ltd Change control for XML T: +44 1684 592 144 E: robin.lafontaine@deltaxml.com http://www.deltaxml.com Registered in England 02528681 Reg. Office: Monsell House, WR8 0QN, UK --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 6.  Re: [office-collab] Compromise on change-tracking proposals

    Posted 09-30-2011 08:16
    On 22/09/2011 20:53, Svante Schubert wrote: Allow me a little magic trick. ... So if we agree to standardize all labels of operations, our two proposals are close to equal. The difference is: I neglect the redundant verbose XML change, only keeping the CT labels in the document, move them out of the content into their own file "undo.xml" and specify the XML change of each label on TC level in a specification. Saving a lot of space in the document and allowing ODF applications with no XML model to deal with CT on a higher abstraction level. That is quite big magic trick, as you call it! However I do agree this is worth looking it. I think there could be a way of removing all the markup to another file, which has advantages in some (not all!) situations. I am not so sure you can simply ignore the details of the changes (though maybe I have not understood you properly here). But some worked examples will make that clear, it would be worth investigation. See also comments in 'Two items for voting...' thread. I hope you will be able to address this in the document you are working on that describes your approach. Robin -- -- ----------------------------------------------------------------- Robin La Fontaine, Director, DeltaXML Ltd "Change control for XML" T: +44 1684 592 144 E: robin.lafontaine@deltaxml.com http://www.deltaxml.com Registered in England 02528681 Reg. Office: Monsell House, WR8 0QN, UK