OASIS LegalDocumentML (LegalDocML) TC

  • 1.  Redlining Discussion Slides and Background Info

    Posted 10-24-2012 07:35
      |   view attached
    Here is a list of issues I would like to discuss during the TC meeting today. It's a follow-on from my previous presentation a number of weeks back on the subject. Hopefully we can start closing some of the issues I have come up with:   1) Redlining shown within an active modification. For example, in California all amendments to codes are always complete sections. It is not permitted to amend just a word or sentence. The old section must be replaced with a new section, even when the change is just a few words. This makes it clearer to the legislators what is being proposed. To make it even clearer, they show the words within the section that are actually changing using strikeout and insert notation. But this is informative redlining and does not correspond directly to either active or passive redlining. When you see this notation inside a section being amended inside a bill, it does not necessarily mean that this is a change from a previous version of a bill. Instead it is information about how the amended section actually differs.   2) Merge/Split paragraphs. I believe there is now a mechanism to handle this. I saw mention of it a few weeks ago. I'm not clear of the details. In my case, the redlining notation used can be separated from the actual modification recorded as a modification. The redlining reflects what the change is according to local custom instead of literally containing the change. This is what Fabio said and I agree with him.   3) Complex edits. This is a difficult subject. In California, it is quite common for edits to be made that violate the XML hierarchy. For example, two bill sections might merge into one by striking the second half of the first bill section and the first half of the second bill section. The drafter then assumes that the two remaining halves will make a single joined together section. This makes no sense from an XML point of view, but makes total sense if you're a drafter and only care about the paper.   When we build the California system, we tried to argue that this practice had to change. We were told that we had to support one way or another. This was not negotiable. Our solution was to rely on XMetaL's change control mechanism which records deletions inside processing instructions and insertions bounded by two processing instructions and to write some very complex software to magically glue bill sections together when this sort of situation arose. And the scenario I outlined is the simple case - there are much more complex scenarios involving heading levels in more complex documents.   But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots of "choices" to allow deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction solution, the complexity of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control that the editor provides. The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it.   4) I believe that I can map all other redlining scenarios into Akoma Ntoso with little problem. Related to this subject is our amendment practices which is page and line number oriented. That is how it is done in virtually all US jurisdictions and it's not likely to change soon - page and line numbers are here to stay. But, like redlining, we have solved this problem without affecting the schema and I would propose that we not encumber the schema with page and line number issues.     -Grant     On Tue, Oct 23, 2012 at 11:40 AM, monica.palmirani < monica.palmirani@unibo.it > wrote: Hi Grant, ok can you provide a list of these issues in order to permit to Fabio and me to thinking about? I would like to put them also in the agenda for informing the TC members. Thanks mp Il 23/10/2012 20:37, Grant Vergottini ha scritto: Hi Monica,   I would like to do a recap of the issues with redlining. Some issues have been addressed (split/merge), some issues I have a better idea how they might be addresses, and some issues remain open. I would like to discuss which issues are in which category.   -Grant On Tue, Oct 23, 2012 at 9:56 AM, monica.palmirani < monica.palmirani@unibo.it > wrote: Hi Grant, tomorrow there is the LegalDocML TC. You mentioned in the last telecon that you have some pending issues to discuss. Can you listed them so I am able to include in the agenda and in the communication? Thanks a lot, Monica -- =================================== Associate professor of Legal Informatics School of Law Alma Mater Studiorum Università di Bologna C.I.R.S.F.I.D. http://www.cirsfid.unibo.it/ Palazzo Dal Monte Gaudenzi - Via Galliera, 3 I - 40121 BOLOGNA (ITALY) Tel +39 051 277217 Fax +39 051 260782 E-mail   monica.palmirani@unibo.it ==================================== -- ____________________________________________________________________ Grant Vergottini Xcential Group, LLC. email: grant.vergottini@xcential.com phone: 858.361.6738 -- =================================== Associate professor of Legal Informatics School of Law Alma Mater Studiorum Università di Bologna C.I.R.S.F.I.D. http://www.cirsfid.unibo.it/ Palazzo Dal Monte Gaudenzi - Via Galliera, 3 I - 40121 BOLOGNA (ITALY) Tel +39 051 277217 Fax +39 051 260782 E-mail monica.palmirani@unibo.it ==================================== -- ____________________________________________________________________ Grant Vergottini Xcential Group, LLC. email: grant.vergottini@xcential.com phone: 858.361.6738 -- ____________________________________________________________________ Grant Vergottini Xcential Group, LLC. email: grant.vergottini@xcential.com phone: 858.361.6738 Attachment: California Redlining Examples.pptx Description: application/vnd.openxmlformats-officedocument.presentationml.presentation

    Attachment(s)



  • 2.  Re: [legaldocml] Redlining Discussion Slides and Background Info

    Posted 10-24-2012 09:03
    On 10/24/12 3:34 AM, Grant Vergottini wrote: But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots of "choices" to allow deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction solution, the complexity of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control that the editor provides. The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it. === Grant: This makes perfect sense to me. I wonder if we might create a related schema that could be used to describe the syntax of the proprietary solutions in a standard way -- that is, an interchange standard for processing instructions, such that someone porting could use it to transform PIs from one system to PIs that would work in another. Wouldn't have to be part of AKN, necessarily. But very handy for systems/practice communities that make different assumptions about the granularity of edits and their implicit merging behavior. I suspect that numbering practices in the shadow of insertion would raise similar issues ... t. -- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Thomas R. Bruce @trbruce Director, Legal Information Institute Cornell Law School http://www.law.cornell.edu/ @liicornell +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+


  • 3.  Re: [legaldocml] Redlining Discussion Slides and Background Info

    Posted 10-24-2012 14:26
    I found this article which seems like a good summary of the various approaches to solving this problem.   http://www.deltaxml.com/attachment/341-dxml/xml-change-tracking-review.pdf   It seems that XMetaL, Xopus, and Oxygen all take the processing instruction approach - but none in a standard way. I agree that there should be some standard representation that is editor neutral and allows for data interchange and can last for much longer than the lifetime of any particular editor. It's a general subject that is much broader than Akoma Ntoso. I'm still looking around to see if anyone has tackled this.   A related issue that should be solved in a similar manner is page and line number handling. We also address this with processing instructions, by back annotating the source document with page and line number PIs during the publishing process. As the PDF is generated using XSL-FO, we use a hook to edit the input stream. The result allows us to reference page and line numbers, doesn't pollute the schema, and is very flexible. But again, it's a proprietory solution.   -Grant On Wed, Oct 24, 2012 at 2:02 AM, Thomas R. Bruce < tom@liicornell.org > wrote: On 10/24/12 3:34 AM, Grant Vergottini wrote: But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots of "choices" to allow deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction solution, the complexity of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control that the editor provides. The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it. === Grant: This makes perfect sense to me.  I wonder if we might create a related schema that could be used to describe the syntax of the proprietary solutions in a standard way -- that is, an interchange standard for processing instructions, such that someone porting could use it to transform PIs from one system to PIs that would work in another.  Wouldn't have to be part of AKN, necessarily. But very handy for systems/practice communities that make different assumptions about the granularity of edits and their implicit merging behavior.  I suspect that numbering practices in the shadow of insertion would raise similar issues ... t. -- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+= +=+=+=+=+ Thomas R. Bruce @trbruce Director, Legal Information Institute Cornell Law School http://www.law.cornell.edu/ @liicornell +=+=+=+=+=+=+=+=+=+=+=+=+=+=+= +=+=+=+=+ -- ____________________________________________________________________ Grant Vergottini Xcential Group, LLC. email: grant.vergottini@xcential.com phone: 858.361.6738


  • 4.  RE: [legaldocml] Redlining Discussion Slides and Background Info

    Posted 10-24-2012 14:14




    Grant wrote:
    “ it does not necessarily mean that this is a change from a previous version of a bill. Instead it is information about how the amended section actually
    differs.”
     
    I’m losing the thread of your argument here. If this shows “how the amended section … differs” from a previous version, why is it not “a change from a previous
    version”?
     
    Can you bring an example of some redlining strike or insert that illustrates this point?
     
    Thanks,
    Roger
     
    Roger Sperberg
     
    973-200-4224 primary
    212-309-8125 office
    roger.sperberg@lexisnexis.com
     
     
     
    From: legaldocml@lists.oasis-open.org [mailto:legaldocml@lists.oasis-open.org]
    On Behalf Of Grant Vergottini
    Sent: Wednesday, October 24, 2012 3:35 AM
    To: legaldocml@lists.oasis-open.org
    Subject: [legaldocml] Redlining Discussion Slides and Background Info
     

    Here is a list of issues I would like to discuss during the TC meeting today. It's a follow-on from my previous presentation a number of weeks back on the subject. Hopefully we can start closing some of the issues I have come up with:

     


    1) Redlining shown within an active modification. For example, in California all amendments to codes are always complete sections. It is not permitted to amend just a word or sentence. The old section must be replaced with a new section,
    even when the change is just a few words. This makes it clearer to the legislators what is being proposed. To make it even clearer, they show the words within the section that are actually changing using strikeout and insert notation. But this is informative
    redlining and does not correspond directly to either active or passive redlining. When you see this notation inside a section being amended inside a bill, it does not necessarily mean that this is a change from a previous version of a bill. Instead it is information
    about how the amended section actually differs.


     


    2) Merge/Split paragraphs. I believe there is now a mechanism to handle this. I saw mention of it a few weeks ago. I'm not clear of the details. In my case, the redlining notation used can be separated from the actual modification recorded
    as a modification. The redlining reflects what the change is according to local custom instead of literally containing the change. This is what Fabio said and I agree with him.


     


    3) Complex edits. This is a difficult subject. In California, it is quite common for edits to be made that violate the XML hierarchy. For example, two bill sections might merge into one by striking the second half of the first bill section
    and the first half of the second bill section. The drafter then assumes that the two remaining halves will make a single joined together section. This makes no sense from an XML point of view, but makes total sense if you're a drafter and only care about the
    paper.


     


    When we build the California system, we tried to argue that this practice had to change. We were told that we had to support one way or another. This was not negotiable. Our solution was to rely on XMetaL's change control mechanism which
    records deletions inside processing instructions and insertions bounded by two processing instructions and to write some very complex software to magically glue bill sections together when this sort of situation arose. And the scenario I outlined is the simple
    case - there are much more complex scenarios involving heading levels in more complex documents.


     


    But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots
    of "choices" to allow deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction
    solution, the complexity of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control
    that the editor provides. The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it.


     


    4) I believe that I can map all other redlining scenarios into Akoma Ntoso with little problem. Related to this subject is our amendment practices which is page and line number oriented. That is how it is done in virtually all US jurisdictions
    and it's not likely to change soon - page and line numbers are here to stay. But, like redlining, we have solved this problem without affecting the schema and I would propose that we not encumber the schema with page and line number issues.


     


     


    -Grant




     




     


    On Tue, Oct 23, 2012 at 11:40 AM, monica.palmirani < monica.palmirani@unibo.it > wrote:


    Hi Grant,
    ok can you provide a list of these issues in order to permit to Fabio and me to thinking about?
    I would like to put them also in the agenda for informing the TC members.

    Thanks
    mp
    Il 23/10/2012 20:37, Grant Vergottini ha scritto:





    Hi Monica,


     


    I would like to do a recap of the issues with redlining. Some issues have been addressed (split/merge), some issues I have a better idea how they might be addresses, and some issues remain open. I would like to discuss which issues are
    in which category.


     


    -Grant


    On Tue, Oct 23, 2012 at 9:56 AM, monica.palmirani < monica.palmirani@unibo.it > wrote:
    Hi Grant,
    tomorrow there is the LegalDocML TC. You mentioned in the last telecon that you have some pending issues to discuss.
    Can you listed them so I am able to include in the agenda and in the communication?

    Thanks a lot,
    Monica

    --
    ===================================
    Associate professor of Legal Informatics
    School of Law
    Alma Mater Studiorum Università di Bologna
    C.I.R.S.F.I.D. http://www.cirsfid.unibo.it/
    Palazzo Dal Monte Gaudenzi - Via Galliera, 3
    I - 40121 BOLOGNA (ITALY)
    Tel +39 051 277217
    Fax +39 051 260782
    E-mail   monica.palmirani@unibo.it
    ====================================




    --
    ____________________________________________________________________
    Grant Vergottini
    Xcential Group, LLC.
    email: grant.vergottini@xcential.com
    phone: 858.361.6738





    --
    ===================================
    Associate professor of Legal Informatics
    School of Law
    Alma Mater Studiorum Università di Bologna
    C.I.R.S.F.I.D. http://www.cirsfid.unibo.it/
    Palazzo Dal Monte Gaudenzi - Via Galliera, 3
    I - 40121 BOLOGNA (ITALY)
    Tel +39 051 277217
    Fax +39 051 260782
    E-mail  monica.palmirani@unibo.it
    ====================================
     







    --
    ____________________________________________________________________
    Grant Vergottini
    Xcential Group, LLC.
    email: grant.vergottini@xcential.com
    phone: 858.361.6738







     

    --
    ____________________________________________________________________
    Grant Vergottini
    Xcential Group, LLC.
    email: grant.vergottini@xcential.com
    phone: 858.361.6738






  • 5.  Re: [legaldocml] Redlining Discussion Slides and Background Info

    Posted 10-24-2012 16:18
      |   view attached
    Dear Grant, > 1) Redlining shown within an active modification. For example, in California all amendments to codes are always complete sections. It is not permitted to amend just a word or sentence. The old section must be replaced with a new section, even when the change is just a few words. This makes it clearer to the legislators what is being proposed. To make it even clearer, they show the words within the section that are actually changing using strikeout and insert notation. But this is informative redlining and does not correspond directly to either active or passive redlining. When you see this notation inside a section being amended inside a bill, it does not necessarily mean that this is a change from a previous version of a bill. Instead it is information about how the amended section actually differs. I believe that the solution I proposed in my first post on the subject can stand: * the "period" attribute defined for all body elements should be used to specify the modifications on the text (actual modification). This is exactly what the legislator expects to change from one version to the other of the modified document. Please note that the modifications from one version of a bill to the next are often overlapping and independent of the modifications that the bill is trying to enforce on another enacted document, and should be considered separately. * the <ins> and <del> elements defined as inline structures (proper redlining). These are the elements used to specify how the differences should be rendered in print so that the reader understands what is going on and they should in any case satisfy the local tradition in terms of presentation. As such, since in California you always replace whole sections, you will need two sections in the text, one with the old period and the other with the new one. You will then insert <ins> and <del> elements in the text of the new one as you see fit, so that the representation of the text is satisfactory for your users. > 2) Merge/Split paragraphs. I believe there is now a mechanism to handle this. I saw mention of it a few weeks ago. I'm not clear of the details. In my case, the redlining notation used can be separated from the actual modification recorded as a modification. The redlining reflects what the change is according to local custom instead of literally containing the change. This is what Fabio said and I agree with him. I will not consider the case of the specification of redline (i.e., presentation only): in this case, you are pretty much allowed to do however you prefer. As for the specification of the actual modification, I do not understand: you just said that you modify whole sections. In case you actually need to specify smaller scale modifications, and they are the expression of a split and/or joined paragraph, I have you a solution that is actually a trick: placing the old inline content inside a properly "period-ed" <span>, and placing the span within a paragraph with a newer period, as follows: <p period="p1">Some text always belonging to the first paragraph</p> <p period="p2"><span period="p1">This text used to belong to the first para but is now in the second one.</span></p> This is a dirty trick, because it places old content within a new container, and requires to understand how to rebuild the old version (since it is not obtained by simply removing the elements associated to the new period, but requires some circuitous embedding of content in separate containers. Still, it is better than throwing up your hand and rejecting any change that affects the hierarchical structure of the XML. > 3) Complex edits. This is a difficult subject. In California, it is quite common for edits to be made that violate the XML hierarchy. For example, two bill sections might merge into one by striking the second half of the first bill section and the first half of the second bill section. The drafter then assumes that the two remaining halves will make a single joined together section. This makes no sense from an XML point of view, but makes total sense if you're a drafter and only care about the paper. There is ample literature on overlapping markup, i.e. markup that needs to go beyond the hierarchical restrictions of XML. A good review of the literature can be found in * DeRose S. (2004), Markup Overlap: A Review and a Horse, Extreme Markup Languages 2004® Montréal, Québec August 2-6, 2004, http://pdf.aminer.org/000/201/061/markup_overlap_a_review_and_a_horse.pdf and more modestly, I've been working a little bit on this, too: * Marinelli P., Vitali F., Zacchiroli S. (2008), Towards the unification of formats for overlapping markup, New Review of Hypermedia and Multimedia, Volume 14, Issue 1, 2008, pages 57-94, DOI:10.1080/13614560802316145, http://www.tandfonline.com/doi/abs/10.1080/13614560802316145 * Di Iorio, A., Peroni, S., Vitali, F. (2010). Handling markup overlaps using OWL. In Proceedings of the 17th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2010). Lisbon, Portugal. http://palindrom.es/phd/wp-content/uploads/2010/09/earmark_ekaw2010_cr.pdf * CIANCARINI P. DI IORIO A., SCHIRINZI M., VITALI F., MARCHETTI C., Detecting Changes on Semi-structured Data: A Natural and Extensible Approach, Submitted to ACM Transaction on the Web, 2012, attached. The problem is that the editing operations are independent of and in contrast with the structure of the text. This is true in many fields, first and foremost word processor applications. > When we build the California system, we tried to argue that this practice had to change. We were told that we had to support one way or another. This was not negotiable. Our solution was to rely on XMetaL's change control mechanism which records deletions inside processing instructions and insertions bounded by two processing instructions and to write some very complex software to magically glue bill sections together when this sort of situation arose. And the scenario I outlined is the simple case - there are much more complex scenarios involving heading levels in more complex documents. > > But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots of "choices" to allow deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction solution, the complexity of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control that the editor provides. The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it. I personally do not like the Processing Instruction approach, which makes this kind of annotations very much dependent on the interpretation of a specific application. I prefer the purely XML approaches such as segmentation, milestones, or standoff. In fact, both Open Document (Open Office) and Open XML (Microsoft Word) use this, and in many ways so does Akoma Ntoso. The "period" attribute is, mutatis mutandis, the continuity attribute of segmentation, and the analysis section is the interpretation layer organized with a standoff approach very similar to OpenDocument. > 4) I believe that I can map all other redlining scenarios into Akoma Ntoso with little problem. Related to this subject is our amendment practices which is page and line number oriented. That is how it is done in virtually all US jurisdictions and it's not likely to change soon - page and line numbers are here to stay. But, like redlining, we have solved this problem without affecting the schema and I would propose that we not encumber the schema with page and line number issues. Akoma Ntoso does in fact support page and line number, even if the presentation does not reflect them. That is to say, I make a link to line 23 of page 12, and this points to the right place even if it actually is line 15 of page 10. The only problem is that you need to remember to mark up the end of lines and end of pages correctly, which you should do in any case if only you care about line and page numbers. Ciao Fabio -- Fabio Vitali Tiger got to hunt, bird got to fly, Dept. of Computer Science Man got to sit and wonder "Why, why, why?' Univ. of Bologna ITALY Tiger got to sleep, bird got to land, phone: +39 051 2094872 Man got to tell himself he understand. e-mail: fabio@cs.unibo.it Kurt Vonnegut (1922-2007), "Cat's cradle" http://vitali.web.cs.unibo.it/ Attachment: JNDiff_TWEB.pdf Description: Adobe PDF document

    Attachment(s)

    pdf
    JNDiff_TWEB.pdf   1.78 MB 1 version


  • 6.  RE: [legaldocml] Redlining Discussion Slides and Background Info

    Posted 11-14-2012 14:05




    Hi guys,
     
    So from what Fabio says it seems that the envisaged solution to the complex edits issue is to use segmentation through the use of @period. One of the most complicated edits
    that I know of in UK legislation involve overlapping passive modifications (these are shown cumulatively) that are indicated within documents using square brackets.
     
    <p>
                    Some text [start of pm1.....[start of pm2..... some more text .....end of pm1].....end of pm2] some text.
    </p>
     
    Using segmentation in this scenario is perfectly possible using nested <ins> elements but the structure starts to get fairly messy. Supposing the same scenario occurred across
    sections it would get even messier.
     
    <section>
                    <num>1</num>
                    <p>
                                    Some text [start of pm1.....[start of pm2..... some more text.
                    </p>
    </section>
    <section>
                    <num>2</num>
                    <p>
                                    Some text .....end of pm1].....end of pm2] some more text.
                    </p>
    </section>
     
    And it sounds as though Grant may well have even more complicated edits in Californian law. Would milestones not be a better option here? Or is this an option in the current
    Schema allowing someone to decide to implement this approach rather than segmentation?
     
     
    cheers
     
    Mark
     
     
     


    From: legaldocml@lists.oasis-open.org [mailto:legaldocml@lists.oasis-open.org]
    On Behalf Of Fabio Vitali
    Sent: 24 October 2012 17:18
    To: Grant Vergottini
    Cc: legaldocml@lists.oasis-open.org
    Subject: Re: [legaldocml] Redlining Discussion Slides and Background Info


     


    Dear Grant,


    > 1) Redlining shown within an active modification. For example, in California all amendments to codes are always complete sections. It is not permitted to amend just a word or sentence. The old section must be replaced with a new section, even when the change
    is just a few words. This makes it clearer to the legislators what is being proposed. To make it even clearer, they show the words within the section that are actually changing using strikeout and insert notation. But this is informative redlining and does
    not correspond directly to either active or passive redlining. When you see this notation inside a section being amended inside a bill, it does not necessarily mean that this is a change from a previous version of a bill. Instead it is information about how
    the amended section actually differs.

    I believe that the solution I proposed in my first post on the subject can stand:


    * the "period" attribute defined for all body elements should be used to specify the modifications on the text (actual modification). This is exactly what the legislator expects to change from one version to the other of the modified document. Please note that
    the modifications from one version of a bill to the next are often overlapping and independent of the modifications that the bill is trying to enforce on another enacted document, and should be considered separately.

    * the <ins> and <del> elements defined as inline structures (proper redlining). These are the elements used to specify how the differences should be rendered in print so that the reader understands what is going on and they should in any case satisfy the local
    tradition in terms of presentation.

    As such, since in California you always replace whole sections, you will need two sections in the text, one with the old period and the other with the new one. You will then insert <ins> and <del> elements in the text of the new one as you see fit, so that
    the representation of the text is satisfactory for your users. 

    > 2) Merge/Split paragraphs. I believe there is now a mechanism to handle this. I saw mention of it a few weeks ago. I'm not clear of the details. In my case, the redlining notation used can be separated from the actual modification recorded as a modification.
    The redlining reflects what the change is according to local custom instead of literally containing the change. This is what Fabio said and I agree with him.

    I will not consider the case of the specification of redline (i.e., presentation only): in this case, you are pretty much allowed to do however you prefer.


    As for the specification of the actual modification, I do not understand: you just said that you modify whole sections. In case you actually need to specify smaller scale modifications, and they are the _expression_ of a split and/or joined paragraph, I have
    you a solution that is actually a trick: placing the old inline content inside a properly "period-ed" <span>, and placing the span within a paragraph with a newer period, as follows: 


    <p period="p1">Some text always belonging to the first paragraph</p>
    <p period="p2"><span period="p1">This text used to belong to the first para but is now in the second one.</span></p>

    This is a dirty trick, because it places old content within a new container, and requires to understand how to rebuild the old version (since it is not obtained by simply removing the elements associated to the new period, but requires some circuitous embedding
    of content in separate containers. Still, it is better than throwing up your hand and rejecting any change that affects the hierarchical structure of the XML.


    > 3) Complex edits. This is a difficult subject. In California, it is quite common for edits to be made that violate the XML hierarchy. For example, two bill sections might merge into one by striking the second half of the first bill section and the first half
    of the second bill section. The drafter then assumes that the two remaining halves will make a single joined together section. This makes no sense from an XML point of view, but makes total sense if you're a drafter and only care about the paper.

    There is ample literature on overlapping markup, i.e. markup that needs to go beyond the hierarchical restrictions of XML. A good review of the literature can be found in


    * DeRose S. (2004), Markup Overlap: A Review and a Horse, Extreme Markup Languages 2004® Montréal, Québec
    August 2-6, 2004,
    http://pdf.aminer.org/000/201/061/markup_overlap_a_review_and_a_horse.pdf

    and more modestly, I've been working a little bit on this, too:

    * Marinelli P., Vitali F., Zacchiroli S. (2008), Towards the unification of formats for overlapping markup, New Review of Hypermedia and Multimedia, Volume 14, Issue 1, 2008, pages 57-94, DOI:10.1080/13614560802316145,
    http://www.tandfonline.com/doi/abs/10.1080/13614560802316145

    * Di Iorio, A., Peroni, S., Vitali, F. (2010). Handling markup overlaps using OWL. In Proceedings of the 17th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2010). Lisbon, Portugal.

    http://palindrom.es/phd/wp-content/uploads/2010/09/earmark_ekaw2010_cr.pdf

    * CIANCARINI P. DI IORIO A., SCHIRINZI M., VITALI F., MARCHETTI C., Detecting Changes on Semi-structured Data: A Natural and Extensible Approach, Submitted to ACM Transaction on the Web, 2012, attached.


    The problem is that the editing operations are independent of and in contrast with the structure of the text. This is true in many fields, first and foremost word processor applications.


    > When we build the California system, we tried to argue that this practice had to change. We were told that we had to support one way or another. This was not negotiable. Our solution was to rely on XMetaL's change control mechanism which records deletions
    inside processing instructions and insertions bounded by two processing instructions and to write some very complex software to magically glue bill sections together when this sort of situation arose. And the scenario I outlined is the simple case - there
    are much more complex scenarios involving heading levels in more complex documents.

    > But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots of "choices" to allow
    deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction solution, the complexity
    of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control that the editor provides.
    The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it.

    I personally do not like the Processing Instruction approach, which makes this kind of annotations very much dependent on the interpretation of a specific application. I prefer the purely XML approaches such as segmentation, milestones, or standoff. In fact,
    both Open Document (Open Office) and Open XML (Microsoft Word) use this, and in many ways so does Akoma Ntoso.


    The "period" attribute is, mutatis mutandis, the continuity attribute of segmentation, and the analysis section is the interpretation layer organized with a standoff approach very similar to OpenDocument.  


    > 4) I believe that I can map all other redlining scenarios into Akoma Ntoso with little problem. Related to this subject is our amendment practices which is page and line number oriented. That is how it is done in virtually all US jurisdictions and it's not
    likely to change soon - page and line numbers are here to stay. But, like redlining, we have solved this problem without affecting the schema and I would propose that we not encumber the schema with page and line number issues.

    Akoma Ntoso does in fact support page and line number, even if the presentation does not reflect them. That is to say, I make a link to line 23 of page 12, and this points to the right place even if it actually is line 15 of page 10.


    The only problem is that you need to remember to mark up the end of lines and end of pages correctly, which you should do in any case if only you care about line and page numbers.


    Ciao

    Fabio

    --

    Fabio Vitali                            Tiger got to hunt, bird got to fly,
    Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
    Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
    phone:  +39 051 2094872              Man got to tell himself he understand.
    e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's cradle"
    http://vitali.web.cs.unibo.it/








    ---------------------------------------------------------------------
    To unsubscribe, e-mail: legaldocml-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: legaldocml-help@lists.oasis-open.org






    LexisNexis is a trading name of REED ELSEVIER (UK) LIMITED - Registered office - 1-3 STRAND, LONDON WC2N 5JR. Registered in England - Company No. 02746621




  • 7.  Re: [legaldocml] Redlining Discussion Slides and Background Info

    Posted 11-14-2012 15:00
    Dear Mark, > And it sounds as though Grant may well have even more complicated edits in Californian law. Would milestones not be a better option here? Or is this an option in the current Schema allowing someone to decide to implement this approach rather than segmentation? on the one hand, I am perfectly fine with milestones to describe separately the beginning and the end of ranges that overlap the basic structures of an XML document. On the other hand milestones, while apparently easier on the eye of the reader of the XML source, have the problem that do not capture containment (i.e., the content of the element must be found out by determining everything that is after the starting milestone and before the ending milestone, which is considerably more complicated than juxtaposing the content of the individual segments in the segmentation scenario) and most importantly do not allow discontinuous overlaps. Now in your example you have a situation of non-contiguous overlap: > <section> > <num>1</num> > <p> > Some text [start of pm1.....[start of pm2..... some more text. > </p> > </section> > <section> > <num>2</num> > <p> > Some text .....end of pm1].....end of pm2] some more text. > </p> > </section> > In this case, element <num>2</num> would end up being part of the content of the milestoned element, which most probably is not intended nor appropriate. Using segmentation, on the other hand, would allow you to specify that the <num> element belongs to neither fragment, and belongs to the basic structure of the document, which I believe is the correct approach. Ciao Fabio -- Fabio Vitali Tiger got to hunt, bird got to fly, Dept. of Computer Science Man got to sit and wonder "Why, why, why?' Univ. of Bologna ITALY Tiger got to sleep, bird got to land, phone: +39 051 2094872 Man got to tell himself he understand. e-mail: fabio@cs.unibo.it Kurt Vonnegut (1922-2007), "Cat's cradle" http://vitali.web.cs.unibo.it/