Hi guys,
So from what Fabio says it seems that the envisaged solution to the complex edits issue is to use segmentation through the use of @period. One of the most complicated edits
that I know of in UK legislation involve overlapping passive modifications (these are shown cumulatively) that are indicated within documents using square brackets.
<p>
Some text [start of pm1.....[start of pm2..... some more text .....end of pm1].....end of pm2] some text.
</p>
Using segmentation in this scenario is perfectly possible using nested <ins> elements but the structure starts to get fairly messy. Supposing the same scenario occurred across
sections it would get even messier.
<section>
<num>1</num>
<p>
Some text [start of pm1.....[start of pm2..... some more text.
</p>
</section>
<section>
<num>2</num>
<p>
Some text .....end of pm1].....end of pm2] some more text.
</p>
</section>
And it sounds as though Grant may well have even more complicated edits in Californian law. Would milestones not be a better option here? Or is this an option in the current
Schema allowing someone to decide to implement this approach rather than segmentation?
cheers
Mark
From:
legaldocml@lists.oasis-open.org [mailto:
legaldocml@lists.oasis-open.org]
On Behalf Of Fabio Vitali
Sent: 24 October 2012 17:18
To: Grant Vergottini
Cc:
legaldocml@lists.oasis-open.org Subject: Re: [legaldocml] Redlining Discussion Slides and Background Info
Dear Grant,
> 1) Redlining shown within an active modification. For example, in California all amendments to codes are always complete sections. It is not permitted to amend just a word or sentence. The old section must be replaced with a new section, even when the change
is just a few words. This makes it clearer to the legislators what is being proposed. To make it even clearer, they show the words within the section that are actually changing using strikeout and insert notation. But this is informative redlining and does
not correspond directly to either active or passive redlining. When you see this notation inside a section being amended inside a bill, it does not necessarily mean that this is a change from a previous version of a bill. Instead it is information about how
the amended section actually differs.
I believe that the solution I proposed in my first post on the subject can stand:
* the "period" attribute defined for all body elements should be used to specify the modifications on the text (actual modification). This is exactly what the legislator expects to change from one version to the other of the modified document. Please note that
the modifications from one version of a bill to the next are often overlapping and independent of the modifications that the bill is trying to enforce on another enacted document, and should be considered separately.
* the <ins> and <del> elements defined as inline structures (proper redlining). These are the elements used to specify how the differences should be rendered in print so that the reader understands what is going on and they should in any case satisfy the local
tradition in terms of presentation.
As such, since in California you always replace whole sections, you will need two sections in the text, one with the old period and the other with the new one. You will then insert <ins> and <del> elements in the text of the new one as you see fit, so that
the representation of the text is satisfactory for your users.
> 2) Merge/Split paragraphs. I believe there is now a mechanism to handle this. I saw mention of it a few weeks ago. I'm not clear of the details. In my case, the redlining notation used can be separated from the actual modification recorded as a modification.
The redlining reflects what the change is according to local custom instead of literally containing the change. This is what Fabio said and I agree with him.
I will not consider the case of the specification of redline (i.e., presentation only): in this case, you are pretty much allowed to do however you prefer.
As for the specification of the actual modification, I do not understand: you just said that you modify whole sections. In case you actually need to specify smaller scale modifications, and they are the _expression_ of a split and/or joined paragraph, I have
you a solution that is actually a trick: placing the old inline content inside a properly "period-ed" <span>, and placing the span within a paragraph with a newer period, as follows:
<p period="p1">Some text always belonging to the first paragraph</p>
<p period="p2"><span period="p1">This text used to belong to the first para but is now in the second one.</span></p>
This is a dirty trick, because it places old content within a new container, and requires to understand how to rebuild the old version (since it is not obtained by simply removing the elements associated to the new period, but requires some circuitous embedding
of content in separate containers. Still, it is better than throwing up your hand and rejecting any change that affects the hierarchical structure of the XML.
> 3) Complex edits. This is a difficult subject. In California, it is quite common for edits to be made that violate the XML hierarchy. For example, two bill sections might merge into one by striking the second half of the first bill section and the first half
of the second bill section. The drafter then assumes that the two remaining halves will make a single joined together section. This makes no sense from an XML point of view, but makes total sense if you're a drafter and only care about the paper.
There is ample literature on overlapping markup, i.e. markup that needs to go beyond the hierarchical restrictions of XML. A good review of the literature can be found in
* DeRose S. (2004), Markup Overlap: A Review and a Horse, Extreme Markup Languages 2004® Montréal, Québec
August 2-6, 2004,
http://pdf.aminer.org/000/201/061/markup_overlap_a_review_and_a_horse.pdf and more modestly, I've been working a little bit on this, too:
* Marinelli P., Vitali F., Zacchiroli S. (2008), Towards the unification of formats for overlapping markup, New Review of Hypermedia and Multimedia, Volume 14, Issue 1, 2008, pages 57-94, DOI:10.1080/13614560802316145,
http://www.tandfonline.com/doi/abs/10.1080/13614560802316145 * Di Iorio, A., Peroni, S., Vitali, F. (2010). Handling markup overlaps using OWL. In Proceedings of the 17th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2010). Lisbon, Portugal.
http://palindrom.es/phd/wp-content/uploads/2010/09/earmark_ekaw2010_cr.pdf * CIANCARINI P. DI IORIO A., SCHIRINZI M., VITALI F., MARCHETTI C., Detecting Changes on Semi-structured Data: A Natural and Extensible Approach, Submitted to ACM Transaction on the Web, 2012, attached.
The problem is that the editing operations are independent of and in contrast with the structure of the text. This is true in many fields, first and foremost word processor applications.
> When we build the California system, we tried to argue that this practice had to change. We were told that we had to support one way or another. This was not negotiable. Our solution was to rely on XMetaL's change control mechanism which records deletions
inside processing instructions and insertions bounded by two processing instructions and to write some very complex software to magically glue bill sections together when this sort of situation arose. And the scenario I outlined is the simple case - there
are much more complex scenarios involving heading levels in more complex documents.
>
> But the solution I used relied on the builtin change control mechanism that was built independent of our schema. With this approach we were able to handle all the possible scenarios without having to pollute the schema with lots and lots of "choices" to allow
deletions and insertions to be placed anywhere in the document. I think that this solution was the only way I could have solved the problem. If I had tried to invent some other mechanism that avoided XMetaL's builtin processing instruction solution, the complexity
of the problem would have made solving it impossible in an editing environment. So I don't think that trying to find a solution for this within Akoma Ntoso is a good idea. It is better to rely on the intrinsic (builtin) change control that the editor provides.
The biggest drawback though is that the insert and delete markings become proprietory solutions for a particular editor using processing instructions that are only understood by that editor and the solution built around it.
I personally do not like the Processing Instruction approach, which makes this kind of annotations very much dependent on the interpretation of a specific application. I prefer the purely XML approaches such as segmentation, milestones, or standoff. In fact,
both Open Document (Open Office) and Open XML (Microsoft Word) use this, and in many ways so does Akoma Ntoso.
The "period" attribute is, mutatis mutandis, the continuity attribute of segmentation, and the analysis section is the interpretation layer organized with a standoff approach very similar to OpenDocument.
> 4) I believe that I can map all other redlining scenarios into Akoma Ntoso with little problem. Related to this subject is our amendment practices which is page and line number oriented. That is how it is done in virtually all US jurisdictions and it's not
likely to change soon - page and line numbers are here to stay. But, like redlining, we have solved this problem without affecting the schema and I would propose that we not encumber the schema with page and line number issues.
Akoma Ntoso does in fact support page and line number, even if the presentation does not reflect them. That is to say, I make a link to line 23 of page 12, and this points to the right place even if it actually is line 15 of page 10.
The only problem is that you need to remember to mark up the end of lines and end of pages correctly, which you should do in any case if only you care about line and page numbers.
Ciao
Fabio
--
Fabio Vitali Tiger got to hunt, bird got to fly,
Dept. of Computer Science Man got to sit and wonder "Why, why, why?'
Univ. of Bologna ITALY Tiger got to sleep, bird got to land,
phone: +39 051 2094872 Man got to tell himself he understand.
e-mail:
fabio@cs.unibo.it Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/ ---------------------------------------------------------------------
To unsubscribe, e-mail:
legaldocml-unsubscribe@lists.oasis-open.org For additional commands, e-mail:
legaldocml-help@lists.oasis-open.org LexisNexis is a trading name of REED ELSEVIER (UK) LIMITED - Registered office - 1-3 STRAND, LONDON WC2N 5JR. Registered in England - Company No. 02746621