OASIS Open Document Format for Office Applications (OpenDocument) TC

 View Only
  • 1.  Formal Request: ODF 1.2 Document Processing Model Proposal

    Posted 12-11-2008 07:56
    I formally request consideration of the proposal "ODF 1.2 Document Processing Model" for ODF 1.2
    
    The new proposal document with an incomplete sketch is on the wiki at 
    http://wiki.oasis-open.org/office/ODF_1.2_Document_Processing_Model
    
    my promise is to have this proposal complete enough on the Wiki to have serious discussion commence on the wiki by mid-January 2009.  It is my intention that work on this proposal could continue along with editorial work for review of the working drafts, resolution of editorial items, and review concerning applicability of public comments to earlier versions of ODF.  Whether the proposal can be treated in this manner can be determined as part of laying out the end-game beyond the completion of feature-related proposals.  Also, whether or not the proposal is enacted explicitly in the documents, its use as a working instrument for our determining that features have been specified completely enough.
    
    The brief description of the proposal is:
    
    "Reflect in the ODF 1.2 specification enough of a document model, its annotation and decoration, and the manipulation and processing of it  that the semantics for an ODF 1.2 document structure can be expressed.
    
    "Define the processing models and related decorations and annotations of the document model that are directed to particular processing  cases, such as document creation, visual presentation, final-form media presentation, interactive use, document manipulation, and  interactive manipulation."
    
    I think the purpose of this proposal is directly stated in the last item of the approach notes:
    
    I propose this sort of approach, even if articulated in a working agreement within the ODF-TC and not formally documented, as a  valuable tool for identifying questions that need to be answered in the ODF specification and then ensuring that the specification does  indeed provide answers to those questions.
    
     - Dennis
    
    Dennis E. Hamilton
    ------------------
    NuovoDoc: Design for Document System Interoperability 
    mailto:Dennis.Hamilton@acm.org | gsm:+1-206.779.9430 
    http://NuovoDoc.com http://ODMA.info/dev/ http://nfoWorks.org 
    
    


  • 2.  Re: [office] Formal Request: ODF 1.2 Document Processing Model Proposal

    Posted 12-11-2008 08:53
    Dennis,
    
    My puppy woke me up (it is raining in Covington) so I decided to catch 
    the early email. ;-)
    
    This is an interesting proposal but I could not decide if you were 
    proposing:
    
    1) A model and additional text about that model to be added to ODF 1.2 or
    
    2) A model that would be used as a heuristic in evaluating the 
    completeness/incompleteness of ODF more generally?
    
    Moreover, I am not entirely sure that we need to go towards processing 
    models, although they are a critical step in the chain of events that 
    lead to a "document" in the sense of something that we view and share 
    with others.
    
    The reason why I make that last statement is that I prefer to think of 
    ODF as a format that stores information that obviously has an implied 
    model of a document, 


  • 3.  RE: [office] Formal Request: ODF 1.2 Document Processing Model Proposal

    Posted 12-11-2008 19:20
    Patrick,
    
    There are already tacit indications of processing assumptions wherever the specification mentions user interaction or suggests behaviors (e.g., default properties recorded for use when a new table row is introduced).  The problem is that these are not grounded in anything.  (Maybe we should remove them.  That is valuable to discuss.)
    
    Processing Model might be the wrong term.  I was looking for a single noun phrase.  Off-and, I don't think I would be adverse to Document Model and Semantics, but I am wary of confusion with other uses of Document Model in our field. That would be something to hash out.  
    
    However, it is clear that certain decorations that are provided in a document structure are specifically intended to guide particular kinds of processing behavior -- that is in their semantics and sometimes it is their only semantics.  This raises conformance issues and also implications of what kinds of processing are being presumed.  I think some sort of explicit treatment of that is called for.  I don't mean to presume that ODF should impose a processing model (or a DOM), but that classes of processing scenario might need to be recognized to identify what the particular markup is generally directed toward.  Some nomenclature normalization around this may also be essential for conformance definitions that follow the current OASIS model.
    
    I am not sure what I am doing with regard to your (1-2) which is why it is vague.  That is intentional in the current sketch.  It may be too much to incorporate such a model in full-fledged form in ODF 1.2, but I do believe that (good term, thanks) a heuristic should be adopted for our being consistent in the specification.  I was thinking of that as a workable minimum.
    
    I also think that a nomenclature section of the ISO variety is called for either way.  And we need to work with careful definitions and use the terms consistently (e.g., using "XML document" when we are referring to any root-element subdocument of an ODF document structure, as distinct from the ODF document [structure] as a whole, with its variety of other parts and their packagings).
    
    Finally, I agree there are lots of ways to "process" ODF document structures (I am becoming fond of that term), and I don't propose ruling any of them out.  However, there are clearly presumed scenarios around what it all means when the ODF document representation is turned into a perceivable document for human use in office document applications and that the interpretation is in conformance with the semantics for ODF documents.  The challenge is figuring out how any normative language about that occurs in the specification, if at all, and then what it means to say that some (class of processing) is implemented in a conformant way.  
    
    I don't see how we can get by with no semantics at all for the markup (and I don't think you are suggesting that).  The ODF appeals to other standards by reference would seem to bring with them semantics from those specifications in any case.  Or maybe not. With the substitution of OASIS namespaces, I am not entirely clear what is incorporated and what is not. 
    
    I suppose that the OIC TC can go farther in this direction than the ODF specification might, but if there is no meaningful conformance in the ODF spec, it doesn't give anyone much to go on when it comes to assessing conformance of products, proposing ways to improve assurance of interchange and interoperable use, etc. 
    
    Thanks for your response.  I value this conversation with you.
    
     - Dennis
    
    


  • 4.  Re: [office] Formal Request: ODF 1.2 Document Processing Model Proposal

    Posted 12-11-2008 20:56
    Dennis,
    
    Dennis E. Hamilton wrote:
    > Patrick,
    >
    > There are already tacit indications of processing assumptions wherever the specification mentions user interaction or suggests behaviors (e.g., default properties recorded for use when a new table row is introduced).  The problem is that these are not grounded in anything.  (Maybe we should remove them.  That is valuable to discuss.)
    >
    >   
    Ah, well, yes, they are grounded, just not in the explicit text of ODF 
    1.2. ;-)
    
    Sorry!
    
    One term we could use would be text model since what I suspect most of 
    the semantics we define (at least in the presentation areas) are based 
    on an implicit model of texts.
    
    For example, take footnotes and endnotes. While it is true that we 
    define styles that define styles for the placement of such notes, we are 
    really relying upon an implicit notion of what we mean by "footnote" and 
    "endnote."
    
    Even though I can point to various attributes and styles that total up 
    to a pretty fair definition of either one, such as a separator line, 
    numbering that matches a location in the main text, location (bottom of 
    page, end of section, end of document), etc., that is only because I 
    know where to look and can piece together such a definition if pressed 
    by someone who wants one.
    
    Having said all that, I have been at the "text" (as distinguished from 
    "document" modeling in the more limited markup sense) modeling game for 
    a long time and while I can see real value in reaching common 
    understandings of some terms and perhaps even defining a handful of 
    them, I really don't think the pursuit of a solid model for texts is a 
    useful enterprise. The more we press for something definite in one part, 
    the more likely something will poke out on the other side, whether we 
    see it or not.
    > Processing Model might be the wrong term.  I was looking for a single noun phrase.  Off-and, I don't think I would be adverse to Document Model and Semantics, but I am wary of confusion with other uses of Document Model in our field. That would be something to hash out.  
    >
    >   
    My suggestion is "text model."
    > However, it is clear that certain decorations that are provided in a document structure are specifically intended to guide particular kinds of processing behavior -- that is in their semantics and sometimes it is their only semantics.  This raises conformance issues and also implications of what kinds of processing are being presumed.  I think some sort of explicit treatment of that is called for.  I don't mean to presume that ODF should impose a processing model (or a DOM), but that classes of processing scenario might need to be recognized to identify what the particular markup is generally directed toward.  Some nomenclature normalization around this may also be essential for conformance definitions that follow the current OASIS model.
    >
    >   
    Well, but I see conformance, particularly with semantics as being 
    incremental at best. There are some bright line areas, such as font 
    size, which has a commonly accepted definition. There are a lot of gray 
    areas as well.
    
    Take our use of fo:widow for example. Sure we call all count lines, that 
    looks like a bright line test doesn't it? But, recall that we don't 
    define a page geometry or word spacing, line breaking, etc. algorithms. 
    My bright line just got a lot fuzzier.
    
    Realize that I don't disagree with making semantics more explicit. The 
    only thing that has occupied more time than simply restructuring the 
    text (no small task) has been to keep asking when the semantics were 
    unclear to me. And to keep asking until I could understand what was 
    being said. The Sun team and others have been very patient with my 
    questions.
    
    I suppose my caution is to start such a quest with the understanding 
    that we really don't have a firm grasp on the semantics of texts and 
    that the more we talk about what understandings we do have, the more 
    precise they will become. But it is always an iterative process and not 
    one that ever finishes. I would like to think that ODF 1.2 is going to 
    be another step towards greater clarity in some semantics but realize 
    that it will have probably made other worse. Perhaps greater clarity is 
    too bold a claim, I would be happy with a different clarity! ;-)
    
    > I am not sure what I am doing with regard to your (1-2) which is why it is vague.  That is intentional in the current sketch.  It may be too much to incorporate such a model in full-fledged form in ODF 1.2, but I do believe that (good term, thanks) a heuristic should be adopted for our being consistent in the specification.  I was thinking of that as a workable minimum.
    >
    > I also think that a nomenclature section of the ISO variety is called for either way.  And we need to work with careful definitions and use the terms consistently (e.g., using "XML document" when we are referring to any root-element subdocument of an ODF document structure, as distinct from the ODF document [structure] as a whole, with its variety of other parts and their packagings).
    >   
    Oooooh! Bite you tongue! ;-)
    
    I really dislike nomenclature clauses, which I note are optional in the 
    ISO Directives.
    
    The reason is we have to *repeat* the definitions that already occur 
    elsewhere in the standard and those definitions in the nomenclature 
    clause *appear without context,* which can make defining some of them 
    quite difficult.
    
    Violates the first rule: Never repeat a definition (because it appears 
    once in the nomenclature clause and where we use it).
    
    Violates the second rule: Never define a definition differently. That is 
    just fraught with peril for mistakes.
    
    Perhaps an unfair example:
    
    I want to define "separator."
    
    Hmmm, how about: "A separator is a character that is displayed in lieu 
    of a line number"
    
    Relying on: "A separator is text that is displayed instead of a line 
    number for lines where no number is displayed." 
    


  • 5.  RE: [office] Formal Request: ODF 1.2 Document Processing Model Proposal

    Posted 12-12-2008 19:15
    Patrick,
    
    Some follow-up comments:
    
    1. I'm not thrilled by text model.  It seems too narrow and is only part of the deal (consider tables, drawings, presentations, etc., where a text model is just part of the deal).  I would rather go to document model than text model.  
    
    Question: In you passage '"text" (as distinguished from "document" modeling in the more limited markup sense) modeling', are you thinking of document as more limited.  I intended ODF Document Structure for the (logical) artifact (markup and such) level, as opposed to the abstraction that is somehow conveyed in the document structure.  I mean semantics up-level from XML and markup "semantics."  Maybe that is where we are looking cross-eyed at the same things.  (Another demonstration of why a nomenclature section is important [;<).
    
    2. I agree this is an incremental activity and the idea is not to attempt to define the world.  However, I think we stumble over tacitness and non-groundedness in the specification in ways that are detrimental (as in the tendency to use the spec as a way to fix what is arguably an implementation bug in the presentation:pages situation).  I think as we bear down and review the specification drafts and attempt to address all of the notes you have in them we will demonstrate the need to have some level of this in order to get ourselves out of the woods.
    
    3. As much as my eyes glaze over when I see a nomenclature section in an ISO specification, I think they are indispensible and even critical in the case of ODF.  First, we borrow wholesale from other specifications and they have their technical language (e.g., what an XML document is versus what an ODF document [representation] might be said to be) that is confusing in those parts of ODF where the notions appear to be comingled without discrimination.  Also, the way we talk about notions and semantics from other specifications is troublesome and we need to get clear about that.  In this respect, we do need to be clear what terms of art we are using (and also introducing ourselves).  In that regard, it seems to me:
     - we should make major conceptual definitions in a nomenclature section (and it should be allowed to reference sections where there is more depth and context if needed).  There is nothing wrong with defining many things local to their (confined) use.  ISO documents do not put every specialized term in the nomenclature, but we need to have something that calls forth consistency in major cases.
     - we should make it clear when we are using the notion in the nomenclature and have some way to not confuse that with informal use of phrases having some of the same words (or teach ourselves to use different words).
     - some problems are simply that a common noun is used without sufficient modifiers to maintain context and signal its technical use.  It is a bit wordy to always provide adjectives, but that is the price (along with looking for alternative wordings and terms that minimize the confusion).  The use of "separator" is an example.  
     - for OASIS, our documents (in both ODF and PDF) are really hypertexts and we can take advantage of that to make the relationship between the nomenclature/glossary section and an usage very clear. (I am not proposing that we do that for 1.2, but it seems like a good thing to keep our eye on.  (Apparently ISO or ITTF is not consistent in not preserving that sort of thing in their PDFs, which makes using their versions of IS 29500 parts a real bitch, although IS 26300 does include the TOC linking from the OASIS 1.0ed2 cs1 version.)
    
    Maybe we just have to see how it goes?
    
     - Dennis
    
    


  • 6.  Re: [office] Formal Request: ODF 1.2 Document Processing Model Proposal

    Posted 12-12-2008 20:27
    Dennis,
    
    Dennis E. Hamilton wrote:
    > Patrick,
    >
    > Some follow-up comments:
    >
    > 1. I'm not thrilled by text model.  It seems too narrow and is only part of the deal (consider tables, drawings, presentations, etc., where a text model is just part of the deal).  I would rather go to document model than text model.  
    >
    > Question: In you passage '"text" (as distinguished from "document" modeling in the more limited markup sense) modeling', are you thinking of document as more limited.  I intended ODF Document Structure for the (logical) artifact (markup and such) level, as opposed to the abstraction that is somehow conveyed in the document structure.  I mean semantics up-level from XML and markup "semantics."  Maybe that is where we are looking cross-eyed at the same things.  (Another demonstration of why a nomenclature section is important [;<).
    >
    >   
    Yes, nomenclature is very important. In my world view, "document" is far 
    more limited than "text." Text includes everything from cuneiform 
    tablets to the illuminated chapters of the Book of Going Forth By Day 
    (mis-named the Egyptian Book of the Dead), the carpet pages of the 
    Leningrad Codex as well as the more mundane things like presentations 
    and other modern ephemera. ;-) OK, ok, I will concede that the 
    multi-dimensional aspects of tables, drawings, etc. make them 
    interesting as well. But I would subsume all of them under "Text."
    > 2. I agree this is an incremental activity and the idea is not to attempt to define the world.  However, I think we stumble over tacitness and non-groundedness in the specification in ways that are detrimental (as in the tendency to use the spec as a way to fix what is arguably an implementation bug in the presentation:pages situation).  I think as we bear down and review the specification drafts and attempt to address all of the notes you have in them we will demonstrate the need to have some level of this in order to get ourselves out of the woods.
    >
    >   
    Probably the best way to pin down where it is needed. I guess I was 
    reacting to the notion that we would create yet another vocabulary, 
    which means we have to imagine the model. What you suggest is that we 
    name it as we come across it, if I am reading you correctly. I think 
    that works.
    > 3. As much as my eyes glaze over when I see a nomenclature section in an ISO specification, I think they are indispensible and even critical in the case of ODF.  First, we borrow wholesale from other specifications and they have their technical language (e.g., what an XML document is versus what an ODF document [representation] might be said to be) that is confusing in those parts of ODF where the notions appear to be comingled without discrimination.  Also, the way we talk about notions and semantics from other specifications is troublesome and we need to get clear about that.  In this respect, we do need to be clear what terms of art we are using (and also introducing ourselves).  In that regard, it seems to me:
    >  - we should make major conceptual definitions in a nomenclature section (and it should be allowed to reference sections where there is more depth and context if needed).  There is nothing wrong with defining many things local to their (confined) use.  ISO documents do not put every specialized term in the nomenclature, but we need to have something that calls forth consistency in major cases.
    >  - we should make it clear when we are using the notion in the nomenclature and have some way to not confuse that with informal use of phrases having some of the same words (or teach ourselves to use different words).
    >  - some problems are simply that a common noun is used without sufficient modifiers to maintain context and signal its technical use.  It is a bit wordy to always provide adjectives, but that is the price (along with looking for alternative wordings and terms that minimize the confusion).  The use of "separator" is an example.  
    >  - for OASIS, our documents (in both ODF and PDF) are really hypertexts and we can take advantage of that to make the relationship between the nomenclature/glossary section and an usage very clear. (I am not proposing that we do that for 1.2, but it seems like a good thing to keep our eye on.  (Apparently ISO or ITTF is not consistent in not preserving that sort of thing in their PDFs, which makes using their versions of IS 29500 parts a real bitch, although IS 26300 does include the TOC linking from the OASIS 1.0ed2 cs1 version.)
    >
    >   
    Well, but I think maintaining consistency is far more difficult that is 
    commonly realized. Particularly when we are preparing a document that 
    will be used by many people for who English is not their first language.
    
    And even for people whose first language is reportedly English, remember 
    that the 811 folks sat in the same room, thought they were using terms 
    the same way and went off and implemented different communication 
    protocols.
    > Maybe we just have to see how it goes?
    >   
    Sure. Grinding through the text (with a small "t") as in the ODF draft 
    is a good starting point.
    
    Hope you are looking forward to a great weekend!
    
    Patrick
    
    >  - Dennis
    >
    >