OASIS Darwin Information Typing Architecture (DITA) TC

 View Only
Expand all | Collapse all

RE: [dita] Proposal for Consideration: Default Behavior for List Items

  • 1.  RE: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-10-2008 16:25
    
    
    
    
    
    [Not sure if this is the right way to contribute to this thread, but I don't see any contributor hooks on the page or in the Help. Responding to http://lists.oasis-open.org/archives/dita/200804/msg00060.html.]
     
    I agree that rendering is an OT issue.
     
    The real issue IMO is that <li> permits #PCDATA and phrase-level elements. These should only be permitted in paragraph-level elements, and any element that permits paragraph-level or "larger" elements as children should not permit #PCDATA and phrase-level elements. This behavior seems to be a relic of the HTML standard.
     
    It is easy for OT and vendors to insert <p> by default, and if <li> begins with some other child element it is only a minor nuisance to delete <p> or insert that child ahead of <p>.
     
    This would simplify the work of rendering and remove the ambivalence that is the topic of this thread.
     
    Perhaps this is already being considered for 1.3 or 2.0.
     
        /Bruce Nevin


  • 2.  RE: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-10-2008 16:47

    Hi, Bruce:

    The intent of the DITA 1.2 constraints enhancement is to make it possible for adopters to implement simplifications like block-only content within sections and list items without forcing specialization or breaking backward compatibility.


    Hoping that's interesting,


    Erik Hennum
    ehennum@us.ibm.com


    "Bruce Nevin (bnevin)" <bnevin@cisco.com> wrote on 06/10/2008 09:22:21 AM:
    >
    > RE: [dita] Proposal for Consideration: Default Behavior for List Items

    >  
    > The real issue IMO is that <li> permits #PCDATA and phrase-level
    > elements. These should only be permitted in paragraph-level
    > elements, and any element that permits paragraph-level or "larger"
    > elements as children should not permit #PCDATA and phrase-level
    > elements. This behavior seems to be a relic of the HTML standard.

    >  
    > It is easy for OT and vendors to insert <p> by default, and if <li>
    > begins with some other child element it is only a minor nuisance to
    > delete <p> or insert that child ahead of <p>.

    >  
    > This would simplify the work of rendering and remove the ambivalence
    > that is the topic of this thread.



  • 3.  RE: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-10-2008 18:28

    A few points:

    - This would be a backwards-incompatible change. That is, it would render invalid a large proportion of the existing DITA content out there. I think we could consider this for 2.0 if the cost of converting all back-level content was justified by the benefits (I'm not currently convinced myself, but that would be the timeline to make the arguments)
    - This would also render the current task specialization invalid, since it specializes a <ph> element as the first child of <step>. As an exercise, see what any of the list specializations would look like, if only block-level elements were allowed (I suspect it would break most of them).

    Finally, and leaving aside the pragmatic reasons not to make a backwards-incompatible change to the schemas and DTDs at this point, I'm still not sure why this:

    <li><p>Do something</p>
           <p>One of three things happens:</p>
           <ul><li><p>A</p></li>
                   <li><p>B</p></li>        
                   <li><p>B</p></li>        
           </ul>
    </li>

    Is better than this:

    <li>Do something.
          <p>One of three things happens:
              <ul><li>A</li>
                      <li>B</li>        
                      <li>B</li>        
              </ul>
          </p>
    </li>

    If it's a relic of HTML, I'm not sure why it's a bad relic. The adoption of HTML hasn't exactly been crippled by this approach.        

    Michael Priestley
    Lead IBM DITA Architect
    mpriestl@ca.ibm.com
    http://dita.xml.org/blog/25



    "Bruce Nevin (bnevin)" <bnevin@cisco.com>

    06/10/2008 12:22 PM

    To
    <dita@lists.oasis-open.org>
    cc
    "Bruce Nevin (bnevin)" <bnevin@cisco.com>
    Subject
    RE: [dita] Proposal for Consideration: Default Behavior for List Items





    [Not sure if this is the right way to contribute to this thread, but I don't see any contributor hooks on the page or in the Help. Responding to http://lists.oasis-open.org/archives/dita/200804/msg00060.html.]
     
    I agree that rendering is an OT issue.
     
    The real issue IMO is that <li> permits #PCDATA and phrase-level elements. These should only be permitted in paragraph-level elements, and any element that permits paragraph-level or "larger" elements as children should not permit #PCDATA and phrase-level elements. This behavior seems to be a relic of the HTML standard.
     
    It is easy for OT and vendors to insert <p> by default, and if <li> begins with some other child element it is only a minor nuisance to delete <p> or insert that child ahead of <p>.
     
    This would simplify the work of rendering and remove the ambivalence that is the topic of this thread.
     
    Perhaps this is already being considered for 1.3 or 2.0.
     
        /Bruce Nevin


  • 4.  Re: [dita] Proposal for Consideration: Default Behavior for ListItems

    Posted 06-10-2008 20:06
      |   view attached

    Attachment(s)

    vcf
    azydron.vcf   240 B 1 version


  • 5.  Re: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-10-2008 20:26

    Hi Andrzej,

    I don't mean to suggest that HTML is without faults. Just that the example I posed does not strike me as one of them.

    The counter example you offer is not one I'm prepared to defend. I agree it's bad form. I regard it as a failure of XML that it cannot assert sequences on mixed content models. IE, I want to disallow the case you describe, while allowing the case I describe.

    Given that failure in XML, it seems we have two opposing strategies:
    - mine is to support the defensible markup and try to address the indefensible markup in other ways (authoring guidelines, post-processing cleanup, etc.)
    - yours is to disallow both the defensible and indefensible markup, since XML validation cannot distinguish between them.

    This doesn't mean we are in horrible disagreement about markup rules, but about where best to draw the line in a continuum of practice.

    With regards to the last two things you mention (highlighting domain and conreffing nouns):

    - the highlighting domain is explicitly *not* part of DITA base precisely so it can be excluded when more valid markup is available. That said, when the more valid markup is *not* available, I do believe it is better semantically to use <b> then to misuse <uicontrol> or some other semantic element. In other words, the highlighting markup tells us nothing, but that's better than telling a lie.
    - re conreffing nouns: I thought we had already gone over this. There are times when it is perfectly legitimate to conref nouns. Here are some of them:

            - reuse of UI strings to ensure consistency between documentation and interface (these can be resolved prior to sending to translation)
            - reuse of product names that have explicitly been vetted for such a purpose (this applies to most IBM product names I believe)
            - reuse of indexterm or prolog metadata content
            - simple lists of nouns (not part of a sentence)

    I am frustrated that we seem to always be walking the same ground.

    Michael Priestley
    Lead IBM DITA Architect
    mpriestl@ca.ibm.com
    http://dita.xml.org/blog/25



    Andrzej Zydron <azydron@xml-intl.com>

    06/10/2008 04:05 PM

    To
    Michael Priestley/Toronto/IBM@IBMCA
    cc
    dita@lists.oasis-open.org, "Bruce Nevin (bnevin)" <bnevin@cisco.com>
    Subject
    Re: [dita] Proposal for Consideration: Default Behavior for List Items





    Hi Michael,

    Your example failed to highlight the real problem, which is:

    <li>Do something.
         <p>One of three things happens:
             <ul><li>A</li>
                     <li>B</li>        
                     <li>B</li>        
             </ul>
           that really screw up segmentation, translation and any sane
           form of linguistic processing.
         </p>
    </li>

    The problem is that HTML was a VERY BAD IMPLEMENTATION of SGML. It
    concentrated on form rather than structure (mixing up both which is, if
    not a sin against humanity, then definitely one against common sense ;)
    ), which is why we needed XML. Basing an XML vocabulary on HTML (which
    would not even parse in SGML terms after about version 2.0) was, at best
    IMHO a dubious choice.

    Rather like <b>, <u>, <i> and translatable attributes this should all be
    consigned to the DITA 'deprecated' bin of history (BTW the same should
    be true of CONREF for individual nouns or noun phrases), and good
    riddance to it all. Anybody who has had to cope with translating such
    documents will testify to the difficulties involved therein.

    Best Regards,

    AZ



    Michael Priestley wrote:
    >
    > A few points:
    >
    > - This would be a backwards-incompatible change. That is, it would
    > render invalid a large proportion of the existing DITA content out
    > there. I think we could consider this for 2.0 if the cost of
    > converting all back-level content was justified by the benefits (I'm
    > not currently convinced myself, but that would be the timeline to make
    > the arguments)
    > - This would also render the current task specialization invalid,
    > since it specializes a <ph> element as the first child of <step>. As
    > an exercise, see what any of the list specializations would look like,
    > if only block-level elements were allowed (I suspect it would break
    > most of them).
    >
    > Finally, and leaving aside the pragmatic reasons not to make a
    > backwards-incompatible change to the schemas and DTDs at this point,
    > I'm still not sure why this:
    >
    > <li><p>Do something</p>
    >        <p>One of three things happens:</p>
    >        <ul><li><p>A</p></li>
    >                <li><p>B</p></li>        
    >                <li><p>B</p></li>        
    >        </ul>
    > </li>
    >
    > Is better than this:
    >
    > <li>Do something.
    >       <p>One of three things happens:
    >           <ul><li>A</li>
    >                   <li>B</li>        
    >                   <li>B</li>        
    >           </ul>
    >       </p>
    > </li>
    >
    > If it's a relic of HTML, I'm not sure why it's a bad relic. The
    > adoption of HTML hasn't exactly been crippled by this approach.        
    >
    > Michael Priestley
    > Lead IBM DITA Architect
    > mpriestl@ca.ibm.com
    > http://dita.xml.org/blog/25
    >
    >
    > *"Bruce Nevin (bnevin)" <bnevin@cisco.com>*
    >
    > 06/10/2008 12:22 PM
    >
    >                  
    > To
    >                  <dita@lists.oasis-open.org>
    > cc
    >                  "Bruce Nevin (bnevin)" <bnevin@cisco.com>
    > Subject
    >                  RE: [dita] Proposal for Consideration: Default Behavior for List Items
    >
    >
    >
    >                  
    >
    >
    >
    >
    >
    > [Not sure if this is the right way to contribute to this thread, but I
    > don't see any contributor hooks on the page or in the Help. Responding
    > to _http://lists.oasis-open.org/archives/dita/200804/msg00060.html_.]
    >  
    > I agree that rendering is an OT issue.
    >  
    > The real issue IMO is that <li> permits #PCDATA and phrase-level
    > elements. These should only be permitted in paragraph-level elements,
    > and any element that permits paragraph-level or "larger" elements as
    > children should not permit #PCDATA and phrase-level elements. This
    > behavior seems to be a relic of the HTML standard.
    >  
    > It is easy for OT and vendors to insert <p> by default, and if <li>
    > begins with some other child element it is only a minor nuisance to
    > delete <p> or insert that child ahead of <p>.
    >  
    > This would simplify the work of rendering and remove the ambivalence
    > that is the topic of this thread.
    >  
    > Perhaps this is already being considered for 1.3 or 2.0.
    >  
    >     /Bruce Nevin

    --
    email - azydron@xml-intl.com
    smail - c/o Mr. A.Zydron
                    PO Box 2167
           Gerrards Cross
           Bucks SL9 8XF
                    United Kingdom
    Mobile +(44) 7966 477 181
    FAX    +(44) 1753 480 465
    www - http://www.xml-intl.com

    This message contains confidential information and is intended only for
    the individual named.  If you are not the named addressee you may not
    disseminate, distribute or copy this e-mail.  Please notify the sender
    immediately by e-mail if you have received this e-mail by mistake and
    delete this e-mail from your system.
    E-mail transmission cannot be guaranteed to be secure or error-free as
    information could be intercepted, corrupted, lost, destroyed, arrive
    late or incomplete, or contain viruses.  The sender therefore does not
    accept liability for any errors or omissions in the contents of this
    message which arise as a result of e-mail transmission.  If verification
    is required please request a hard-copy version. Unless explicitly stated
    otherwise this message is provided for informational purposes only and
    should not be construed as a solicitation or offer.


    [attachment "azydron.vcf" deleted by Michael Priestley/Toronto/IBM] ---------------------------------------------------------------------
    To unsubscribe from this mail list, you must leave the OASIS TC that
    generates this mail.  You may a link to this group and all your TCs in OASIS
    at:
    https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php



  • 6.  RE: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-10-2008 21:07
    Thanks for the cc to me, Andrej. (Someone please offline point me to the
    right help file to learn how to engage this thread properly. Is it a
    subscription thing?) 
    
    > If it's a relic of HTML, I'm not sure why it's a bad relic. The 
    > adoption of HTML hasn't exactly been crippled by this approach.
    
    
    I suspect that on careful consideration, Michael, you might want to
    rephrase that. I know I don't have to tell you that HTML browsers have a
    much simpler rendering task because they're just about format, so the
    HTML spec can get away with ignoring semantic criteria. We can't. That's
    why a relic of HTML that is not semantically motivated is a bad thing
    for XML, and a bad thing for DITA.
    
    Case in point: To say that the text before a list (which may contain
    paragraphs, tables, lists, figures, etc.) is in the _same_paragraph_ 

    as the text after that list is perverse and contrary to any ordinary notion of "paragraph". Conversely, if DITA has its own definition of "paragraph" allowing that, then why not allow

    as a child of

    ? The same logic that proscribes that should proscribe anything like it, including lists. There is undoubtedly a cost to correcting bad decisions after their effects have become established. Bear in mind that if not corrected such considerations may become a barrier to adoption in the future after the utility of an on-ramp to XML wears off and users want closer semantic control of their content. We spoke of usability issues in the TC today. Here is one staring us in the face. Users found it confusing. Very possibly OT developers found it confusing, whence the disparate rendering. Making a clean categorization of elements in terms of their complexity could reduce confusion and simplify OT work. By complexity I mean something like phrase can only containt #PCDATA, para can only contain phrase and #PCDATA, "block" = {list, table, ...} can only contain para and "block", etc. This is representative of a larger issue. Another example is the decision to make lists and tables semantically distinct. That is properly a rendering distinction. Any table can be rendered as a list whose list items (the row elements) are parallel in structure. Any list whose items are parallel in structure (such as a list of steps) can be rendered as a table. Development of adaptable facilities for semantic tables is one of the unresolved challenges and potential benefits of XML, and that decision to sunder lists from tables obscures the means. That's a digression from the current thread, so we ought not to pursue it here. I just mention it to indicate that this is part of a larger issue of relics of HTML format markup that may be lurking, which should have been put in question relative to the SGML standard during the inception of DITA, but which for whatever reason were not. /BN >



  • 7.  RE: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-11-2008 00:20

    Hi Bruce,

    Again I'm not defending HTML in general - but it's still not clear to me why in the two examples I gave the one with the extra tags is actually semantically better. And I'm not talking about text both before and after a paragraph - or before and after any block. Just before, and just some of the time. Does it make sense to require everyone who authors a list to add a paragraph to every list, just because in some cases it's useful?

    Please don't think this aspect of DITA happened because of ignorance or lack of thought. A lot of thought and argument went into it. I'm happy to have those discussions again - it's a reflection of DITA's success that people are still interested in these issues. But as I said in my note to Andrzej, there is a tradeoff between being permissive and allowing bad markup and being restrictive and enforcing unnecessary markup. DITA's current positioning is somewhere in the midpoint of two possible extremes - it allows some bad markup (like text after a block) and requires some unnecessary markup (like body for a simple one-paragraph topic). But please be aware of the tradeoffs that each position entails.

    In terms of issues with usability etc. - I think there is always room for improvement. But my own assessment is that the confusion caused by a backwards-incompatible revamp of the standard that breaks every existing tool and most existing content would cause a lot more usability problems than our current loose content model does. And you haven't addressed what would happen to all the specializations which would become broken, including ones that have been around since DITA 1.0, such as task.

    In terms of lists and tables: believe it or not, we had this discussion too back in the early years. The current design of simpletable reflects an effort to converge the content models at least to some degree (simplifying tables to the point where they could be thought of as multipart lists, unlike CALS tables with spans etc.). And the door is still open to introduce some base classes in the future from which both lists and simpletables could derive (probably in the 2.0 timeframe). But we didn't include it in DITA 1.0 because we couldn't see any usability benefit in doing so. It's a more semantically pure model but doesn't hit the author.

    Michael Priestley
    Lead IBM DITA Architect
    mpriestl@ca.ibm.com
    http://dita.xml.org/blog/25



    "Bruce Nevin (bnevin)" <bnevin@cisco.com>

    06/10/2008 05:06 PM

    To
    "Andrzej Zydron" <azydron@xml-intl.com>, Michael Priestley/Toronto/IBM@IBMCA
    cc
    <dita@lists.oasis-open.org>
    Subject
    RE: [dita] Proposal for Consideration: Default Behavior for List Items





    Thanks for the cc to me, Andrej. (Someone please offline point me to the
    right help file to learn how to engage this thread properly. Is it a
    subscription thing?)

    > If it's a relic of HTML, I'm not sure why it's a bad relic. The
    > adoption of HTML hasn't exactly been crippled by this approach.


    I suspect that on careful consideration, Michael, you might want to
    rephrase that. I know I don't have to tell you that HTML browsers have a
    much simpler rendering task because they're just about format, so the
    HTML spec can get away with ignoring semantic criteria. We can't. That's
    why a relic of HTML that is not semantically motivated is a bad thing
    for XML, and a bad thing for DITA.

    Case in point: To say that the text before a list (which may contain
    paragraphs, tables, lists, figures, etc.) is in the _same_paragraph_ <p>
    as the text after that list is perverse and contrary to any ordinary
    notion of "paragraph". Conversely, if DITA has its own definition of
    "paragraph" allowing that, then why not allow <p> as a child of <p>? The
    same logic that proscribes that should proscribe anything like it,
    including lists.

    There is undoubtedly a cost to correcting bad decisions after their
    effects have become established. Bear in mind that if not corrected such
    considerations may become a barrier to adoption in the future after the
    utility of an on-ramp to XML wears off and users want closer semantic
    control of their content. We spoke of usability issues in the TC today.
    Here is one staring us in the face. Users found it confusing. Very
    possibly OT developers found it confusing, whence the disparate
    rendering. Making a clean categorization of elements in terms of their
    complexity could reduce confusion and simplify OT work. By complexity I
    mean something like phrase can only containt #PCDATA, para can only
    contain phrase and #PCDATA, "block" = {list, table, ...} can only
    contain para and "block", etc.

    This is representative of a larger issue. Another example is the
    decision to make lists and tables semantically distinct. That is
    properly a rendering distinction. Any table can be rendered as a list
    whose list items (the row elements) are parallel in structure. Any list
    whose items are parallel in structure (such as a list of steps) can be
    rendered as a table. Development of adaptable facilities for semantic
    tables is one of the unresolved challenges and potential benefits of
    XML, and that decision to sunder lists from tables obscures the means.

    That's a digression from the current thread, so we ought not to pursue
    it here. I just mention it to indicate that this is part of a larger
    issue of relics of HTML format markup that may be lurking, which should
    have been put in question relative to the SGML standard during the
    inception of DITA, but which for whatever reason were not.

                    /BN

    >



  • 8.  RE: [dita] Proposal for Consideration: Default Behavior for List Items

    Posted 06-11-2008 00:50