Yes, it's an old dichotomy.
PDP-11 had the Fascist approach, with UNIX you are free to shoot yourself in the
foot (or head).
A categorization by complexity
is in fact a middle path with a lot of flexibility, but it sounds like the
repercussions of changing something so deeply rooted in the
architecture would be like telling all keyboard users they have to switch
from qwerty to dvorak.
That leaves it to users to do it
locally with guidelines, best practices, authoring tool customizations,
and DITA specializations. It seems to me that such issues should be exposed and
made explicit to users, so that they can make decisions in an informed and
intelligent way.
Andrzej, my apology for
misspelling your name. Here's a cheer for OAXAL!
/BN
Hi Bruce,
Again I'm not defending HTML in general - but it's
still not clear to me why in the two examples I gave the one with the extra
tags is actually semantically better. And I'm not talking about text both
before and after a paragraph - or before and after any block. Just before, and
just some of the time. Does it make sense to require everyone who authors a
list to add a paragraph to every list, just because in some cases it's
useful?
Please don't think this
aspect of DITA happened because of ignorance or lack of thought. A lot of
thought and argument went into it. I'm happy to have those discussions again -
it's a reflection of DITA's success that people are still interested in these
issues. But as I said in my note to Andrzej, there is a tradeoff between being
permissive and allowing bad markup and being restrictive and enforcing
unnecessary markup. DITA's current positioning is somewhere in the midpoint of
two possible extremes - it allows some bad markup (like text after a block)
and requires some unnecessary markup (like body for a simple one-paragraph
topic). But please be aware of the tradeoffs that each position entails.
In terms of issues with usability
etc. - I think there is always room for improvement. But my own assessment is
that the confusion caused by a backwards-incompatible revamp of the standard
that breaks every existing tool and most existing content would cause a lot
more usability problems than our current loose content model does. And you
haven't addressed what would happen to all the specializations which would
become broken, including ones that have been around since DITA 1.0, such as
task.
In terms of lists and
tables: believe it or not, we had this discussion too back in the early years.
The current design of simpletable reflects an effort to converge the content
models at least to some degree (simplifying tables to the point where they
could be thought of as multipart lists, unlike CALS tables with spans etc.).
And the door is still open to introduce some base classes in the future from
which both lists and simpletables could derive (probably in the 2.0
timeframe). But we didn't include it in DITA 1.0 because we couldn't see any
usability benefit in doing so. It's a more semantically pure model but doesn't
hit the author.
Michael
Priestley
Lead IBM DITA Architect
mpriestl@ca.ibm.com
http://dita.xml.org/blog/25
Thanks for the cc to me, Andrej. (Someone please offline point me to
the
right help file to learn how to engage this thread properly. Is it
a
subscription thing?)
> If it's a relic of HTML, I'm not sure
why it's a bad relic. The
> adoption of HTML hasn't exactly been
crippled by this approach.
I suspect that on careful consideration,
Michael, you might want to
rephrase that. I know I don't have to tell you
that HTML browsers have a
much simpler rendering task because they're just
about format, so the
HTML spec can get away with ignoring semantic
criteria. We can't. That's
why a relic of HTML that is not semantically
motivated is a bad thing
for XML, and a bad thing for DITA.
Case in
point: To say that the text before a list (which may contain
paragraphs,
tables, lists, figures, etc.) is in the _same_paragraph_ <p>
as the
text after that list is perverse and contrary to any ordinary
notion of
"paragraph". Conversely, if DITA has its own definition of
"paragraph"
allowing that, then why not allow <p> as a child of <p>?
The
same logic that proscribes that should proscribe anything like
it,
including lists.
There is undoubtedly a cost to correcting bad
decisions after their
effects have become established. Bear in mind that if
not corrected such
considerations may become a barrier to adoption in the
future after the
utility of an on-ramp to XML wears off and users want
closer semantic
control of their content. We spoke of usability issues in
the TC today.
Here is one staring us in the face. Users found it confusing.
Very
possibly OT developers found it confusing, whence the
disparate
rendering. Making a clean categorization of elements in terms of
their
complexity could reduce confusion and simplify OT work. By complexity
I
mean something like phrase can only containt #PCDATA, para can
only
contain phrase and #PCDATA, "block" = {list, table, ...} can
only
contain para and "block", etc.
This is representative of a
larger issue. Another example is the
decision to make lists and tables
semantically distinct. That is
properly a rendering distinction. Any table
can be rendered as a list
whose list items (the row elements) are parallel
in structure. Any list
whose items are parallel in structure (such as a
list of steps) can be
rendered as a table. Development of adaptable
facilities for semantic
tables is one of the unresolved challenges and
potential benefits of
XML, and that decision to sunder lists from tables
obscures the means.
That's a digression from the current thread, so we
ought not to pursue
it here. I just mention it to indicate that this is
part of a larger
issue of relics of HTML format markup that may be lurking,
which should
have been put in question relative to the SGML standard during
the
inception of DITA, but which for whatever reason were not.
/BN
>
Original Message-----
> From: Andrzej Zydron
[mailto:azydron@xml-intl.com]
> Sent: Tuesday, June 10, 2008 4:05
PM
> To: Michael Priestley
> Cc: dita@lists.oasis-open.org; Bruce
Nevin (bnevin)
> Subject: Re: [dita] Proposal for Consideration: Default
> Behavior for List Items
>
> Hi Michael,
>
>
Your example failed to highlight the real problem, which is:
>
>
<li>Do something.
> <p>One of three
things happens:
>
<ul><li>A</li>
>
<li>B</li>
>
<li>B</li>
>
</ul>
> that
really screw up segmentation, translation and any sane
>
form of linguistic processing.
>
</p>
> </li>
>
> The problem is that HTML was a
VERY BAD IMPLEMENTATION of
> SGML. It concentrated on form rather than
structure (mixing
> up both which is, if not a sin against humanity,
then
> definitely one against common sense ;) ), which is why we
> needed XML. Basing an XML vocabulary on HTML (which would not
> even parse in SGML terms after about version 2.0) was, at
>
best IMHO a dubious choice.
>
> Rather like <b>, <u>,
<i> and translatable attributes this
> should all be consigned to
the DITA 'deprecated' bin of
> history (BTW the same should be true of
CONREF for individual
> nouns or noun phrases), and good riddance to it
all. Anybody
> who has had to cope with translating such documents will
> testify to the difficulties involved therein.
>
> Best
Regards,
>
> AZ
>
>
>
> Michael
Priestley wrote:
> >
> > A few points:
> >
>
> - This would be a backwards-incompatible change. That is, it would
> > render invalid a large proportion of the existing DITA content
out
> > there. I think we could consider this for 2.0 if the cost of
> > converting all back-level content was justified by the
>
benefits (I'm
> > not currently convinced myself, but that would be
the
> timeline to make
> > the arguments)
> > - This
would also render the current task specialization invalid,
> > since
it specializes a <ph> element as the first child of
>
<step>. As
> > an exercise, see what any of the list
specializations would
> look like,
> > if only block-level
elements were allowed (I suspect it would break
> > most of
them).
> >
> > Finally, and leaving aside the pragmatic
reasons not to make a
> > backwards-incompatible change to the
schemas and DTDs at
> this point,
> > I'm still not sure why
this:
> >
> > <li><p>Do
something</p>
> > <p>One of
three things happens:</p>
> >
<ul><li><p>A</p></li>
> >
<li><p>B</p></li>
> >
<li><p>B</p></li>
> > </ul>
> >
</li>
> >
> > Is better than this:
>
>
> > <li>Do something.
> >
<p>One of three things happens:
> >
<ul><li>A</li>
> >
<li>B</li>
> >
<li>B</li>
> >
</ul>
> >
</p>
> > </li>
> >
> > If it's a
relic of HTML, I'm not sure why it's a bad relic. The
> > adoption
of HTML hasn't exactly been crippled by this
> approach.
> >
> > Michael Priestley
> > Lead
IBM DITA Architect
> > mpriestl@ca.ibm.com
> >
http://dita.xml.org/blog/25
> >
> >
> > *"Bruce
Nevin (bnevin)" <bnevin@cisco.com>*
> >
> > 06/10/2008
12:22 PM
> >
> >
> > To
> >
<dita@lists.oasis-open.org>
>
> cc
> >
"Bruce Nevin (bnevin)" <bnevin@cisco.com> Subject
> >
RE: [dita]
Proposal for Consideration: Default Behavior
> for List
> >
Items
> >
> >
> >
> >
> >
> >
>
>
> >
> >
> > [Not sure if this is the right way
to contribute to this
> thread, but I
> > don't see any
contributor hooks on the page or in the Help.
> Responding
>
> to
>
_http://lists.oasis-open.org/archives/dita/200804/msg00060.html_.]
>
>
> > I agree that rendering is an OT issue.
> >
> > The real issue IMO is that <li> permits #PCDATA and
phrase-level
> > elements. These should only be permitted in
paragraph-level
> elements,
> > and any element that permits
paragraph-level or "larger"
> elements as
> > children should
not permit #PCDATA and phrase-level elements. This
> > behavior
seems to be a relic of the HTML standard.
> >
> > It
is easy for OT and vendors to insert <p> by default, and if <li>
> > begins with some other child element it is only a minor nuisance
to
> > delete <p> or insert that child ahead of
<p>.
> >
> > This would simplify the work of
rendering and remove the
> ambivalence
> > that is the topic
of this thread.
> >
> > Perhaps this is already being
considered for 1.3 or 2.0.
> >
> >
/Bruce Nevin
>
> --
> email - azydron@xml-intl.com
>
smail - c/o Mr. A.Zydron
>
PO Box 2167
> Gerrards
Cross
> Bucks SL9 8XF
>
United Kingdom
> Mobile
+(44) 7966 477 181
> FAX +(44) 1753 480 465
> www -
http://www.xml-intl.com
>
> This message contains confidential
information and is
> intended only for the individual named. If
you are not the
> named addressee you may not disseminate, distribute
or copy
> this e-mail. Please notify the sender immediately by
e-mail
> if you have received this e-mail by mistake and delete this
> e-mail from your system.
> E-mail transmission cannot be
guaranteed to be secure or
> error-free as information could be
intercepted, corrupted,
> lost, destroyed, arrive late or incomplete,
or contain
> viruses. The sender therefore does not accept
liability for
> any errors or omissions in the contents of this message
which
> arise as a result of e-mail transmission. If verification
is
> required please request a hard-copy version. Unless
>
explicitly stated otherwise this message is provided for
>
informational purposes only and should not be construed as a
>
solicitation or offer.
>
>
>
---------------------------------------------------------------------
To
unsubscribe from this mail list, you must leave the OASIS TC that
generates
this mail. You may a link to this group and all your TCs in
OASIS
at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php