docbook-apps

Expand all | Collapse all

Question on CDATA in XML

  • 1.  Question on CDATA in XML

    Posted 06-23-2010 15:36
    Hi List.
    I have some hairy troubles with html encoded parts I want to convey in
    DocBook XML then convert back to HTML (and later on to PDF, but that's
    another sad story).
    My problem is that it works in some cases but not in others (I am using
    DB5.0, Saxon 9.2. Tested with Kernow)

    For example, if I have
    ------------------------------------------------------------------------
    -----------

    <test>

    <row>
    <entry role="Tab3">Assay Name</entry>
    <entry role="Tab3"><rtf>
    Source

    ]]></rtf>
    </entry>
    <entry role="Tab3"><rtf>
    Ligand

    ]]></rtf>
    </entry>
    </row>


    <biblioentry id="179">
    <indexterm>
    <primary>Bibl.179</primary>
    </indexterm>
    <subtitle><rtf>
    PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)







    ]]></rtf>
    </subtitle>

    <releaseinfo><rtf>
    Mol. Pharmacol.,


    39


    : 9-12.




    ]]></rtf>
    </releaseinfo>
    </biblioentry>
    </test>
    ------------------------------------------------------------------------
    -----------
    As xml and the following DB XSL
    ------------------------------------------------------------------------
    -----------



    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:d="http://docbook.org/ns/docbook"
    exclude-result-prefixes="d" >
    <xsl:import href="docBook/xsl/html/docbook.xsl"/>
    <xsl:function name="exslt:node-set"
    xmlns:exslt="http://exslt.org/common" as="node()">
    <xsl:param name="node" as="node()"/>
    <xsl:sequence select="$node"/>
    </xsl:function>
    <xsl:output method="html"/>
    <xsl:param name="html.stylesheet" select="'corpstyle.css'"/>
    <xsl:param name="admon.graphics" select="1"/>


    <xsl:template match="d:rtf">
    <xsl:value-of select="text()"
    disable-output-escaping="yes"/>
    </xsl:template>
    <xsl:template match="d:*" mode="html.title.attribute"/>
    </xsl:stylesheet>
    ------------------------------------------------------------------------
    -----------
    The first part gives

    Assay Name

    <span style="color: #000000; font-size: 10pt; font-family:
    Times New Roman">
    Source
    </span>



    <span style="color: #000000; font-size: 10pt; font-family:
    Times New Roman">
    Ligand
    </span>


    And the second (which is correct)

    [179]


    .



    PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)








    .



    [


    3


    H]cytisine binding to nicotinic cholinergic receptors in brain.








    .



    Mol. Pharmacol.,


    39


    : 9-12.





    .




    So, the big question is why ?

    Thanks a lot in advance
    Best regards,
    Fabien




  • 2.  Re: [docbook-apps] Question on CDATA in XML

    Posted 06-23-2010 18:39
    Fabien,

    I'm a bit confused here.

    Is the output you show below what you see if you open the output file in
    a text editor, or what you see if you open the output in a browser?

    I'm guessing it's the first, and that what you are trying to do is pass
    html markup through unchanged to the output so that the browser will
    interpret it as html markup and render it. If not, you can ignore what
    follows:-).

    I'm not 100% sure why your XSL customization works in some places and
    not in others; maybe one of the gurus can tell. My best guess is that it
    is related to the unique way that biblioentry and bibliomixed are
    handled by the standard stylesheets. I suspect that these two elements
    are possibly the only places where your XSL code will work the way you
    want it to.

    I think there are better ways to do what you seem to be trying to do.
    I'll suggest three:

    1) For bibliographic entries, I would use the full biblioref or
    bibliomixed markup. It should allow you to mark up pretty much any
    bibliographic entry you might need, and you can easily adjust the style
    of the output in your XSL customization layer.

    2) If you create a phrase in docbook with a role, e.g.:

    <phrase role="somerole">some text</phrase>

    The stylesheets will generate a span with a class, e.g.:

    some text

    You can then associate that class with the styling you want in your
    css file. That has the added benefit of separating the styling from your
    source, in case you ever want to change it.

    3) If you are looking for a general means for passing through html to
    the output, the closest I can find is a description of how to insert
    html from an external file using a processing instruction. Here is a
    link to that description, which is from Bob Stayton's excellent book,
    DocBook XSL: The Complete Guide, which BTW, is essential reading if
    you'll be doing a lot with the DocBook stylesheets:

    http://www.sagehill.net/docbookxsl/InsertExtHtml.html

    I don't know of a way to include html in your DocBook file and have it
    passed through as output markup; the file inclusion described in the
    link above is the closest to that I could find. Maybe someone else on
    the list has a suggestion.

    Hope that helps.
    Dick Hamilton
    -----------------------------------------
    XML Press
    XML for Technical Communicators
    http://xmlpress.net



    On Wed, 2010-06-23 at 17:36 +0200, Fabien Tillier wrote:
    > Hi List.
    > I have some hairy troubles with html encoded parts I want to convey in
    > DocBook XML then convert back to HTML (and later on to PDF, but that's
    > another sad story).
    > My problem is that it works in some cases but not in others (I am using
    > DB5.0, Saxon 9.2. Tested with Kernow)
    >
    > For example, if I have
    > ------------------------------------------------------------------------
    > -----------
    >
    > <test>
    >
    > <row>
    > <entry role="Tab3">Assay Name</entry>
    > <entry role="Tab3"><rtf>>
    > Source
    >

    > ]]></rtf>
    > </entry>
    > <entry role="Tab3"><rtf>>
    > Ligand
    >

    > ]]></rtf>
    > </entry>
    > </row>
    >

    >
    > <biblioentry id="179">
    > <indexterm>
    > <primary>Bibl.179</primary>
    > </indexterm>
    > <subtitle><rtf>>
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    > ]]></rtf>
    > </subtitle>
    >
    > <releaseinfo><rtf>>
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    > ]]></rtf>
    > </releaseinfo>
    > </biblioentry>
    > </test>
    > ------------------------------------------------------------------------
    > -----------
    > As xml and the following DB XSL
    > ------------------------------------------------------------------------
    > -----------
    >
    >
    >
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > xmlns:d="http://docbook.org/ns/docbook"
    > exclude-result-prefixes="d" >
    > <xsl:import href="docBook/xsl/html/docbook.xsl"/>
    > <xsl:function name="exslt:node-set"
    > xmlns:exslt="http://exslt.org/common" as="node()">
    > <xsl:param name="node" as="node()"/>
    > <xsl:sequence select="$node"/>
    > </xsl:function>
    > <xsl:output method="html"/>
    > <xsl:param name="html.stylesheet" select="'corpstyle.css'"/>
    > <xsl:param name="admon.graphics" select="1"/>
    >
    >
    > <xsl:template match="d:rtf">
    > <xsl:value-of select="text()"
    > disable-output-escaping="yes"/>
    > </xsl:template>
    > <xsl:template match="d:*" mode="html.title.attribute"/>
    > </xsl:stylesheet>
    > ------------------------------------------------------------------------
    > -----------
    > The first part gives
    >
    > Assay Name
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Source
    > </span>
    >
    >
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Ligand
    > </span>
    >
    >
    > And the second (which is correct)
    >
    >

    [179]
    >
    >
    > .

    >
    >
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    >
    > .

    >
    >
    > [
    >

    >
    > 3
    >

    >
    > H]cytisine binding to nicotinic cholinergic receptors in brain.
    >

    >
    >
    >

    >
    >
    >

    >
    >
    .

    >
    >
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    >
    > .

    >


    >
    >
    > So, the big question is why ?
    >
    > Thanks a lot in advance
    > Best regards,
    > Fabien
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
    >




  • 3.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-24-2010 07:03
    Hi Richard.
    First, thanks for answering to me.
    The output is the one from Kernow. Thus, you get the HTML "text" that would be (or not) interpret by the browser to display.
    I am trying to use the HTML because the original data is from an RTF encoded text I get from a database. I made an RTF to HTML conversion to be able to keep at least some of the formatting, so I need to keep it until I can pass it to an HTML able output (like a browser). Thus, though I would really love doing the formatting with CSS, I am bound to use the HTML as it is.

    Regarding Bob's book, I have it on the left of my desk, I will look at the part you are kindly pointing me to and see if it can help me.

    Best regards,
    Fabien


    -----Message d'origine-----
    De : Richard Hamilton [mailto:rlhamilton@frii.com]
    Envoyé : mercredi 23 juin 2010 20:39
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    Fabien,

    I'm a bit confused here.

    Is the output you show below what you see if you open the output file in
    a text editor, or what you see if you open the output in a browser?

    I'm guessing it's the first, and that what you are trying to do is pass
    html markup through unchanged to the output so that the browser will
    interpret it as html markup and render it. If not, you can ignore what
    follows:-).

    I'm not 100% sure why your XSL customization works in some places and
    not in others; maybe one of the gurus can tell. My best guess is that it
    is related to the unique way that biblioentry and bibliomixed are
    handled by the standard stylesheets. I suspect that these two elements
    are possibly the only places where your XSL code will work the way you
    want it to.

    I think there are better ways to do what you seem to be trying to do.
    I'll suggest three:

    1) For bibliographic entries, I would use the full biblioref or
    bibliomixed markup. It should allow you to mark up pretty much any
    bibliographic entry you might need, and you can easily adjust the style
    of the output in your XSL customization layer.

    2) If you create a phrase in docbook with a role, e.g.:

    <phrase role="somerole">some text</phrase>

    The stylesheets will generate a span with a class, e.g.:

    some text

    You can then associate that class with the styling you want in your
    css file. That has the added benefit of separating the styling from your
    source, in case you ever want to change it.

    3) If you are looking for a general means for passing through html to
    the output, the closest I can find is a description of how to insert
    html from an external file using a processing instruction. Here is a
    link to that description, which is from Bob Stayton's excellent book,
    DocBook XSL: The Complete Guide, which BTW, is essential reading if
    you'll be doing a lot with the DocBook stylesheets:

    http://www.sagehill.net/docbookxsl/InsertExtHtml.html

    I don't know of a way to include html in your DocBook file and have it
    passed through as output markup; the file inclusion described in the
    link above is the closest to that I could find. Maybe someone else on
    the list has a suggestion.

    Hope that helps.
    Dick Hamilton
    -----------------------------------------
    XML Press
    XML for Technical Communicators
    http://xmlpress.net



    On Wed, 2010-06-23 at 17:36 +0200, Fabien Tillier wrote:
    > Hi List.
    > I have some hairy troubles with html encoded parts I want to convey in
    > DocBook XML then convert back to HTML (and later on to PDF, but that's
    > another sad story).
    > My problem is that it works in some cases but not in others (I am using
    > DB5.0, Saxon 9.2. Tested with Kernow)
    >
    > For example, if I have
    > ------------------------------------------------------------------------
    > -----------
    >
    > <test>
    >
    > <row>
    > <entry role="Tab3">Assay Name</entry>
    > <entry role="Tab3"><rtf>>
    > Source
    >

    > ]]></rtf>
    > </entry>
    > <entry role="Tab3"><rtf>>
    > Ligand
    >

    > ]]></rtf>
    > </entry>
    > </row>
    >

    >
    > <biblioentry id="179">
    > <indexterm>
    > <primary>Bibl.179</primary>
    > </indexterm>
    > <subtitle><rtf>>
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    > ]]></rtf>
    > </subtitle>
    >
    > <releaseinfo><rtf>>
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    > ]]></rtf>
    > </releaseinfo>
    > </biblioentry>
    > </test>
    > ------------------------------------------------------------------------
    > -----------
    > As xml and the following DB XSL
    > ------------------------------------------------------------------------
    > -----------
    >
    >
    >
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > xmlns:d="http://docbook.org/ns/docbook"
    > exclude-result-prefixes="d" >
    > <xsl:import href="docBook/xsl/html/docbook.xsl"/>
    > <xsl:function name="exslt:node-set"
    > xmlns:exslt="http://exslt.org/common" as="node()">
    > <xsl:param name="node" as="node()"/>
    > <xsl:sequence select="$node"/>
    > </xsl:function>
    > <xsl:output method="html"/>
    > <xsl:param name="html.stylesheet" select="'corpstyle.css'"/>
    > <xsl:param name="admon.graphics" select="1"/>
    >
    >
    > <xsl:template match="d:rtf">
    > <xsl:value-of select="text()"
    > disable-output-escaping="yes"/>
    > </xsl:template>
    > <xsl:template match="d:*" mode="html.title.attribute"/>
    > </xsl:stylesheet>
    > ------------------------------------------------------------------------
    > -----------
    > The first part gives
    >
    > Assay Name
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Source
    > </span>
    >
    >
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Ligand
    > </span>
    >
    >
    > And the second (which is correct)
    >
    >

    [179]
    >
    >
    > .

    >
    >
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    >
    > .

    >
    >
    > [
    >

    >
    > 3
    >

    >
    > H]cytisine binding to nicotinic cholinergic receptors in brain.
    >

    >
    >
    >

    >
    >
    >

    >
    >
    .

    >
    >
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    >
    > .

    >


    >
    >
    > So, the big question is why ?
    >
    > Thanks a lot in advance
    > Best regards,
    > Fabien
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
    >


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 4.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-24-2010 08:13
    Humm.
    If I try
    <bibliomixed>
    <rtf>
    Mol. Pharmacol.,


    39


    : 9-12.




    ]]></rtf>
    </bibliomixed>

    Which is not the way it should be used, I get the expected output (keeps <, not ">") [thus, you are right when stating these are special]

    However, it may be inefficient to use a bibliomixed tag everywhere I have to decode the HTML output (on some tables for example).
    I would assume that there is something special occurring within the treatment of bibliomixed or biblioentry, but I am a bit lost in the stylesheets and not knowledgeable enough to catch it. Thus, I could create a special template for handling this specifically (like with the bilbio* tags) but without the special treatments applied to a bibliographic entry.

    Any help/directions ?
    Regards,
    Fabien



    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : jeudi 24 juin 2010 09:03
    À : Richard Hamilton; docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Hi Richard.
    First, thanks for answering to me.
    The output is the one from Kernow. Thus, you get the HTML "text" that would be (or not) interpret by the browser to display.
    I am trying to use the HTML because the original data is from an RTF encoded text I get from a database. I made an RTF to HTML conversion to be able to keep at least some of the formatting, so I need to keep it until I can pass it to an HTML able output (like a browser). Thus, though I would really love doing the formatting with CSS, I am bound to use the HTML as it is.

    Regarding Bob's book, I have it on the left of my desk, I will look at the part you are kindly pointing me to and see if it can help me.

    Best regards,
    Fabien


    -----Message d'origine-----
    De : Richard Hamilton [mailto:rlhamilton@frii.com]
    Envoyé : mercredi 23 juin 2010 20:39
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    Fabien,

    I'm a bit confused here.

    Is the output you show below what you see if you open the output file in
    a text editor, or what you see if you open the output in a browser?

    I'm guessing it's the first, and that what you are trying to do is pass
    html markup through unchanged to the output so that the browser will
    interpret it as html markup and render it. If not, you can ignore what
    follows:-).

    I'm not 100% sure why your XSL customization works in some places and
    not in others; maybe one of the gurus can tell. My best guess is that it
    is related to the unique way that biblioentry and bibliomixed are
    handled by the standard stylesheets. I suspect that these two elements
    are possibly the only places where your XSL code will work the way you
    want it to.

    I think there are better ways to do what you seem to be trying to do.
    I'll suggest three:

    1) For bibliographic entries, I would use the full biblioref or
    bibliomixed markup. It should allow you to mark up pretty much any
    bibliographic entry you might need, and you can easily adjust the style
    of the output in your XSL customization layer.

    2) If you create a phrase in docbook with a role, e.g.:

    <phrase role="somerole">some text</phrase>

    The stylesheets will generate a span with a class, e.g.:

    some text

    You can then associate that class with the styling you want in your
    css file. That has the added benefit of separating the styling from your
    source, in case you ever want to change it.

    3) If you are looking for a general means for passing through html to
    the output, the closest I can find is a description of how to insert
    html from an external file using a processing instruction. Here is a
    link to that description, which is from Bob Stayton's excellent book,
    DocBook XSL: The Complete Guide, which BTW, is essential reading if
    you'll be doing a lot with the DocBook stylesheets:

    http://www.sagehill.net/docbookxsl/InsertExtHtml.html

    I don't know of a way to include html in your DocBook file and have it
    passed through as output markup; the file inclusion described in the
    link above is the closest to that I could find. Maybe someone else on
    the list has a suggestion.

    Hope that helps.
    Dick Hamilton
    -----------------------------------------
    XML Press
    XML for Technical Communicators
    http://xmlpress.net



    On Wed, 2010-06-23 at 17:36 +0200, Fabien Tillier wrote:
    > Hi List.
    > I have some hairy troubles with html encoded parts I want to convey in
    > DocBook XML then convert back to HTML (and later on to PDF, but that's
    > another sad story).
    > My problem is that it works in some cases but not in others (I am using
    > DB5.0, Saxon 9.2. Tested with Kernow)
    >
    > For example, if I have
    > ------------------------------------------------------------------------
    > -----------
    >
    > <test>
    >
    > <row>
    > <entry role="Tab3">Assay Name</entry>
    > <entry role="Tab3"><rtf>>
    > Source
    >

    > ]]></rtf>
    > </entry>
    > <entry role="Tab3"><rtf>>
    > Ligand
    >

    > ]]></rtf>
    > </entry>
    > </row>
    >

    >
    > <biblioentry id="179">
    > <indexterm>
    > <primary>Bibl.179</primary>
    > </indexterm>
    > <subtitle><rtf>>
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    > ]]></rtf>
    > </subtitle>
    >
    > <releaseinfo><rtf>>
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    > ]]></rtf>
    > </releaseinfo>
    > </biblioentry>
    > </test>
    > ------------------------------------------------------------------------
    > -----------
    > As xml and the following DB XSL
    > ------------------------------------------------------------------------
    > -----------
    >
    >
    >
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > xmlns:d="http://docbook.org/ns/docbook"
    > exclude-result-prefixes="d" >
    > <xsl:import href="docBook/xsl/html/docbook.xsl"/>
    > <xsl:function name="exslt:node-set"
    > xmlns:exslt="http://exslt.org/common" as="node()">
    > <xsl:param name="node" as="node()"/>
    > <xsl:sequence select="$node"/>
    > </xsl:function>
    > <xsl:output method="html"/>
    > <xsl:param name="html.stylesheet" select="'corpstyle.css'"/>
    > <xsl:param name="admon.graphics" select="1"/>
    >
    >
    > <xsl:template match="d:rtf">
    > <xsl:value-of select="text()"
    > disable-output-escaping="yes"/>
    > </xsl:template>
    > <xsl:template match="d:*" mode="html.title.attribute"/>
    > </xsl:stylesheet>
    > ------------------------------------------------------------------------
    > -----------
    > The first part gives
    >
    > Assay Name
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Source
    > </span>
    >
    >
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Ligand
    > </span>
    >
    >
    > And the second (which is correct)
    >
    >

    [179]
    >
    >
    > .

    >
    >
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    >
    > .

    >
    >
    > [
    >

    >
    > 3
    >

    >
    > H]cytisine binding to nicotinic cholinergic receptors in brain.
    >

    >
    >
    >

    >
    >
    >

    >
    >
    .

    >
    >
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    >
    > .

    >


    >
    >
    > So, the big question is why ?
    >
    > Thanks a lot in advance
    > Best regards,
    > Fabien
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
    >


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 5.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-24-2010 15:17
    No one ?
    I have tried many different ways, but I am stuck...
    Thanks !
    Fabien

    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : jeudi 24 juin 2010 10:13
    À : docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Humm.
    If I try
    <bibliomixed>
    <rtf>
    Mol. Pharmacol.,


    39


    : 9-12.




    ]]></rtf>
    </bibliomixed>

    Which is not the way it should be used, I get the expected output (keeps <, not ">") [thus, you are right when stating these are special]

    However, it may be inefficient to use a bibliomixed tag everywhere I have to decode the HTML output (on some tables for example).
    I would assume that there is something special occurring within the treatment of bibliomixed or biblioentry, but I am a bit lost in the stylesheets and not knowledgeable enough to catch it. Thus, I could create a special template for handling this specifically (like with the bilbio* tags) but without the special treatments applied to a bibliographic entry.

    Any help/directions ?
    Regards,
    Fabien



    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : jeudi 24 juin 2010 09:03
    À : Richard Hamilton; docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Hi Richard.
    First, thanks for answering to me.
    The output is the one from Kernow. Thus, you get the HTML "text" that would be (or not) interpret by the browser to display.
    I am trying to use the HTML because the original data is from an RTF encoded text I get from a database. I made an RTF to HTML conversion to be able to keep at least some of the formatting, so I need to keep it until I can pass it to an HTML able output (like a browser). Thus, though I would really love doing the formatting with CSS, I am bound to use the HTML as it is.

    Regarding Bob's book, I have it on the left of my desk, I will look at the part you are kindly pointing me to and see if it can help me.

    Best regards,
    Fabien


    -----Message d'origine-----
    De : Richard Hamilton [mailto:rlhamilton@frii.com]
    Envoyé : mercredi 23 juin 2010 20:39
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    Fabien,

    I'm a bit confused here.

    Is the output you show below what you see if you open the output file in
    a text editor, or what you see if you open the output in a browser?

    I'm guessing it's the first, and that what you are trying to do is pass
    html markup through unchanged to the output so that the browser will
    interpret it as html markup and render it. If not, you can ignore what
    follows:-).

    I'm not 100% sure why your XSL customization works in some places and
    not in others; maybe one of the gurus can tell. My best guess is that it
    is related to the unique way that biblioentry and bibliomixed are
    handled by the standard stylesheets. I suspect that these two elements
    are possibly the only places where your XSL code will work the way you
    want it to.

    I think there are better ways to do what you seem to be trying to do.
    I'll suggest three:

    1) For bibliographic entries, I would use the full biblioref or
    bibliomixed markup. It should allow you to mark up pretty much any
    bibliographic entry you might need, and you can easily adjust the style
    of the output in your XSL customization layer.

    2) If you create a phrase in docbook with a role, e.g.:

    <phrase role="somerole">some text</phrase>

    The stylesheets will generate a span with a class, e.g.:

    some text

    You can then associate that class with the styling you want in your
    css file. That has the added benefit of separating the styling from your
    source, in case you ever want to change it.

    3) If you are looking for a general means for passing through html to
    the output, the closest I can find is a description of how to insert
    html from an external file using a processing instruction. Here is a
    link to that description, which is from Bob Stayton's excellent book,
    DocBook XSL: The Complete Guide, which BTW, is essential reading if
    you'll be doing a lot with the DocBook stylesheets:

    http://www.sagehill.net/docbookxsl/InsertExtHtml.html

    I don't know of a way to include html in your DocBook file and have it
    passed through as output markup; the file inclusion described in the
    link above is the closest to that I could find. Maybe someone else on
    the list has a suggestion.

    Hope that helps.
    Dick Hamilton
    -----------------------------------------
    XML Press
    XML for Technical Communicators
    http://xmlpress.net



    On Wed, 2010-06-23 at 17:36 +0200, Fabien Tillier wrote:
    > Hi List.
    > I have some hairy troubles with html encoded parts I want to convey in
    > DocBook XML then convert back to HTML (and later on to PDF, but that's
    > another sad story).
    > My problem is that it works in some cases but not in others (I am using
    > DB5.0, Saxon 9.2. Tested with Kernow)
    >
    > For example, if I have
    > ------------------------------------------------------------------------
    > -----------
    >
    > <test>
    >
    > <row>
    > <entry role="Tab3">Assay Name</entry>
    > <entry role="Tab3"><rtf>>
    > Source
    >

    > ]]></rtf>
    > </entry>
    > <entry role="Tab3"><rtf>>
    > Ligand
    >

    > ]]></rtf>
    > </entry>
    > </row>
    >

    >
    > <biblioentry id="179">
    > <indexterm>
    > <primary>Bibl.179</primary>
    > </indexterm>
    > <subtitle><rtf>>
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    > ]]></rtf>
    > </subtitle>
    >
    > <releaseinfo><rtf>>
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    > ]]></rtf>
    > </releaseinfo>
    > </biblioentry>
    > </test>
    > ------------------------------------------------------------------------
    > -----------
    > As xml and the following DB XSL
    > ------------------------------------------------------------------------
    > -----------
    >
    >
    >
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > xmlns:d="http://docbook.org/ns/docbook"
    > exclude-result-prefixes="d" >
    > <xsl:import href="docBook/xsl/html/docbook.xsl"/>
    > <xsl:function name="exslt:node-set"
    > xmlns:exslt="http://exslt.org/common" as="node()">
    > <xsl:param name="node" as="node()"/>
    > <xsl:sequence select="$node"/>
    > </xsl:function>
    > <xsl:output method="html"/>
    > <xsl:param name="html.stylesheet" select="'corpstyle.css'"/>
    > <xsl:param name="admon.graphics" select="1"/>
    >
    >
    > <xsl:template match="d:rtf">
    > <xsl:value-of select="text()"
    > disable-output-escaping="yes"/>
    > </xsl:template>
    > <xsl:template match="d:*" mode="html.title.attribute"/>
    > </xsl:stylesheet>
    > ------------------------------------------------------------------------
    > -----------
    > The first part gives
    >
    > Assay Name
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Source
    > </span>
    >
    >
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Ligand
    > </span>
    >
    >
    > And the second (which is correct)
    >
    >

    [179]
    >
    >
    > .

    >
    >
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    >
    > .

    >
    >
    > [
    >

    >
    > 3
    >

    >
    > H]cytisine binding to nicotinic cholinergic receptors in brain.
    >

    >
    >
    >

    >
    >
    >

    >
    >
    .

    >
    >
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    >
    > .

    >


    >
    >
    > So, the big question is why ?
    >
    > Thanks a lot in advance
    > Best regards,
    > Fabien
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
    >


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 6.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 08:27
    Still on with my attempts....
    When I embed my specific tag in a table, the template I have modified is not call.
    Which explains why it doesn't work as expected....
    Thus, in part 1, my template (see the end) is never called (in none of the three different rtf sections) , though it is for part 2









    <row>
    <entry role="Tab3">Assay Name</entry>

    <entry role="Tab3"><bibliomixed><rtf>
    Source

    ]]></rtf></bibliomixed>
    </entry>


    <entry role="Tab3"><rtf>
    Ligand

    ]]></rtf>
    </entry>
    </row>




    <biblioentry id="179">
    <indexterm>
    <primary>Bibl.179</primary>
    <rtf>
    Source

    ]]></rtf>
    </indexterm>


    Here is the template
    <xsl:template match="d:rtf ">
    -1-
    </xsl:template>

    Can someone please try to find out why my template is not called within the table ?
    Thank you so much.

    Regards,
    Fabien



    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : jeudi 24 juin 2010 17:17
    À : docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    No one ?
    I have tried many different ways, but I am stuck...
    Thanks !
    Fabien

    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : jeudi 24 juin 2010 10:13
    À : docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Humm.
    If I try
    <bibliomixed>
    <rtf>
    Mol. Pharmacol.,


    39


    : 9-12.




    ]]></rtf>
    </bibliomixed>

    Which is not the way it should be used, I get the expected output (keeps <, not ">") [thus, you are right when stating these are special]

    However, it may be inefficient to use a bibliomixed tag everywhere I have to decode the HTML output (on some tables for example).
    I would assume that there is something special occurring within the treatment of bibliomixed or biblioentry, but I am a bit lost in the stylesheets and not knowledgeable enough to catch it. Thus, I could create a special template for handling this specifically (like with the bilbio* tags) but without the special treatments applied to a bibliographic entry.

    Any help/directions ?
    Regards,
    Fabien



    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : jeudi 24 juin 2010 09:03
    À : Richard Hamilton; docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Hi Richard.
    First, thanks for answering to me.
    The output is the one from Kernow. Thus, you get the HTML "text" that would be (or not) interpret by the browser to display.
    I am trying to use the HTML because the original data is from an RTF encoded text I get from a database. I made an RTF to HTML conversion to be able to keep at least some of the formatting, so I need to keep it until I can pass it to an HTML able output (like a browser). Thus, though I would really love doing the formatting with CSS, I am bound to use the HTML as it is.

    Regarding Bob's book, I have it on the left of my desk, I will look at the part you are kindly pointing me to and see if it can help me.

    Best regards,
    Fabien


    -----Message d'origine-----
    De : Richard Hamilton [mailto:rlhamilton@frii.com]
    Envoyé : mercredi 23 juin 2010 20:39
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    Fabien,

    I'm a bit confused here.

    Is the output you show below what you see if you open the output file in
    a text editor, or what you see if you open the output in a browser?

    I'm guessing it's the first, and that what you are trying to do is pass
    html markup through unchanged to the output so that the browser will
    interpret it as html markup and render it. If not, you can ignore what
    follows:-).

    I'm not 100% sure why your XSL customization works in some places and
    not in others; maybe one of the gurus can tell. My best guess is that it
    is related to the unique way that biblioentry and bibliomixed are
    handled by the standard stylesheets. I suspect that these two elements
    are possibly the only places where your XSL code will work the way you
    want it to.

    I think there are better ways to do what you seem to be trying to do.
    I'll suggest three:

    1) For bibliographic entries, I would use the full biblioref or
    bibliomixed markup. It should allow you to mark up pretty much any
    bibliographic entry you might need, and you can easily adjust the style
    of the output in your XSL customization layer.

    2) If you create a phrase in docbook with a role, e.g.:

    <phrase role="somerole">some text</phrase>

    The stylesheets will generate a span with a class, e.g.:

    some text

    You can then associate that class with the styling you want in your
    css file. That has the added benefit of separating the styling from your
    source, in case you ever want to change it.

    3) If you are looking for a general means for passing through html to
    the output, the closest I can find is a description of how to insert
    html from an external file using a processing instruction. Here is a
    link to that description, which is from Bob Stayton's excellent book,
    DocBook XSL: The Complete Guide, which BTW, is essential reading if
    you'll be doing a lot with the DocBook stylesheets:

    http://www.sagehill.net/docbookxsl/InsertExtHtml.html

    I don't know of a way to include html in your DocBook file and have it
    passed through as output markup; the file inclusion described in the
    link above is the closest to that I could find. Maybe someone else on
    the list has a suggestion.

    Hope that helps.
    Dick Hamilton
    -----------------------------------------
    XML Press
    XML for Technical Communicators
    http://xmlpress.net



    On Wed, 2010-06-23 at 17:36 +0200, Fabien Tillier wrote:
    > Hi List.
    > I have some hairy troubles with html encoded parts I want to convey in
    > DocBook XML then convert back to HTML (and later on to PDF, but that's
    > another sad story).
    > My problem is that it works in some cases but not in others (I am using
    > DB5.0, Saxon 9.2. Tested with Kernow)
    >
    > For example, if I have
    > ------------------------------------------------------------------------
    > -----------
    >
    > <test>
    >
    > <row>
    > <entry role="Tab3">Assay Name</entry>
    > <entry role="Tab3"><rtf>>
    > Source
    >

    > ]]></rtf>
    > </entry>
    > <entry role="Tab3"><rtf>>
    > Ligand
    >

    > ]]></rtf>
    > </entry>
    > </row>
    >

    >
    > <biblioentry id="179">
    > <indexterm>
    > <primary>Bibl.179</primary>
    > </indexterm>
    > <subtitle><rtf>>
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    > ]]></rtf>
    > </subtitle>
    >
    > <releaseinfo><rtf>>
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    > ]]></rtf>
    > </releaseinfo>
    > </biblioentry>
    > </test>
    > ------------------------------------------------------------------------
    > -----------
    > As xml and the following DB XSL
    > ------------------------------------------------------------------------
    > -----------
    >
    >
    >
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > xmlns:d="http://docbook.org/ns/docbook"
    > exclude-result-prefixes="d" >
    > <xsl:import href="docBook/xsl/html/docbook.xsl"/>
    > <xsl:function name="exslt:node-set"
    > xmlns:exslt="http://exslt.org/common" as="node()">
    > <xsl:param name="node" as="node()"/>
    > <xsl:sequence select="$node"/>
    > </xsl:function>
    > <xsl:output method="html"/>
    > <xsl:param name="html.stylesheet" select="'corpstyle.css'"/>
    > <xsl:param name="admon.graphics" select="1"/>
    >
    >
    > <xsl:template match="d:rtf">
    > <xsl:value-of select="text()"
    > disable-output-escaping="yes"/>
    > </xsl:template>
    > <xsl:template match="d:*" mode="html.title.attribute"/>
    > </xsl:stylesheet>
    > ------------------------------------------------------------------------
    > -----------
    > The first part gives
    >
    > Assay Name
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Source
    > </span>
    >
    >
    >
    > <span style="color: #000000; font-size: 10pt; font-family:
    > Times New Roman">
    > Ligand
    > </span>
    >
    >
    > And the second (which is correct)
    >
    >

    [179]
    >
    >
    > .

    >
    >
    > PABREZA, L.A., DHAWAN, S. and KELLAR, K.J. (1991)
    >

    >
    >
    >

    >
    >
    >

    >
    > .

    >
    >
    > [
    >

    >
    > 3
    >

    >
    > H]cytisine binding to nicotinic cholinergic receptors in brain.
    >

    >
    >
    >

    >
    >
    >

    >
    >
    .

    >
    >
    > Mol. Pharmacol.,
    >

    >
    > 39
    >

    >
    > : 9-12.
    >

    >
    >
    >

    >
    > .

    >


    >
    >
    > So, the big question is why ?
    >
    > Thanks a lot in advance
    > Best regards,
    > Fabien
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
    >


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 7.  Re: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 08:49
    On Fri, 25 Jun 2010 10:26:31 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Still on with my attempts....
    > When I embed my specific tag in a table, the template I have modified
    > is not call. Which explains why it doesn't work as expected....
    > Thus, in part 1, my template (see the end) is never called (in none
    > of the three different rtf sections) , though it is for part 2
    >
    >
    >
    >

    >
    >

    >
    >
    >

    That isn't valid docbook Fabien?
    Just what are you trying to do please?
    Perhaps we could help if we knew what you were trying to achieve?


    regards

    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk



  • 8.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 09:18
    Hi Dave
    I think everything is explained in the previous mails on that same thread.
    To make things short, I have some RTF strings that I get from a database, then transform as HTML so that I can set them as a CDATA field in the DocBook XML.
    When I try to get the complete docbook document to HTML using the stylesheets, it works for the embedded CDATA in a bibliographic section, but not when in a table. So I am trying to figure out why, because I haven't found a better way to handle that stupid rtf string.
    As mentioned, my recent experiments seems (I can't be sure as I am beginning with these techniques) to show that the problem is that the dummy template below is not used when converting to HTML (I have a custom xsl sheet that pass the work to docbook.xsl after some customizations).
    I am really trying hard to figure out what I am doing wrong, and I spent already a couple of days on that, which is making me a bit stressed.... (as it is something I have to do)

    Best regards,
    Fabien

    PS: I don't know how to send a link to a threaded view of the topic... otherwise, I would have done it
    PPS: the real template that works within biobliographic refences is
    <xsl:template match="d:rtf">
    <xsl:value-of select="text()" disable-output-escaping="yes"/>
    </xsl:template>


    -----Message d'origine-----
    De : Dave Pawson [mailto:davep@dpawson.co.uk]
    Envoyé : vendredi 25 juin 2010 10:49
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    On Fri, 25 Jun 2010 10:26:31 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Still on with my attempts....
    > When I embed my specific tag in a table, the template I have modified
    > is not call. Which explains why it doesn't work as expected....
    > Thus, in part 1, my template (see the end) is never called (in none
    > of the three different rtf sections) , though it is for part 2
    >
    >
    >
    >


  • 9.  Re: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 11:28
    On Fri, 25 Jun 2010 11:18:18 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Hi Dave
    > I think everything is explained in the previous mails on that same
    > thread. To make things short, I have some RTF strings that I get from
    > a database, then transform as HTML so that I can set them as a CDATA
    > field in the DocBook XML.

    which I *think* is not what you want? You want markup 'added' somewhere
    in the process of generating HTML?


    When I try to get the complete docbook
    > document to HTML using the stylesheets, it works for the embedded
    > CDATA in a bibliographic section, but not when in a table.

    In my experience CDATA sections rarely help with processing xml
    when you want markup in the output.

    Is it feasible to 'drop in' a processing instruction,
    then when processing through to html, add the PI processing
    to insert the required 'markup'?

    that is only if the context is needed. If you know what
    you want added, you could simply add it at the HTML processing
    stage as 'boilerplate' html?

    HTH






    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk



  • 10.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 11:51
    Thanks for answering Dave.
    I did want the HTML provided within a CDATA "field" to be output as it is by the transformation (at least when transforming to HTML) (I may have been unclear. I do RTF to HTML, set the HTML content to a CDATA section in the XML, then transform the complete XML to HTML using the DB XSL).
    I don't get it when you talk about adding a processing instruction (lack of knowledge on my side). I really don't need to transform the content of the CDATA Field, I just want it as it is (like it does when in the case of biblioentry section).
    To add boilerplate html, I only saw means using a file, which is not my case as I have an HTML String directly. I could create a file and then add it, but I have lots of these and this will probably be inefficient. But, the good side of that is that it would probably work also for PDF output (haven't thought about this yet).

    Any (more) thoughts ?
    Regards,
    Fabien





    -----Message d'origine-----
    De : Dave Pawson [mailto:davep@dpawson.co.uk]
    Envoyé : vendredi 25 juin 2010 13:28
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    On Fri, 25 Jun 2010 11:18:18 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Hi Dave
    > I think everything is explained in the previous mails on that same
    > thread. To make things short, I have some RTF strings that I get from
    > a database, then transform as HTML so that I can set them as a CDATA
    > field in the DocBook XML.

    which I *think* is not what you want? You want markup 'added' somewhere
    in the process of generating HTML?


    When I try to get the complete docbook
    > document to HTML using the stylesheets, it works for the embedded
    > CDATA in a bibliographic section, but not when in a table.

    In my experience CDATA sections rarely help with processing xml
    when you want markup in the output.

    Is it feasible to 'drop in' a processing instruction,
    then when processing through to html, add the PI processing
    to insert the required 'markup'?

    that is only if the context is needed. If you know what
    you want added, you could simply add it at the HTML processing
    stage as 'boilerplate' html?

    HTH






    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 11.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 14:32
    Ok, so I have changed my RTF to HTML conversion stuff so that it strips all HTML tags (so I am loosing some formatting), and it works better (I don't have thoses CDATA things anymore), except that I have encoded characters (like &alpha;) that are showing up (which is perfectly normal) but are not displayed by the browser as their entity counterpart (here an alpha symbol).
    I assume I am missing something like a transformation from &alpha; to α
    Is it something that has to be implemented specifically (and how) in a customization template or is there a magical option that can be used ?
    Thanks a lot in advance
    Best regards,
    Fabien


    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : vendredi 25 juin 2010 13:51
    À : docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Thanks for answering Dave.
    I did want the HTML provided within a CDATA "field" to be output as it is by the transformation (at least when transforming to HTML) (I may have been unclear. I do RTF to HTML, set the HTML content to a CDATA section in the XML, then transform the complete XML to HTML using the DB XSL).
    I don't get it when you talk about adding a processing instruction (lack of knowledge on my side). I really don't need to transform the content of the CDATA Field, I just want it as it is (like it does when in the case of biblioentry section).
    To add boilerplate html, I only saw means using a file, which is not my case as I have an HTML String directly. I could create a file and then add it, but I have lots of these and this will probably be inefficient. But, the good side of that is that it would probably work also for PDF output (haven't thought about this yet).

    Any (more) thoughts ?
    Regards,
    Fabien





    -----Message d'origine-----
    De : Dave Pawson [mailto:davep@dpawson.co.uk]
    Envoyé : vendredi 25 juin 2010 13:28
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    On Fri, 25 Jun 2010 11:18:18 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Hi Dave
    > I think everything is explained in the previous mails on that same
    > thread. To make things short, I have some RTF strings that I get from
    > a database, then transform as HTML so that I can set them as a CDATA
    > field in the DocBook XML.

    which I *think* is not what you want? You want markup 'added' somewhere
    in the process of generating HTML?


    When I try to get the complete docbook
    > document to HTML using the stylesheets, it works for the embedded
    > CDATA in a bibliographic section, but not when in a table.

    In my experience CDATA sections rarely help with processing xml
    when you want markup in the output.

    Is it feasible to 'drop in' a processing instruction,
    then when processing through to html, add the PI processing
    to insert the required 'markup'?

    that is only if the context is needed. If you know what
    you want added, you could simply add it at the HTML processing
    stage as 'boilerplate' html?

    HTH






    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 12.  Re: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 14:54
    On Fri, 25 Jun 2010 16:31:51 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Ok, so I have changed my RTF to HTML conversion stuff so that it
    > strips all HTML tags (so I am loosing some formatting), and it works
    > better (I don't have thoses CDATA things anymore), except that I have
    > encoded characters (like &alpha;)

    You need to check your character level encoding throughout,
    end to end.
    Once it's in XML, & => & etc.
    If you want alpha, then use unicode character level encoding,
    e.g. & # x 0 3 b 1 ; (spaced in case it is messed up)


    that are showing up (which is
    > perfectly normal) but are not displayed by the browser as their
    > entity counterpart (here an alpha symbol). I assume I am missing
    > something like a transformation from &alpha; to α Is it
    > something that has to be implemented specifically (and how) in a
    > customization template or is there a magical option that can be
    > used ? Thanks a lot in advance Best regards, Fabien

    Once in XML that alpha as a character entity is just as valid as any
    other character.
    Prior to that, it's up to you.

    HTH


    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk



  • 13.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 15:23
    Well...
    I am using the apache commons lang package to encode the special characters (the one remaining after the rtf conversion, like µ). It encodes 'µ' as 'µ'. If I want to have an Unicode hex encoding, I have to stop using it and write my own...
    Unless there is another package I can use ?
    For my understanding, µ is valid HTML, not XML, that why it is further encoded by the XSLt to '&micro;' right ?

    Thanks a lot for your help, Dave.

    Regards,
    Fabien



    -----Message d'origine-----
    De : Dave Pawson [mailto:davep@dpawson.co.uk]
    Envoyé : vendredi 25 juin 2010 16:54
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    On Fri, 25 Jun 2010 16:31:51 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Ok, so I have changed my RTF to HTML conversion stuff so that it
    > strips all HTML tags (so I am loosing some formatting), and it works
    > better (I don't have thoses CDATA things anymore), except that I have
    > encoded characters (like &alpha;)

    You need to check your character level encoding throughout,
    end to end.
    Once it's in XML, & => & etc.
    If you want alpha, then use unicode character level encoding,
    e.g. & # x 0 3 b 1 ; (spaced in case it is messed up)


    that are showing up (which is
    > perfectly normal) but are not displayed by the browser as their
    > entity counterpart (here an alpha symbol). I assume I am missing
    > something like a transformation from &alpha; to α Is it
    > something that has to be implemented specifically (and how) in a
    > customization template or is there a magical option that can be
    > used ? Thanks a lot in advance Best regards, Fabien

    Once in XML that alpha as a character entity is just as valid as any
    other character.
    Prior to that, it's up to you.

    HTH


    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 14.  RE: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 15:29
    Stupid me! (in case others are too :) )
    I have to use StringEscapeUtils.escapeXml, not StringEscapeUtils.escapeHTML....



    -----Message d'origine-----
    De : Fabien Tillier [mailto:f.tillier@cerep.fr]
    Envoyé : vendredi 25 juin 2010 17:23
    À : docbook-apps@lists.oasis-open.org
    Objet : RE: [docbook-apps] Question on CDATA in XML

    Well...
    I am using the apache commons lang package to encode the special characters (the one remaining after the rtf conversion, like µ). It encodes 'µ' as 'µ'. If I want to have an Unicode hex encoding, I have to stop using it and write my own...
    Unless there is another package I can use ?
    For my understanding, µ is valid HTML, not XML, that why it is further encoded by the XSLt to '&micro;' right ?

    Thanks a lot for your help, Dave.

    Regards,
    Fabien



    -----Message d'origine-----
    De : Dave Pawson [mailto:davep@dpawson.co.uk]
    Envoyé : vendredi 25 juin 2010 16:54
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    On Fri, 25 Jun 2010 16:31:51 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Ok, so I have changed my RTF to HTML conversion stuff so that it
    > strips all HTML tags (so I am loosing some formatting), and it works
    > better (I don't have thoses CDATA things anymore), except that I have
    > encoded characters (like &alpha;)

    You need to check your character level encoding throughout,
    end to end.
    Once it's in XML, & => & etc.
    If you want alpha, then use unicode character level encoding,
    e.g. & # x 0 3 b 1 ; (spaced in case it is messed up)


    that are showing up (which is
    > perfectly normal) but are not displayed by the browser as their
    > entity counterpart (here an alpha symbol). I assume I am missing
    > something like a transformation from &alpha; to α Is it
    > something that has to be implemented specifically (and how) in a
    > customization template or is there a magical option that can be
    > used ? Thanks a lot in advance Best regards, Fabien

    Once in XML that alpha as a character entity is just as valid as any
    other character.
    Prior to that, it's up to you.

    HTH


    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 15.  Re: [docbook-apps] Question on CDATA in XML

    Posted 06-25-2010 15:55

    Fabien,

    FYI - "µ" is an XML ENTITY which is valid XML.

    It sounds like you are close to working through your problem. In following this thread, I wondered if you could strip the RTF into a text file and then use XInclude. It may get messy since it sounds like you have a lot of bits of text which would equate to a lot of files but that could work.

    Regards,
    Dean Nelson




    In a message dated 06/25/10 08:22:47 Pacific Daylight Time, f.tillier@cerep.fr writes:
    Well...
    I am using the apache commons lang package to encode the special characters (the one remaining after the rtf conversion, like µ). It encodes 'µ' as 'µ'. If I want to have an Unicode hex encoding, I have to stop using it and write my own...
    Unless there is another package I can use ?
    For my understanding, µ is valid HTML, not XML, that why it is further encoded by the XSLt to 'µ' right ?

    Thanks a lot for your help, Dave.

    Regards,
    Fabien



    -----Message d'origine-----
    De : Dave Pawson [mailto:davep@dpawson.co.uk]
    Envoyé : vendredi 25 juin 2010 16:54
    À : docbook-apps@lists.oasis-open.org
    Objet : Re: [docbook-apps] Question on CDATA in XML

    On Fri, 25 Jun 2010 16:31:51 +0200
    "Fabien Tillier" <f.tillier@cerep.fr> wrote:

    > Ok, so I have changed my RTF to HTML conversion stuff so that it
    > strips all HTML tags (so I am loosing some formatting), and it works
    > better (I don't have thoses CDATA things anymore), except that I have
    > encoded characters (like α)

    You need to check your character level encoding throughout,
    end to end.
    Once it's in XML, & => & etc.
    If you want alpha, then use unicode character level encoding,
    e.g. & # x 0 3 b 1 ; (spaced in case it is messed up)


    that are showing up (which is
    > perfectly normal) but are not displayed by the browser as their
    > entity counterpart (here an alpha symbol). I assume I am missing
    > something like a transformation from α to α Is it
    > something that has to be implemented specifically (and how) in a
    > customization template or is there a magical option that can be
    > used ? Thanks a lot in advance Best regards, Fabien

    Once in XML that alpha as a character entity is just as valid as any
    other character.
    Prior to that, it's up to you.

    HTH


    --

    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org