docbook-apps

Expand all | Collapse all

Syntax highlighting

  • 1.  Syntax highlighting

    Posted 12-06-2013 21:54
    Hello world,

    A fair bit of effort in the DocBook stylesheets goes into parsing,
    decomposing, annotating, and recomposing program listings for the
    purpose of adding line numbers to them. There's also a bunch of work
    that goes into syntax highlighting them.

    Occasionally, this takes a *long* time.

    It appears that modern systems do this in the JavaScript layer on the
    client. They also use tables to render line numbers.

    I'm tempted to move in this direction. Comments?

    As long as I'm airing dirty laundry, I'm also tempted to abandon the
    XSL stylesheets and work instead on a purpose-built HTML+CSS rendering
    for printing.

    I should say that this note is particularly about the XSLT 2.0
    stylesheets that I've been working on, not the "standard" ones.

    Be seeing you,
    norm

    --
    Norman Walsh <ndw@nwalsh.com> | Resist the urge to hurry; it will
    http://www.oasis-open.org/docbook/ | only slow you down--Bruce Eckel
    Chair, DocBook Technical Committee |



  • 2.  Re: [docbook-apps] Syntax highlighting

    Posted 12-07-2013 07:38
    On 06/12/13 21:53, Norman Walsh wrote:
    > Hello world,
    >
    > A fair bit of effort in the DocBook stylesheets goes into parsing,
    > decomposing, annotating, and recomposing program listings for the
    > purpose of adding line numbers to them. There's also a bunch of work
    > that goes into syntax highlighting them.
    >
    > Occasionally, this takes a *long* time.
    >
    > It appears that modern systems do this in the JavaScript layer on the
    > client. They also use tables to render line numbers.
    >
    > I'm tempted to move in this direction. Comments?


    No problem with line numbers. I guess it's the outcome we want
    rather than how it is done?
    Would that work with insertions too? E.g. lines with (6) etc
    which are referred to in commentary?

    >
    > As long as I'm airing dirty laundry, I'm also tempted to abandon the
    > XSL stylesheets and work instead on a purpose-built HTML+CSS rendering
    > for printing.

    I can see you getting 80% with html and CSS...
    My penneth says you'll fall over on the 20% so I'm -1 on this.
    Do you believe you can do it?
    What of re-ordering, toc etc?

    >
    > I should say that this note is particularly about the XSLT 2.0
    > stylesheets that I've been working on, not the "standard" ones.
    >
    > Be seeing you,
    > norm
    >
    Understood.


    regards

    --
    Dave Pawson
    XSLT XSL-FO FAQ.
    http://www.dpawson.co.uk



  • 3.  Re: [docbook-apps] Syntax highlighting

    Posted 12-07-2013 21:26
    On 12/07/2013 01:38 AM, davep wrote:
    > On 06/12/13 21:53, Norman Walsh wrote:
    >> Hello world,
    >>
    >> A fair bit of effort in the DocBook stylesheets goes into parsing,
    >> decomposing, annotating, and recomposing program listings for the
    >> purpose of adding line numbers to them. There's also a bunch of work
    >> that goes into syntax highlighting them.
    >>
    >> Occasionally, this takes a *long* time.
    >>
    >> It appears that modern systems do this in the JavaScript layer on the
    >> client. They also use tables to render line numbers.

    We use the popular syntax highlighter:
    https://github.com/alexgorbatchev/SyntaxHighlighter

    But we had to adapt it to support callouts and markup inside the code
    sample. Without our modifications, the js highlighter would end up
    displaying the html markup from the inline markup and the callouts as if
    it were part of the code sample. Here's an example of our version:

    http://docs.rackspace.com/servers/api/v2/cs-devguide/content/curl_summary_xml.html

    On the downside, it adds weight to the page.

    >>
    >> I'm tempted to move in this direction. Comments?
    >
    >
    > No problem with line numbers. I guess it's the outcome we want
    > rather than how it is done?
    > Would that work with insertions too? E.g. lines with (6) etc
    > which are referred to in commentary?

    One caveat about using tables for line numbers. It generally makes it
    impossible to cut and paste the code sample without getting all the line
    numbers. One thing I like about the syntax highlighter above is that
    when you click on the code sample, the callouts go away and are not copied.

    Of course for pdf/print output, copy/paste is less of a concern.

    Regards,
    David



  • 4.  Re: [docbook-apps] Syntax highlighting

    Posted 12-09-2013 14:10
    On Sat, 2013-12-07 at 15:26 -0600, David Cramer wrote:
    > On 12/07/2013 01:38 AM, davep wrote:
    > > On 06/12/13 21:53, Norman Walsh wrote:
    > >> Hello world,
    > >>
    > >> A fair bit of effort in the DocBook stylesheets goes into parsing,
    > >> decomposing, annotating, and recomposing program listings for the
    > >> purpose of adding line numbers to them. There's also a bunch of work
    > >> that goes into syntax highlighting them.
    > >>
    > >> Occasionally, this takes a *long* time.
    > >>
    > >> It appears that modern systems do this in the JavaScript layer on the
    > >> client. They also use tables to render line numbers.
    >
    > We use the popular syntax highlighter:
    > https://github.com/alexgorbatchev/SyntaxHighlighter
    >
    > But we had to adapt it to support callouts and markup inside the code
    > sample. Without our modifications, the js highlighter would end up
    > displaying the html markup from the inline markup and the callouts as if
    > it were part of the code sample. Here's an example of our version:
    >
    > http://docs.rackspace.com/servers/api/v2/cs-devguide/content/curl_summary_xml.html
    >
    > On the downside, it adds weight to the page.

    Yelp uses jQuery.Syntax for syntax highlighting, which we've been pretty
    happy with. I've had no problems with it clobbering markup inside code
    blocks, which is something I deal with pretty often.

    http://www.codeotaku.com/projects/jquery-syntax/index.en

    Yelp does still do line numbers is XSLT though. Also:

    > One caveat about using tables for line numbers. It generally makes it
    > impossible to cut and paste the code sample without getting all the line
    > numbers. One thing I like about the syntax highlighter above is that
    > when you click on the code sample, the callouts go away and are not copied.

    Yelp puts the line numbers in a single pre that is float: left (or right
    in RTL) next to the code block. It's the best solution I've found.

    --
    Shaun






  • 5.  Re: [docbook-apps] Syntax highlighting

    Posted 12-09-2013 16:39
    On 12/09/2013 08:10 AM, Shaun McCance wrote:
    > Yelp uses jQuery.Syntax for syntax highlighting, which we've been pretty
    > happy with. I've had no problems with it clobbering markup inside code
    > blocks, which is something I deal with pretty often.

    That looks interesting. I experimented by adding some markup to an
    example in the distribution (ex.python.html). Adding was ignored
    (isn't clobbered but doesn't make text bold either). When I added src="1.png"/> it broke the highlighting. Perhaps I'm doing it wrong?

    > http://www.codeotaku.com/projects/jquery-syntax/index.en
    >
    > Yelp does still do line numbers is XSLT though. Also:
    >
    >> One caveat about using tables for line numbers. It generally makes it
    >> impossible to cut and paste the code sample without getting all the line
    >> numbers. One thing I like about the syntax highlighter above is that
    >> when you click on the code sample, the callouts go away and are not copied.
    >
    > Yelp puts the line numbers in a single pre that is float: left (or right
    > in RTL) next to the code block. It's the best solution I've found.

    The jquery-syntax examples have line numbers (added by the js). Is there
    a reason not to use that functionality?

    David




  • 6.  Re: [docbook-apps] Syntax highlighting

    Posted 12-09-2013 17:20
    On Mon, 2013-12-09 at 10:38 -0600, David Cramer wrote:
    > On 12/09/2013 08:10 AM, Shaun McCance wrote:
    > > Yelp uses jQuery.Syntax for syntax highlighting, which we've been pretty
    > > happy with. I've had no problems with it clobbering markup inside code
    > > blocks, which is something I deal with pretty often.
    >
    > That looks interesting. I experimented by adding some markup to an
    > example in the distribution (ex.python.html). Adding was ignored
    > (isn't clobbered but doesn't make text bold either). When I added > src="1.png"/> it broke the highlighting. Perhaps I'm doing it wrong?

    I have to admit, I've never tested images. And I completely override all
    the CSS and just use jQuery.Syntax as a tokenizer. But I do use spans to
    highlight added text with no problems. YMMV.

    > > http://www.codeotaku.com/projects/jquery-syntax/index.en
    > >
    > > Yelp does still do line numbers is XSLT though. Also:
    > >
    > >> One caveat about using tables for line numbers. It generally makes it
    > >> impossible to cut and paste the code sample without getting all the line
    > >> numbers. One thing I like about the syntax highlighter above is that
    > >> when you click on the code sample, the callouts go away and are not copied.
    > >
    > > Yelp puts the line numbers in a single pre that is float: left (or right
    > > in RTL) next to the code block. It's the best solution I've found.
    >
    > The jquery-syntax examples have line numbers (added by the js). Is there
    > a reason not to use that functionality?

    I already had the XSLT for this in place. I didn't feel like figuring
    out how to support continuation/startinglinenumber. And I don't see that
    doing it in JS has any actual benefits. JS might be the right solution
    for others.

    --
    Shaun







  • 7.  Re: [docbook-apps] Syntax highlighting

    Posted 12-07-2013 19:38


    On 12/6/2013 1:53 PM, Norman Walsh wrote:
    > ...
    > As long as I'm airing dirty laundry, I'm also tempted to abandon the
    > XSL stylesheets and work instead on a purpose-built HTML+CSS rendering
    > for printing.

    Hi Norm,

    What composition engines do you have in mind for rendering HTML+CSS to
    print?

    Bob Stayton
    Sagehill Enterprises
    bobs@sagehill.net



  • 8.  Re: [docbook-apps] Syntax highlighting

    Posted 12-07-2013 20:35
    On 6.12.2013 22:53, Norman Walsh wrote:
    > It appears that modern systems do this in the JavaScript layer on the
    > client. They also use tables to render line numbers.
    >
    > I'm tempted to move in this direction. Comments?

    I have used both approaches. Problem with Javascript one is that it
    doesn't work for print (neither of XSL-FO, HTML/CSS composition egines
    support Javascript) and sometimes introduces delay in rendering. On the
    other hand Javascript syntax highlighters are in a more active
    development then highlighting extensions for XSLT.

    > As long as I'm airing dirty laundry, I'm also tempted to abandon the
    > XSL stylesheets and work instead on a purpose-built HTML+CSS rendering
    > for printing.

    Do you mean completely removing existing incomplete fo code from project?

    I think that we can keep code here in the case someone will have time
    and interest to improve it further.

    With HTML+CSS composition there is one big problem -- there is (at least
    to my knowledge) no free renderer, I'm aware only of PrinceXML and
    AntennaHouse.

    But HTML+CSS printing got some traction recently, so it is wortwhile to
    explore this area.

    Jirka

    --
    ------------------------------------------------------------------
    Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
    ------------------------------------------------------------------
    Professional XML consulting and training services
    DocBook customization, custom XSLT/XSL-FO document processing
    ------------------------------------------------------------------
    OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
    ------------------------------------------------------------------
    Bringing you XML Prague conference http://xmlprague.cz
    ------------------------------------------------------------------




  • 9.  Re: [docbook-apps] Syntax highlighting

    Posted 12-07-2013 22:05

    > 7 dec 2013 kl. 21:35 skrev Jirka Kosek <jirka@kosek.cz>:
    >
    >> On 6.12.2013 22:53, Norman Walsh wrote:
    >> It appears that modern systems do this in the JavaScript layer on the
    >> client. They also use tables to render line numbers.
    >>
    >> I'm tempted to move in this direction. Comments?
    >
    > I have used both approaches. Problem with Javascript one is that it
    > doesn't work for print (neither of XSL-FO, HTML/CSS composition egines
    > support Javascript) and sometimes introduces delay in rendering. On the
    > other hand Javascript syntax highlighters are in a more active
    > development then highlighting extensions for XSLT.
    >
    >> As long as I'm airing dirty laundry, I'm also tempted to abandon the
    >> XSL stylesheets and work instead on a purpose-built HTML+CSS rendering
    >> for printing.
    >
    > Do you mean completely removing existing incomplete fo code from project?
    >
    > I think that we can keep code here in the case someone will have time
    > and interest to improve it further.
    >
    > With HTML+CSS composition there is one big problem -- there is (at least
    > to my knowledge) no free renderer, I'm aware only of PrinceXML and
    > AntennaHouse.

    Actually, there is a renderer that is able to generate PDF from HTML with really good results. Wkhtmltopdf (https://code.google.com/p/wkhtmltopdf/). It's basically a wrapper for the WebKit engine used by safari, chrome etc. Not sure if the WebKit they are using right now is bleeding edge. Anyway, it produces quite decent PDFs.

    I would definitely vote for the HTML route. Styling printable output with CSS would make the whole development process much more easier and less time consuming than it is today with XSL.

    /frank

    >
    > But HTML+CSS printing got some traction recently, so it is wortwhile to
    > explore this area.
    >
    > Jirka
    >
    > --
    > ------------------------------------------------------------------
    > Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
    > ------------------------------------------------------------------
    > Professional XML consulting and training services
    > DocBook customization, custom XSLT/XSL-FO document processing
    > ------------------------------------------------------------------
    > OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
    > ------------------------------------------------------------------
    > Bringing you XML Prague conference http://xmlprague.cz
    > ------------------------------------------------------------------
    >



  • 10.  Re: [docbook-apps] Syntax highlighting

    Posted 12-07-2013 22:22
    On 7.12.2013 23:05, Frank Arensmeier wrote:
    > Actually, there is a renderer that is able to generate PDF from HTML with really good results. Wkhtmltopdf (https://code.google.com/p/wkhtmltopdf/). It's basically a wrapper for the WebKit engine used by safari, chrome etc. Not sure if the WebKit they are using right now is bleeding edge. Anyway, it produces quite decent PDFs.
    >
    > I would definitely vote for the HTML route. Styling printable output with CSS would make the whole development process much more easier and less time consuming than it is today with XSL.

    AFAIK web browsers, including those based on webkit, doesn't support
    CSS3 modules that add basic features necessary for print output, e.g.:

    https://bugs.webkit.org/show_bug.cgi?id=85062

    --
    ------------------------------------------------------------------
    Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
    ------------------------------------------------------------------
    Professional XML consulting and training services
    DocBook customization, custom XSLT/XSL-FO document processing
    ------------------------------------------------------------------
    OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
    ------------------------------------------------------------------
    Bringing you XML Prague conference http://xmlprague.cz
    ------------------------------------------------------------------




  • 11.  Re: [docbook-apps] Syntax highlighting

    Posted 12-08-2013 03:58
    On 12/7/2013 5:05 PM, Frank Arensmeier wrote:
    > Actually, there is a renderer that is able to generate PDF from HTML with really good results.
    > Wkhtmltopdf (https://code.google.com/p/wkhtmltopdf/).

    There's also htmltolatex (with LaTeX to create the PDF):
    http://htmltolatex.sourceforge.net/
    I'm not sure how it handles character encodings; I would guess that it could be made to preserve
    UTF-8 (rather than substituting special LaTeX names), in which case you could use XeLaTeX to produce
    the PDF.
    --
    Mike Maxwell
    maxwell@umiacs.umd.edu
    "My definition of an interesting universe is
    one that has the capacity to study itself."
    --Stephen Eastmond



  • 12.  RE: [docbook-apps] Syntax highlighting

    Posted 12-08-2013 16:31
    On 2013-12-06 Norman Walsh wrote:
    >
    > I'm also tempted to abandon the XSL stylesheets and
    > work instead on a purpose-built HTML+CSS rendering
    > for printing.
    >

    While I don't plan to upgrade my generating workflow to XSL 2.0 stylesheets
    in the near future, I am quite curious whether the proposed HTML+CSS
    approach can really cover all common needs:

    1. ToC and Index with page numbers
    2. Bookmarks
    3. Double-sided version (different recto/verso margins, header/footer
    content)
    4. Running header-footers (differences amongst title, blank or recto/verso
    pages)
    5. Absolute positioning of title page graphics or other page elements
    6. Using PDF format for images
    7. Change bars

    Currently I also post-process final PDF files utilizing an intermediate
    format (I can retrieve the page # and Y coord for the specific ID in this
    format).

    Thanks, Jan