docbook-apps

  • 1.  export from MS Excel to DocBook?

    Posted 08-16-2012 07:24
    Hi all,

    I've got a large number of text paragraphs in a Excel spreadsheet.

    I think about to convert the entries of the spreadsheet cells to a docbook file using the XML capabilities of the newer Excel versions.

    My Excel looks like this:

    Heading 1 | | | |
    | Text a | | |
    | Text b | | |
    | | Heading 2 | |
    | | | Text c |
    | | | Text d |

    This should go in something like this:




    <para> Text a</para>
    <para> Text b</para>


    <para> Text a</para>
    <para> Text b</para>



    Has anybody any experience with this? Any pointers?

    Best regards

    Robert Bürgel



  • 2.  Re: [docbook-apps] export from MS Excel to DocBook?

    Posted 08-16-2012 08:02
    On 16.8.2012 9:23, Robert.Buergel@bmw.de wrote:

    > Has anybody any experience with this? Any pointers?

    XSLT 2.0 has powerfull grouping instruction xsl:for-each-group. With
    that instruction and Excel file saved as .xslx it's fairly easy to
    produce output you want. Also Saxon9 (XSLT 2.0 implementation) can read
    directly content of .xsls files there is no need to unpack them first.
    Simply use something like

    doc('jar:table.xslx!!/_rels/.rels')

    to access parts of XSLX file.

    Jirka

    --
    ------------------------------------------------------------------
    Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
    ------------------------------------------------------------------
    Professional XML consulting and training services
    DocBook customization, custom XSLT/XSL-FO document processing
    ------------------------------------------------------------------
    OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
    ------------------------------------------------------------------




  • 3.  Re: [docbook-apps] export from MS Excel to DocBook?

    Posted 08-16-2012 10:08
    On 16.08.2012, at 10:02, Jirka Kosek wrote:

    > doc('jar:table.xslx!!/_rels/.rels')

    Why are there two '!' in this Jar URL? Is that a special mode?

    -Christian

    --
    Christian Roth * Phone: +49 (0)89 89 04 32 95
    infinity-loop GmbH * Neideckstr. 25 * 81249 München * Germany
    HRB 136 783 (AG München) * Geschäftsführer: Dr. Stefan Hermann
    Web: http://www.infinity-loop.de








  • 4.  Re: [docbook-apps] export from MS Excel to DocBook?

    Posted 08-16-2012 10:43
    On 16.8.2012 12:07, Christian Roth wrote:
    > On 16.08.2012, at 10:02, Jirka Kosek wrote:
    >
    >> doc('jar:table.xslx!!/_rels/.rels')
    >
    > Why are there two '!' in this Jar URL? Is that a special mode?

    Sorry, it should be just one. Silly typo.

    --
    ------------------------------------------------------------------
    Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
    ------------------------------------------------------------------
    Professional XML consulting and training services
    DocBook customization, custom XSLT/XSL-FO document processing
    ------------------------------------------------------------------
    OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
    ------------------------------------------------------------------




  • 5.  RE: [docbook-apps] export from MS Excel to DocBook?

    Posted 08-16-2012 15:07

    Not knowing about the xlsx (sic) unpacking facility in Saxon, and having had no success trying to automate Excel internally, I've recently written a simple C++ program to take an Excel spreadsheet file and convert it to XML. From there I'm using Xsl to generate the Xml I really want.

    As it's a program I've written for work I can't publish it, though it is really quite simple. Access Excel using COM. Scan all worksheets, scan all rows, scan all columns, and write the text in the cells to Xml using Msxml. Took about an afternoon to get it working.

    Maybe now I'll see if I can get anything useful from the xlsx file directly using Saxon....

    Unhelpfully,
    Richard.




    >


  • 6.  Re: [docbook-apps] export from MS Excel to DocBook?

    Posted 08-16-2012 19:48
    On 16.8.2012 17:06, Kerry, Richard wrote:

    > Not knowing about the xlsx (sic) unpacking facility in Saxon, and
    > having had no success trying to automate Excel internally, I've
    > recently written a simple C++ program to take an Excel spreadsheet
    > file and convert it to XML. From there I'm using Xsl to generate the
    > Xml I really want.
    >
    > As it's a program I've written for work I can't publish it, though it
    > is really quite simple. Access Excel using COM. Scan all
    > worksheets, scan all rows, scan all columns, and write the text in
    > the cells to Xml using Msxml. Took about an afternoon to get it
    > working.

    Today I would stay away from COM if possible. You need Excel in order to
    use it (which can be problem for server environment) and it is quite
    slow on large document.

    > Maybe now I'll see if I can get anything useful from the xlsx file
    > directly using Saxon....

    Don't get discouraged from first inspection of OOXML internals. Format
    is pretty convoluted but once you understood principles is pretty easy
    to process if you just need to extract some data or automatically fill
    some data into existing template.

    Jirka

    --
    ------------------------------------------------------------------
    Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz
    ------------------------------------------------------------------
    Professional XML consulting and training services
    DocBook customization, custom XSLT/XSL-FO document processing
    ------------------------------------------------------------------
    OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
    ------------------------------------------------------------------




  • 7.  RE: [docbook-apps] export from MS Excel to DocBook?

    Posted 08-16-2012 15:31
    I did something like this recently to take some spreadsheets (or rather tab-delimited data from them) and format it in LaTeX. I'm not a programmer, so my skills doing something like this are pretty crude. I used this bit of Lua code to pull data from a tab-delimited text file produced by Excel and read it into a Lua table. In your case, you could use the LuaXML library to write data from the table to a Docbook file as you see fit.

    function string:split( inSplitPattern, outResults )
    if not outResults then
    outResults = { }
    end
    local theStart = 1
    local theSplitStart, theSplitEnd = string.find( self, inSplitPattern, theStart )
    while theSplitStart do
    table.insert( outResults, string.sub( self, theStart, theSplitStart-1 ) )
    theStart = theSplitEnd + 1
    theSplitStart, theSplitEnd = string.find( self, inSplitPattern, theStart )
    end
    table.insert( outResults, string.sub( self, theStart ) )
    return outResults
    end

    -- table to hold data
    local data = {}

    local file = assert(io.open(arg[2], "r"), "Error error reading file")
    line = file:read("*line")

    repeat
    table.insert(data, line:split("\t"))
    line = file:read("*line")
    until line==nil
    file:flush()
    file:close()

    Again, not elegant, but it worked out for me.

    -David
    <dgoss@mueller-inc.com>