docbook-apps

  • 1.  Re: [docbook-apps] Stripping comments

    Posted 03-30-2007 17:30
    Here's a quick perl solution that doesn't read everything into
    memory and seems to handle some of the edge cases. Try it out on a
    few things to verify that everything is okay before completely
    trusting it, though. :)

    Copy the lines between '------------' into a file (say
    strip_xml_comments.pl).

    (if on Unix do this step first)
    chmod 755 strip_xml_comments.pl

    Make a backup copy of any and all files that you'll be using. (The
    script should work fine as is, but it's *MUCH* better to be safe than
    sorry. :)

    Now you should be able to run the script on a copy of your input file.

    strip_xml_comments.pl my_xml_input_file.xml

    The script will make a backup copy of its own with '.orig' at the
    end of the name. (Please don't just rely on this feature -- make your
    own backup.)

    Verify that everything looks okay and integrate it into your
    application stream.

    Here's the script

    ----------------------
    #!/usr/bin/perl -w -i.orig

    #
    # NB: Delete the '.orig' portion if backup copies are not desired
    #

    #
    # Delete XML comments.
    #


    #
    # Go through every file given on the command line
    #
    $in_comment= 0;
    while( <> ) {

    #
    # Match inline comments
    #
    s {
    # Match the closing delimiter.
    } []gsx;

    #
    # Match multi-line comments
    #
    if( // ) {
    s/.*-->//;
    $in_comment= 0;
    }

    #
    # Ignore every line in the comment
    #
    if( $in_comment ) {
    next;
    }

    print; # Print everything on the current line
    }

    ----------------------

    Note that the code is a simple modification of one of the examples
    from the perlre man page (http://perldoc.perl.org/perlre.html).


    Hopefully this will suit your purposes!


    kells

    >
    > ----- Original Message -----
    > From: "Paul Moloney" <paul_moloney@hotmail.com>
    > To: <docbook-apps@lists.oasis-open.org>
    > Sent: Thursday, March 29, 2007 6:45 AM
    > Subject: [docbook-apps] Stripping comments
    >
    >
    > >
    > > One task I have it to package our source XML files for use by
    > > integrators;
    > > one thing I'd like to do is first strip the comments from these files as
    > > they may contain sensitive information.
    > >
    > > I was thinking that this could be done by processing each file through
    > > Saxon
    > > using a stylesheet which strips out comments and outputs the XML again.
    > > But
    > > rather than risk reinventing the wheel, I was wondering if anyone out
    > > there
    > > has implemented a DocBook comment stripper in their build process?
    > >
    > > Thanks,
    > >
    > > P.
    > > --
    > > View this message in context:
    > > http://www.nabble.com/Stripping-comments-tf3486783.html#a9734912
    > > Sent from the docbook apps mailing list archive at Nabble.com.
    >



  • 2.  Re: [docbook-apps] Stripping comments

    Posted 04-02-2007 09:46

    Thanks for the help; will try this out and let you know how it goes...

    P.
    --
    View this message in context: http://www.nabble.com/Stripping-comments-tf3486783.html#a9787652
    Sent from the docbook apps mailing list archive at Nabble.com.