docbook-apps

  • 1.  Unicode characters in epub

    Posted 01-31-2012 22:39
    Hello,

    I successfully generated an epub file with the epub stylesheets and tested
    it on various e-readers (or emulators of e-readers). I have some annoying
    problems with Unicode characters though and wonder what others do or
    recommend.

    1. My book is about C++. Unfortunately "C++" is not a word - so e-readers
    seem to break "C++" wherever they like. A line could end with "C+" or "C",
    and the plus sign(s) is on the next line. I turned "C++" into
    "C++" (which is already crazy as I don't know how often I
    refer to C++ in my book). However this had some unfortunate side effects: If
     is used in the book title or titles which appear in the table of
    contents, the Sony Reader displays rectangles (not in the body text though).
    If I use  somewhere else like in "--option" in the body text
    (to avoid that a command line option is broken after the double minus), the
    Sony Reader displays something like "--`option" (and still breaks after the
    double minus). I don't know whether this is only a problem with the Sony
    Reader. But if in doubt I prefer line breaks than having some readers to see
    rectangles or other funny characters everywhere.

    2. Some e-readers like the Sony Reader and the Kobo Touch don't break long
    words. If you have a book about C++, you can have very long paths to header
    files or very long macros. The Kindle does the right thing and puts a line
    break into a word which you can't read anymore otherwise. I tried different
    CSS properties like word-wrap and overflow-warp but to no avail. Is there
    any trick to make e-readers break words by all means if they are too long?

    3. I use a table with three columns in my book which is already difficult to
    display on a narrow e-reader. If there are some long words, e-readers can
    mess up completely (because of 2.). So I added ­ here and there to
    insert soft hyphens. The Sony Reader, Kobo Touch and Adobe Digital Editions
    do break the words now where I put ­ - but they don't display a hyphen!
    Adobe Digital Editions does display a hyphen in the table of contents if I
    add ­ to a chapter title - although the chapter title doesn't need to
    be and isn't broken in the table of contents. Only the Kindle seems to do
    the right thing.

    My conclusion is that one better doesn't try to beautify an epub with
    Unicode characters? I think I'll use ­ where it's absolutely required
    to break words (like in a table with three columns) because I know that some
    parts of the text will not be displayed at all. Otherwise it's probably
    better to blame the e-reader? ;)

    Boris





  • 2.  Re: [docbook-apps] Unicode characters in epub

    Posted 02-01-2012 13:21
    On Tue, January 31, 2012 10:39 pm, Boris Schäling wrote:
    ...
    > 1. My book is about C++. Unfortunately "C++" is not a word - so e-readers
    > seem to break "C++" wherever they like. A line could end with "C+" or "C",
    > and the plus sign(s) is on the next line. I turned "C++" into
    > "C++" (which is already crazy as I don't know how often I
    > refer to C++ in my book). However this had some unfortunate side effects:
    > If
    >  is used in the book title or titles which appear in the table of
    > contents, the Sony Reader displays rectangles (not in the body text
    > though).

     has a dual role as Zero Width No-Break Space and as the BOM.

    Unicode 3.2 added ⁠, WORD JOINER, that is just a word joiner. [1]

    The Unicode Standard says that you are supposed to use ⁠ in new
    text, and that applications are supposed to support word joining with
    either ⁠ or .

    Maybe, just maybe, your EPUB readers will do better with ⁠ than
    they do with .

    Regards,


    Tony Graham tgraham@mentea.net
    Consultant http://www.mentea.net
    Mentea 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
    -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    XML, XSL-FO and XSLT consulting, training and programming

    [1] Page 5 (or 524) of http://www.unicode.org/versions/Unicode6.0.0/ch16.pdf




  • 3.  RE: [docbook-apps] Unicode characters in epub

    Posted 02-01-2012 23:21


    >