docbook-apps

  • 1.  UI strings vs manual strings ?

    Posted 12-05-2022 23:06
    What's the best way in a DocBook centered process to ensure that the list of terms used in a software UI is (semi-automatically?) taken into account in the DocBook sources that describe that software?

    Problem at hand:

    - a Java application with ~2k UI strings (not all users facing), in a Bundle.properties file
    - a ~80K words DocBook manual

    It is not trivial to keep track of the whole string set (searches, etc.)

    Also, the l10n process takes place on the DocBook sources, not on the HTML output, so tricks like don't work because translators don't see the target terms.

    I'm left with having to rewrite the strings explicitly and that's a pain, and also adds risks of mistakes in translations.

    --
    Jean-Christophe Helary @jchelary@emacs.ch
    https://traductaire-libre.org
    https://mac4translators.blogspot.com
    https://sr.ht/~brandelune/omegat-as-a-book/




  • 2.  Re: [docbook-apps] UI strings vs manual strings ?

    Posted 12-06-2022 12:45
    On 05/12/2022 23:05, Jean-Christophe Helary wrote:
    > What's the best way in a DocBook centered process to ensure that the
    > list of terms used in a software UI is (semi-automatically?) taken
    > into account in the DocBook sources that describe that software?

    I haven't had to do this, but since no-one else has responded yet...

    > Problem at hand:
    >
    > - a Java application with ~2k UI strings (not all users facing), in
    > a Bundle.properties file

    Java also has an XML format for properties files.

    > - a ~80K words DocBook manual
    >
    > It is not trivial to keep track of the whole string set (searches,
    > etc.)
    >
    > Also, the l10n process takes place on the DocBook sources, not on
    > the HTML output, so tricks like don't work
    > because translators don't see the target terms.

    Before translation, replace each with the replacement text from
    the XML properties file wrapped in a well-known element that still
    carries the identifier for the properties file entry.

    After translation, if necessary, convert the well-known elements back
    into and also do something to handle the strings that have been
    translated differently in different places.

    Once you have the properties file for a second language, you could
    insert the translated strings in place of when preparing for
    translation. Alternatively, or as well, you could set up your
    computer-aided translation tool to not translate the well-known elements
    for the strings and insert the translated strings after everything else
    is translated.

    > I'm left with having to rewrite the strings explicitly and that's a
    > pain, and also adds risks of mistakes in translations.

    The more that you can automate, the better.

    Regards,


    Tony Graham.
    --
    Senior Architect
    XML Division
    Antenna House, Inc.
    ----
    Skerries, Ireland
    tgraham@antenna.co.jp



  • 3.  Re: [docbook-apps] UI strings vs manual strings ?

    Posted 12-06-2022 13:05
    Hello,
    I would transform the bundle.properties in a document (article, book or section whatever)Each line of the file corresponds to somethine like :
    <simpara><guilabel xml:id="messageId">My message</guilabel></simpara>
    One element simpara for one guilabel is useless : it is just to make it readable in a DocBook parse.

    In the document, you include the message - something like :<para>You should see <xi:include href="bundle.properties.xml" xpointer="messageId"> after clicking on the button.</para>
    The French, English, German version of the document will take advantage of the corresponding translated version of bundle.properties.xml

    As far as no id message starts with a number (NC Name for xml:id) you are ok.With an XSLT 2.0 processor, it might even be possible to transform the bundle.properties in XML.

    Regards,Florimond
    Le mardi 6 décembre 2022 à 00:05:49 UTC+1, Jean-Christophe Helary <lists@traduction-libre.org> a écrit :

    What's the best way in a DocBook centered process to ensure that the list of terms used in a software UI is (semi-automatically?) taken into account in the DocBook sources that describe that software?

    Problem at hand:

    - a Java application with ~2k UI strings (not all users facing), in a Bundle.properties file
    - a ~80K words DocBook manual

    It is not trivial to keep track of the whole string set (searches, etc.)

    Also, the l10n process takes place on the DocBook sources, not on the HTML output, so tricks like don't work because translators don't see the target terms.

    I'm left with having to rewrite the strings explicitly and that's a pain, and also adds risks of mistakes in translations.

    --
    Jean-Christophe Helary @jchelary@emacs.ch
    https://traductaire-libre.org
    https://mac4translators.blogspot.com
    https://sr.ht/~brandelune/omegat-as-a-book/


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org




  • 4.  RE: [docbook-apps] UI strings vs manual strings ?

    Posted 12-06-2022 20:14
    On 05/12/2022 23:05, Jean-Christophe Helary wrote:
    > What's the best way in a DocBook centered process to ensure that the
    > list of terms used in a software UI is (semi-automatically?) taken
    > into account in the DocBook sources that describe that software?

    In your document you can use <guilabel> and other <gui* related elements
    which can indicate the content must match the GUI label. You can then
    instruct the localization agency to follow this rule.
    But there is no way to avoid human error so this still has to be checked
    manually which is inefficient.

    I've seen an interesting approach where any guilabel had a dedicated
    attribute storing a termbase ID. While a guilabel value was present, it was
    just informative (for the author, to understand the context). The actual
    value was taken from the termbase during generating outputs.

    So if GUI labels are linked to the same termbase, this system ensures your
    document will never diverge. Moreover, it is ensured also to all translated
    documents.

    The hardest step is consolidating such a termbase and establishing processes
    on DEV and DOC sides so both departments use the termbase as a single source
    of truth.

    Jan




  • 5.  Re: [docbook-apps] UI strings vs manual strings ?

    Posted 12-07-2022 00:10
    Thank you all for the replies so far.

    Let me reply in one mail.

    > On Dec 6, 2022, at 21:44, Tony Graham <tgraham@antenna.co.jp> wrote:
    >
    >> Problem at hand:
    >> - a Java application with ~2k UI strings (not all users facing), in
    >> a Bundle.properties file
    >
    > Java also has an XML format for properties files.

    Interesting. It could be part of a solution (esp. considering Florimond's reply).

    >> - a ~80K words DocBook manual
    >> It is not trivial to keep track of the whole string set (searches, etc.)
    >> Also, the l10n process takes place on the DocBook sources, not on
    >> the HTML output, so tricks like don't work because translators don't see the target terms.
    >
    > Before translation, replace each with the replacement text from
    > the XML properties file wrapped in a well-known element that still
    > carries the identifier for the properties file entry.
    >
    > After translation, if necessary, convert the well-known elements back
    > into and also do something to handle the strings that have been
    > translated differently in different places.

    The problem is that it's not possible to do that for a lot of languages. There are inflected forms that transform the text of the "endterm" part and the translation targets 3 dozen languages, including BiDi documents.

    That process would add another layer of transformation+verification.

    Or maybe I missed something?

    > Once you have the properties file for a second language, you could
    > insert the translated strings in place of when preparing for
    > translation. Alternatively, or as well, you could set up your
    > computer-aided translation tool to not translate the well-known elements
    > for the strings and insert the translated strings after everything else
    > is translated.

    It looks feasible but only with a small set of target languages.

    >> I'm left with having to rewrite the strings explicitly and that's a pain, and also adds risks of mistakes in translations.
    >
    > The more that you can automate, the better.

    Hence the question ;-)


    > On Dec 6, 2022, at 22:04, Alemps Florimond <ntuflorimond@yahoo.com> wrote:
    >
    > Hello,
    >
    > I would transform the bundle.properties in a document (article, book or section whatever)
    > Each line of the file corresponds to somethine like :
    > <simpara><guilabel xml:id="messageId">My message</guilabel></simpara>
    >
    > One element simpara for one guilabel is useless : it is just to make it readable in a DocBook parse.

    Interesting.

    Considering that Java properties can also be expressed as XML there could be some automation here.

    > In the document, you include the message - something like :
    > <para>You should see <xi:include href="bundle.properties.xml" xpointer="messageId"> after clicking on the button.</para>
    >
    > The French, English, German version of the document will take advantage of the corresponding translated version of bundle.properties.xml

    Why only those 3 languages?

    My understanding of xi:include is that it is not required to be resolved before the actual documentation build process.

    Which means that the document to translate (and the way it is displayed in the tool) is actually

    > <para>You should see <arbitrary link shortcut> after clicking on the button.</para>

    Which is not different from what we have now with

    > As far as no id message starts with a number (NC Name for xml:id) you are ok.
    > With an XSLT 2.0 processor, it might even be possible to transform the bundle.properties in XML.

    It looks like Java properties can be expressed as XML natively (see above) so there is something to explore here.

    > On Dec 7, 2022, at 5:13, Jan Tosovsky <j.tosovsky@email.cz> wrote:
    >
    > On 05/12/2022 23:05, Jean-Christophe Helary wrote:
    >> What's the best way in a DocBook centered process to ensure that the
    >> list of terms used in a software UI is (semi-automatically?) taken
    >> into account in the DocBook sources that describe that software?
    >
    > In your document you can use <guilabel> and other <gui* related elements
    > which can indicate the content must match the GUI label. You can then
    > instruct the localization agency to follow this rule.
    > But there is no way to avoid human error so this still has to be checked
    > manually which is inefficient.

    The problem is not instructions, the problem is to lower the burden of the translators by explicitly displaying the strings in the DocBook sources.

    Creating a normative glossary from the UI strings first could be something, but there are Windows/Linux mnemonics (&) characters in the strings so we'd need to remove them to create that glossary and that would add another step (which can be automatized I guess).

    Full disclosure: the manual is for OmegaT, a free software solution for translators, that supports DocBook out of the box, and Java properties too. I am project leader, also in charge of the manual, I made a close to full rewrite of the thing this summer/fall to prepare for our next release but I know that the solution that I chose (link linkend endterm) is not optimal because the link contents/target is not available for inflected modifications required in some languages. (And I also happen to be a translation company, so I understand those issues quite well, but it was my first time on the DocBook authoring side, the last time I wrote the manual it was in HTML.)

    As mentioned above in the reply to Tony, the issue with some strings is that they must be explicitly available for translation because some languages need to modify them (grammatical inflections, etc.)

    > I've seen an interesting approach where any guilabel had a dedicated
    > attribute storing a termbase ID. While a guilabel value was present, it was
    > just informative (for the author, to understand the context). The actual
    > value was taken from the termbase during generating outputs.
    >
    > So if GUI labels are linked to the same termbase, this system ensures your
    > document will never diverge. Moreover, it is ensured also to all translated
    > documents.

    I'm not sure I understand how the termbase is linked to the document.

    > The hardest step is consolidating such a termbase and establishing processes
    > on DEV and DOC sides so both departments use the termbase as a single source
    > of truth.

    :-)



    --
    Jean-Christophe Helary @jchelary@emacs.ch
    https://traductaire-libre.org
    https://mac4translators.blogspot.com
    https://sr.ht/~brandelune/omegat-as-a-book/




  • 6.  Re: [docbook-apps] UI strings vs manual strings ?

    Posted 12-13-2022 10:28
    What result are you looking for?

    Are you treating one language (say, English) as the main language (which
    has empty elements for value lookups) and the other languages as end
    products (which have all text filled in), where you'd use OmegaT's
    translation memory to keep translations consistent across revisions?

    Or do you want the other languages to be structurally equivalent to the
    main version (apart from inline elements moved around because of
    sentence structure), where elements containing text are turned back into
    empty elements?

    Regards,


    Tony Graham.
    --
    Senior Architect
    XML Division
    Antenna House, Inc.
    ----
    Skerries, Ireland
    tgraham@antenna.co.jp



  • 7.  Re: [docbook-apps] UI strings vs manual strings ?

    Posted 12-16-2022 13:32


    > On Dec 13, 2022, at 19:28, Tony Graham <tgraham@antenna.co.jp> wrote:
    >
    > What result are you looking for?

    I am looking for an authoring process where software UI strings can
    easily be handled in the documentation.

    I'm imagining that there would be an editor that uses a UI strings
    "library" as reference and calls its contents when required in the doc
    during the build process.

    What would be the best way to achieve that in a DocBook centered process?

    > Are you treating one language (say, English) as the main language (which
    > has empty elements for value lookups) and the other languages as end
    > products (which have all text filled in), where you'd use OmegaT's
    > translation memory to keep translations consistent across revisions?

    I'm not sure I understand the above question, even though I've been
    using OmegaT almost daily for the past 20 years.

    > Or do you want the other languages to be structurally equivalent to the
    > main version (apart from inline elements moved around because of
    > sentence structure), where elements containing text are turned back into
    > empty elements?

    I don't understand the second part "where elements containing text are
    turned back into empty elements?".

    Jean-Christophe

    >
    > Regards,
    >
    >
    > Tony Graham.
    > --
    > Senior Architect
    > XML Division
    > Antenna House, Inc.
    > ----
    > Skerries, Ireland
    > tgraham@antenna.co.jp
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: docbook-apps-unsubscribe@lists.oasis-open.org
    > For additional commands, e-mail: docbook-apps-help@lists.oasis-open.org
    >

    --
    Jean-Christophe Helary @jchelary@emacs.ch
    https://traductaire-libre.org
    https://mac4translators.blogspot.com
    https://sr.ht/~brandelune/omegat-as-a-book/




  • 8.  RE: [External Email] Re: [docbook-apps] UI strings vs manual strings ?

    Posted 12-20-2022 13:39
      |   view attached
    Hi all,

    the issue of keeping UI strings and documentation in sync sounds familiar and I'd like to let you know how we've been doing it for last 12 years or so - not perfect but robust enough to publish several dozen documents with up to 1300 pages in two (was four) languages as PDF and webhelp.

    The software we are writing documentation for is using dozens of XML files for its configuration and user interface. Each UI string comes with an ID that is more or less unique, and we can therefore reuse it in our documentation. However, almost every configuration file has a slightly different content model to the one before and therefore we have to pre-process everything before we can actually use it.

    In this pre-processing step, a script pulls together all the UI strings from all the different configurations into a "normalized" xml file. This step also pulls in the translated versions of strings. You end up with a rather big xml file ("alltext.xml"), in which you have the ID, the original string and translations in one place. If a translation is missing, we add a placeholder text.

    On the Docbook side, we are using guilabel/guimenu/guimenuitem to tag UI text. By our convention, each of these elements must have a @remap attribute with its value set to one of the IDs you can find in the software configuration files (or rather in that big xml file from the previous step). Referencing the ID is the important bit for the automation. For readability, the Technical Writer would also use the UI string as content for guilabel/guimenu/guimenuitem. However, this isn't strictly necessary as the element's content will be replaced at a later stage.

    Our publishing process is automated via ant/jenkins. Once all xincludes have been resolved, we use some XSLT on the resulting temporary file and replace the UI strings with the most recent versions from "alltext.xml". And that's more or less it - an ID-based lookup and string replacement.

    To assist our Technical Writers, we also provide modified versions of the software configuration. This configs can be used to run the software in a way that will display the internal ID alongside the UI text. By doing this, the Technical Writer is provided with sufficient context to find the right ID to be used for @remap. It also helps to avoid confusion where the configuration files have several copies of the same UI string but with different ID values.

    Since the UI strings are translated separately and keep their original ID, the remapping also works when we are publishing in other languages (our primary language is German, and we translate into English. This process has been used in the past to translate into French, Italian and Russian as well).

    A word on the authoring process: We are using oXygen XML and have put together some schematron rules that will flag up when a remap attribute is missing or when the content of a guilabel is different from the current value in the lookup file. This was done as a proof-of-concept and isn't required for authoring, but we encourage using it. We are also experimenting with schematron quickfixes to replace UI strings where necessary - this also is experimental and not a feature we are using all the time, but it comes in handy every now and then.

    We do most of our translation inhouse (we used OmegaT in the past, but not anymore, sorry!). The documentation and the UI strings are translated separately (by the same Translator) into two separate translation memories. When translating the documentation, all guilabel/guimenu/guimenuitem are set to be non-translatable. However, our Translator sees the element content and thus has all the context information they need. Once they export their files to the target language, the UI strings will still be in German. The German text is then replaced via a combination of @lang and @remap when publishing to their respective output format.

    For those cases where UI text does need to be translated/localized, a Technical Writer can set a @translate attribute to "yes" on guilabel elements and override the replacement. The TMS will unlock those elemetns and the Translator can modify the content of those elements.

    I am aware that this process might be problematic with certain languages and/or if the source material is not well maintained and probably
    for many other reasons, too. We had trouble in the past with UI text that was split into several individual strings or when placeholders are
    being used. With regards to mnemonics and shortcuts - those are indeed annoying. Our specific problem is that the software configuration wasn't designed to be translation friendly, and we basically have to strip away things before going into translation. I do believe that this problem could be solved if we put in some effort in redesigning the configuration files. Screenshots also matter and need manual updates when the UI text changes.

    Bottom line is: We manage to keep our documentation and user interface synchronized in terms of UI strings. It took a while to get to this level and we still haven't solved every issue but for the most part, it works. On top of that, the Technical Writers and our Dev team are constantly sharing information. Developers also need to consult the Technical Writers and Translator before adding or changing any UI text - that way we have more control over terminology and consistency.

    I hope that helps and you get some ideas out of this.

    Regards
    Patrick


    [Quadient]
    Patrick Riffel
    Team Leader Information Development
    p.riffel@quadient.com<mailto:p.riffel@quadient.com>
    +49 711 806098-0
    Hauptstätter Straße 70
    70178 Stuttgart
    www.quadient.com<https: www.quadient.com/="">
    Learn more about Quadient<https: www.quadient.com/connections="">

    Quadient DOPiX Germany GmbH ist ein beim Amtsgericht Stuttgart eingetragenes Unternehmen.
    Registernummer: HRB 17655
    Sitz der Gesellschaft: Hauptstätter Straße 70, 70178 Stuttgart
    Geschäftsführer: Andreas Schwegler, Duncan Spanner, Laurent du Passage


    </https:></https:></mailto:p.riffel@quadient.com>