Hi,
I'm replying on behalf of Mihai Niță, who wrote the Plural, Gender and Select module.
= = = = =
> Simple example of ICU message: Anna has invited {g, plural, one {one guest} other {a lot of guests}} to her party.
Those kinds of messages are in general considered bad i18n.
Because in many languages the sections "outside the decision" must in fact change (to agree in gender / number / case).
So there is no intention to support ICU messages where the decision is inside the main flow.
In fact the MessageFormat v2 spec (very close to release) does not provide any way to do that anymore.
The translator would have to translate more, indeed, but they are free to change order, account for agreement, etc.
In general the l10n tools would use some kind of fuzzy match and aleviate a bit the burden of translating more.
With the advantage of less complexity and higher quality.
> I hope you can see the problem. The way how we overcame this problem is that we would create the following segments for a given example:
> Anna has invited {1} to her party.
> (one) one guest
> (other) a lot of guests
This might be very clunky of things must move around and it is in fact somewhat equivalend to concatentenation.
An example:
You deleted {count, plural, one {# file} other {# files}}, very big!
`
The Romanian translation is
one = Ați sters # fișier, foarte mare!
other = Ați sters # fișiere, foarte mari!
Note the difference in "mare" / "mari" (singualr / plural).
If we create these segments:
You deleted {1}, very big!
(one) # file
(other) # files
then the translator would have to do this:
Ați sters {1}!
(one) # fișier, foarte mare
(other) # fișiere, foarte mari
So they would basically translate "# file" as "# fișier, foarte mare" ("one file, very big"). Worse, the translator would have to know that they can take "pieces" from the main message and move them in the one/other segments.
Worse, if there is another English message saying:
You deleted {1}, very small!
(one) # file
(other) # files
Now the TM leveraging will be wrong, as the "one file, very big" translation can be reused for "# file"
TLDR: decisions inside the message are intentionally not supported, as that is a bad i18n practice. The price is translating more, but with better quality and less complexity.
------------------------------
--
Rodolfo M. Raya
rmraya@maxprograms.comMaxprograms
https://www.maxprograms.com------------------------------
Original Message:
Sent: 10-10-2024 17:43
From: husarcik
Subject: Comment on XLIFF v2.2 CSD01: General impression
Dear TC,
I'm submitting this feedback on behalf of Tomáš Beluský, who's waiting for his account to be approved:
Translators might need to translate the same thing over and over because some parts are repeated in all cases. Of course, the tools themselves can make translators' lifes easier by adding some help. But because of this it might be impossible to merge. You want to generate the same format but since cases can be translated freely it is nearly impossible for tools to do that. Simple example of ICU message: Anna has invited {g, plural, one {one guest} other {a lot of guests}} to her party.
Translated XLIFF for Slovak language (for simplicity "few" and "many" cases are omitted):
<unit id="seg1" pgs:switch="plural:g"> <segment id="seg1" pgs:case="one">
<source>Anna has invited one guest to her party.</source>
<target>Anna pozvala jedného hosťa na svoju párty.</target>
</segment>
<segment id="seg1" pgs:case="other">
<source>Anna has invited a lot of guests to her party.</source>
<target>Anna pozvala mnoho hostí na svoju párty.</target>
</segment>
</unit>A tool should merge translations into the following:
Anna pozvala {g, plural, one {jedného hosťa} other {mnoho hostí}} na svoju párty.I hope you can see the problem. The way how we overcame this problem is that we would create the following segments for a given example:
- Anna has invited {1} to her party.
- (one) one guest
- (other) a lot of guests
Thanks to this translator doesn't have to translate the same thing over and over and we have no issue with the merging. I understand that because of this translator has to keep context across all related segments to do correct translation and is a bit limited as he/she can't reshuffle words between segments but as I mentioned it has other big advantages. Of course, for some formats like Android the proposed XLIFF representation is just fine as each case has its own segment, not sharing anything with other forms like in case of ICU. Would it make sense to support both representations - hierarchical and non-hierarchical?
Best regards,
Jano