OASIS XML Localisation Interchange File Format (XLIFF) TC

  • 1.  Implementation of XLIFF 2.1 - ITS module

    Posted 08-12-2016 09:51
    Hi all, I started an ITS module implementation relying on my generic ITS processor. See the processed files here https://github.com/fsasaki/its20-extractor/tree/master/sample/xliff21sample external-rules.xml contains the rules, currently only for text analytics. inputfile.xml is an XLIFF 2.1 input file, currently with ITS Text Analytics information. The output is as a list of XPath expressions in nodelist-with-its-information.xml and as inline annotations in output-inline-annotation.xml The output shows one issue which we had discussed before, see below, taken from output-inline-annotation.xml <source> <itsAnn xmlns= /> <sm id= sm1 type= itsm:generic itsm:taClassRef= http://nerd.eurecom.fr/ontology#Place itsm:taIdentRef= http://dbpedia.org/resource/Arizona > <itsAnn xmlns= > <elem> <taClassRefPointer xmlns:xlf2= urn:oasis:names:tc:xliff:document:2.0 xmlns:its= http://www.w3.org/2005/11/its xmlns:datc= http://example.com/datacats itsm:taClassRef= http://nerd.eurecom.fr/ontology#Place /> <taIdentRefPointer xmlns:xlf2= urn:oasis:names:tc:xliff:document:2.0 xmlns:its= http://www.w3.org/2005/11/its xmlns:datc= http://example.com/datacats itsm:taIdentRef= http://dbpedia.org/resource/Arizona /> </elem> </itsAnn> </sm>Arizona<em startRef= sm1 > <itsAnn xmlns= /> </em> </source>  With the ITS rules file, „sm“ is annotated to have the text analytics information. But it is actually the content between sm and em that should be annotated. I don’t know how to resolve this. Maybe we should add to the ITS module the constraint that extends general ITS processors: if the selected element is XLIFF sm, apply the ITS information to the next em which corresponds to sm, via the startRef attribute. This would be a small burden on the ITS processors, but would greatly simply the creation of the ITS/XLIFF rules file.  Thoughts? Best, Felix


  • 2.  Re: Implementation of XLIFF 2.1 - ITS module

    Posted 08-31-2016 12:10
    Hi all again, I am looking for feedback on this topic before I can continue with the ITS rules file. I have put per Davd’s suggestion this also to the ITS IG list, see the thread at https://lists.w3.org/Archives/Public/public-i18n-its-ig/2016Aug/thread.html and the latest mail at https://lists.w3.org/Archives/Public/public-i18n-its-ig/2016Aug/0009.html Best, Felix Am 12.08.2016 um 11:51 schrieb Felix Sasaki < felix@sasakiatcf.com >: Hi all, I started an ITS module implementation relying on my generic ITS processor. See the processed files here https://github.com/fsasaki/its20-extractor/tree/master/sample/xliff21sample external-rules.xml contains the rules, currently only for text analytics. inputfile.xml is an XLIFF 2.1 input file, currently with ITS Text Analytics information. The output is as a list of XPath expressions in nodelist-with-its-information.xml and as inline annotations in output-inline-annotation.xml The output shows one issue which we had discussed before, see below, taken from output-inline-annotation.xml <source> <itsAnn xmlns= /> <sm id= sm1 type= itsm:generic itsm:taClassRef= http://nerd.eurecom.fr/ontology#Place itsm:taIdentRef= http://dbpedia.org/resource/Arizona > <itsAnn xmlns= > <elem> <taClassRefPointer xmlns:xlf2= urn:oasis:names:tc:xliff:document:2.0 xmlns:its= http://www.w3.org/2005/11/its xmlns:datc= http://example.com/datacats itsm:taClassRef= http://nerd.eurecom.fr/ontology#Place /> <taIdentRefPointer xmlns:xlf2= urn:oasis:names:tc:xliff:document:2.0 xmlns:its= http://www.w3.org/2005/11/its xmlns:datc= http://example.com/datacats itsm:taIdentRef= http://dbpedia.org/resource/Arizona /> </elem> </itsAnn> </sm>Arizona<em startRef= sm1 > <itsAnn xmlns= /> </em> </source>  With the ITS rules file, „sm“ is annotated to have the text analytics information. But it is actually the content between sm and em that should be annotated. I don’t know how to resolve this. Maybe we should add to the ITS module the constraint that extends general ITS processors: if the selected element is XLIFF sm, apply the ITS information to the next em which corresponds to sm, via the startRef attribute. This would be a small burden on the ITS processors, but would greatly simply the creation of the ITS/XLIFF rules file.  Thoughts? Best, Felix


  • 3.  Re: Implementation of XLIFF 2.1 - ITS module

    Posted 09-04-2016 08:40
    Hi all again, since I did not receive any feedback here I will follow the suggestion in the separate thread below - develop a rues files that covers only „mrk but not „sm“ / „em“. If there is disagreement with this early input would be possible to avoid re-working the file. Best, Felix Am 31.08.2016 um 14:09 schrieb Felix Sasaki < felix@sasakiatcf.com >: Hi all again, I am looking for feedback on this topic before I can continue with the ITS rules file. I have put per Davd’s suggestion this also to the ITS IG list, see the thread at https://lists.w3.org/Archives/Public/public-i18n-its-ig/2016Aug/thread.html and the latest mail at https://lists.w3.org/Archives/Public/public-i18n-its-ig/2016Aug/0009.html Best, Felix Am 12.08.2016 um 11:51 schrieb Felix Sasaki < felix@sasakiatcf.com >: Hi all, I started an ITS module implementation relying on my generic ITS processor. See the processed files here https://github.com/fsasaki/its20-extractor/tree/master/sample/xliff21sample external-rules.xml contains the rules, currently only for text analytics. inputfile.xml is an XLIFF 2.1 input file, currently with ITS Text Analytics information. The output is as a list of XPath expressions in nodelist-with-its-information.xml and as inline annotations in output-inline-annotation.xml The output shows one issue which we had discussed before, see below, taken from output-inline-annotation.xml <source> <itsAnn xmlns= /> <sm id= sm1 type= itsm:generic itsm:taClassRef= http://nerd.eurecom.fr/ontology#Place itsm:taIdentRef= http://dbpedia.org/resource/Arizona > <itsAnn xmlns= > <elem> <taClassRefPointer xmlns:xlf2= urn:oasis:names:tc:xliff:document:2.0 xmlns:its= http://www.w3.org/2005/11/its xmlns:datc= http://example.com/datacats itsm:taClassRef= http://nerd.eurecom.fr/ontology#Place /> <taIdentRefPointer xmlns:xlf2= urn:oasis:names:tc:xliff:document:2.0 xmlns:its= http://www.w3.org/2005/11/its xmlns:datc= http://example.com/datacats itsm:taIdentRef= http://dbpedia.org/resource/Arizona /> </elem> </itsAnn> </sm>Arizona<em startRef= sm1 > <itsAnn xmlns= /> </em> </source>  With the ITS rules file, „sm“ is annotated to have the text analytics information. But it is actually the content between sm and em that should be annotated. I don’t know how to resolve this. Maybe we should add to the ITS module the constraint that extends general ITS processors: if the selected element is XLIFF sm, apply the ITS information to the next em which corresponds to sm, via the startRef attribute. This would be a small burden on the ITS processors, but would greatly simply the creation of the ITS/XLIFF rules file.  Thoughts? Best, Felix