OASIS Universal Business Language (UBL) TC

 View Only
  • 1.  Issue: two-letter or three-letter language codes?

    Posted 03-10-2011 15:41
    Hello UBL TC and developers, A question has arisen with regard to language codes. It appears that UN/CEFACT uses the ISO three-letter lowercase language codes, whereas UBL currently includes a list of ISO two-letter uppercase language codes (both are ISO standards as far as we know). We're not going to be inclined to change something in 2.1 that already shipped in 2.0, so the question is: does anyone know of a good reason why UBL should switch to the three-letter lowercase form? We need to have this issue resolved by next week, so if anyone has an opinion, please let us know ASAP. Jon


  • 2.  Re: [ubl] Issue: two-letter or three-letter language codes?

    Posted 03-10-2011 15:48
    I think that this would be a major version change breaking compatibility. I see no good reason for such a change. Regards Mikkel On Thu, Mar 10, 2011 at 4:40 PM, Jon Bosak < bosak@pinax.com > wrote: Hello UBL TC and developers, A question has arisen with regard to language codes.  It appears that UN/CEFACT uses the ISO three-letter lowercase language codes, whereas UBL currently includes a list of ISO two-letter uppercase language codes (both are ISO standards as far as we know).  We're not going to be inclined to change something in 2.1 that already shipped in 2.0, so the question is: does anyone know of a good reason why UBL should switch to the three-letter lowercase form? We need to have this issue resolved by next week, so if anyone has an opinion, please let us know ASAP. Jon --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 3.  Re: [ubl] Issue: two-letter or three-letter language codes?

    Posted 03-11-2011 00:21
    The most commonly used syntax for language codes is specified in RFC5646, it is an Internet Best Current Practice. A language tag is expressed by a sequence of subtags separated by a dash "-". The first subtag is the primary subtag and is mandatory. It is the shortest ISO 639 code (i.e. it is a 2 or 3 characters code where the 3 characters code is used only if the language does not exist in the 2 characters code list). It is recommended use lowercase for languages to distinguish between from countries (e.g. IT=Italy; it=Italian). Additional tags are used to specify the script and the region, for example: - "en-US" represents English ('en') as used in the United States ('US'). - "sr-Latn-RS" represents Serbian ('sr') written using Latin script ('Latn') as used in Serbia ('RS'). Coming back to UBL, a 2 char code is probably ok in most cases (if you do not have the need to refer to a language not included in the list of 2 char length codes, this is not common). Using upper case is not recommended but not forbidden by ISO639-1. Andrea Il giorno 10/mar/2011, alle ore 16.40, Jon Bosak ha scritto: > Hello UBL TC and developers, > > A question has arisen with regard to language codes. It appears that > UN/CEFACT uses the ISO three-letter lowercase language codes, whereas > UBL currently includes a list of ISO two-letter uppercase language codes > (both are ISO standards as far as we know). We're not going to be > inclined to change something in 2.1 that already shipped in 2.0, so the > question is: does anyone know of a good reason why UBL should switch to > the three-letter lowercase form? > > We need to have this issue resolved by next week, so if anyone has an > opinion, please let us know ASAP. > > Jon > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 4.  Re: [ubl] Issue: two-letter or three-letter language codes?

    Posted 03-11-2011 10:26
    Hello, I apologies as I provided an answer about countries and not languages. The concept described by Andrea is correct but it is an extended concept associated to a "Locale" which is a precise indication about a language, its variant and the country where it is used. The plain language code list is provided by ISO639-2 and the 3-letters language is more precise and complete then the 2-letters version. http://www.loc.gov/standards/iso639-2/php/English_list.php Hope this helps Roberto > The most commonly used syntax for language codes is specified in RFC5646, > it is an Internet Best Current Practice. > A language tag is expressed by a sequence of subtags separated by a dash > "-". > The first subtag is the primary subtag and is mandatory. It is the > shortest ISO 639 code (i.e. it is a 2 or 3 characters code where the 3 > characters code is used only if the language does not exist in the 2 > characters code list). > It is recommended use lowercase for languages to distinguish between from > countries (e.g. IT=Italy; it=Italian). > > Additional tags are used to specify the script and the region, for > example: > - "en-US" represents English ('en') as used in the United States ('US'). > - "sr-Latn-RS" represents Serbian ('sr') written using Latin script > ('Latn') as used in Serbia ('RS'). > > Coming back to UBL, a 2 char code is probably ok in most cases (if you do > not have the need to refer to a language not included in the list of 2 > char length codes, this is not common). > Using upper case is not recommended but not forbidden by ISO639-1. > > Andrea > > > Il giorno 10/mar/2011, alle ore 16.40, Jon Bosak ha scritto: > >> Hello UBL TC and developers, >> >> A question has arisen with regard to language codes. It appears that >> UN/CEFACT uses the ISO three-letter lowercase language codes, whereas >> UBL currently includes a list of ISO two-letter uppercase language codes >> (both are ISO standards as far as we know). We're not going to be >> inclined to change something in 2.1 that already shipped in 2.0, so the >> question is: does anyone know of a good reason why UBL should switch to >> the three-letter lowercase form? >> >> We need to have this issue resolved by next week, so if anyone has an >> opinion, please let us know ASAP. >> >> Jon >> >> --------------------------------------------------------------------- >> To unsubscribe from this mail list, you must leave the OASIS TC that >> generates this mail. Follow this link to all your TCs in OASIS at: >> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php > > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php > > -- * JAVEST by Roberto Cisternino * * Document Engineering Services Ltd. - Alliance Member * UBL Italian Localization SubCommittee (ITLSC), co-Chair * UBL Online Community editorial board member (ubl.xml.org) * Italian UBL Advisor Roberto Cisternino mobile: +39 328 2148123 skype: roberto.cisternino.ubl-itlsc [UBL Technical Committee] http://www.oasis-open.org/committees/ubl [UBL Online Community] http://ubl.xml.org [UBL International Conferences] http://www.ublconference.org [UBL Italian Localization Subcommittee] http://www.oasis-open.org/committees/ubl-itlsc [Iniziativa divulgativa UBL Italia] http://www.ubl-italia.org


  • 5.  Re: [ubl-dev] Re: [ubl] Issue: two-letter or three-letter language codes?

    Posted 03-11-2011 10:33
    To offer a meet-in-the-middle solution we could provide in the root of the UBL document a new BIE to provide a default Language or better a full Locale (so not a BIE but ABIE). It will be a similar feature to the default currency for an UBL document. The implementer will be enabled to use any ISO language code and we could just provide a default codelist (external decoupled genricode list) At least there wll be the way to assert the docuemnt is expressed with a specific Locale/language. Roberto > Hello, > I apologies as I provided an answer about countries and not languages. > > The concept described by Andrea is correct but it is an extended concept > associated to a "Locale" which is a precise indication about a language, > its variant and the country where it is used. > > The plain language code list is provided by ISO639-2 and the 3-letters > language is more precise and complete then the 2-letters version. > > http://www.loc.gov/standards/iso639-2/php/English_list.php > > Hope this helps > > Roberto > >> The most commonly used syntax for language codes is specified in >> RFC5646, >> it is an Internet Best Current Practice. >> A language tag is expressed by a sequence of subtags separated by a dash >> "-". >> The first subtag is the primary subtag and is mandatory. It is the >> shortest ISO 639 code (i.e. it is a 2 or 3 characters code where the 3 >> characters code is used only if the language does not exist in the 2 >> characters code list). >> It is recommended use lowercase for languages to distinguish between >> from >> countries (e.g. IT=Italy; it=Italian). >> >> Additional tags are used to specify the script and the region, for >> example: >> - "en-US" represents English ('en') as used in the United States ('US'). >> - "sr-Latn-RS" represents Serbian ('sr') written using Latin script >> ('Latn') as used in Serbia ('RS'). >> >> Coming back to UBL, a 2 char code is probably ok in most cases (if you >> do >> not have the need to refer to a language not included in the list of 2 >> char length codes, this is not common). >> Using upper case is not recommended but not forbidden by ISO639-1. >> >> Andrea >> >> >> Il giorno 10/mar/2011, alle ore 16.40, Jon Bosak ha scritto: >> >>> Hello UBL TC and developers, >>> >>> A question has arisen with regard to language codes. It appears that >>> UN/CEFACT uses the ISO three-letter lowercase language codes, whereas >>> UBL currently includes a list of ISO two-letter uppercase language >>> codes >>> (both are ISO standards as far as we know). We're not going to be >>> inclined to change something in 2.1 that already shipped in 2.0, so the >>> question is: does anyone know of a good reason why UBL should switch to >>> the three-letter lowercase form? >>> >>> We need to have this issue resolved by next week, so if anyone has an >>> opinion, please let us know ASAP. >>> >>> Jon >>> >>> --------------------------------------------------------------------- >>> To unsubscribe from this mail list, you must leave the OASIS TC that >>> generates this mail. Follow this link to all your TCs in OASIS at: >>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php >> >> >> --------------------------------------------------------------------- >> To unsubscribe from this mail list, you must leave the OASIS TC that >> generates this mail. Follow this link to all your TCs in OASIS at: >> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php >> >> > > > -- > * JAVEST by Roberto Cisternino > * > * Document Engineering Services Ltd. - Alliance Member > * UBL Italian Localization SubCommittee (ITLSC), co-Chair > * UBL Online Community editorial board member (ubl.xml.org) > * Italian UBL Advisor > > Roberto Cisternino > > mobile: +39 328 2148123 > skype: roberto.cisternino.ubl-itlsc > > [UBL Technical Committee] > http://www.oasis-open.org/committees/ubl > > [UBL Online Community] > http://ubl.xml.org > > [UBL International Conferences] > http://www.ublconference.org > > [UBL Italian Localization Subcommittee] > http://www.oasis-open.org/committees/ubl-itlsc > > [Iniziativa divulgativa UBL Italia] > http://www.ubl-italia.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: ubl-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: ubl-dev-help@lists.oasis-open.org > > -- * JAVEST by Roberto Cisternino * * Document Engineering Services Ltd. - Alliance Member * UBL Italian Localization SubCommittee (ITLSC), co-Chair * UBL Online Community editorial board member (ubl.xml.org) * Italian UBL Advisor Roberto Cisternino mobile: +39 328 2148123 skype: roberto.cisternino.ubl-itlsc [UBL Technical Committee] http://www.oasis-open.org/committees/ubl [UBL Online Community] http://ubl.xml.org [UBL International Conferences] http://www.ublconference.org [UBL Italian Localization Subcommittee] http://www.oasis-open.org/committees/ubl-itlsc [Iniziativa divulgativa UBL Italia] http://www.ubl-italia.org


  • 6.  Re: [ubl-dev] Re: [ubl] Issue: two-letter or three-letter language codes?

    Posted 03-11-2011 10:41
    If the decision to introduce new elements is taken, I suggest to consider RFC5646 as basis, as it is the standard used for sw localization, allowing to express transliterations and language variants (such as US English). Andrea Il giorno 11/mar/2011, alle ore 11.32, Roberto Cisternino ha scritto: > To offer a meet-in-the-middle solution we could provide in the root of the > UBL document a new BIE to provide a default Language or better a full > Locale (so not a BIE but ABIE). > > It will be a similar feature to the default currency for an UBL document. > > The implementer will be enabled to use any ISO language code and we could > just provide a default codelist (external decoupled genricode list) > > At least there wll be the way to assert the docuemnt is expressed with a > specific Locale/language. > > Roberto > > >> Hello, >> I apologies as I provided an answer about countries and not languages. >> >> The concept described by Andrea is correct but it is an extended concept >> associated to a "Locale" which is a precise indication about a language, >> its variant and the country where it is used. >> >> The plain language code list is provided by ISO639-2 and the 3-letters >> language is more precise and complete then the 2-letters version. >> >> http://www.loc.gov/standards/iso639-2/php/English_list.php >> >> Hope this helps >> >> Roberto >> >>> The most commonly used syntax for language codes is specified in >>> RFC5646, >>> it is an Internet Best Current Practice. >>> A language tag is expressed by a sequence of subtags separated by a dash >>> "-". >>> The first subtag is the primary subtag and is mandatory. It is the >>> shortest ISO 639 code (i.e. it is a 2 or 3 characters code where the 3 >>> characters code is used only if the language does not exist in the 2 >>> characters code list). >>> It is recommended use lowercase for languages to distinguish between >>> from >>> countries (e.g. IT=Italy; it=Italian). >>> >>> Additional tags are used to specify the script and the region, for >>> example: >>> - "en-US" represents English ('en') as used in the United States ('US'). >>> - "sr-Latn-RS" represents Serbian ('sr') written using Latin script >>> ('Latn') as used in Serbia ('RS'). >>> >>> Coming back to UBL, a 2 char code is probably ok in most cases (if you >>> do >>> not have the need to refer to a language not included in the list of 2 >>> char length codes, this is not common). >>> Using upper case is not recommended but not forbidden by ISO639-1. >>> >>> Andrea >>> >>> >>> Il giorno 10/mar/2011, alle ore 16.40, Jon Bosak ha scritto: >>> >>>> Hello UBL TC and developers, >>>> >>>> A question has arisen with regard to language codes. It appears that >>>> UN/CEFACT uses the ISO three-letter lowercase language codes, whereas >>>> UBL currently includes a list of ISO two-letter uppercase language >>>> codes >>>> (both are ISO standards as far as we know). We're not going to be >>>> inclined to change something in 2.1 that already shipped in 2.0, so the >>>> question is: does anyone know of a good reason why UBL should switch to >>>> the three-letter lowercase form? >>>> >>>> We need to have this issue resolved by next week, so if anyone has an >>>> opinion, please let us know ASAP. >>>> >>>> Jon >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe from this mail list, you must leave the OASIS TC that >>>> generates this mail. Follow this link to all your TCs in OASIS at: >>>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe from this mail list, you must leave the OASIS TC that >>> generates this mail. Follow this link to all your TCs in OASIS at: >>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php >>> >>> >> >> >> -- >> * JAVEST by Roberto Cisternino >> * >> * Document Engineering Services Ltd. - Alliance Member >> * UBL Italian Localization SubCommittee (ITLSC), co-Chair >> * UBL Online Community editorial board member (ubl.xml.org) >> * Italian UBL Advisor >> >> Roberto Cisternino >> >> mobile: +39 328 2148123 >> skype: roberto.cisternino.ubl-itlsc >> >> [UBL Technical Committee] >> http://www.oasis-open.org/committees/ubl >> >> [UBL Online Community] >> http://ubl.xml.org >> >> [UBL International Conferences] >> http://www.ublconference.org >> >> [UBL Italian Localization Subcommittee] >> http://www.oasis-open.org/committees/ubl-itlsc >> >> [Iniziativa divulgativa UBL Italia] >> http://www.ubl-italia.org >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: ubl-dev-unsubscribe@lists.oasis-open.org >> For additional commands, e-mail: ubl-dev-help@lists.oasis-open.org >> >> > > > -- > * JAVEST by Roberto Cisternino > * > * Document Engineering Services Ltd. - Alliance Member > * UBL Italian Localization SubCommittee (ITLSC), co-Chair > * UBL Online Community editorial board member (ubl.xml.org) > * Italian UBL Advisor > > Roberto Cisternino > > mobile: +39 328 2148123 > skype: roberto.cisternino.ubl-itlsc > > [UBL Technical Committee] > http://www.oasis-open.org/committees/ubl > > [UBL Online Community] > http://ubl.xml.org > > [UBL International Conferences] > http://www.ublconference.org > > [UBL Italian Localization Subcommittee] > http://www.oasis-open.org/committees/ubl-itlsc > > [Iniziativa divulgativa UBL Italia] > http://www.ubl-italia.org > >


  • 7.  RE: [ubl] Issue: two-letter or three-letter language codes?

    Posted 03-11-2011 17:38
    No reason I can think of to use one over the other. The xml:lang definition references RFC 4646, which does not promote using one over the other, but does state that the separation of the language code from the country code must use a dash. Please note that most language codes are represented in lower case, although technically they are not case sensitive. So, "requiring" the use of two characters (upper or lower) is wrong. Zarella


  • 8.  Re: Issue: two-letter or three-letter language codes?

    Posted 03-16-2011 13:38
    Hello UBL TC and developers, Since no compelling reason has been offered for switching to three-letter language codes, we will continue using the two-letter versions in UBL 2.1. Many thanks to everyone who gave input on this question. Jon Jon Bosak wrote: > Hello UBL TC and developers, > > A question has arisen with regard to language codes. It appears that > UN/CEFACT uses the ISO three-letter lowercase language codes, whereas > UBL currently includes a list of ISO two-letter uppercase language codes > (both are ISO standards as far as we know). We're not going to be > inclined to change something in 2.1 that already shipped in 2.0, so the > question is: does anyone know of a good reason why UBL should switch to > the three-letter lowercase form? > > We need to have this issue resolved by next week, so if anyone has an > opinion, please let us know ASAP. > > Jon