I am ok with this as you described.
From:
"Dr. David Filip"
<
David.Filip@ul.ie>
To:
"Schurig, Joachim"
<
Joachim.Schurig@lionbridge.com>, "Estreen, Fredrik" <
Fredrik.Estreen@lionbridge.com>,
Helena S Chapman/San Jose/IBM@IBMUS, Tom Comerford <
tom@supratext.com>,
Ryan King <
ryanki@microsoft.com>, "xliff@lists.oasis-open.org"
<
xliff@lists.oasis-open.org>
Date:
04/09/2013 07:07 AM
Subject:
Re: [xliff]
R37: Revised Validations Module proposal
Sent by:
<
xliff@lists.oasis-open.org>
Thanks Joachim,
I believe we agree.. (see also inline)
So the consensus should be something like this:
No storage default. NFC as comparison default in all three
cases core and both modules, but both modules allowing for override by
having the dedicated attribute
@Helena, @Fredrik, others,
Please shout by Wed End of Your Day, if this does not
seem OK, otherwise I'd like to ask Fredrik and Ryan to implement this in
their modules by the end of their day Thu, so that Tom and I can finalize
the spec for meeting next week
@Tom, will you be able to modify schema by Monday if this
is done by Friday?
Thanks
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Tue, Apr 9, 2013 at 10:38 AM, Schurig, Joachim <
Joachim.Schurig@lionbridge.com >
wrote:
I opt for no normalization
on core – quote from my other mail today:
I actually think there
should not be an attribute about normalization in the XLIFF core. In my
opinion it makes sense to have it in length restrictions (because they
refer to the storage format) and to a lesser extent in validation, but
not in core. If you expect or need a specific normalization in your document,
apply it.
+1
Actually, even the normalization
attributes in the size restriction module do not require the content be
stored in that normalization
+1
– which would also be
difficult as the same content could have e.g. NFC for their storage size
applied and NFD for general size restrictions.. it’s only used to know
which calculation about sizes to apply.
+1
So I think we both agree.
Regards,
Joachim
From: Dr. David Filip [mailto:
David.Filip@ul.ie ]
Sent: Dienstag, 9. April 2013 01:41
To: Estreen, Fredrik
Cc: Helena S Chapman; Kevin O'Donnell; Schurig, Joachim; Ryan King;
xliff@lists.oasis-open.org ;
Yves Savourel
Subject: Re: [xliff] R37: Revised Validations Module proposal
All, following the TC discussions last week, I believe
the following needs to be answered.
Does the core need a default or even enforced storage normalization?
I believe the answer is no and we are OK with specifying
a default for content comparison purposes that can be overridden in
the validation module and in the size restriction module and nowhere else
in the spec as it stands now AFAIK. In case these two introduce an explicit
attribute that allows for the override, the default should be NFC in both
cases, same as in 2.6.8.
If we decided that it is not enough to have normalization
defaults for comparison purposes ONLY, we could introduce an optional normalization
attribute in core that could live on any of the structural elements, from
<file> down to <source> and <target>, there would be
inheritance and the default/inherited would be assumed (MUST for processors)
where nothing is specified/inherited.
The default could be either "none" or "NFC"
In case we go for the core attribute, I believe the default
should be "none" for everything (including storage) except comparison
purposes that would cover section 2.6.8 and both modules.
Please note that I am NOT actually proposing to have the
core attribute, I am just trying to accelerate the discussion by charting
all viable options.
Please indicate what option seems preferable to you, eventually
if you see any other viable options..
Thanks and regards
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Tue, Apr 2, 2013 at 3:59 PM, Estreen, Fredrik <
Fredrik.Estreen@lionbridge.com >
wrote:
Hi All,
Just back from vacation and
catching up on email. Reading the current spec I think it could make sense
to simply rely on section “2.6.8 Content Comparison” in the core specification
for what (if any) normalization to apply for validation comparisons. It
stipulates that NFC is used to compare the equality of content.
Best regards,
Fredrik Estreen
From:
xliff@lists.oasis-open.org [mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Dr. David Filip
Sent: den 2 april 2013 14:49
To: Helena S Chapman
Cc: Kevin O'Donnell; Schurig, Joachim; Ryan King;
xliff@lists.oasis-open.org ;
Yves Savourel
Subject: Re: [xliff] R37: Revised Validations Module proposal
IMHO validation is different use case to size restriction.
It makes sense to have none as default there but not here..
See inline..
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Wed, Mar 27, 2013 at 6:02 PM, Helena S Chapman <
hchapman@us.ibm.com >
wrote:
Unfortunately, I don't entirely agree.
I understand what you are suggesting with "language neutral rules".
However, Unicode normalization should often be applied even when there
is no language rules that should be associated with it.
+1
For instance, when someone send an English
document which contains two distinctly different characters "?"
vs "?", without normalization
and the correct intention, how would the tools/processes know what to do
with these two characters? When to treat them as two different characters
and when to treat them as the same? "none" for normalization
would tell them to treat these as different and if it's "NFD"
or "NFC", these two would be the same (not identical).
This is going to be increasingly common because:
+1
1. We are more likely to receive true multilingual static content these
days. And, we don't have to look far for an example. In Canada, most content
has to be available both in English and French at the same time.
2. When we deal with multimedia type content, the use of more than one
language within the same context is even more frequent. In my own household,
a combination of Mandarin, Taiwanese, Japanese, and Hebrew are often mixed
in with English.
I am actually curious if the spoken language content interchange is out
of scope of XLIFF in general? What happens when we embed this into an interactive
format? Do we give our community the guideline that if one is working with
translation requests that are not limited to written languages, don't use
XLIFF for interchange?
To include voice content we would need to re-charter. I
think it is the next frontier and worth discussion.. I would just see it
not at the front burner right now with 2.0 preparing for the first public
review..
From: "Kevin
O'Donnell" <
kevinod@microsoft.com >
To: "Dr.
David Filip" <
David.Filip@ul.ie >,
Helena S Chapman/San Jose/IBM@IBMUS
Cc: "Schurig,
Joachim" <
Joachim.Schurig@lionbridge.com >,
Ryan King <
ryanki@microsoft.com >,
"
xliff@lists.oasis-open.org "
<
xliff@lists.oasis-open.org >,
Yves Savourel <
ysavourel@enlaso.com >
Date: 03/21/2013
04:30 PM
Subject:
RE: [xliff] R37: Revised Validations
Module proposal
Sent by:
<
xliff@lists.oasis-open.org >
Hi David,
This is a fair point. I’d be interested in hearing from tool providers
about their preference here also.
I see two scenarios that are relevant here:
1. Global (language neutral) rules : these rules
are the most common here at Microsoft and do not differentiate per-language
(e.g. formatting rules). Therefore, rules of this nature would not require/benefit
from normalization
2. Language-specific rules : these rules, by
default, may benefit from normalization and indeed would help reduce false
positives if present
In my thinking, keeping “none” as default keeps the module simple and
avoids unnecessary overhead by the processing agent when not required.
If the XLIFF creator is implementing language-specific rules, then they
have an onus to specify their preferred normalization approach.
If we can surmise the likely prevalence of scenario 1 vs. scenario 2, that
may also indicate the likely best default setting here.
Other feedback/thoughts appreciated.
Thanks,
Kevin.
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Dr. David Filip
Sent: Thursday, March 21, 2013 12:08 PM
To: Helena S Chapman
Cc: Schurig, Joachim; Ryan King;
xliff@lists.oasis-open.org ;
Yves Savourel
Subject: Re: [xliff] R37: Revised Validations Module proposal
Guys, after a brief spotcheck I just forwarded Asanka's minutes from Tuesday
to the list.
While there was no formal conclusion the discussion tended to inclusion
of normalization types along the lines of the size restriction module.
I think there was no doubt that we should provide a vehicle for conveying
the normalization type required.
There did not seem to be a clear consensus on the default value though.
I personally think that the default should NOT be "none". This
option for default seems vague and obscure to me.. It lets the processor
guess based on tribal knowledge what they should do not to produce tons
of false positives. I thought that we wanted to be naive implementer friendly..
:-)
Cheers
dF
Dr. David Filip
=======================
LRC CNGL LT-Web CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734
mailto:
david.filip@ul.ie On Thu, Mar 21, 2013 at 6:44 PM, Helena S Chapman <
hchapman@us.ibm.com >
wrote:
You summarized correctly of my own recollection of the discussion.
From: "Schurig,
Joachim" <
Joachim.Schurig@lionbridge.com >
To: Ryan King
<
ryanki@microsoft.com >,
Helena S Chapman/San Jose/IBM@IBMUS
Cc: "
xliff@lists.oasis-open.org "
<
xliff@lists.oasis-open.org >,
Yves Savourel <
ysavourel@enlaso.com >
Date: 03/20/2013
07:29 PM
Subject: RE:
[xliff] R37: Revised Validations Module proposal
Hi Ryan,
while yours was my initial position as well, I do not think that it was
the outcome of the discussion in the TC. We do have already mention of
the normalization approach in the size restriction module, so it would
make sense to include it here, too, and I think this was the conclusion
on the Tuesday call. You could leave the default to “none” and declare
that this would leave it to the processing agent how to deal with the situation,
but if any of “nfd” or “nfc” values are set it should lead to more
specific behavior. Could this be an acceptable solution to all parties?
Cheers,
Joachim
From:
xliff@lists.oasis-open.org [ mailto:
xliff@lists.oasis-open.org ]
On Behalf Of Ryan King
Sent: Mittwoch, 20. März 2013 17:58
To: Helena S Chapman
Cc:
xliff@lists.oasis-open.org ;
Yves Savourel
Subject: RE: [xliff] R37: Revised Validations Module proposal
Yes, Helena, thanks for checking with me that. We did discuss it and feel
that the processing agent should be responsible for normalization of text
and so we will explicitly state that in the module.
Thanks,
Ryan
Sent from my Windows Phone
From: Helena S Chapman
Sent: 3/20/2013 4:43 AM
To: Ryan King
Cc:
xliff@lists.oasis-open.org ;
Yves Savourel
Subject: RE: [xliff] R37: Revised Validations Module proposal
Did Kevin convey the comments about normalization to you? How do we expect
to deal with that in the spec?
From: Ryan
King <
ryanki@microsoft.com >
To: Yves Savourel
<
ysavourel@enlaso.com >,
"
xliff@lists.oasis-open.org "
<
xliff@lists.oasis-open.org >
Date: 03/20/2013
02:41 AM
Subject: RE:
[xliff] R37: Revised Validations Module proposal
Sent by: <
xliff@lists.oasis-open.org >
Hi Yves, all,
We suggest that for mustLoc, we enclose the source and replacement target
values in parenthesis, like so: mustLoc="(World) (Welt)"
If for any reason, a parenthesis is required to be translated, as a brace
for example, we could escape it like so: mustLoc="((World)) ({Welt})"
Since we are generalizing the dblSpace to occurrences, then we could do
something similar there as well: occurrences="( ) (3)"
For example, where 3 pipes need to occur in the target for whatever reason.
Further comments or suggestions welcome.
Original Message-----
From: Ryan King
Sent: Tuesday, March 19, 2013 10:59 PM
To: 'Yves Savourel'; xliff@lists.oasis-open.org
Subject: RE: [xliff] R37: Revised Validations Module proposal
Thanks Yves for the feedback. All valid and good comments, which we will
incorporate into the spec. As for the question on the mustLoc separator,
Kevin and I are discussing it and suggest something shortly.
Thanks,
ryan
Original Message-----
From: xliff@lists.oasis-open.org
[ mailto:xliff@lists.oasis-open.org ]
On Behalf Of Yves Savourel
Sent: Tuesday, March 19, 2013 5:01 AM
To: xliff@lists.oasis-open.org
Subject: RE: [xliff] R37: Revised Validations Module proposal
Hi Ryan, all,
Some feedback on the Validation proposal (nothing major, just possible
suggestions):
-- strbegins and strEnds attributes:
Maybe names such as startsWith and endsWith may be a bit more descriptive
of the function?
-- dblSpace:
This seems to be a very specific check. Maybe it can be generalized a bit
without making it very different? For example, instead of dblSpace="3"
we could do occurrence=" 3" (or a better name than 'occurrence').
This would allow to check for more than double spaces.
-- mustLoc:
How do you represent the ' ' if it needs to be in the left part of the
value?
-- existsInSource, disabled:
So far I think XLIFF is using yes no for Boolean rather than true false.
Maybe we could be consistent?
cheers,
-yves
---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that generates
this mail. Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php