Hi Yves, I have a couple of questions and comments on the proposal. 1) Data type of score(similarity) and quality: * Is there any reason why the score should be an integer? In our case, it has been always a real number ranging from 0 to 100.00. You may ask us back the benefit of having them in real number though. Our scoring logic is very sophisticated. We want to sort suggestions correctly (99.9 is definitely preferred to 99). Real numbers may be better for interoperability as it is a superset of integer. 2) Score and quality? * I understand the points of having two attributes. However, our scoring logic all consider many factors including similarity, quality, content domains and types etc. The score for our case is a combination score, so we can list the suggestions clearly in the order of our preference. * Therefore, similarity is not proper for our case. I suggest to have match-score as a main attribute, allowing two more attributes (similarity, quality) if each tool wants to have. All these may increase confusion rather than help. 2 attributes are perfect, and 3 attributes are too many? Then my suggestion is to have the first attribute score . 3) content-type, content-domain, match-type * Due to cross-file/type leverage, we need to deliver content-type (xml, html, properties, etc) and content domain. Do you think origin can be used for that purpose? * type requires a clearly defined list of values. For MT suggestions, translators should post-edit instead of translate. CATs may have specific features for MT suggestions. Therefore, XLIFF docs should use the same value in type attribute for MT suggestions. Regards Jung On 27/02/2012 12:45, Yves Savourel wrote: Hi Rodolfo, all, 1) Change the name of score to similarity . That would be clearer. Done. 2) Define an optional module for storing the metadata associated with a match. Yes, I think such metadata could be re-used for other features. For example QA annotations, etc. Perhaps we would need to provide some directions for handling the combination of score/similarity with quality . It may be hard for a user to select the best match from two matches that have these properties: a) similarity= 60 quality= 90 b) similarity= 80 quality= 60 That would be something useful. But, based on some discussions I've seen in use cases like Microsoft Translator's MatchDegree (similarity) and Rating (quality) I'm not sure there would be a single answer. Often it ends up being a user preference that needs to be decided at usage time. This also brings the question: should we have a processing expectation that user agents should preserve the order of the matches? Also should we have specific processing expectations about how new matches should be added? My guess is that we probably want to keep this simple: XLIFF provides the structure to hold the information, but let tools do what they want with it. For example a processing expectation that the matches must be re-written in the same order wouldn't work with a tool whose tasks is precisely to apply some ranking to the matches. Cheers, -yves --------------------------------------------------------------------- To unsubscribe, e-mail:
xliff-unsubscribe@lists.oasis-open.org For additional commands, e-mail:
xliff-help@lists.oasis-open.org -- Jung Nicholas Ryoo Principal Software Engineer Phone: +35318031918 Fax: +35318031918 Oracle WPTG Infrastructure ORACLE Ireland Block P5, Eastpoint Business Park Dublin 3 Oracle is committed to developing practices and products that help protect the environment