Yves, thanks for your comments. I think groups of users would required similar functionality. I think we are talking about similar objectives. For tools which generate XLIFF files from original source files (like a development group creating XLIFF files for translation), I would think that those XLIFF files would need: 1. Read the original source file format to strip out the translatable text to be added to the XLIFF file. - A <trans-unit> would be a block of similar text. I once created some internal guidelines for what the content of a <source> element should be. The XLIFF 1.2 specification states, "The <source> element is used to delimit a unit of text that could be a paragraph, a title, a menu item, a caption, etc." What is considered a unit of text will be different based on the non-XLIFF file format being handled. Other ways to think of a unit of text contained in the <source> element of a <trans-unit> element are: A block of text which is formatted the same, like the HTML text associated with a paragraph <p> tag, preformatted text in a <pre> section, or the text for a list item <li> tag. A block of text which the end user will see separately from other text, like a report title, an error message, a menu item or tooltip text. A piece of text which has its own identifier and can be separately retrieved by the program's code. A block of text which is separated by one or more blank lines, which implies the start of a new paragraph. 2. Creating the <target> section would not make sense (other than copying the <source> text to it), because this environment is only working with the source files. 3. Understand inline codes in the content. This should be a requirement. I had sent this information to Arle a while back on this topic, with the idea that he would bright this up to the inline tag committee.: I think the intent of XLIFF is to provide a structured way to represent translatable text from any file format in an easy to process, format-neutral way. A translator is concerned about translating text in a simple-to-understand format. Requiring them to understand multiple file formats creates inefficiencies and reduces quality. The key, in my opinion, is that translator should be able to translate the content of an XLIFF file without having to understand the syntax or characteristics of the original file format. Consider this example: Different ways to represent a replacement variable: User %s has been deleted. User %1 has been deleted. User %1$s has been deleted. User {0} has been deleted. User &user; has been deleted. User #user# has been deleted. User %USER has been deleted. User :user. has been deleted. User [user] has been deleted. XLIFF: Should the translator see all of the above segments as different segments which requires them to know how a variable is represented in the base file format and would result in different translation memory entries? For example, "[user]" in one file format may be a replacement variable, but it may be translatable text in another file format. Or should the XLIFF be something like: User <x id=1> has been deleted. or at least: User <ph id=1>%s</ph> has been deleted. I think that XLIFF inline elements should be used for: 1. Non-translatable items which are imbedded in the translatable text and are unique to that file format. All inline HTML and XML elements. Replacement variables. Special formatting codes. 2. Non-translatable items which if added, removed, or modified will affect the function of the product. HTML <a> tags for links. Replacement variables. David Corporate Globalization Tool Development EMail:
waltersd@us.ibm.com Phone: (507) 253-7278, T/L:553-7278, Fax: (507) 253-1721 CHKPII:
http://w3-03.ibm.com/globalization/page/2011 TM file formats:
http://w3-03.ibm.com/globalization/page/2083 TM markups:
http://w3-03.ibm.com/globalization/page/2071 Yves Savourel ---04/08/2011 07:11:41 AM---Hi David, > It is my understanding that XLIFF can be used in these situations: From: Yves Savourel <
ysavourel@translate.com> To: <
xliff@lists.oasis-open.org> Date: 04/08/2011 07:11 AM Subject: RE: [xliff] XLIFF 2.0 Core Hi David, > It is my understanding that XLIFF can be used in these situations: > 1. A format that product development can create to provide > translatable text to be translated. > 2. A format that can be used within a tool set to manage the > translation of the content. > 3. A format that one tool set can export which would then be > imported into another set of tools. I'm not sure #2 applies. If they keep their data within a unique toolset there is little advantage in using anything but some proprietary format it would seems. They only reason would if they want to be ready for #3. > In my opinion, the Core could be different for each situation. > ... It's an interesting way to look at the core. But I think it quickly boils down to sets of functionalities more than users. For example some users of case #1 may need to provide comments in the extracted documents, while other may not simply because their file format does not have comments facility. Same for segmentation: Some original formats may favor pre-segmented entries (I know of one such case), while other (most) would simply generate unit-level content. So depending on various factors the same category of users may need different features. I think the core can be defined in relation to the implementations: what is the minimal sub-set of XLIFF features a tool that reads XLIFF, makes modifications to it, and writes it back, should support. The more features we can get away with the better. But we have to remain realistic. The minimal main type of operation a tool is likely to do on an XLIFF file is to change the translation. That means it should probably be able to something like the following: - make the distinction between different segments if the content of a unit is segmented - read the source, and detect if it should/can be translated - create the target element if it's not present, or detect the state of the existing translation - understand inline codes in the content - update possible status flag related to the translation - preserve any construct it does not understand Among those actions, already, some may or may not be considered core. For example. Some tools simply do not deal with inline codes. Is that means inline codes should not be part of the core? Or is that mean 2.0 should not recognize such tool as compliant? Personally I think marked-up formats are so prevalent today that even software-string oriented tools should be able to handle inline codes, but that is something we would need to specify in the conformance clauses. It may also be different depending on the type of tool: for instance I don't think we can force producer tools to generate inline codes, but we may want to force consumer ones to understand them. Another example, is the translation status. Should a tool be obligated to update it? If the answer is yes, then such flag should be part of the core. If not, then it should be outside of the core. My current thoughts are that if we can get away with a core that includes the features in the list above we would be already doing well. Today many tools don't even do that. Then the additional features could be grouped in logical modules. If we manage to make them small and well defined, the tools could implement them step by step. In some cases it won't be easy to define what should be the processing expectations for a module. Notes/Comments for example. Is "supporting the notes/comments" module mean a tool should be able to do all or only some of the following actions: - read notes associated to the unit/segment and present them to the user agent - allow the user-agent to edit existing notes (or notes belonging to some categories) - allow the user-agent to create new notes (or notes belonging to some categories) - allow the user-agent to remove existing notes (or notes belonging to some categories) My guess is that we will probably end up with mandatory and optional expectations. But I'm getting away from the subject: defining the core. Cheers, -ys --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail. Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php