Hi Yves,
thanks for this, this is really helpful. I am too worried about the CTRM
becoming too complicated. As I said in the last meeting this is intended as
a strawman that puts forward an array of possible solutions, happy to
restrict them along the lines you suggested..
I introduced the simpleItem in the latest draft and that was inspired by
the use cases as described by Chase in the XLIFF OMOS call last week..
I think that the biggest issue the design needs to solve is that CTRM
doesn't have control over the XLIFF Core content, therefore storing small
changes with small footprint is always in danger of becoming useless when
the core is modified by an Agent unaware of CTRM..
So I heard two conflicting requirements:
1) Let me store small changes with a small footprint, I don't want to store
the whole segment if I made a tiny text change or similar
2) Don't let things get stored in different ways in CTRM
I am reacting to some specifics of you feedback inline below..
On Monday, I will create another version of the CTRM 2.1 proposal based on
this feedback and the reflections I expressed here and inline below..
Cheers and thanks
dF
Dr. David Filip
===========
OASIS XLIFF OMOS TC Chair
OASIS XLIFF TC Secretary, Editor, Liaison Officer
Spokes Research Fellow
ADAPT Centre
KDEG, Trinity College Dublin
Mobile: +420-777-218-122
On Sun, Oct 2, 2016 at 3:54 AM, Yves <
yves@opentag.com> wrote:
> Hi David, all,
>
>
>
> It seems to me the CTR module is becoming rather complicated.
>
> Here are a few things I've noted after a quick look at the latest draft:
>
>
>
>
>
> === Issues ===
>
>
>
> --- Do we really want to allow agents to track the content of individual
> inline codes?
>
> That can be done by tracking the parent's content itself (in a much
> simpler way).
>
>
>
dF: As I said above I am happy not to track individual inlines, but
tracking structural parents will bring lot of unwanted redundancy
>
>
> --- You can have revisions that apply to different elements but be
> inconsistent:
>
>
>
> For example, you can have:
>
>
>
> <revisions appliesTo="target">
>
> ...
>
> </revisions>
>
> <revisions appliesTo="segment">
>
> ...
>
> </revisions>
>
>
>
> With both tracking the same content (since segment is a superset of
> target), but there are no way to safely make sense of their history (e.g.
> datetime is optional so one may not now the order of the changes, the
> currentVersion of each could be contradictory, etc.) It'd be impossible to
> really use across different tools, which make the existence of a common
> module pointless.
>
> The same issue arises with revisions on specific inline codes along with
> revisions on the source/target content.
>
>
>
> One of the things we wanted to achieve with XLIFF 2.x is avoid having
> different ways to do the same thing. CTR2.1 has many ways to do the same
> thing.
>
>
>
>
>
> --- Currently a <revision> can have more than 1 <item> with the same
> property value. Which means you can have N different changes for the same
> data at the same time.
>
Good catch, happy to put a uniqueness requirement (Constraint) on that.
>
>
>
>
> --- I'm unsure how attributes work in the case of tracking a
> segment/ignorable content.
>
>
>
> For example, you may have a revision of the state attribute of the segment
> s1, but have also an item tracking the segments for the unit where that
> segment s1 is in; and that segment may have a different state. When a tool
> looks for the history of the state values for s1 what does it do? Look just
> at the revisions for appliesTo='target' + property='state' or also take
> into account the state attribute in the item for appliesTo='segment' +
> property='content'?
>
>
>
>
>
> --- Having <originalData> inside <item> seems a bad idea: the content of
> <item> should be same content as /<target>/<segment/etc. It
> should probably be at the <revision> level.
>
Ok, to have them at <revision>
I thought that the highest possible level of item was the same as unit, but
you're right that <item> is complicated enough w/o original data..
>
>
>
>
> === Thoughts ===
>
>
>
> Some ideas:
>
>
>
> - Make datetime a required attribute. A history without date/time is a lot
> less useful, and datetime is easy to set for any tool.
>
+1 to that
>
>
>
>
> - Get rid of currentVersion: If "the most current version of a revision"
> means "the latest", then a required datetime takes care of this, without
> having to maintain an extra attribute (and a bunch of PRs). Or maybe I'm
> missing the point of this attribute.
>
Fine with me, little value in those extra PRs and the REQUIRED datetime
attribute is more reliable
>
>
>
>
> - Let's not allow to track individual inline codes. I don't think anyone
> has made that requirement. It would also make things complicated for
> interoperability since you would have different ways to track them
> (individually or in content).
>
Again, fine with me, I just want to highlight that the current draft
version is a restriction compared with the 2.0 ctr where you can track ANY
XLIFF defined element
I am happy to take out the individual codes from the enumeration of the
trackable elements..
Again, I just want to make everyone aware that it will require larger
portions of text stored as revisions even in case of minor changes..
>
>
>
> - Add a constraint saying the property values of <item> must be unique
> within a given <revision>.
>
+1
>
>
>
>
> - It seems we are having too many ways to track the same thing. And/or we
> try to do too much.
>
>
>
> The only obstacle, as far as know, that prevent us to use inline in <item>
> is that we say
must have its corresponding <sm/>.
>
> But maybe we've been focusing too much (once again) on the XLIFF markup.
>
>
>
> The more general issue with content in CTR entries is that we take it out
> of context (from a markup viewpoint). But the actual data after parsing is
> what we really need to store. So, if we have this:
>
>
>
> <target>...data text</target>
>
>
>
> We can store it like this:
>
>
>
> <item property='content'><mrk id='m1' translate='no'>...data</mrk>
> text<item>
>
I don't think this is a good idea, the stuff between start of <target> and
the unhandled isolated is actually text too and not untranslatable
data. Remember we store data in <originalData> and we don't allow to store
mix them with inline content..
>
>
> The other aspect of CTR is that I don't think we can expect all the
> constraints the normal unit content has to apply in the CTR elements. We
> will have duplicate ID values, etc.
>
I think you can have unit like identity constraints on item and probably
within each <revision> elements. As discussed before, the module uniqueness
scope is separate from the core <unit> uniqueness scope.. Not sure which of
the above you mean?
>
>
>
>
> - As far as the different types of <item> content: I'm not sure we need
> all the possibilities the draft currently has. Do we have requirements for
> all of them?
>
I think that theere are only two clear requirements
Tracking simple target revisions
and note changes
I am sure there would be value in tracking of segmentation changes
The Ocelot requirements seem to be informed by the XLIFF 1.2 usage of
<alt-trans> for <trans-unit><target> changes
In 1.2 the <alt-trans> has two allowed data types, the full <trans-unit>
and <target> only.
Since <unit> is our logical unit in 2.0, and its data model is more
complicated than <trans-unit> because it handles segmentation too. I think
we don't really have an option.
I tried to cater for simple needs with the simplified <simpleItem> that
would basically only allow for one type of the 4 types of <item> content.
>
>
>
>
> Cheers,
>
> -yves
>
>
>
>
>