UBL TC Face to Face

When:  Apr 30, 2003
Associated with  UBL Naming and Design Rules SC
Description:

==========
Agenda:

==========
Minutes: AM Discussion 1. Dicscussion of containership 2. Discussion of the ATG Guiding Principles for Naming and Design Rules. 3. Naming of Business Documents 1. Containership Wednesday Morning 4/30/03 Minutes Start out with an Overview of the Containership Issue from Arofan. There are two different perspectives on this, start with Arofan. Arofan would like to see us use containers to help keep the documents more organized. Adding a Header and Body and then use the ListOf containers. Discussion Points: GS: I would like to recommend, using Header and Body, instead of LineItems. We would be having more than one body. EG: How could you have more than one body? AG: Here are the rules I am proposing: Every docment has a header. The rest of the info goes into a Body section plus Lists. EG: This doesn't seem consistent with what you had proposed before. Have the Metadata at the top within the document, then you have a body or data section. AG: I am saying that the header and the body, the body is really anything that has a cardinality of 1..n is contained within a List element, this would start with the OrderLine, for example, and that would esstentially your Body. HS: Please lets not call it Header, there are too many projects with that name and it would be confusing. EG: It is hard to generalize because you don't know what kind of documents you will be dealing with. AG: Do people worry about the numbers of levels within these containers? Should this be a consideration? EG: I think that we should be able to deal with these, that tools should be able to handle it. GS: We have 7 levels currently in the Order document. For example take PaymentMeans, you could need possibility need 4 or 10 of these. We could restrict the containership levels for some of these. We could restrict the number you need in an 0..n to maybe 10. GS: Could we add the rule, put lists only around mandatory things like 1..n not 0..n EG: The rule would be you have the container when you have more than one, not the other way around, the rule applies to the schema, not the instance. AG: PaymentMeansList that is 0..1, GS: I agree we need the containers, but I would like to show an example. 1..n and 0..n are the places we need to look at. Repeating things that are not qualified. AG: How do we come up with a rule to show that. GS: Things that are repeated a lot it will be easy to put together. AG: We have a current list of 69 items that have a 0..n cardinality, none of these seem to need a list container. Proposal: 1. All documents shall have a container for metadata and which proceeds the body of the document and is named "Head" _____________. (anything but header) 2. All elements with a cardinality of 1..n, (and lack a qualifing structure) must be contained by a list container named "(name of repeating element)List", which has a cardinality of 1..1. GS: I feel the qualifier is important, and using that as a way to tell whether or not they have or need a AG: We are going to have to explain what goes into the head at some point. EG: I don't think we need to explain that yet. Once we agree on this, then we can spend years going through that. AG: I have one last question about the naming, do we call it OrderHead or Head, I may have two things with different context within the same namespace. EG: Consenses seems to be Head, not OrderHead. Define it generically enough. Group agrees that we don't have to qualify the name? No one disagreed. Reference EMail: containership. 2. Discussion of the ATG Guiding Principles for Naming and Design Rules. Processing Instructions EG: I better make sure I am not using SOAP but PIs are always used with agreement anyway. The schemas we are constructing do not use PIs. Why would we want to tell anyone not to use them. MC: These are rules for LCSC to follow. AG: Let's say UBL will not use PIs. EG: I don't have a problem LCSC that. But what worries me is if these get interpreted by users. MC: What about Core UBL schemas will not use PIs. EG: That is fine. AG: Are we talking about schema or business instances. EG: For schemas it does not make any sense. MC: We don't want to control what people do with their instances. Mixed content LS: None of our structure uses it. EG: By agreeing to have XHTML in the documentation we have agreed to use mixed content in that area. AG: This does not refer to the documentation, it refers to the business context. EG: I disagree with you. Mixed content should not be used is what we agreed in the NDR Document v.23. Attribute Groups GS: We have not defined any yet. ID/IDREF AG: The content of business documents are referenced in other business documents using KEY/KEYREFs is different. XSD prefix AG: It must be used to refer to the XSD namespace. EG: The namespace specification is clear that you can use any namespace you want. Decisions Made: Using this document we made our own agreements at the London UBL Face to Face. ATG2 - Agreed Upon Naming and Design Rules UBL decisions marked at the end of every point. a) Processing Instructions MUST NOT be used - CORE UBL SCHEMA MUST NOT CONTAIN PROCESSING INSTRUCTIONS. b) The Nillability attribute MUST NOT be used - AGREED c) Wildcards MUST NOT be used - AGREED d) Two schemas shall be developed for each standard. One schema shall be a run-time schema devoid of documentation. One schema shall be a fully annotated schema that employs XHTML for the annotations. - AGREED e) Mixed Content MUST NOT be used - AGREED, excluding documentation. We have agreed on Should not instead of Must Not. f) Built-in Simple Types SHOULD be used where possible - NOT APPLICABLE g) Simple Type restriction MAY be used wherever possible - AGREED (LEAVE OUT "WHEREVER POSSIBLE". h) Union technique MAY be used to merge datatypes - NOT APPLICABLE, Should Not be used. (Codelists are excluded). i) Complex Types MUST be named - AGREED j) The absence of a construct or data MUST NOT carry meaning - AGREED k) Substitution groups MUST NOT be used - AGREED l) Attribute Groups MAY be used - AGREED m) ID/IDREF MUST NOT be used - AGREED n) The XSD prefix MUST be used. xmlns:xsd=http://www.w3.org/2001/xmlSchema - AGREED, The prefix "xsd" MUST be used when referring to XSD namespace. o) The XSI prefix SHALL be used where appropriate - AGREED, SEE WORDING PER N ABOVE. p) Abstract Complex Types MAY be used. - AGREED FOR UR SCHEMA. q) (not finalized) Complex Type extension SHOULD be used where appropriate - r) (not finalized) The 'final' attribute shall be used to control s) (not finalized) The 'block' attribute shall be used to control t) Complex Type restriction SHOULD be used u) The 'final' attribute SHALL be used to control v) The 'block' attribute SHALL be used to control w) Key/KeyRef May be used - AGREED, not finalised for our purposes. We still need to work through and define rules. x) Notations MUST NOT be used - AGREED y) UpperCamelCase (UCC) MUST be used for naming elements and types - AGREED z) lowerCamelCase(lCC) MUST be used for naming attributes - AGREED 3. Naming of Business Documents - Discussion lead by Gunther. We have alot of different ways of naming business documents. What kind of terms must be placed in those names. Consitent naming is needed for exchanging of documents between trading partners. We need 3 terms. Rosetta Net use 3 terms for the definition of the business documents. MC: The way we name our documents is dependent on our decision on versioning namespaces? AG: It should not be. MC: If we don't version our namespaces then we have to version our document namespace. MC: Should we also version our document namespace. EG: There is a proposal that chairs of Oasis Committees prepared for filenaming in Oasis. MC: Do we need to look at that? EG: Probably. It is a fairly reasonable. MHC: Let's look at that. SP: This came up in LCSC as a question too. GS: Should we use the tripartite structure, substantives, verbs and in which order. SP: There is a UN codelist that gives us a list of document types that we would like to be able to link to. EG: Let's look at "Proposed Rules for OASIS document File Naming" by Oasis Working Draft 02, 18 February 2003. SP: Some of the aspects of these will be applicable but the semantics of the names is another aspect as well. EG: There are some some general principles which we don't have a problem with. PM Discussions Local vs Global MC: We should revisit our last decision and our grading of pros and cons. ATG did their pros and cons of Garden of Eden vs Venetian blinds. EM: The costs in terms of Java binding of local vs global. Basically, Joe Fialli, spec lead of the JAXB spec for Java data bindings was very careful not to prefer any one schema design over another. His team did experiments with SAML and UBL. ANy schemas that have global elements, because they have alot more power than local it does generate more code. Global could potentially serve as a root element for a schema, serve as the head of a substition group, content you could fill in a wild card content somewhere else. One of our use cases was for tweaking existing content models. Also it is useful to reference a whole element not just a type for building new content elements. The extra code developed would be a trivial java interface. A global element can be referenced from a foreign content model, which is interesting for our use case of building new document types out of existing unchanged UBL parts (not for our other use case of tweaking content models). This is in addition to your list of other global element features above. Problems that you could not reuse the same element name. If you have done your design to resolve clashes then the generated code does not care. It is not disturbed by the fact that you artificially ensured there were no name clashes. THe cost is that you now have an interface for that element. What about if we know there are elements we don't want to reuse. He said I could imagine a configuration that would not generate interfaces for those elements. Eve said you would not actually not want to generate an interface. In other words, if we really want these elements to be reusable, the interface would still want to be generated, so we don't save anything with this idea. AG: If you are generating the code from scratch there is not alot of cost. MG: In support of Gunther's paper. It seems to me we are trimming our finger nails with a chainsaw. I don't know why everything is global regardless of context and location. More reasonable would be to have global types and if you need a global element invent a separate schema and import it. EM: That wold make for anomolous results. MG: It does not mean they are not reusable. GS: My biggest issue is that you define for every component a global element. IF you do a direct binding to any programming language, you have to define for every tag name you have to define a variable. This is not very maintainable EM: If the code maintenance problems are the issues, I think we really need to go through my comments. EM: Let's go through my comments on Gunther's paper on March 12 2003. What about people who have a problem with things that are not in the instance. Non type aware processing is problematic for them. 1.The assertion it would be hard to find out which child elements must get the object class qualifier. It is a modelling question, as modelling work will distinguish them as necessary. It is true that you have to do "cynical" modelling. There is a responsibility to resolve clashes at the modelling stage. It would be more efficient if we did not have to do that. Is that cost too high? GS: I agree with you. If you have many different business requirements you don't have the normal form. The Address within BuyerParty differs from the Address in SellerParty. How do you name the Address of the SellerParty. Should we name it SellerPartyAddress? EM: Good question. You have to look at the context in which you are defining the thing. THe modelling process has not filled in the context drivers very successfully to date. Chances are they are two instances of some generic thing that does not yet have an expression in the library. MG: I agree with Gunther's conclusion but it is not the strongest argument. On the one hand we don't want to go though a rigamarole to decide what is local and global. But if we are deciding when to truncate names, then we are doing something similar in the modelling stage. So we are not reducing the complexity or the chance of error. EM: The reason why I am really not happy to decide on an element by element basis if you were to decide the inconsistency you would provide a developer would be enormous. However, if it was whole bunches of elements like leaf nodes were to be one way that would be different. MG: I believe everything should be local. If we want to provide global elements I would have a separate schema file that has global elements for all the global types. EM: I am sympathetic to that as we have not been very good at weighing use cases. MG: THe alternative to me is too much burden on the modellers. GS: It is not a good idea to decide on an element by element basis. We are using the CCTS as the basis. It is easy to extract the tag names of each local element out of these dictionary entry names. EM: There is a deterministic way that we could proceed with local elements. Matts suggestion that you have parallel global elements is not practicable timewise. MG: Why not just qualify everything in the global element names? 2. Long tag names EM: The model should remove redundancies. It does not make things so artificially different i.e. PartyName. GS: The problem is we are removing two names PartyName ( has rep term Details) and ShippingContactName ( has rep term Text). EM: What do we want to do when things are structurally different but semantically the same, is really what we are trying to address. I can see that local names work in those cases. GS: Take Address. We globally declare Address. Another modeller extending the library sees that this Address cannot be used in the aggregation of BuyerParty because the structure is different. So he makes PartyAddress. Another modeller will need a SupplierPartyAddress and will generate that. If you define a different structure based on Address you will need a new name and lots of exceptional rules will be required to define this name. The only way is to use the Dictionary entry name. THese are very long and will have redundancy. EM: I don't buy that argument. To create different elements for different structures does not bother me. The names of the elements might occasionally be longer. A tiny bit of adjustment might be needed in the binding but this is because we have not been modelling in a very tight way. I don't think type aware processing is hurt by global elements because they are bound to the same types they would have been anyway. Type aware processing is not going by element names. EM: This is a straw man on top of a strawman. GS: I have looked at some SAP examples where we need more qualifiers to identify specific structures. EM: This is true for some names but others won't need that level of qualification. GS: Let's look at Identifier. EM: It is possible if we compromise we can resolve this for leaf nodes. Types won't get that long if they are distinguished structurally. GS: You will have the same long names if you use aggregations. Address is a huge aggregation with lots of difference. EG: That is what we have the extension mechanism for. EM: If you add in industries and business context variations there is an infinite number of possibilities. Let's solve the problem for the stock UBL library first. EG: What I was saying is for a whole industry that is what Context Methodology applies to. It is a red herring to say that tag names grow with context. The whole point of Context Methodology is to do this correctly. GS: I agree with Eduardo that CM has a role, therefore it makes sense to use globally declared elements but without qualifiers because qualifiers are based on context which makes tag names very long. EM: Make it local whenever you have a qualifier. EG: There is another issue of role context within the tagnames. How does that play with Gunther's proposal. GS: Remove role from the tag name. EG: If we do that at this point it is fairly late to do that. GS: It is the only possibility to get consistency. EM: It is intriguing. The modelling is not very context focussed right now. It would be a good exercise and try and apply these things as explicit contexts. I am dubious that it is right to draw a global local based on context or lack of it. Qualifiers are not exclusively used for context differences, look at HouseNumber. You could just call it Number but for human understandability you want to qualify it. GS: HouseNumber is a property term. EM: Ok, I can buy that. WE should define qualifiers based on context. MG: How do you solve name clashes. GS: One possibility would be to define for each context driver and attribute for example a global attribute group and put the contexts into the attribute group. EM: I thought you were talking about making qualified elements local? The rest would be global. That would be cheap way of generating global elements out of our system. MC: I am looking at the spreadsheet. If we strip off the qualifiers we would kill reuse. Are you suggesting that we strip off the object and property qualifiers. GS: I am saying most of the elements in the spreadsheet are completely wrong. Look at StartDateTime on L629 (OP7 reusable) in the spreadsheet. Start is not a qualifier, it is a property term. The qualifier is defined by the context drivers. AG: This is coming to the fore in LCSC. I agree you could do what Gunther proposes without breaking anything. GS: I have seen with BBIEs that I can remove 80% of qualifiers because they are part of the property term. StartTime is a complete property term. That means you put Start in front of DateTime. AG: You shift the qualifiers into the property term so you don't lose them. GS: ValidityPeriod is another example. AG: If we do this we would get shorter tag names. GS: We need to look at how we can handle everything. We have alot of property qualifiers that are really specific e.g. Shipping, Settlement, Seller. OrderLineMinimumQuantity becomes LineMinimumQuantity and this can become MinimumQuantity. EG: This should make global more palatable. GS: How do we handle all the qualifiers. NW: We would use the property qualifier as an association to link two aggregates. The qualifier is part of the property term. 3. Redundancy EM: THe dictionary entry name is a full name. Are we considering giving this full name to any element? No absolutely not. Our truncation rules will result in most cases in names that won't be the dictionary entry name. GS: If you have a different structure for the same name, you need rules to define these names and the only way to do this is the dictionary entry name. EM: You are talking about a naming design process. Bill Burcham demonstrated that there is not a great deal of overlap. AG: The whole point of the exercise is to agree on structures and assign them meanings. If there is a structural difference then it should be semantically different. GS: I am agreed it would be a different type but you don't have to have a different name. EG: given two types A and B, that are completely different, you could still have two local element Address, both called the same, one of type A, the other of type B; if the differences between types A and B are small that would be ok; what if the differences are enormous? When is the difference enough to use different names? GS: It is easier to change types than attributes in classes for implementation. EG: In that case in every definition of the elements that make up an element we would always have an element called A, B and C bound to different types because it is easier. AG: I am not sure you have won that point with local elements. Developers will implement what their customers want. GS: They will implement what is easiest for them. AG: Make the link between a venetian blind approach and your argument. MG: I am reconsidering my stand. EG: I would agree to a compromise. Most elements should be global perhaps, if there was a good criterion to indicate which would be which. MC: A step further would be criteria to ensure uniformity and consistency of the local elements. The biggest problems is that within certain circumstances is the ability to harmonize across a large number of business areas using local elements. EM: You can build it, but in an XML library types are no where to be seen. Type information will not be in the instances. You would have to require alot of system specific stuff. MG: As a compromise, what I suggest is reference the types inside local elements and define global elements separately. But don't reference the global types within the content models. EG: It is like throwing a scrap. EM: I actually agree it is counter intuitive to have everything global. The reason I am tempted to do it is because of the second use case and we have not substantiated it very well. The whole point of the library is to be able to do that. GS: We can do it 2 ways. One way is to say the approach does not matter. The instance must be the same. One group uses local and the other global. Another way is to use a mixture of global and local. All aggregations will be globally declared and all Basic Information Entities would be local. Frank Van Damme told me that the BBIEs are not reusable and therefore must be local. MC: I want to respond to what Gunther said. BBIE reuse in CCTS raised by Frank was not completely agreed by CCTS. GS: I am also saying the basis of every tag and type name will be the dictionary entry name. In our NDR specification we have the rules on how to generate tag names from dicitonary entry names. EM: I could certainly support a global aggregate and local basic approach. The cost is disambiguation. 3. Inconsistency in names EM: A legitimate concern. If we are looking for pure consistency we are going too far. If there is semantic or structural variation then perhaps it is right to have naming variation. Your literal XML instance should express differences that exist. We are really just arguing about the degree of difference. GS: How can you define the tags names? How do we define address for another aggregation? Do we define BuyerAddress or Address EM: You should have a BuyerAddressType that is derrived from AddressType. MC: LCSC has gone too far in their application of qualifier terms and should revisit the use of them. They should build a controlled vocabulary of qualifier terms. I am not in agreement that we should drop property qualifiers from the tag name. I would agree not to include the qualifiers in the object class. AG: In this discussion Mark, what we do internally is one thing, you can't know in advance what people who use it will do. EM: I think this is good i.e. to rationalize our use of qualifiers to help with our local vs global decision. EG: I think a compromise is better than the extremes. MG: I would summarize the 3 problems with global elements as follows: 1. Tag names are longer and more unwieldy as compared to local elements. 2. It seems conceptually not to be correct. 3. It imposes additional burden on automatic processors if tag names are created just for reasons consistency but aren't used. This can create a lot of unnecessary overhead. My solution In content models of all elements use local elements that reference a type, lop off all qualifiers and then have global elements defined for all of those types in a separate file that use the fully qualified name with all the qualifiers. EG: It would create instances that are totally different from each other, the name of the tags would be different dependent on whether you use local or global. Validation would be difficult. MC: Everything I have heard so far does not give me a technical reason why our global decision is broken. There are lots of subjective ones. It is clear that there is consensus that some compromise is the way to go. If LCSC revisits the qualifier issue alot of the problems with long tag names goes away. That in conjunction with the truncation rules i.e. not including object class qualifiers in all cases will significantly reduce the tag names. It seems to me that there are a limited number of circumstances that lend themselves to use as local elements in a way that can ensure consistency e.g. code and ID. I could support that sort of a modification. EM: I am some what sympathetic with Mark's call however, I have come to believe that the use cases we have been pointing out i.e. non-type aware instances are perhaps as not compelling. We have been talking about XPATHs being clean with local elements. You have to start from the root for local names to get clean XPATHs. This could be a really long XPATH. If implementers like Gunther tell me that there are implementation problems with global then I have to appreciate that. GS: The biggest con of global elements is the maintenance. MG: In response to Mark. As far as a lack to a killer objection there are some that have been mooted. Super long tag names could be a complete showstopper but that is not a far criterion. Eve made a good point, Gunther and I both programme this stuff and that needs to be taken in to account. Regarding her point, Do we need global elements at all?? I think we should get rid of that grab bag and the grab bag of UBL types should suffice. EG: You mean the types should be used for reuse. MG: Yes MC: I disagree - if you are trying to build a reusable library stored in a registry, you can't store local elements in a registry and have them stored and be semantically clear. Trying to search a registry is not possible if you go local. Tag name size is disingenuous. I cannot enforce harmonization. If I declare a type in one schema and another one in another schema with slightly different names and they have the same semantic value, I can't tell that you can't use one type because the other one exists. You would get a proliferation of types. AG: One of the most common things people will do with UBL is to contextualize it and they will need a library of types with standard names. I think global elements are vital for extension and customized. EM: I don't think global elements help you with this. Global types are what you need for extension and restriction AG: Alot of what I do is reassembly and we need global elements. Most XML developers are working from instances and they could become confused. We need global elements and I have changed my mind on local. I agree with Mark. SK: In terms of the length of tags I agree with Mark. GS: I am looking at the biggest advantage of XML is the ease of implementation. You don't harmonize just elements but also the structure. You find the structure in the types only. It is enough to harmonize types. For direct implementation local is the best. NW: Our use is to pick up from the UBL library and assemble in to our own messages. We want to be able to reuse the core library. We certainly wish to remove most of the use of qualifiers and this is acceptable to NDR. The only time we would use a qualifier is when it is a context driver. MG: I definitely feel this was a helpful discussion. The fundamental point of contention is whether people will understand the distinction between types and elements. It is possible to harmonize with local elements and global types. People are not used to doing it and it could lead to confusion. We are not ready to make a decision and work through some more examples. JB: My chief concerns are getting this thing done and adopted. First of all, readability is important and it is not going to be read just by experts. For adoption you have to design for a much lower level of expertise than what people are assuming, just look at HTML. For alot of people reuse means copy and paste. I don't know what those ramifications are. The single most important factor in adoption is training. From the standpoint of time, we are out of time. We won't debate this longer than to get this decided procedurally. From the standpoint of getting the job done let's make the smallest adoptable change to what we have already.

==========
Attendance: