CTI STIX Subcommittee

 View Only

RE: [cti-users] MTI Binding

  • 1.  RE: [cti-users] MTI Binding

    Posted 10-01-2015 15:43
    Resent to the CTI-STIX list as I was not on that list.   From: Cory Casanave Sent: Thursday, October 01, 2015 11:08 AM To: 'Wunder, John A.' Cc: Jordan, Bret; cti-stix@lists.oasis-open.org; cti-users@lists.oasis-open.org Subject: RE: [cti-users] MTI Binding   John, With respect to RDF in JSON, logical data models and other options, I will respond here but also look at updating the wiki. Sorry in advance for the long message – but I think it an important point.   JSON has come from an environment of “server applications” supplying data to their “client applications”, where the client applications tended to be coupled and implemented in _javascript_. The use has, of course, broadened, but that is the foundation and what it is very good at. What makes it “easy” is: ·          There is a well defined relationship between the client and server applications, usually under control of the same entity. ·          The server application is primarily in control of what the user will see through the client and how they interact. ·          There is a “dominate decomposition” of the data because it is serving a specific restricted set of use cases that the data structure and applications are tuned for. A strict data hierarchy works just fine. (Look up “dominate decomposition” – there is a lot of good information on the topic) ·          Data is coming from a single source and can be “bundled” for the next step in the applications workflow. Not much need to reference data from other sources or across interactions. ·          The semantics and restrictions of the data are understood within the small team(s) implementing this “client server” relationship – fancy schema or semantic representations are not needed. ·          The data source is the complete authority, at least for the client application. ·          Things don’t change much and when they do it is under a controlled revision on “both ends”. ·          The application technology is tuned to the data structure – _javascript_ directly reflects JSON. A good example may be the “weather channel” application on your phone and web browser. It is all managed by the weather channel developers (and perhaps their partners) for users (specific stakeholder group) to get weather information (specific information for a purpose) for a region (the dominate decomposition). I don’t know if they use JSON, but it would be a natural choice. This set of clients is served by servers designed for the above purpose.   RDF & the “semantic web” stack has been designed with a very different set of assumptions: ·          Data providers and data consumers are independent and from different organizations, countries and communities. ·          Data providers and data consumers are independently managed. ·          Data providers have no idea what data consumers will use the data for, the consumer is more in control of what they consume and how they use it ·          There are numerous use cases, purposes and viewpoints being served – there is no dominate decomposition. ·          Data may come from multiple sources and the consumer may follow links to get more information, perhaps from the same or different sources. No static fixed “bundles” are practical. ·          Due to the distributed community the data semantics, relations and restrictions must be clearly communicated in a machine readable form. ·          Things change all the time and at different rates ·          No data source is complete, clients may use multiple sources ·          Any number of technology stacks will be used for both data providers and consumers. An example could be the position and path of all airliners, worldwide.   This difference in design intent results in some specific differences in the technology: ·          RDF (and similar structures) are “data graphs” – information points to information without a dominate decomposition. ·          JSON is a strict hierarchy, essentially nested name/value pairs ·          RDF has as its core a type system with ways to describe those types ·          JSON has no type system, everything is a string. There is an assumption that “everyone knows what the tags mean” ·          RDF depends on URIs to reference data – this works within a “document” and across the web. This is where the “Linked data” term comes from (note: linked data may or may not be “open”) ·          JSON has no reference system at all, you can invent ways to encode references (locally or remote) in strings but they are ad-hoc and tend to be untyped ·          RDF is a data model with multiple syntax representations (XML, JSON, Turtle, etc) ·          JSON is a data syntax Here is the rub: Programming any application for a more general, more distributed, less “dominate”, less managed and less coupled environment is going to be harder than coding for the coupled, dominate managed and technology tuned one. Changing the syntax is not going to change that. Encoding the RDF model in JSON does allow a simpler syntax (than RDF-XML or, I think, current STIX) and does allow it to be consumed more easily in many clients, but the developer will still have to cope with references, distribution and “creating their viewpoint” in the application rather than having it handed to them. The flexibility has this cost and the community has to decide if and how to handle it.   As I have suggested earlier, the best case is to make sure the description of your information (as understood by stakeholders) is represented in precise high-level machine readable models that will work with different decompositions and different syntaxes. It this is not the “singe source of the truth” for what your data means, you will be stuck in a technology – even if it is RDF.   If there is going to be one “required” syntax it best be one that can reflect this general model well, serve diverse communities, support different technology stacks and is friendly to differing decompositions (no dominate decomposition). Of course, it then has to be as easy to understand and implement as is possible under these constraints.    Where such general structures are encoded in XML it becomes complex. This is a combination of the need for the generality and the limits of XML schema. But, don’t blame XML for complexity that is inherent in the generality of CTI. The same complaint is levied on other general XML formats, like NIEM.   RDF in JSON syntax provides the type system, reference system and allows for a structured composition but does not require it – it is more friendly to this general structure than XML Schema. This seems like a good option. It would be a very good option if generated from a high level model that would serve to bind all the technologies.   Regards, Cory Casanave     From: Wunder, John A. [ mailto:jwunder@mitre.org ] Sent: Thursday, October 01, 2015 9:18 AM To: Cory Casanave Cc: Jordan, Bret; cti-stix@lists.oasis-open.org ; cti-users@lists.oasis-open.org Subject: Re: [cti-users] MTI Binding   Can you elaborate a little, Cory? What are the advantages of RDF in JSON vs. either native JSON, native XML, or RDF in XML? What are the disadvantages?   If you could fill it out on the wiki that would be awesome, but if not then e-mail is fine too.   John   https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis   On Sep 30, 2015, at 8:20 PM, Cory Casanave < cory-c@modeldriven.com > wrote:   What about RDF in JSON? This then has a well defined schema.   From: cti-users@lists.oasis-open.org [ mailto:cti-users@lists.oasis-open.org ] On Behalf Of Jordan, Bret Sent: Wednesday, September 30, 2015 6:56 PM To: cti-stix@lists.oasis-open.org ; cti-users@lists.oasis-open.org Subject: [cti-users] MTI Binding   From the comments so far on the github wiki [1], the consensus right now from the community is for JSON to be used as the MTI (mandatory to implement) binding for STIX. For those that agree or disagree or have a different opinion, please update at least the final Conclusions section with your opinion.     [1]  https://github.com/STIXProject/schemas/wiki/MTI-Format-Analysis   Thanks,   Bret       Bret Jordan CISSP Director of Security Architecture and Standards Office of the CTO Blue Coat Systems PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050 "Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not be unscrambled is an egg."