Jason,
I do not disagree that conceptually this is a large change, but it is a positive one. Considering that making this update will simplify the model and that the cyber observable types and properties will remain
consistent, this is not as large of a change as you are stating. There are almost always code updates when integrating with a new version of a software library, subsystem, or standard.
In my view, the issue is that we need a standard that supports both communities, and currently it only supports one. Such an issue justifies making a change. I agree with you that we need to find a middle-ground
consensus. As I understand the Event based systems have the following basic high high level requirements:
Represent the types of objects and relationships necessary to communicate their info. Reduced data-transfer overhead Reduced creation and parsing overhead Capture event data which occurs at a specific time from a specific device.
Even based organizations like the status quo because they can capture event based data within observed-data objects in smaller JSON files and it is fairly easy to create and parse. They are not dealing with complex
cyber intelligence knowledge. State-based systems have a greater interest in the relationships between data allowing for correlation and pivoting. This is something they cannot do using the current model. Requirements 2 and 3, while still important to state-based
system, are not as critical. The time at which things occur also vary more heavily in state-based systems.
We therefore need the ability to optimize the transfer of STIX data that support both extremes and all of the systems that will be a combination of an event and state based system. Event-based focuses on optimizing
for size and speed while sacrificing the complexity of the knowledge being represented (through relationships and time data), while state-based optimizes for knowledge representation while taking a hit on size and speed. There will of course be systems that
will have both or a subset across these requirements as well.
I believe we can meet the needs of these two audiences by a modified version of option 1. Note, this will still require an update to code, but any compromise will require work by all parties.
Option 1 transitions cyber observables to TLOs, allowing state-based systems to meet their main requirement of capturing more complex cyber threat intelligence, but it hurts requirements 2 and 3 for event-based
systems. Note, if the system requires correlation across cyber observable objects related to observed-data, then option 1 is orders of magnitude faster than option 7 for parsing. Each cyber observable must now include a version, created-date and modified-date,
expanding the size of the JSON and increasing the number of parameters attached to each cyber observable.
The compromise:
If we allow TLOs to inherit (unless overridden) created, modified and version from the bundle, the size and complexity of the JSON is similar to the
complexity of today s cyber observables. Since event-based systems send out data in a consistent version and at an event in time, the version and time stamps should remain consistent across objects. (remember, currently cyber observables objects perform
the same inheritance from the observed-data objects they are encapsulated in). TLOs can also still include version, created and modified within the objects (overriding the properties in the bundle), allowing state-based systems to represent the complexity
they require.
This change should be very simple to implement for event-based systems. A skilled developer that writes JSON parsers and writers should be able to implement this update in a couple days. If we are still worried
about adoption issues, we can write libraries to transition from the current format to the new format. These libraries will be lossy, since not all the relationships in the new format are possible in the current format, but they should be fine for STIX documents
that only have observed-data in them.
Please let me know everyone s thoughts. I hope this provides the compromise necessary to move us forward.
-Gary
From: Jason Keirstead <
Jason.Keirstead@ca.ibm.com>
Date: Thursday, November 1, 2018 at 8:02 AM
To: Bret Jordan <
Bret_Jordan@symantec.com>
Cc: Allan Thomson <
athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <
cti@lists.oasis-open.org>, Gary Katz <
gary.katz@FireEye.com>, "Kelley, Sarah E." <
skelley@mitre.org>
Subject: Re: [cti] Re: [EXT] RE: [cti] Option 1 vs. Option 7 Powerpoint Example
Brett, the change is not as simple as you are describing. The movement of all SCO objects to top-level objects is an extremely significant change, and will cause a large amount
of disruption to anyone who has implemented STIX in their software. It's an entire re-think of how data has to be consumed, processed, and stored.
Statements like "we must fail fast" should not be casually be thrown about here. We must be extremely cognizant of the fact that STIX 2.X is not a "paper standard". It is in use in the wild. Millions
upon millions of dollars and years of effort across industry have been expended adding support for it to products. The TC can not simply casually pull the rug out from under industry whenever it wants - frankly, it does not have any ability to dictate what
industry does and does not do. All of the vendors who have invested in STIX 2.X are unlikely to want to "fail fast" and throw all that work and money away, in order to support something that does not immediately matter for their current use cases.
One only has to look at what occurred at the W3C in 2004 onwards to see how a standards body can be reduced to insignificance when vendors decide that it is not worth implementing something an SDO
came up with that they do not want to support. This is a very real possibility if something like this occurs at this stage.
I implore the TC to seek to reach middle-ground consensus on this issue and actively explore backwards-compatible options.
-
Jason Keirstead
Lead Architect - IBM Security Connect
www.ibm.com/security "Things may come to those who wait, but only the things left by those who hustle." - Unknown
From: Bret Jordan <
Bret_Jordan@symantec.com>
To: Gary Jay Katz <
gary.katz@FireEye.com>, "Kelley, Sarah E." <
skelley@mitre.org>, Allan Thomson <
athomson@lookingglasscyber.com>,
"cti@lists.oasis-open.org" <
cti@lists.oasis-open.org>
Date: 10/31/2018 11:27 PM
Subject: Re: [cti] Re: [EXT] RE: [cti] Option 1 vs. Option 7 Powerpoint Example
Sent by: <
cti@lists.oasis-open.org>
It is also important to note about Option 1 prime, Observed Data is staying around. It will just have a list of embedded relationships (like the report object) instead of containing the cyber observable objects themselves.
Bret
From: Gary Jay Katz <
gary.katz@FireEye.com>
Sent: Wednesday, October 31, 2018 2:07:54 PM
To: Bret Jordan; Kelley, Sarah E.; Allan Thomson;
cti@lists.oasis-open.org Subject: Re: [cti] Re: [EXT] RE: [cti] Option 1 vs. Option 7 Powerpoint Example
Allan,
I would echo both Sarah and Bret s comments. I thought your email did a nice job in laying out and making clear this differentiation in use cases. I believe both state-based and event-based can be represented in Option
1, but as has been noted, it is more data being transferred for event-based scenarios. I do not believe we have seen other solutions beyond those listed by Bret, but I am interested in discussions on how to best meet both use cases.
-Gary
From: <
cti@lists.oasis-open.org> on behalf of Bret Jordan <
Bret_Jordan@symantec.com>
Date: Wednesday, October 31, 2018 at 3:23 PM
To: "Kelley, Sarah E." <
skelley@mitre.org>, Allan Thomson <
athomson@lookingglasscyber.com>, Gary Katz <
gary.katz@FireEye.com>, "cti@lists.oasis-open.org" <
cti@lists.oasis-open.org>
Subject: [cti] Re: [EXT] RE: [cti] Option 1 vs. Option 7 Powerpoint Example
All,
When I look at it, the problem I see / hear from Gary / Jeff / Sean / Sarah is that internal relationships on the observable container do not really work for what people need. Thus having external relationships and all their goodness
is what people need.
You can do that in one of three ways.
a) Make cyber observables top level objects (option 1 prime from previous discussions)
b) Provide some sort of deep referencing inside of Observed Data (people have consistently shot down this idea)
c) Try and pull out the relationships that really need to be external and leave the rest. (A combination of option 7 with some tweaks that John Wunder has brought up)
So options a, b, and c are technically all possible, though option b where you do deep referencing inside of an Observed Data is just awful and will probably be the no-end-to-pain.
Bret
From:
cti@lists.oasis-open.org <
cti@lists.oasis-open.org> on behalf of Kelley, Sarah E. <
skelley@mitre.org>
Sent: Wednesday, October 31, 2018 12:04:09 PM
To: Allan Thomson; Gary Jay Katz;
cti@lists.oasis-open.org Subject: [EXT] RE: [cti] Option 1 vs. Option 7 Powerpoint Example
Allan (and all),
I think this is a really profound realization. I have been coming at this with a state-based idea, as in give me everything you know about X . Having worked in a SOC, I also realize the use cases for event-based data. I,
for one, would be curious about your possible ideas for being able to represent both.
Thanks,
Sarah Kelley
Lead Cybersecurity Engineer, T8B2
Defensive Operations
The MITRE Corporation
703-983-6242
skelley@mitre.org From:
cti@lists.oasis-open.org <
cti@lists.oasis-open.org>
On Behalf Of Allan Thomson
Sent: Wednesday, October 31, 2018 10:44 AM
To: Gary Jay Katz <
gary.katz@FireEye.com>;
cti@lists.oasis-open.org Subject: Re: [cti] Option 1 vs. Option 7 Powerpoint Example
Gary thanks for sharing.
One of the things that I ve realized as part of reviewing the use cases is the differences in how we talk about things.
I ve come to the conclusion that we are talking about 2 different aspects of our problem set.
Event-based
Vs
State-based
From my perspective, Option 1 is really representing a state of entities and connectedness between those entities after multiple events have occurred.
Option 7 (current observed-data model) represents discrete individual events that would occur over time.
This would be similar to having a state-machine defined (I,.e. the resultant intel model) and then individual events (intel events) that cause you to update the state-model.
Think of the intel model as the campaigns, actors, email-addresses, ips .etc.
Think of the events as changes to those intel objects (i.e. observed data model).
Conflating the 2 of these is not the solution.
The question is whether we are defining STIX to communicate event-based model or a state-based model.
I think we should consider the possibility that both are valid things to do and therefore we should consider how to approach using STIX to clearly articulate when we are
Sending discrete events that have been observed at a specific time and any associated meta data to that event
Sending a state model that represents the collective intelligence and associated relationships across that state built up over time
I think if we recognize that both models require something different and factor that into our STIX data model discussion then we might find a way to solve both.
I have some ideas but this email is already too long.
Allan
From: "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
on behalf of Gary Jay Katz <
gary.katz@FireEye.com >
Date: Wednesday, October 31, 2018 at 6:20 AM
To: "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: [cti] Option 1 vs. Option 7 Powerpoint Example
Thank you to everyone for taking time to discuss Option 1 and Option 7. As usual, Jane did an excellent job capturing the discussion, including screen shots from the presentation. John-Mark requested that I resend out the slides
from yesterday s discussion with any updates, which I believe is valuable as it will allow us to continue the discussion over email. As an update, I did include an optional Observed Data object in Option 1. The inclusion of an Observed Data object would
show that the producer directly observed the email with an attachment vs. indirectly having that information (ex. Gathered the information from external reporting).
The purpose of this example is to show a very reasonable use-case for a cyber security analyst and discuss how that data can be represented in the STIX standard using either Option 1 or Option 7. I have not created JSON versions
of the example in both Option 1 and Option 7 form. My assumption would be, to Allan s point, that the Option 1 version is more verbose, although only slightly. This does mean that the data size of the document is larger and to earlier points, in other use
cases this difference can be even larger. This example though highlights an even larger issue. Option 7 does not allow some common useful relationships to be represented within the format. Having relationships to show that a file found in an email, which
analysis shows beacons to a C2 that resolved to a specific domain is not possible in Option 7. The receiver must infer this information through 3 disjointed objects.
Our greatest risk to adoption is not asking companies and organizations to update their STIX implementations to support Option 1 or the increase in data size for certain use cases. Our greatest risk is having the trust of the
userbase. One day, far in the future (if we do our jobs well), analysts will not even be aware of STIX being used in the background to transfer their data. Today though, they are paying attention, they will be asked by their leadership to look at the standard
and provide their opinion on how valuable it is to adopt STIX, and analysts will not understand why they can t represent a file found in an email has a C2 beacon that resolves to a domain (or something similar). The answer to just trust us that the receiver
is going to auto-correlate that information back together, probably won t fly.
Some of these issues were masked by the limited use cases possible in STIX 2.0 and 2.1. As the standard evolves to support Malware, Infrastructure and Incident objects these issues will become very pronounced. We will continue
to put band-aids on the standard as a result of the deficiency (ex. See the malware proposal submitted by Jeff Mates and I earlier this year). Option 1 will resolve these deficiencies. Will it take work and effort, yes, but that work and effort will only
continue to grow the longer we wait.
-Gary
Some Metrics on the two implementations of the use cases:
Option 1:
8 Objects (1 optional) (2 SDOs, 6 SOOs)
5 Embedded refs (3 optional)
6 Relationships (6 SROs)
Option 7
15 Objects* (6 SDOs, 9 cyber observables)
5 Embedded refs (2 within Malware not shown)
2 Relationships (2 SROs) Note some relationships in the example cannot be represented in this option
* Cyber Observables are not full objects in this option. Therefore must be embedded in an SDO but are lighter objects that take less text to represent.
From: <
cti@lists.oasis-open.org > on behalf of Jane Ginn <
jg@ctin.us >
Date: Tuesday, October 30, 2018 at 6:10 PM
To: "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: [cti] Groups - Weekly Working Call - Notes uploaded
Submitter's message
CTI TC:
Here is the PDF of the notes from the Working Call. I included the figures in this version.
Best regards,
-- Ms. Jane Ginn
Document Name :
Weekly
Working Call - Notes
Description
Discussed Option 1 and Option 7 for Cyber Observables
Download
Latest Revision
Public
Download Link
Submitter : Ms. Jane Ginn
Group : OASIS Cyber Threat Intelligence (CTI) TC
Folder : Meeting Notes
Date submitted : 2018-10-30 15:10:05
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by
others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto)
by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. [attachment "image001.jpg" deleted by Jason Keirstead/CanEast/IBM]
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.
If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.