Hey Gary, all, I’d like to follow-up on your last statement. While this talk about cyber observable objects as SDOs is a good discussion on the technical aspects, I think given where we are with STIX 2.0 Option 1 or 2 are much easier lifts, but I haven’t seen much discussion of how they would work. Given that, since you preferred option 2…what properties do you think we’d need to add to Observed Data? A couple thoughts: Is this a place where Assertion could help? Would you make an assertion that the category of the IP is “infrastructure”, and then use a relationship to an intrusion set to do attribution to whose infrastructure? You mentioned something to indicate detection mechanism or how it was detected. On Sighting we have “where_sighted_ref”…maybe that should exist on observed_data? Right now the implication is that the observed-data was observed by the creator of the STIX object, but maybe that shouldn’t be the case to capture things like this? We could also consider a “detection_mechanism” property. Anything else? As we finish off work on Assertion and Grouping these will be important questions to tackle. Hopefully by then we’ll also have the options modeled in JSON for people to evaluate. John From: <
cti-stix@lists.oasis-open.org> on behalf of "Katz, Gary CTR DC3DCCI" <
Gary.Katz.ctr@dc3.mil> Date: Thursday, November 16, 2017 at 10:28 AM To: Jason Keirstead <
Jason.Keirstead@ca.ibm.com> Cc: "Bret Jordan (CS)" <
Bret_Jordan@symantec.com>, "cti-stix@lists.oasis-open.org" <
cti-stix@lists.oasis-open.org>, JG on CTI-TC <
jg@ctin.us>, Sean Barnum <
sean.barnum@FireEye.com> Subject: RE: [cti-stix] Re: [EXT] [cti-stix] RE: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 Jason, I apologize for skipping over your first question. Terry wrote a decent size response that I have not had the time to grok yet, so I’m deliberately omitting comments relating to it. You obviously have much greater faith in companies not just moving towards a more STIX compliant model and I sincerely hope you are correct. As a counter example though, didn’t Soltra not only directly use the STIX 1.0 model but it was also their main advertising point? For companies that do maintain their own internal models, which may not have the same level of fidelity as STIX, we could potentially see systems ingesting a STIX document with Observed Data and then exporting it with Infrastructure or vice-versa. We are no longer talking about a system losing fidelity in the types of relationships between objects but instead in the types of objects being created. At the end of the day, I would like to reiterate what got this discussion started. No matter which of the 3 or 4 implementations are chosen I strongly believe that we do need to have a way to capture infrastructure within STIX. I still believe making cyber observables TLOs would be the best choice but I also do not want to distract from the overall point that we need to have a way to capture this information. Thanks, -Gary From: Jason Keirstead [mailto:
Jason.Keirstead@ca.ibm.com] Sent: Thursday, November 16, 2017 9:31 AM To: Katz, Gary CTR DC3DCCI <
Gary.Katz.ctr@dc3.mil> Cc: Bret Jordan <
Bret_Jordan@symantec.com>;
cti-stix@lists.oasis-open.org; JG on CTI-TC <
jg@ctin.us>; Sean Barnum <
sean.barnum@FireEye.com> Subject: RE: [cti-stix] Re: [EXT] [cti-stix] RE: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 For #1 - Can you go into more detail into the processing issues? We have not seen this.. I am not sure how making a a SCO a TLO would reduce processing ? #2 / #3 / #4 - These are describing problems with a stored model and building capabilities on that. STIX is not a standard for a storage and query model, or for an analyst UI - it is not built for that, at all. STIX is a standard for data interchange. We need to be **extremely careful** here and not try to take problems that would result from taking STIX as a storage model and projecting them onto a data interchange model. If STIX was meant to be a storage and query model, we should have done a whole host of other things differently... SCO as TLOs is the least of our problems. If that is the goal, we need to totally reboot the entire standard. RE " most implementers will probably just follow the STIX structure ", I *totally* disagree with that. First, most people already have data models, and are looking to use STIX to interchange between these already existing things. Even if you were building net-new, no one worth his salt will follow this model as-is.... there are a whole host of things we have done that do not construe an efficient design for storage and querying... - Jason Keirstead STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security "Things may come to those who wait, but only the things left by those who hustle." - Unknown From: "Katz, Gary CTR DC3\DCCI" <
Gary.Katz.ctr@dc3.mil> To: Jason Keirstead <
Jason.Keirstead@ca.ibm.com>, Sean Barnum <
sean.barnum@FireEye.com> Cc: Bret Jordan <
Bret_Jordan@symantec.com>, "cti-stix@lists.oasis-open.org" <
cti-stix@lists.oasis-open.org>, JG on CTI-TC <
jg@ctin.us> Date: 11/15/2017 06:36 PM Subject: RE: [cti-stix] Re: [EXT] [cti-stix] RE: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 Sent by: <
cti-stix@lists.oasis-open.org> Jason, I’ll try to provide a couple of reasons for making cyber observables TLOs but I’d be interested in others providing their thoughts. 1. Removes graphs-within-graphs problem: Currently the Cyber Observables can link to each other within an Observed Data object but cannot be directly linked to. This creates a subgraph within the larger STIX graph and creates unnecessary processing issues. 2. Encourages inefficiencies for analysts: While a system could hide the Observed Data and Infrastructure objects from the user, most implementers will probably just follow the STIX structure. Currently if an analyst were to manually want to say that they observed an IP they need to create an Observed Data object and then link that object to an IP address. This requires the creation of 2 objects rather than one, or in the case of viewing, they would see two objects instead of 1, creating unnecessary clutter. 3. Infrastructure and Observed Data provide context about the Cyber Observables. Elsewhere within STIX we use relationship information (links) to capture context. By making Cyber Observables TLOs we would follow the same model as the rest of STIX. 4. Analysts are used to it. If you review current cyber intelligence platforms, Maltego, TrueStar, Threat Connect, Anomali, etc. Cyber Observables are directly shown on the graphs (note: in general their models are less detailed than STIX, so I am not suggesting we just adopt their level of detail, second note: my knowledge may be dated, I have not reviewed these platforms too recently). Analysts are used to working in that paradigm, why confuse them? 5. Lastly, I do not believe we would have chosen this design if we were not starting out with Cybox being separated from STIX and with organizations (me!) pushing for cybox to remain separate so it could easily be used in other schemas. Cybox has been fully integrated as a part of STIX, so this original reason for calling it out separately no longer applies. My 2 cents. -Gary From: Jason Keirstead [ mailto:
Jason.Keirstead@ca.ibm.com ] Sent: Tuesday, November 14, 2017 11:49 AM To: Sean Barnum <
sean.barnum@FireEye.com> Cc: Bret Jordan <
Bret_Jordan@symantec.com>;
cti-stix@lists.oasis-open.org; Katz, Gary CTR DC3DCCI <
Gary.Katz.ctr@dc3.mil>; JG on CTI-TC <
jg@ctin.us> Subject: Re: [cti-stix] Re: [EXT] [cti-stix] RE: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 My opinion is Option #1. I think it is fairly obvious to all that Infrastructure is a needed capability, the only question was "when" - but I do not think that making Cyber Observables TLOs is a way to do that. An observable is something that is static, that is a fact at a point in time, that never changes, and that can observed - that thing would *point* at Infrastructure, it is not the infrastructure itself. Making Cyber Observables TLOs would not solve the use case properly - most of the requirements for Infrastructure Jane has below, would not be solved by this at all. They would require an actual Infrastructure object. I have yet to see someone present a concrete use case for why we should make cyber observables a TLO. It has been proposed multiple times in the past and debated to death - it is perhaps the second-most debated subject in the TC (after timestamps). I am all for the idea of "fixing what is broken" in STIX, as Brett says, but to me if we are going to re-open topics that have been extensively debated and yet voted on in another direction, there is a significant burden on the proposer as to why it should be re-opened. I don't see that burden being met here. What are the specific modeling problem(s) that would be solved by making cyber observables top level objects, that can not be solved in any other fashion? - Jason Keirstead STSM, Product Architect, Security Intelligence, IBM Security Systems
www.ibm.com/security "Things may come to those who wait, but only the things left by those who hustle." - Unknown From: Sean Barnum <
sean.barnum@FireEye.com> To: Bret Jordan <
Bret_Jordan@symantec.com>, "Katz, Gary CTR DC3DCCI" <
Gary.Katz.ctr@dc3.mil> Cc: JG on CTI-TC <
jg@ctin.us>, "cti-stix@lists.oasis-open.org" <
cti-stix@lists.oasis-open.org> Date: 11/14/2017 12:11 PM Subject: Re: [cti-stix] Re: [EXT] [cti-stix] RE: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 Sent by: <
cti-stix@lists.oasis-open.org> Speaking for myself specifically and for FireEye, we very strongly agree with Jane’s overall perspective and with each and every one of the motivating arguments she has taken the time to outline for us. We also very strongly agree with Gary’s perspective that addressing the issue of infrastructure should begin with making Cyber Observables top level objects. Forcing observables to exist only within an Observed Data wrapping artificially constrains observables to a context and use that is artificially constrained to a single temporal context of observation. As Gary points out these observables in reality exist and are useful in numerous and varied contexts beyond simply a single temporal context of observation. By breaking them out into top level objects they may be expressed in a consistent form without unnecessary duplication (across temporal observations) or semantic gymnastics (for situations like infrastructure that are other than a simple observation) and the specific context for any use of the observable can, as Gary describes, be expressed within the relationships of that observable to other content. We agree with Gary and Bret that this change should be made and now is the time to do it rather than later when the impact will likely be far greater not only as a breaking change to existing implementations but also due to other ongoing language design decisions that are made based on the current structure. We do diverge with the rest of Gary’s opinions on the options he presented. We would strongly disagree with Option 2 which we view as perpetuating and exacerbating a flawed design and would represent some very confusing and invalid semantic gymnastics to use a temporally constrained observation construct to attempt to represent a non-temporally constrained concept like Infrastructure. We also believe that while Option 3 is urgently necessary, it by itself is still not a full adequate solution for infrastructure. It should be combined with Option 1. I think this is what Bret is suggesting in his second paragraph below. Bret, feel free to correct me if I am mistaken. We believe the ideal solution is an independent Infrastructure SDO that can represent an aggregate infrastructure whether simple or complex that can then be associated with appropriate ThreatActors, Campaigns, Malware, IntrusionSets, etc in a clean, clear and unambiguous way to support many of the use cases Jane outlines. This Infrastructure SDO can also be associated with any number of top level Cyber Observable objects to characterize any simple or complex infrastructure. This could include clean, clear and unambiguous relationships between Cyber Observables within this infrastructure. It also enables the characterization of cyber observables that exist within multiple infrastructures (a common occurrence) in a clean, efficient and pivotable manner. We would assert that this approach offers the cleanest model where each concept is distinct and clearly understood, granular and inherently linkable to support effective graph-based analysis and visualization, and straightforward to implement in code without unnecessary convolutions, conflations or complex interdependencies. Sean Barnum Principal Architect FireEye M: 703.473.8262 E:
sean.barnum@fireeye.com From: <
cti-stix@lists.oasis-open.org> on behalf of Bret Jordan <
Bret_Jordan@symantec.com> Date: Monday, November 13, 2017 at 7:26 PM To: "Katz, Gary CTR DC3DCCI" <
Gary.Katz.ctr@dc3.mil> Cc: JG on CTI-TC <
jg@ctin.us>, "cti-stix@lists.oasis-open.org" <
cti-stix@lists.oasis-open.org> Subject: [cti-stix] Re: [EXT] [cti-stix] RE: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 I agree with Gary and Jane on the need for a solution to infrastructure for 2.1. In my opinion I do not think we should ship 2.1 without it. Infrastructure and Malware are the two biggest missing parts that are preventing adoption. I agree with Gary’s other points as well, though I would add a 4th option of make cyber observable TLOs and add a meta data object called infrastructure so you can record higher level information about how the threat actor is using the infrastructure. All of the IPs and such would link to it via real relationships and we could then use the relationship dates to establish validity. Bret Sent from my Commodore 128D PGP Fingerprint: 63B4 FC53 680A 6B7D 1447 F2C0 74F8 ACAE 7415 0050 On Nov 14, 2017, at 4:10 AM, Katz, Gary CTR DC3DCCI <
Gary.Katz.ctr@dc3.mil > wrote: Jane, Sorry for taking a long time to respond. Since it seemed you put considerable amount of time into your email, I wanted to make sure I had time to put some thought into the problem before providing a response. I agree that we need a way to represent malicious infrastructure within 2.1 although I believe there are multiple ways we can achieve this goal. There are three options that I believe are possible (4 if you include not doing anything about infrastructure in 2.1, which I do not believe is an option) Option 1: Create a New Infrastructure Object Option 2: Modify the Observed Data TLO to also model Infrastructure Option 3: Make Cyber Observables Top Level Objects There are Pros and Cons to each of these, and I’d be interested in hearing everyone’s inputs. My personal order of preference is Option 3, then Option 2, then Option 1. My personnel opinion is we should not release 2.1 without a way to handle Infrastructure. I think some of you will find it ironic that I would push for option 3, since I was one of those that fought against making cyber observables top level objects, but the reasons for keeping them separated no longer exist. Cybox has been integrated into STIX, there is not a reason in my mind to keep them non-TLOs any longer. The main disadvantage to making Cyber Observables TLOs is that it is a major breaking change from STIX 2.0. We can debate how big a deal this is, but I’d rather rip the bandaid off now rather than having this issue long term. From a model standpoint, embedding cyber observables within Observed Data results in graphs within graphs, which are in themselves difficult to deal for systems receiving STIX data. Making Cyber Observables TLOs would fix this issue and simplify the model. From an analyst standpoint. I spent (parts of) the last several days manually ingesting text documents into our own cyber analytic system. The model we use is a graph and is based closely on the STIX model. I modeled a custom malware analysis report, an internal incident/event report, a short external government finished intelligence reporting of an event, and a longer external government intelligence reporting of multiple events. I manually ingested a variety of reports to form my own options about how I think about the data and would like to model it, what is useful and what information cluttered the graph. In the end, it did not matter to me if an object represented an IP address that was seen in a malware object or it was observed on my network or it was infrastructure found through other methods. That context was contained in the linkages. An IP address, is an IP address, is an IP address and storing it within Observed Data or an Infrastructure object did not make too much sense to me. The fact that the IP was infrastructure used by the adversary was shown in my link types, not in the object itself. For example, some IP address may relate to adversary infrastructure because the adversary had taken over the system for a time and then at a later time another adversary started using it. It was infrastructure for both adversaries at different times, which is shown through the links and the time attached to those links, not through the IP address itself. If we decide not to make Cyber Observables TLOs, my second option would be to extend the Observed Data TLO to meet the Infrastructure needs. This would mean vastly increasing the number of relationship types assigned to Observed Data and adding on some extra properties. One could be a Boolean stating whether the Cyber Observables were observed on your network or some other fashion. By not creating an Infrastructure object, we avoid having one organization model the data using the Observed Data object and another organization use the Infrastructure object or the worst case scenario where an analyst feels they need to model it as Observed Data and Infrastructure, leading to them being extremely frustrated with us and receiving analysts seeing 4 objects in their system for every IP (one Observed Data object, one embedded IP address object in the OD, one Infrastructure Object and one IP address object in the Infrastructure object (yes some systems could optimize how this data is viewed, but some will not)). I understand that we will be bastardizing the Observed Data object by doing this, but that’s another reason why making Cyber Observables TLOs is a better option in my view. Lastly, I would suggest an Infrastructure object. It would be similar to an Observed Data object but with more relationships associated with it. This would still allow us to complete the initial STIX model and provide the functionality that many users would expect to see within the model. I agree with Bret, Jane and many others on the crucial need to model infrastructure, but I wanted to provide some additional options for how we can meet that need. Apologies for the long email, -Gary From:
cti-stix@lists.oasis-open.org [ mailto:
cti-stix@lists.oasis-open.org ] On Behalf Of JG on CTI-TC Sent: Tuesday, November 7, 2017 7:19 PM To:
cti-stix@lists.oasis-open.org Subject: [Non-DoD Source] [cti-stix] Eight Arguments for an Infrastructure SDO for STIX 2.1 At present the CTI TC does not appear to be of one mind on the need for an Infrastructure SDO for 2.1. After months of debate in Slack and on the email list, after two intense working sessions at both the Bethesda, MD and Austin, TX face-to-face meetings, and after numerous discussions during working meetings and review of the working draft developed by Richard Struse and Bret Jordan, a Straw Man poll at Austin led to an almost even tie on whether or not to include an Infrastructure SDO in 2.1. I'm writing today to outline eight reasons why I believe we should seriously consider including an Infrastructure SDO for the STIX 2.1 release. Note that my view on the topic is as a threat hunter, educator, and analyst; therefore, I'll be relying on insights from the programmers, data architects, and MRTI aficionados to actually make it work. It will make the human-to-machine interface more effective during this period of rapid ecosystem expansion, ISAO/ISAC build-out, market/product definition and trust-building between private sector entities and law enforcement for critical infrastructure protection. 1. An argument has been made that the Indicator SDO could serve as a series of interconnected buckets for a malicious infrastructure, and that specific Cyber Observables could be linked to such Indicator to define a malicious infrastructure with a Boolean property indicating the goodness or badness of a set of interconnected Indicators. I believe this would not be a suitable approach for the following reasons: 1) it would overload the Indicator SDO which already suffers from overuse and misunderstanding; 2) relationships would have to be drawn through the Observed Data SDO to the specific Cyber Observables. Given how timestamps are handled this would add a layer of complexity that we could avoid with carefully designed properties on the Infrastructure SDO. 2. A wide range of SDOs and Cyber Observables will need to be strung together in an interrelated complex of potentially rapidly changing data elements by producers seeking to convey rich detail about observations, sightings, TTPs, malware, network effects, and cyber observables operating as a single unified entity with a single purpose. Once issued by a Producer, sightings of one or more SDOs or cyber observables associated with this multi-headed Hydra will enable other members of a sharing community to quickly assess kill chain phases or other clues on their own networks that may help expedite discovery. And when operating within a truly effective and skilled sharing community this could also lead to more rapid crowdsourced threat analysis with accompanying remediation recommendations. 3. Foundational literature on tradecraft in cyber threat analysis includes an Infrastructure vertex as part of the analytical toolset. I'm referring here to the Diamond Model (Caltagirone, 2013) which directly juxtaposes threat actor capabilities to the infrastructure he uses. The origins and utility of the Diamond Model within the analyst community stands on its own merits, regardless of the fact that the STIX2 data model has moved on from this foundational concept. 4. Advanced NoSQL graph database techniques are well suited to visualizing the interconnectedness of a malicious infrastructure expediting pattern recognition by human analysts seeking to perform higher level analysis and synthesis of STIX2 data. The power of this type of tooling should not be underestimated as we look towards the future of CTI and sharing communities. Indeed, such notable companies as OpenDNS (acquired by OASIS member CISCO) have used such visualizations to great success. Further, the use of data visualization techniques for enabling higher-order pattern recognition as a tool for analysis has been well documented by Tufte (2001), among others. Importantly, we need to build the Infrastructure SDO with sufficient metadata properties to enable these higher-order analytics. For example, it will be important to link back to Threat Actor SDOs within a boxed time-frame to move closer to attribution. 5. The larger global community of network defenders and cyber threat analysts are developing siloed versions of classification and enumeration systems for infrastructure as they are seeing it. However, we do not have a generally agreed upon system as we do have for malware (MAEC), exposures and vulnerabilities (CVE), and attack patterns (ATT&CK). By creating an Infrastructure SDO in STIX 2.1 we might be able to kick-start such a development. 6. One of the key insights gleaned during the Austin, TX face-to-face meeting was the need for more effective outreach and marketing to the broader CTI community, beyond those actively participating in OASIS. The addition of Infrastructure SDO will send a positive market signal to this broader community which may speed adoption. This is because the inclusion of such an object, in conjunction with the fully vetted Malware object, will convey a level of maturity of the STIX2 data model that heretofore has been lacking. The perception of a data model that is actually reflective of reality will greatly enhance the reputation build of this phase of the market innovation, adoption, diffusion and transformation cycle. 7. In research presented at the ENISA CTI Bonding event in Rome, Italy (ENISA, 2017) an analyst from CyberDefCon reported that the worst performing ASNs from its Shadowserver Foundation (2017) database over a multi-year period were AS29182 ISPSYSTEM (located in RU) and AS5577 ROOT (located in LU). This exemplifies how longitudinal data aggregated from proprietary and open sources can demonstrate that the Infrastructure of a large-scale operation can be used to identify bad actors at the Regional Registry level. Since one of the stated objectives of CTI is to facilitate public/private sharing this example shows how the research community can provide evidence that can be used by the jurisdictional law enforcement authorities for enforcement action. With an explicit “Infrastructure SDO” the evidentiary quality of the data for law enforcement can be improved. 8. During a Sports-ISAO sponsored Internship program run during the World Championship games in London in August 2017 a group of 60+ Interns from over 30 Universities across the U.S. working to support the program identified the “digital exhaust” of multiple attack patterns targeting sports organizations and the related sponsors of such. As a trainer for these novice threat hunters it was useful to provide visualizations of attack infrastructures to help them wrap their minds around the ideas of threat actors, campaigns, intrusion sets, indicators, cyber observables and other concepts we tried to capture in STIX2. I am able to generate such visualizations from several sources other than STIX. However, if I had had tangible evidence stemming from an Infrastructure SDO in STIX 2.X, the learning curve pedagogy would have been more streamlined. In summary, I needed the Infrastructure SDO in order to tie all of the pieces of the puzzle together. If any of these arguments, make sense to you please let your voice be heard so that we can expedite the build towards consensus before an official Ballot on STIX 2.1. Also note that I recognize that the STIX Subcommittee is seeking a more orderly scheduling of discussions around Version 2.1 SDOs. Therefore, I’m requesting that we reopen discussions on this object when it would fit into the existing schedule and SDO priorities. _______________________________________________________________________ References: Caltagirone, S., Pendergast, A., Betz, C. (2013, July 5). The Diamond Model of Intrusion Analysis.
http://www.dtic.mil/get-tr-doc/pdf?AD=ADA586960 ENISA (2017).
https://www.enisa.europa.eu/events/cti-eu-event/enisa-cti-eu-event Hutchins, E., Cloppert, M., Amin, R. (2011). Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains. Lockheed Martin. Shadowserver (2017).
https://www.shadowserver.org/wiki/pmwiki.php/Main/HomePage Tufte, E.R. (2001). The Visual Display of Quantitative Information (2 nd Ed.). Graphics Press: Cheshire, CT. -- Jane Ginn, MSIA, MRP CTI TC Secretary, OASIS Co-Founder of Cyber Threat Intelligence Network, Inc.
jg@ctin.us This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. Attachment: smime.p7s Description: S/MIME cryptographic signature