Hi All,
New to this to do not know the protocol. Just laying it out.
I find using observed data a bit lengthy option, especially when the number of object is high. That is usually the case with FireEye output for dynamic analysis.
I also think we are referencing objects in true sense and not observed data, although we can argue either way.
I had initially tried the “June 26 th proposal” with something like the one below. The issue was – we had to reference objects with their individual key and not by observed_data id, as there were many
objects bundled in one observed data. IF there is a single observed data objects in the report, we can ignore the observed data id for all practical purposes. The point is, if we leave data and time out, drop the id (or ignore it for all practical purposes),
drop numbers observed, what is left in observed-data.
I do think all options solve my problem and give me a cleaner way to reference objects (except the current model) but am leaning in favor of top level observables model.
{
"type": "observed-data",
"spec_version": "2.1",
"id": "observed-data--b67d30ff-02ac-498a-92f9-32f845f448cf",
"created_by_ref": "identity--f431f809-377b-45e0-aa1c-6a4751cae5ff",
"created": "2016-04-06T19:58:16.000Z",
"modified": "2016-04-06T19:58:16.000Z",
"first_observed": "2015-12-21T19:00:00Z",
"last_observed": "2015-12-21T19:00:00Z",
"number_observed": 50,
"objects": {
"0":{
"type":"file",
"name":"1234.jse",
}
Thanks
Subodh Kumar
From:
cti@lists.oasis-open.org [mailto
cti@lists.oasis-open.org ]
On Behalf Of Kirillov, Ivan A.
Sent: Friday, June 29, 2018 11:26 AM
To:
cti@lists.oasis-open.org Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object
To continue the discussion on the capture of Cyber Observables as part of the Malware SDO, I’ve attached 3 example JSON instances outlining the various approaches that we’ve looked at:
current: the approach currently defined in the STIX 2.1 specification, using observable-objects dictionaries nested as dictionary values.
Positives: simplifies parsing, as objects are always embedded, so no need for dereferencing. Negatives: complicated data model/specification (some values are object dictionaries, some are not), different design pattern than used elsewhere.
top_level_observables: the approach I had proposed on the June 19 working call of embedding all observables in a property at the top level of the Malware SDO, and then referencing them elsewhere via their
ID.
Positives: allows object re-use. Negatives: makes parsing more difficult, different design pattern than used elsewhere.
observed_data: the approach that Gary Katz and Jeff Mates presented on the June 26 th working call of capturing all observables in Observed Data SDOs that are referenced accordingly.
Positives: re-uses existing object (Observed Data), which results in less effort on the part of consumers and producers to use since they already support it. Negatives: Observed Data contains other required properties which may not be suitable for this context (number_observed, etc.), results in significantly larger JSON representations.
As you’ll see, one of the downsides of the Observed Data-based approach is that each individual Cyber Observable object (file, software, et.) has to be captured in its own Observed Data SDO per the current language in the specification
(i.e., a single Observed Data cannot capture multiple unrelated objects). This means that this approach will significantly increase the size of the JSON that we’ll need to generate for Malware SDOs that make use of many cyber observables. The other issue with
using Observed Data here is that “first_observed/last_observed” and “number_observed” are rather meaningless here, since these are non-traditional observations; in my example, I set “first_observed/last_observed” to the same timestamp as “created/modified”
and “number_observed” I always set to 1.
Another option we discussed briefly at the June 26 th call was to create a new “Observed-data like” SDO that could be capture multiple objects and be better suited for use cases such as these. It seems like this would essentially
be identical to the Observed Data SDO but without the first_observed/last_observed/number_observed properties.
Let me know your thoughts and preferences as far as these approaches – personally I’m rather torn, as I don’t see a clear winner here. Also, since this issue is currently holding up the release of STIX 2.1 CSD01 with no immediate resolution
in sight, I think we need to seriously consider whether we should include these Malware SDO updates in CSD01 or instead push them out to CSD02.
Regards,
Ivan
From: <
cti@lists.oasis-open.org > on behalf of Ivan Kirillov <
ikirillov@mitre.org >
Date: Tuesday, June 19, 2018 at 1:02 PM
To: Sean Barnum <
sean.barnum@FireEye.com >, Bret Jordan <
Bret_Jordan@symantec.com >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object
That seems reasonable to me – I’ll bring it up on the working call. Thanks!
-Ivan
From: Sean Barnum <
sean.barnum@FireEye.com >
Date: Tuesday, June 19, 2018 at 12:54 PM
To: Ivan Kirillov <
ikirillov@mitre.org >, Bret Jordan <
Bret_Jordan@symantec.com >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object
Yes, that is basically what I am proposing.
Something along the lines of:
result (required)
string
The classification result or name assigned to the malware instance by the AV scanner tool.
If no resulting context-specific classification value or name is provided by the AV scanner tool then
the result SHOULD come from the
av-result-general-ov open vocabulary.
where
av-result-general-ov is something like “malicious”, “suspicious”, “benign”, “unknown”, “error”
Sean Barnum
Principal Architect
FireEye
M: 703.473.8262
E:
sean.barnum@fireeye.com From: "Kirillov, Ivan A." <
ikirillov@mitre.org >
Date: Tuesday, June 19, 2018 at 2:36 PM
To: Sean Barnum <
sean.barnum@FireEye.com >, Bret Jordan <
Bret_Jordan@symantec.com >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object
Thanks Sean - no worries about the delayed reply. So as far as 2), are you suggesting that we make “results” required and that it can capture either the actual result or something more generic (e.g., malicious/benign/etc.) that could come
from a vocabulary? I do agree with you that the current language around “results” being not required if there is no result is rather confusing and I would also rather make it required in all cases.
Regards,
Ivan
From: Sean Barnum <
sean.barnum@FireEye.com >
Date: Tuesday, June 19, 2018 at 8:35 AM
To: Bret Jordan <
Bret_Jordan@symantec.com >, Ivan Kirillov <
ikirillov@mitre.org >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object
Sorry for the delayed response, Ivan.
This week I am actually in the midst of working through some significant evolution on our Malware object and its use.
I plan to attend today’s working call but am not sure what level of definitive opinions I will be ready to offer by then on very specific details. If not on today’s call we still should hopefully be able to offer some constructive input
this week.
On your two items that started this thread I can offer the following though:
FireEye would definitely support consolidating the _expression_ of the observables into a single location and referencing those from the various other places as appropriate. This is MUCH cleaner, simpler and
more resilient. I would agree that it makes sense to make “product” required as it does not really make sense to capture/convey an av_results entry where you don’t convey which product was used. I would disagree with making
“scanned” required. There are not uncommon use cases where you may wish to convey that samples were scanned with particular AV but specifically do not want to expose when those scans occurred as it exposes details of when you knew about them. Lastly, I would
suggest that we modify the current definition of “result” slightly and make it required. Currently, the definition allows the lack of the “results” property to
imply that the scan was successfully completed but did not classify the sample as malicious. This sort of
implication seems to present significant risk of confusion. Rather, I propose that the “result” property be defined to explicitly convey the result of the scan (whether malicious or otherwise) and that it be required. We could also define a simple vocab
for general results that could apply across any scanners (e.g. “malicious”, “suspicious”, “benign”, “unknown”, “error”, etc). Looking across the full set of properties currently in av-results-type, the two properties that seem to be necessary (any av-results
instance would not really make any sense or be of value without them) are product and results. Telling people that a scan occurred (even if all the other details are included) but not saying which product was used is not very useful. The “scan” could have
been my 5 year old niece looking at the file. Similarly, telling people that the sample was scanned (even if all the other details are included) but not saying the result of the scan is not very useful.
Sean Barnum
Principal Architect
FireEye
M: 703.473.8262
E:
sean.barnum@fireeye.com From: <
cti@lists.oasis-open.org > on behalf of Bret Jordan <
Bret_Jordan@symantec.com >
Date: Monday, June 18, 2018 at 5:04 PM
To: "Kirillov, Ivan A." <
ikirillov@mitre.org >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Subject: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object
I will try and review this change this week.
Bret
From:
cti@lists.oasis-open.org <
cti@lists.oasis-open.org > on behalf of Kirillov, Ivan A. <
ikirillov@mitre.org >
Sent: Monday, June 18, 2018 10:26:15 AM
To:
cti@lists.oasis-open.org Subject: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object
Are there any other thoughts on these topics? It would be great to close them out so we can finish up CSD01 of STIX 2.1.
Regards,
Ivan
From: <
cti@lists.oasis-open.org > on behalf of Ivan Kirillov <
ikirillov@mitre.org >
Date: Wednesday, June 13, 2018 at 2:47 PM
To: Allan Thomson <
athomson@lookingglasscyber.com >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Cc: "Kumar, Subodh" <
subodh.kumar@jpmorgan.com >
Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object
Sorry, that should read “Conversely, parsing the SDO
may become more difficult because…”
Regards,
Ivan
From: Ivan Kirillov <
ikirillov@mitre.org >
Date: Wednesday, June 13, 2018 at 2:44 PM
To: Allan Thomson <
athomson@lookingglasscyber.com >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Cc: "Kumar, Subodh" <
subodh.kumar@jpmorgan.com >
Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object
Hi Allan,
This approach doesn’t fundamentally change how we capture static/dynamic analysis data, but rather where and how the Cyber Observable Objects that correspond to that data are stored. If you have multiple
observables from different analyses, you’ll just reference their corresponding objects that are stored in the “observable_objects” dictionary (which may or may not be the same objects across different analyses).
As far as being easier, it’s kind of a wash – it may simplify the generation of content because any Cyber Observable Objects would have to be stored in this top-level dictionary. Conversely, parsing the SDO because you’ll have to dereference the objects as
you come across their usage. However, I do think that the simplification to the data model and the ability to re-use objects are worthwhile changes.
Regards,
Ivan
From: Allan Thomson <
athomson@lookingglasscyber.com >
Date: Wednesday, June 13, 2018 at 2:22 PM
To: Ivan Kirillov <
ikirillov@mitre.org >, "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Cc: "Kumar, Subodh" <
subodh.kumar@jpmorgan.com >
Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object
Ivan – regarding 1.
What if I have multiple observables for the same malware from different analysis (i.e. static + dynamic results).
Would consolidating them into a single place really make it easier? You would still want to indicate that you have a list of observables and indicate where those were ‘observed’ from either static or dynamic
or other.
So I’m not sure consolidating it makes it easier but so long as the same things are possible with the consolidated design then I don’t have a strong preference either way.
Allan Thomson
CTO ( +1-408-331-6646)
LookingGlass
Cyber Solutions
From: "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org > on behalf of "Kirillov,
Ivan" <
ikirillov@mitre.org >
Date: Wednesday, June 13, 2018 at 12:57 PM
To: "
cti@lists.oasis-open.org " <
cti@lists.oasis-open.org >
Cc: "Kumar, Subodh" <
subodh.kumar@jpmorgan.com >
Subject: [cti] Finalizing the STIX 2.1 Malware Object
All,
As we’re wrapping up work on STIX 2.1 CSD01, we need to finalize what we have for the updated Malware SDO. Accordingly, I have two topics I’d like to bring up in this regard:
Thanks to the work done by Subodh Kumar and his team, I’ve been wondering if there’s a better way to capture the Observable Objects associated with the Malware SDO. Right now, there are three places where
you may encounter a Cyber Observable Object: samples (a dict of observable objects), static_analysis_results/results (certain keys have a corresponding dict of observable objects), dynamic_analysis_results/results (each key has a corresponding dict of observable
objects).
Instead of having these observable object dictionaries all over the place, I believe it would make more sense to have a single property at the top level of the object (let’s call
it “observable_objects”), where any Cyber Observable Objects associated with the SDO (samples, analysis results, etc.) could be captured, via references. There are a number of advantages to this: a simpler data model (less embedded observable object dicts
everywhere), the ability to re-use objects (e.g., if static and dynamic analysis find the same objects, you can create one object and just reference it accordingly), and a more compact serialization. See the attached JSON example for what this looks like in
practice – this is a modified version of the “Malware Instance with Analysis Data” example currently in the 2.1 spec.
Currently, the “av-results-type”, used to capture AV classification results, has only optional properties and the text specifies that at least one must be included. This allows you to construct some odd,
but spec-valid instances, such as an AV classification with only the engine version. In order to make this type more useful, I’d suggest that we make “product” (the name of the tool performing the scan) and “scanned” (the date/time the scan occurred) required,
so that way you’ll at least have this minimum set of useful data for each instance. In addition, we should probably add some text stating that the “result” property (the actual AV classification result, e.g., “Trojan.Zeus”) must be included if the tool reports
some classification during the scan.
Let me know what you think – if we can get these final things wrapped up, we’re that much closer to getting STIX 2.1 out the door.
Regards,
Ivan
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is
strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is
strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
This message is confidential and subject to terms at: http://
www.jpmorgan.com/emaildisclaimer including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.