OASIS Cyber Threat Intelligence (CTI) TC

 View Only
  • 1.  Re: [cti] Finalizing the STIX 2.1 Malware Object

    Posted 06-29-2018 15:26




    To continue the discussion on the capture of Cyber Observables as part of the Malware SDO, I’ve attached 3 example JSON instances outlining the various approaches that we’ve looked at:
     

    current: the approach currently defined in the STIX 2.1 specification, using observable-objects dictionaries nested as dictionary values.

    Positives: simplifies parsing, as objects are always embedded, so no need for dereferencing. Negatives: complicated data model/specification (some values are object dictionaries, some are not), different design pattern than used elsewhere.
    top_level_observables: the approach I had proposed on the June 19 working call of embedding all observables in a property at the top level of the Malware SDO, and then referencing
    them elsewhere via their ID.

    Positives: allows object re-use. Negatives: makes parsing more difficult, different design pattern than used elsewhere.
    observed_data: the approach that Gary Katz and Jeff Mates presented on the June 26 th working call of capturing all observables in Observed Data SDOs that are referenced
    accordingly.

    Positives: re-uses existing object (Observed Data), which results in less effort on the part of consumers and producers to use since they already support it. Negatives: Observed Data contains other required properties which may not be suitable for this context (number_observed, etc.), results in significantly larger JSON representations.

     
    As you’ll see, one of the downsides of the Observed Data-based approach is that each individual Cyber Observable object (file, software, et.) has to be captured in its own Observed Data SDO per the current language in the specification
    (i.e., a single Observed Data cannot capture multiple unrelated objects). This means that this approach will significantly increase the size of the JSON that we’ll need to generate for Malware SDOs that make use of many cyber observables. The other issue with
    using Observed Data here is that “first_observed/last_observed” and “number_observed” are rather meaningless here, since these are non-traditional observations; in my example, I set “first_observed/last_observed” to the same timestamp as “created/modified”
    and “number_observed” I always set to 1.
     
    Another option we discussed briefly at the June 26 th call was to create a new “Observed-data like” SDO that could be capture multiple objects and be better suited for use cases such as these. It seems like this would essentially
    be identical to the Observed Data SDO but without the first_observed/last_observed/number_observed properties.

     
    Let me know your thoughts and preferences as far as these approaches – personally I’m rather torn, as I don’t see a clear winner here. Also, since this issue is currently holding up the release of STIX 2.1 CSD01 with no immediate resolution
    in sight, I think we need to seriously consider whether we should include these Malware SDO updates in CSD01 or instead push them out to CSD02.
     
    Regards,
    Ivan
     

    From: <cti@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org>
    Date: Tuesday, June 19, 2018 at 1:02 PM
    To: Sean Barnum <sean.barnum@FireEye.com>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    That seems reasonable to me – I’ll bring it up on the working call. Thanks!
     
    -Ivan
     

    From: Sean Barnum <sean.barnum@FireEye.com>
    Date: Tuesday, June 19, 2018 at 12:54 PM
    To: Ivan Kirillov <ikirillov@mitre.org>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Yes, that is basically what I am proposing.
     
    Something along the lines of:
     
     




    result (required)


    string


    The classification result or name assigned to the malware instance by the AV scanner tool.
     
    If no resulting context-specific classification value or name is provided by the AV scanner tool then

    the result SHOULD come from the
    av-result-general-ov open vocabulary.




     
    where
    av-result-general-ov   is something like “malicious”, “suspicious”, “benign”, “unknown”, “error”
     
     

    Sean Barnum
    Principal Architect
    FireEye
    M: 703.473.8262

    E: sean.barnum@fireeye.com
     

    From: "Kirillov, Ivan A." <ikirillov@mitre.org>
    Date: Tuesday, June 19, 2018 at 2:36 PM
    To: Sean Barnum <sean.barnum@FireEye.com>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Thanks Sean - no worries about the delayed reply. So as far as 2), are you suggesting that we make “results” required and that it can capture either the actual result or something more generic (e.g., malicious/benign/etc.) that could come
    from a vocabulary? I do agree with you that the current language around “results” being not required if there is no result is rather confusing and I would also rather make it required in all cases.
     
    Regards,
    Ivan
     

    From: Sean Barnum <sean.barnum@FireEye.com>
    Date: Tuesday, June 19, 2018 at 8:35 AM
    To: Bret Jordan <Bret_Jordan@symantec.com>, Ivan Kirillov <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Sorry for the delayed response, Ivan.
    This week I am actually in the midst of working through some significant evolution on our Malware object and its use.
    I plan to attend today’s working call but am not sure what level of definitive opinions I will be ready to offer by then on very specific details. If not on today’s call we still should hopefully be able to offer some constructive input
    this week.
     
    On your two items that started this thread I can offer the following though:

    FireEye would definitely support consolidating the _expression_ of the observables into a single location and referencing those from the various other places as appropriate. This is MUCH cleaner, simpler and
    more resilient. I would agree that it makes sense to make “product” required as it does not really make sense to capture/convey an av_results entry where you don’t convey which product was used. I would disagree with making
    “scanned” required. There are not uncommon use cases where you may wish to convey that samples were scanned with particular AV but specifically do not want to expose when those scans occurred as it exposes details of when you knew about them. Lastly, I would
    suggest that we modify the current definition of “result” slightly and make it required. Currently, the definition allows the lack of the “results” property to
    imply that the scan was successfully completed but did not classify the sample as malicious. This sort of
    implication seems to present significant risk of confusion. Rather, I propose that the “result” property be defined to explicitly convey the result of the scan (whether malicious or otherwise) and that it be required. We could also define a simple vocab
    for general results that could apply across any scanners (e.g. “malicious”, “suspicious”, “benign”, “unknown”, “error”, etc). Looking across the full set of properties currently in av-results-type, the two properties that seem to be necessary (any av-results
    instance would not really make any sense or be of value without them) are product and results. Telling people that a scan occurred (even if all the other details are included) but not saying which product was used is not very useful. The “scan” could have
    been my 5 year old niece looking at the file. Similarly, telling people that the sample was scanned (even if all the other details are included) but not saying the result of the scan is not very useful.
     

    Sean Barnum
    Principal Architect
    FireEye
    M: 703.473.8262

    E: sean.barnum@fireeye.com
     

    From: <cti@lists.oasis-open.org> on behalf of Bret Jordan <Bret_Jordan@symantec.com>
    Date: Monday, June 18, 2018 at 5:04 PM
    To: "Kirillov, Ivan A." <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     


    I will try and review this change this week.
     
    Bret





    From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Kirillov, Ivan A. <ikirillov@mitre.org>
    Sent: Monday, June 18, 2018 10:26:15 AM
    To: cti@lists.oasis-open.org
    Subject: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     




    Are there any other thoughts on these topics? It would be great to close them out so we can finish up CSD01 of STIX 2.1.
     
    Regards,
    Ivan
     

    From: <cti@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org>
    Date: Wednesday, June 13, 2018 at 2:47 PM
    To: Allan Thomson <athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com>
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Sorry, that should read “Conversely, parsing the SDO
    may become more difficult because…”
     
    Regards,
    Ivan
     

    From: Ivan Kirillov <ikirillov@mitre.org>
    Date: Wednesday, June 13, 2018 at 2:44 PM
    To: Allan Thomson <athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com>
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Hi Allan,
     
    This approach doesn’t fundamentally change how we capture static/dynamic analysis data, but rather where and how the Cyber Observable Objects that correspond to that data are stored. If you have multiple
    observables from different analyses, you’ll just reference their corresponding objects that are stored in the “observable_objects” dictionary (which may or may not be the same objects across different analyses).


    As far as being easier, it’s kind of a wash – it may simplify the generation of content because any Cyber Observable Objects would have to be stored in this top-level dictionary. Conversely, parsing the SDO because you’ll have to dereference the objects as
    you come across their usage. However, I do think that the simplification to the data model and the ability to re-use objects are worthwhile changes.
     
    Regards,
    Ivan
     

    From: Allan Thomson <athomson@lookingglasscyber.com>
    Date: Wednesday, June 13, 2018 at 2:22 PM
    To: Ivan Kirillov <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com>
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Ivan – regarding 1.
     
    What if I have multiple observables for the same malware from different analysis (i.e. static + dynamic results).
     
    Would consolidating them into a single place really make it easier? You would still want to indicate that you have a list of observables and indicate where those were ‘observed’ from either static or dynamic
    or other.
     
    So I’m not sure consolidating it makes it easier but so long as the same things are possible with the consolidated design then I don’t have a strong preference either way.
     

    Allan Thomson
    CTO ( +1-408-331-6646)

    LookingGlass
    Cyber Solutions
     

    From: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> on behalf of "Kirillov, Ivan" <ikirillov@mitre.org>
    Date: Wednesday, June 13, 2018 at 12:57 PM
    To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com>
    Subject: [cti] Finalizing the STIX 2.1 Malware Object


     

    All,
     
    As we’re wrapping up work on STIX 2.1 CSD01, we need to finalize what we have for the updated Malware SDO. Accordingly, I have two topics I’d like to bring up in this regard:
     

    Thanks to the work done by Subodh Kumar and his team, I’ve been wondering if there’s a better way to capture the Observable Objects associated with the Malware SDO. Right now, there are three places where
    you may encounter a Cyber Observable Object: samples (a dict of observable objects), static_analysis_results/results (certain keys have a corresponding dict of observable objects), dynamic_analysis_results/results (each key has a corresponding dict of observable
    objects).  
     
    Instead of having these observable object dictionaries all over the place, I believe it would make more sense to have a single property at the top level of the object (let’s call
    it “observable_objects”), where any Cyber Observable Objects associated with the SDO (samples, analysis results, etc.) could be captured, via references. There are a number of advantages to this: a simpler data model (less embedded observable object dicts
    everywhere), the ability to re-use objects (e.g., if static and dynamic analysis find the same objects, you can create one object and just reference it accordingly), and a more compact serialization. See the attached JSON example for what this looks like in
    practice – this is a modified version of the “Malware Instance with Analysis Data” example currently in the 2.1 spec.
     

    Currently, the “av-results-type”, used to capture AV classification results, has only optional properties and the text specifies that at least one must be included. This allows you to construct some odd,
    but spec-valid instances, such as an AV classification with only the engine version. In order to make this type more useful, I’d suggest that we make “product” (the name of the tool performing the scan) and “scanned” (the date/time the scan occurred) required,
    so that way you’ll at least have this minimum set of useful data for each instance. In addition, we should probably add some text stating that the “result” property (the actual AV classification result, e.g., “Trojan.Zeus”) must be included if the tool reports
    some classification during the scan.
     
    Let me know what you think – if we can get these final things wrapped up, we’re that much closer to getting STIX 2.1 out the door.
     
    Regards,
    Ivan


    This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is
    strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

    This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is
    strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.




    Attachment: malware_example_current.json Description: malware_example_current.json Attachment: malware_example_observed_data.json Description: malware_example_observed_data.json Attachment: malware_example_top_level_observables.json Description: malware_example_top_level_observables.json


  • 2.  RE: [cti] Finalizing the STIX 2.1 Malware Object

    Posted 06-29-2018 19:42
      |   view attached
    Thanks for putting this together.  I don’t think it quite captured what Gary and I had proposed however as we actually used single “ref” relationships to observed data elements rather than “refs”.  This means that we can capture the entire static or dynamic analysis result as a single graph.   While it is still a bit larger than the other files 8,175 bytes vs the current 5,006 bytes.  It is a fair bit smaller than the original version at 11,936 bytes while still following existing rules and allowing for a potentially greater degree of fidelity.   I’ve attached an example of what this looks like, although I did sort of cheat by throwing in an “action_extension” for the file object as I couldn’t find any way to say that one file created another.  I also included an image showing how this functions as a graph.   If you received this already, I apologize.  I’ve been having email issues today.     Jeffrey Mates, Civ DC3/DCCI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Computer Scientist Defense Cyber Crime Institute jeffrey.mates@dc3.mil 410-694-4335   From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> On Behalf Of Kirillov, Ivan A. Sent: Friday, June 29, 2018 11:26 AM To: cti@lists.oasis-open.org Subject: [Non-DoD Source] Re: [cti] Finalizing the STIX 2.1 Malware Object   To continue the discussion on the capture of Cyber Observables as part of the Malware SDO, I’ve attached 3 example JSON instances outlining the various approaches that we’ve looked at:   current: the approach currently defined in the STIX 2.1 specification, using observable-objects dictionaries nested as dictionary values. Positives: simplifies parsing, as objects are always embedded, so no need for dereferencing. Negatives: complicated data model/specification (some values are object dictionaries, some are not), different design pattern than used elsewhere. top_level_observables: the approach I had proposed on the June 19 working call of embedding all observables in a property at the top level of the Malware SDO, and then referencing them elsewhere via their ID. Positives: allows object re-use. Negatives: makes parsing more difficult, different design pattern than used elsewhere. observed_data: the approach that Gary Katz and Jeff Mates presented on the June 26 th working call of capturing all observables in Observed Data SDOs that are referenced accordingly. Positives: re-uses existing object (Observed Data), which results in less effort on the part of consumers and producers to use since they already support it. Negatives: Observed Data contains other required properties which may not be suitable for this context (number_observed, etc.), results in significantly larger JSON representations.   As you’ll see, one of the downsides of the Observed Data-based approach is that each individual Cyber Observable object (file, software, et.) has to be captured in its own Observed Data SDO per the current language in the specification (i.e., a single Observed Data cannot capture multiple unrelated objects). This means that this approach will significantly increase the size of the JSON that we’ll need to generate for Malware SDOs that make use of many cyber observables. The other issue with using Observed Data here is that “first_observed/last_observed” and “number_observed” are rather meaningless here, since these are non-traditional observations; in my example, I set “first_observed/last_observed” to the same timestamp as “created/modified” and “number_observed” I always set to 1.   Another option we discussed briefly at the June 26 th call was to create a new “Observed-data like” SDO that could be capture multiple objects and be better suited for use cases such as these. It seems like this would essentially be identical to the Observed Data SDO but without the first_observed/last_observed/number_observed properties.   Let me know your thoughts and preferences as far as these approaches – personally I’m rather torn, as I don’t see a clear winner here. Also, since this issue is currently holding up the release of STIX 2.1 CSD01 with no immediate resolution in sight, I think we need to seriously consider whether we should include these Malware SDO updates in CSD01 or instead push them out to CSD02.   Regards, Ivan   From: <cti@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org> Date: Tuesday, June 19, 2018 at 1:02 PM To: Sean Barnum <sean.barnum@FireEye.com>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   That seems reasonable to me – I’ll bring it up on the working call. Thanks!   -Ivan   From: Sean Barnum <sean.barnum@FireEye.com> Date: Tuesday, June 19, 2018 at 12:54 PM To: Ivan Kirillov <ikirillov@mitre.org>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Yes, that is basically what I am proposing.   Something along the lines of:     result (required) string The classification result or name assigned to the malware instance by the AV scanner tool.   If no resulting context-specific classification value or name is provided by the AV scanner tool then the result SHOULD come from the av-result-general-ov open vocabulary.   where av-result-general-ov   is something like “malicious”, “suspicious”, “benign”, “unknown”, “error”     Sean Barnum Principal Architect FireEye M: 703.473.8262 E: sean.barnum@fireeye.com   From: "Kirillov, Ivan A." <ikirillov@mitre.org> Date: Tuesday, June 19, 2018 at 2:36 PM To: Sean Barnum <sean.barnum@FireEye.com>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Thanks Sean - no worries about the delayed reply. So as far as 2), are you suggesting that we make “results” required and that it can capture either the actual result or something more generic (e.g., malicious/benign/etc.) that could come from a vocabulary? I do agree with you that the current language around “results” being not required if there is no result is rather confusing and I would also rather make it required in all cases.   Regards, Ivan   From: Sean Barnum <sean.barnum@FireEye.com> Date: Tuesday, June 19, 2018 at 8:35 AM To: Bret Jordan <Bret_Jordan@symantec.com>, Ivan Kirillov <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Sorry for the delayed response, Ivan. This week I am actually in the midst of working through some significant evolution on our Malware object and its use. I plan to attend today’s working call but am not sure what level of definitive opinions I will be ready to offer by then on very specific details. If not on today’s call we still should hopefully be able to offer some constructive input this week.   On your two items that started this thread I can offer the following though: FireEye would definitely support consolidating the _expression_ of the observables into a single location and referencing those from the various other places as appropriate. This is MUCH cleaner, simpler and more resilient. I would agree that it makes sense to make “product” required as it does not really make sense to capture/convey an av_results entry where you don’t convey which product was used. I would disagree with making “scanned” required. There are not uncommon use cases where you may wish to convey that samples were scanned with particular AV but specifically do not want to expose when those scans occurred as it exposes details of when you knew about them. Lastly, I would suggest that we modify the current definition of “result” slightly and make it required. Currently, the definition allows the lack of the “results” property to imply that the scan was successfully completed but did not classify the sample as malicious. This sort of implication seems to present significant risk of confusion. Rather, I propose that the “result” property be defined to explicitly convey the result of the scan (whether malicious or otherwise) and that it be required. We could also define a simple vocab for general results that could apply across any scanners (e.g. “malicious”, “suspicious”, “benign”, “unknown”, “error”, etc). Looking across the full set of properties currently in av-results-type, the two properties that seem to be necessary (any av-results instance would not really make any sense or be of value without them) are product and results. Telling people that a scan occurred (even if all the other details are included) but not saying which product was used is not very useful. The “scan” could have been my 5 year old niece looking at the file. Similarly, telling people that the sample was scanned (even if all the other details are included) but not saying the result of the scan is not very useful.   Sean Barnum Principal Architect FireEye M: 703.473.8262 E: sean.barnum@fireeye.com   From: <cti@lists.oasis-open.org> on behalf of Bret Jordan <Bret_Jordan@symantec.com> Date: Monday, June 18, 2018 at 5:04 PM To: "Kirillov, Ivan A." <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   I will try and review this change this week.   Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Kirillov, Ivan A. <ikirillov@mitre.org> Sent: Monday, June 18, 2018 10:26:15 AM To: cti@lists.oasis-open.org Subject: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Are there any other thoughts on these topics? It would be great to close them out so we can finish up CSD01 of STIX 2.1.   Regards, Ivan   From: <cti@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org> Date: Wednesday, June 13, 2018 at 2:47 PM To: Allan Thomson <athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object   Sorry, that should read “Conversely, parsing the SDO may become more difficult because…”   Regards, Ivan   From: Ivan Kirillov <ikirillov@mitre.org> Date: Wednesday, June 13, 2018 at 2:44 PM To: Allan Thomson <athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object   Hi Allan,   This approach doesn’t fundamentally change how we capture static/dynamic analysis data, but rather where and how the Cyber Observable Objects that correspond to that data are stored. If you have multiple observables from different analyses, you’ll just reference their corresponding objects that are stored in the “observable_objects” dictionary (which may or may not be the same objects across different analyses). As far as being easier, it’s kind of a wash – it may simplify the generation of content because any Cyber Observable Objects would have to be stored in this top-level dictionary. Conversely, parsing the SDO because you’ll have to dereference the objects as you come across their usage. However, I do think that the simplification to the data model and the ability to re-use objects are worthwhile changes.   Regards, Ivan   From: Allan Thomson <athomson@lookingglasscyber.com> Date: Wednesday, June 13, 2018 at 2:22 PM To: Ivan Kirillov <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object   Ivan – regarding 1.   What if I have multiple observables for the same malware from different analysis (i.e. static + dynamic results).   Would consolidating them into a single place really make it easier? You would still want to indicate that you have a list of observables and indicate where those were ‘observed’ from either static or dynamic or other.   So I’m not sure consolidating it makes it easier but so long as the same things are possible with the consolidated design then I don’t have a strong preference either way.   Allan Thomson CTO ( +1-408-331-6646) LookingGlass Cyber Solutions   From: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> on behalf of "Kirillov, Ivan" <ikirillov@mitre.org> Date: Wednesday, June 13, 2018 at 12:57 PM To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: [cti] Finalizing the STIX 2.1 Malware Object   All,   As we’re wrapping up work on STIX 2.1 CSD01, we need to finalize what we have for the updated Malware SDO. Accordingly, I have two topics I’d like to bring up in this regard:   Thanks to the work done by Subodh Kumar and his team, I’ve been wondering if there’s a better way to capture the Observable Objects associated with the Malware SDO. Right now, there are three places where you may encounter a Cyber Observable Object: samples (a dict of observable objects), static_analysis_results/results (certain keys have a corresponding dict of observable objects), dynamic_analysis_results/results (each key has a corresponding dict of observable objects).     Instead of having these observable object dictionaries all over the place, I believe it would make more sense to have a single property at the top level of the object (let’s call it “observable_objects”), where any Cyber Observable Objects associated with the SDO (samples, analysis results, etc.) could be captured, via references. There are a number of advantages to this: a simpler data model (less embedded observable object dicts everywhere), the ability to re-use objects (e.g., if static and dynamic analysis find the same objects, you can create one object and just reference it accordingly), and a more compact serialization. See the attached JSON example for what this looks like in practice – this is a modified version of the “Malware Instance with Analysis Data” example currently in the 2.1 spec.   Currently, the “av-results-type”, used to capture AV classification results, has only optional properties and the text specifies that at least one must be included. This allows you to construct some odd, but spec-valid instances, such as an AV classification with only the engine version. In order to make this type more useful, I’d suggest that we make “product” (the name of the tool performing the scan) and “scanned” (the date/time the scan occurred) required, so that way you’ll at least have this minimum set of useful data for each instance. In addition, we should probably add some text stating that the “result” property (the actual AV classification result, e.g., “Trojan.Zeus”) must be included if the tool reports some classification during the scan.   Let me know what you think – if we can get these final things wrapped up, we’re that much closer to getting STIX 2.1 out the door.   Regards, Ivan This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. Attachment: malware_example_observed_data.json Description: Binary data Attachment: STIX Malware Example.png Description: PNG image Attachment: smime.p7s Description: S/MIME cryptographic signature


  • 3.  RE: [cti] Finalizing the STIX 2.1 Malware Object

    Posted 06-29-2018 21:34




    Hi All,
    New to this to do not know the protocol. Just laying it out.
     
    I find using observed data a bit lengthy option, especially when the number of object is high. That is usually the case with FireEye output for dynamic analysis.
    I also think we are referencing objects in true sense and not observed data, although we can argue either way.
    I had initially tried the “June 26 th proposal” with something like the one below. The issue was – we had to reference objects with their individual key and not by observed_data id, as there were many
    objects bundled in one observed data. IF there is a single observed data objects in the report, we can ignore the observed data id for all practical purposes. The point is, if we leave data and time out, drop the id (or ignore it for all practical purposes),
    drop numbers observed, what is  left in observed-data.
     
    I do think all options solve my problem and give me a cleaner way to reference objects (except the current model) but am leaning in favor of top level observables model.
     
    {
      "type": "observed-data",
      "spec_version": "2.1",
      "id": "observed-data--b67d30ff-02ac-498a-92f9-32f845f448cf",
      "created_by_ref": "identity--f431f809-377b-45e0-aa1c-6a4751cae5ff",
      "created": "2016-04-06T19:58:16.000Z",
      "modified": "2016-04-06T19:58:16.000Z",
      "first_observed": "2015-12-21T19:00:00Z",
      "last_observed": "2015-12-21T19:00:00Z",
      "number_observed": 50,
      "objects": {
            "0":{
                                      "type":"file",
                                      "name":"1234.jse",
                                    }
     
    Thanks

    Subodh Kumar

     


    From: cti@lists.oasis-open.org [mailto cti@lists.oasis-open.org ]
    On Behalf Of Kirillov, Ivan A.
    Sent: Friday, June 29, 2018 11:26 AM
    To: cti@lists.oasis-open.org
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     
    To continue the discussion on the capture of Cyber Observables as part of the Malware SDO, I’ve attached 3 example JSON instances outlining the various approaches that we’ve looked at:
     

    current: the approach currently defined in the STIX 2.1 specification, using observable-objects dictionaries nested as dictionary values.


    Positives: simplifies parsing, as objects are always embedded, so no need for dereferencing. Negatives: complicated data model/specification (some values are object dictionaries, some are not), different design pattern than used elsewhere.
    top_level_observables: the approach I had proposed on the June 19 working call of embedding all observables in a property at the top level of the Malware SDO, and then referencing them elsewhere via their
    ID.

    Positives: allows object re-use. Negatives: makes parsing more difficult, different design pattern than used elsewhere.
    observed_data: the approach that Gary Katz and Jeff Mates presented on the June 26 th working call of capturing all observables in Observed Data SDOs that are referenced accordingly.


    Positives: re-uses existing object (Observed Data), which results in less effort on the part of consumers and producers to use since they already support it. Negatives: Observed Data contains other required properties which may not be suitable for this context (number_observed, etc.), results in significantly larger JSON representations.

     
    As you’ll see, one of the downsides of the Observed Data-based approach is that each individual Cyber Observable object (file, software, et.) has to be captured in its own Observed Data SDO per the current language in the specification
    (i.e., a single Observed Data cannot capture multiple unrelated objects). This means that this approach will significantly increase the size of the JSON that we’ll need to generate for Malware SDOs that make use of many cyber observables. The other issue with
    using Observed Data here is that “first_observed/last_observed” and “number_observed” are rather meaningless here, since these are non-traditional observations; in my example, I set “first_observed/last_observed” to the same timestamp as “created/modified”
    and “number_observed” I always set to 1.
     
    Another option we discussed briefly at the June 26 th call was to create a new “Observed-data like” SDO that could be capture multiple objects and be better suited for use cases such as these. It seems like this would essentially
    be identical to the Observed Data SDO but without the first_observed/last_observed/number_observed properties.

     
    Let me know your thoughts and preferences as far as these approaches – personally I’m rather torn, as I don’t see a clear winner here. Also, since this issue is currently holding up the release of STIX 2.1 CSD01 with no immediate resolution
    in sight, I think we need to seriously consider whether we should include these Malware SDO updates in CSD01 or instead push them out to CSD02.
     
    Regards,
    Ivan
     

    From: < cti@lists.oasis-open.org > on behalf of Ivan Kirillov < ikirillov@mitre.org >
    Date: Tuesday, June 19, 2018 at 1:02 PM
    To: Sean Barnum < sean.barnum@FireEye.com >, Bret Jordan < Bret_Jordan@symantec.com >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    That seems reasonable to me – I’ll bring it up on the working call. Thanks!
     
    -Ivan
     

    From: Sean Barnum < sean.barnum@FireEye.com >
    Date: Tuesday, June 19, 2018 at 12:54 PM
    To: Ivan Kirillov < ikirillov@mitre.org >, Bret Jordan < Bret_Jordan@symantec.com >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Yes, that is basically what I am proposing.
     
    Something along the lines of:
     
     




    result (required)


    string


    The classification result or name assigned to the malware instance by the AV scanner tool.
     
    If no resulting context-specific classification value or name is provided by the AV scanner tool then

    the result SHOULD come from the
    av-result-general-ov open vocabulary.




     
    where
    av-result-general-ov   is something like “malicious”, “suspicious”, “benign”, “unknown”, “error”
     
     

    Sean Barnum
    Principal Architect
    FireEye
    M: 703.473.8262

    E: sean.barnum@fireeye.com
     

    From: "Kirillov, Ivan A." < ikirillov@mitre.org >
    Date: Tuesday, June 19, 2018 at 2:36 PM
    To: Sean Barnum < sean.barnum@FireEye.com >, Bret Jordan < Bret_Jordan@symantec.com >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Thanks Sean - no worries about the delayed reply. So as far as 2), are you suggesting that we make “results” required and that it can capture either the actual result or something more generic (e.g., malicious/benign/etc.) that could come
    from a vocabulary? I do agree with you that the current language around “results” being not required if there is no result is rather confusing and I would also rather make it required in all cases.
     
    Regards,
    Ivan
     

    From: Sean Barnum < sean.barnum@FireEye.com >
    Date: Tuesday, June 19, 2018 at 8:35 AM
    To: Bret Jordan < Bret_Jordan@symantec.com >, Ivan Kirillov < ikirillov@mitre.org >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Sorry for the delayed response, Ivan.
    This week I am actually in the midst of working through some significant evolution on our Malware object and its use.
    I plan to attend today’s working call but am not sure what level of definitive opinions I will be ready to offer by then on very specific details. If not on today’s call we still should hopefully be able to offer some constructive input
    this week.
     
    On your two items that started this thread I can offer the following though:

    FireEye would definitely support consolidating the _expression_ of the observables into a single location and referencing those from the various other places as appropriate. This is MUCH cleaner, simpler and
    more resilient. I would agree that it makes sense to make “product” required as it does not really make sense to capture/convey an av_results entry where you don’t convey which product was used. I would disagree with making
    “scanned” required. There are not uncommon use cases where you may wish to convey that samples were scanned with particular AV but specifically do not want to expose when those scans occurred as it exposes details of when you knew about them. Lastly, I would
    suggest that we modify the current definition of “result” slightly and make it required. Currently, the definition allows the lack of the “results” property to
    imply that the scan was successfully completed but did not classify the sample as malicious. This sort of
    implication seems to present significant risk of confusion. Rather, I propose that the “result” property be defined to explicitly convey the result of the scan (whether malicious or otherwise) and that it be required. We could also define a simple vocab
    for general results that could apply across any scanners (e.g. “malicious”, “suspicious”, “benign”, “unknown”, “error”, etc). Looking across the full set of properties currently in av-results-type, the two properties that seem to be necessary (any av-results
    instance would not really make any sense or be of value without them) are product and results. Telling people that a scan occurred (even if all the other details are included) but not saying which product was used is not very useful. The “scan” could have
    been my 5 year old niece looking at the file. Similarly, telling people that the sample was scanned (even if all the other details are included) but not saying the result of the scan is not very useful.
     

    Sean Barnum
    Principal Architect
    FireEye
    M: 703.473.8262

    E: sean.barnum@fireeye.com
     

    From: < cti@lists.oasis-open.org > on behalf of Bret Jordan < Bret_Jordan@symantec.com >
    Date: Monday, June 18, 2018 at 5:04 PM
    To: "Kirillov, Ivan A." < ikirillov@mitre.org >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     


    I will try and review this change this week.
     
    Bret





    From:
    cti@lists.oasis-open.org < cti@lists.oasis-open.org > on behalf of Kirillov, Ivan A. < ikirillov@mitre.org >
    Sent: Monday, June 18, 2018 10:26:15 AM
    To: cti@lists.oasis-open.org
    Subject: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object


     




    Are there any other thoughts on these topics? It would be great to close them out so we can finish up CSD01 of STIX 2.1.
     
    Regards,
    Ivan
     

    From: < cti@lists.oasis-open.org > on behalf of Ivan Kirillov < ikirillov@mitre.org >
    Date: Wednesday, June 13, 2018 at 2:47 PM
    To: Allan Thomson < athomson@lookingglasscyber.com >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Cc: "Kumar, Subodh" < subodh.kumar@jpmorgan.com >
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Sorry, that should read “Conversely, parsing the SDO
    may become more difficult because…”
     
    Regards,
    Ivan
     

    From: Ivan Kirillov < ikirillov@mitre.org >
    Date: Wednesday, June 13, 2018 at 2:44 PM
    To: Allan Thomson < athomson@lookingglasscyber.com >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Cc: "Kumar, Subodh" < subodh.kumar@jpmorgan.com >
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Hi Allan,
     
    This approach doesn’t fundamentally change how we capture static/dynamic analysis data, but rather where and how the Cyber Observable Objects that correspond to that data are stored. If you have multiple
    observables from different analyses, you’ll just reference their corresponding objects that are stored in the “observable_objects” dictionary (which may or may not be the same objects across different analyses).


    As far as being easier, it’s kind of a wash – it may simplify the generation of content because any Cyber Observable Objects would have to be stored in this top-level dictionary. Conversely, parsing the SDO because you’ll have to dereference the objects as
    you come across their usage. However, I do think that the simplification to the data model and the ability to re-use objects are worthwhile changes.
     
    Regards,
    Ivan
     

    From: Allan Thomson < athomson@lookingglasscyber.com >
    Date: Wednesday, June 13, 2018 at 2:22 PM
    To: Ivan Kirillov < ikirillov@mitre.org >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Cc: "Kumar, Subodh" < subodh.kumar@jpmorgan.com >
    Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object


     

    Ivan – regarding 1.
     
    What if I have multiple observables for the same malware from different analysis (i.e. static + dynamic results).
     
    Would consolidating them into a single place really make it easier? You would still want to indicate that you have a list of observables and indicate where those were ‘observed’ from either static or dynamic
    or other.
     
    So I’m not sure consolidating it makes it easier but so long as the same things are possible with the consolidated design then I don’t have a strong preference either way.
     

    Allan Thomson
    CTO ( +1-408-331-6646)

    LookingGlass
    Cyber Solutions
     

    From: " cti@lists.oasis-open.org " < cti@lists.oasis-open.org > on behalf of "Kirillov,
    Ivan" < ikirillov@mitre.org >
    Date: Wednesday, June 13, 2018 at 12:57 PM
    To: " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Cc: "Kumar, Subodh" < subodh.kumar@jpmorgan.com >
    Subject: [cti] Finalizing the STIX 2.1 Malware Object


     

    All,
     
    As we’re wrapping up work on STIX 2.1 CSD01, we need to finalize what we have for the updated Malware SDO. Accordingly, I have two topics I’d like to bring up in this regard:
     

    Thanks to the work done by Subodh Kumar and his team, I’ve been wondering if there’s a better way to capture the Observable Objects associated with the Malware SDO. Right now, there are three places where
    you may encounter a Cyber Observable Object: samples (a dict of observable objects), static_analysis_results/results (certain keys have a corresponding dict of observable objects), dynamic_analysis_results/results (each key has a corresponding dict of observable
    objects).  
     
    Instead of having these observable object dictionaries all over the place, I believe it would make more sense to have a single property at the top level of the object (let’s call
    it “observable_objects”), where any Cyber Observable Objects associated with the SDO (samples, analysis results, etc.) could be captured, via references. There are a number of advantages to this: a simpler data model (less embedded observable object dicts
    everywhere), the ability to re-use objects (e.g., if static and dynamic analysis find the same objects, you can create one object and just reference it accordingly), and a more compact serialization. See the attached JSON example for what this looks like in
    practice – this is a modified version of the “Malware Instance with Analysis Data” example currently in the 2.1 spec.
     

    Currently, the “av-results-type”, used to capture AV classification results, has only optional properties and the text specifies that at least one must be included. This allows you to construct some odd,
    but spec-valid instances, such as an AV classification with only the engine version. In order to make this type more useful, I’d suggest that we make “product” (the name of the tool performing the scan) and “scanned” (the date/time the scan occurred) required,
    so that way you’ll at least have this minimum set of useful data for each instance. In addition, we should probably add some text stating that the “result” property (the actual AV classification result, e.g., “Trojan.Zeus”) must be included if the tool reports
    some classification during the scan.
     
    Let me know what you think – if we can get these final things wrapped up, we’re that much closer to getting STIX 2.1 out the door.
     
    Regards,
    Ivan


    This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is
    strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

    This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is
    strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.


    This message is confidential and subject to terms at: http:// www.jpmorgan.com/emaildisclaimer including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.





  • 4.  RE: [cti] Finalizing the STIX 2.1 Malware Object

    Posted 06-29-2018 22:56
    Thanks for putting this together.  I don’t think it quite captured what Gary and I had proposed however as we actually used single “ref” relationships to observed data elements rather than “refs”.  This means that we can capture the entire static or dynamic analysis result as a single graph.   While it is still a bit larger than the other files 8,175 bytes vs the current 5,006 bytes.  It is a fair bit smaller than the original version at 11,936 bytes while still following existing rules and allowing for a potentially greater degree of fidelity.   I’ve attached an example of what this looks like, although I did sort of cheat by throwing in an “action_extension” for the file object as I couldn’t find any way to say that one file created another.   Jeffrey Mates, Civ DC3/DCCI ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Computer Scientist Defense Cyber Crime Institute jeffrey.mates@dc3.mil 410-694-4335   From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> On Behalf Of Kirillov, Ivan A. Sent: Friday, June 29, 2018 11:26 AM To: cti@lists.oasis-open.org Subject: [Non-DoD Source] Re: [cti] Finalizing the STIX 2.1 Malware Object   To continue the discussion on the capture of Cyber Observables as part of the Malware SDO, I’ve attached 3 example JSON instances outlining the various approaches that we’ve looked at:   current: the approach currently defined in the STIX 2.1 specification, using observable-objects dictionaries nested as dictionary values. Positives: simplifies parsing, as objects are always embedded, so no need for dereferencing. Negatives: complicated data model/specification (some values are object dictionaries, some are not), different design pattern than used elsewhere. top_level_observables: the approach I had proposed on the June 19 working call of embedding all observables in a property at the top level of the Malware SDO, and then referencing them elsewhere via their ID. Positives: allows object re-use. Negatives: makes parsing more difficult, different design pattern than used elsewhere. observed_data: the approach that Gary Katz and Jeff Mates presented on the June 26 th working call of capturing all observables in Observed Data SDOs that are referenced accordingly. Positives: re-uses existing object (Observed Data), which results in less effort on the part of consumers and producers to use since they already support it. Negatives: Observed Data contains other required properties which may not be suitable for this context (number_observed, etc.), results in significantly larger JSON representations.   As you’ll see, one of the downsides of the Observed Data-based approach is that each individual Cyber Observable object (file, software, et.) has to be captured in its own Observed Data SDO per the current language in the specification (i.e., a single Observed Data cannot capture multiple unrelated objects). This means that this approach will significantly increase the size of the JSON that we’ll need to generate for Malware SDOs that make use of many cyber observables. The other issue with using Observed Data here is that “first_observed/last_observed” and “number_observed” are rather meaningless here, since these are non-traditional observations; in my example, I set “first_observed/last_observed” to the same timestamp as “created/modified” and “number_observed” I always set to 1.   Another option we discussed briefly at the June 26 th call was to create a new “Observed-data like” SDO that could be capture multiple objects and be better suited for use cases such as these. It seems like this would essentially be identical to the Observed Data SDO but without the first_observed/last_observed/number_observed properties.   Let me know your thoughts and preferences as far as these approaches – personally I’m rather torn, as I don’t see a clear winner here. Also, since this issue is currently holding up the release of STIX 2.1 CSD01 with no immediate resolution in sight, I think we need to seriously consider whether we should include these Malware SDO updates in CSD01 or instead push them out to CSD02.   Regards, Ivan   From: <cti@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org> Date: Tuesday, June 19, 2018 at 1:02 PM To: Sean Barnum <sean.barnum@FireEye.com>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   That seems reasonable to me – I’ll bring it up on the working call. Thanks!   -Ivan   From: Sean Barnum <sean.barnum@FireEye.com> Date: Tuesday, June 19, 2018 at 12:54 PM To: Ivan Kirillov <ikirillov@mitre.org>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Yes, that is basically what I am proposing.   Something along the lines of:     result (required) string The classification result or name assigned to the malware instance by the AV scanner tool.   If no resulting context-specific classification value or name is provided by the AV scanner tool then the result SHOULD come from the av-result-general-ov open vocabulary.   where av-result-general-ov   is something like “malicious”, “suspicious”, “benign”, “unknown”, “error”     Sean Barnum Principal Architect FireEye M: 703.473.8262 E: sean.barnum@fireeye.com   From: "Kirillov, Ivan A." <ikirillov@mitre.org> Date: Tuesday, June 19, 2018 at 2:36 PM To: Sean Barnum <sean.barnum@FireEye.com>, Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Thanks Sean - no worries about the delayed reply. So as far as 2), are you suggesting that we make “results” required and that it can capture either the actual result or something more generic (e.g., malicious/benign/etc.) that could come from a vocabulary? I do agree with you that the current language around “results” being not required if there is no result is rather confusing and I would also rather make it required in all cases.   Regards, Ivan   From: Sean Barnum <sean.barnum@FireEye.com> Date: Tuesday, June 19, 2018 at 8:35 AM To: Bret Jordan <Bret_Jordan@symantec.com>, Ivan Kirillov <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: Re: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Sorry for the delayed response, Ivan. This week I am actually in the midst of working through some significant evolution on our Malware object and its use. I plan to attend today’s working call but am not sure what level of definitive opinions I will be ready to offer by then on very specific details. If not on today’s call we still should hopefully be able to offer some constructive input this week.   On your two items that started this thread I can offer the following though: FireEye would definitely support consolidating the _expression_ of the observables into a single location and referencing those from the various other places as appropriate. This is MUCH cleaner, simpler and more resilient. I would agree that it makes sense to make “product” required as it does not really make sense to capture/convey an av_results entry where you don’t convey which product was used. I would disagree with making “scanned” required. There are not uncommon use cases where you may wish to convey that samples were scanned with particular AV but specifically do not want to expose when those scans occurred as it exposes details of when you knew about them. Lastly, I would suggest that we modify the current definition of “result” slightly and make it required. Currently, the definition allows the lack of the “results” property to imply that the scan was successfully completed but did not classify the sample as malicious. This sort of implication seems to present significant risk of confusion. Rather, I propose that the “result” property be defined to explicitly convey the result of the scan (whether malicious or otherwise) and that it be required. We could also define a simple vocab for general results that could apply across any scanners (e.g. “malicious”, “suspicious”, “benign”, “unknown”, “error”, etc). Looking across the full set of properties currently in av-results-type, the two properties that seem to be necessary (any av-results instance would not really make any sense or be of value without them) are product and results. Telling people that a scan occurred (even if all the other details are included) but not saying which product was used is not very useful. The “scan” could have been my 5 year old niece looking at the file. Similarly, telling people that the sample was scanned (even if all the other details are included) but not saying the result of the scan is not very useful.   Sean Barnum Principal Architect FireEye M: 703.473.8262 E: sean.barnum@fireeye.com   From: <cti@lists.oasis-open.org> on behalf of Bret Jordan <Bret_Jordan@symantec.com> Date: Monday, June 18, 2018 at 5:04 PM To: "Kirillov, Ivan A." <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Subject: [cti] Re: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   I will try and review this change this week.   Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Kirillov, Ivan A. <ikirillov@mitre.org> Sent: Monday, June 18, 2018 10:26:15 AM To: cti@lists.oasis-open.org Subject: [EXT] Re: [cti] Finalizing the STIX 2.1 Malware Object   Are there any other thoughts on these topics? It would be great to close them out so we can finish up CSD01 of STIX 2.1.   Regards, Ivan   From: <cti@lists.oasis-open.org> on behalf of Ivan Kirillov <ikirillov@mitre.org> Date: Wednesday, June 13, 2018 at 2:47 PM To: Allan Thomson <athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object   Sorry, that should read “Conversely, parsing the SDO may become more difficult because…”   Regards, Ivan   From: Ivan Kirillov <ikirillov@mitre.org> Date: Wednesday, June 13, 2018 at 2:44 PM To: Allan Thomson <athomson@lookingglasscyber.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object   Hi Allan,   This approach doesn’t fundamentally change how we capture static/dynamic analysis data, but rather where and how the Cyber Observable Objects that correspond to that data are stored. If you have multiple observables from different analyses, you’ll just reference their corresponding objects that are stored in the “observable_objects” dictionary (which may or may not be the same objects across different analyses). As far as being easier, it’s kind of a wash – it may simplify the generation of content because any Cyber Observable Objects would have to be stored in this top-level dictionary. Conversely, parsing the SDO because you’ll have to dereference the objects as you come across their usage. However, I do think that the simplification to the data model and the ability to re-use objects are worthwhile changes.   Regards, Ivan   From: Allan Thomson <athomson@lookingglasscyber.com> Date: Wednesday, June 13, 2018 at 2:22 PM To: Ivan Kirillov <ikirillov@mitre.org>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: Re: [cti] Finalizing the STIX 2.1 Malware Object   Ivan – regarding 1.   What if I have multiple observables for the same malware from different analysis (i.e. static + dynamic results).   Would consolidating them into a single place really make it easier? You would still want to indicate that you have a list of observables and indicate where those were ‘observed’ from either static or dynamic or other.   So I’m not sure consolidating it makes it easier but so long as the same things are possible with the consolidated design then I don’t have a strong preference either way.   Allan Thomson CTO ( +1-408-331-6646) LookingGlass Cyber Solutions   From: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> on behalf of "Kirillov, Ivan" <ikirillov@mitre.org> Date: Wednesday, June 13, 2018 at 12:57 PM To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> Cc: "Kumar, Subodh" <subodh.kumar@jpmorgan.com> Subject: [cti] Finalizing the STIX 2.1 Malware Object   All,   As we’re wrapping up work on STIX 2.1 CSD01, we need to finalize what we have for the updated Malware SDO. Accordingly, I have two topics I’d like to bring up in this regard:   Thanks to the work done by Subodh Kumar and his team, I’ve been wondering if there’s a better way to capture the Observable Objects associated with the Malware SDO. Right now, there are three places where you may encounter a Cyber Observable Object: samples (a dict of observable objects), static_analysis_results/results (certain keys have a corresponding dict of observable objects), dynamic_analysis_results/results (each key has a corresponding dict of observable objects).     Instead of having these observable object dictionaries all over the place, I believe it would make more sense to have a single property at the top level of the object (let’s call it “observable_objects”), where any Cyber Observable Objects associated with the SDO (samples, analysis results, etc.) could be captured, via references. There are a number of advantages to this: a simpler data model (less embedded observable object dicts everywhere), the ability to re-use objects (e.g., if static and dynamic analysis find the same objects, you can create one object and just reference it accordingly), and a more compact serialization. See the attached JSON example for what this looks like in practice – this is a modified version of the “Malware Instance with Analysis Data” example currently in the 2.1 spec.   Currently, the “av-results-type”, used to capture AV classification results, has only optional properties and the text specifies that at least one must be included. This allows you to construct some odd, but spec-valid instances, such as an AV classification with only the engine version. In order to make this type more useful, I’d suggest that we make “product” (the name of the tool performing the scan) and “scanned” (the date/time the scan occurred) required, so that way you’ll at least have this minimum set of useful data for each instance. In addition, we should probably add some text stating that the “result” property (the actual AV classification result, e.g., “Trojan.Zeus”) must be included if the tool reports some classification during the scan.   Let me know what you think – if we can get these final things wrapped up, we’re that much closer to getting STIX 2.1 out the door.   Regards, Ivan This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. Attachment: malware_example_observed_data_revised.json Description: Binary data Attachment: smime.p7s Description: S/MIME cryptographic signature