OASIS Cyber Threat Intelligence (CTI) TC

  • 1.  Re: [cti] RE: Detailed sub-issues for the topic of refactoring "sources"

    Posted 02-10-2016 13:55
      |   view attached




    So I’ll just throw out that I’m OK with this approach, though I do believe we’re making things more complex than I (and maybe others) would hope. My general thoughts are:

    created_by_ref needs to be optional. Its absence should mean that the STIX producer is not providing their identity. “Unknown” identities are meaningless in this scenario since the producer always knows their own identity and so could populate it if they
    want. Unknown identities will apply to derived sources (e.g., I pulled this e-mail dump off pastebin but I don’t know who posted it) and we’ll need to define a canonical way to describe this (it’s tough with relationships because you need an explicit endpoint) I have no idea what to do about external IDs. I will say that I kind of hate that we now have to use a relationship to indicate the CVE ID for a vulnerability, which I believe is a result of this proposal

    Vulnerability (empty) ===> HAS SOURCE [id] ===> Reference [CVE-2016-1234]

    Maybe this means “Vulnerability”, “Configuration”, “Weakness” should all just be types of reference?
    This would also apply to representing your incident ID in an external system, e.g. Incident ===> HAS SOURCE [external id] ===> Reference [Internal Ticket 1234], which is also not awesome. I think personally I still like the static list of references we had in 1.2 (it was super easy!), it even supports Sarah’s use case, but I do understand the desire of others to have confidence on these so I’m relatively OK with this. It does make simple
    things pretty difficult though (you now have to track 3 objects rather than 1)
    I share Terry’s concerns about multi-source (and multi-dest) relationships. IMO we don’t understand them enough to add them, and nobody has really seemed concerned about message size yet, so let’s not. I also agree with Terry about the additional “Role” field, now that I think about it. Just use separate types of relationships and, if a specific Identity/Tool/Reference is multiple, have multiple relationships. That’s really what you’re saying anyway.






    Overall though I feel like we’re all trying to design this in a vacuum and as we try to represent our own *real* content in it we’ll quickly discover things that we can’t guess at now. So IMO we should try to quickly arrive at some consensus for the outstanding
    issues and then move on.


    John




    From: < cti@lists.oasis-open.org > on behalf of Terry MacDonald < terry@soltra.com >
    Date: Tuesday, February 9, 2016 at 11:45 PM
    To: Sean Barnum < sbarnum@mitre.org >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: [cti] RE: Detailed sub-issues for the topic of refactoring "sources"








    Hi Sean,
    My take on things are inline…


    What should the structure of the Reference TLO look like?


    Current proposal is three properties: (this addresses the   “ external_ids ”  issue
    as well)


    “reference_URL” (optional) - specifies a URL to the external reference
    “external_identifier” (optional) - specifies an identifier for the external reference content
    “defining_context” (optional) - specifies the context within which the external_identifier is defined (system, registry, organization, etc.)


    I’m not a fan of conflating disparate things together. In my mind there are two distinct things that
    we are trying to capture:

    1.       
    A reference to documentation that contains information that was referred to during creation of this object.

    2.       
    Any link to internal tooling that provides more information about this, such as an internal IDS alerting interface, or an internal helpdesk ticketing
    solution.
    Item 1 is ‘where I got the information from’. Information of this type could be thought of as ‘evidence
    that led me to a conclusion’. The referenced document can be thought of as evidence that proves the conclusions reached by the object creator when they constructed the object.

    This idea in turn made me wonder if information of the type described in item 1 may be better stored
    as ‘Observation’ objects that contain CybOX Web Content observables, and then we use an evidenced-by relationship to tie them to the object that references their information:

    Item 2 is ‘what we call this in our non-STIX tools’. Its simply an external identifier. This information
    I am fine with storing in a list of references/external_ids – although I would prefer a field name of something like external-tool-references. E.g.

    "references": {
                    "reference-url" (optional) - specifies a URL to the external reference
                    "external-identifier" (optional) - specifies an identifier for the external reference content
    }
    Item 1 and item 2 are two very different things in my opinion, and it is sub optimal to conflate them.
     


    Use “created_by _ ref” on each TLO as shorthand for “Producer” source relationships?


    Should “created_by _ ref” shorthand and “Producer” Has Source relationship both be explicitly supported?  
    What if there is conflicting information? Should “created_by _ ref” be optional or required?

     
    I firmly believe that the created-by-ref should be optional to enable anonymity.

    Created-by-ref is the smallest way to document who created the object, and has the smallest amount
    of indirection before you reach the Identity of the organization or individual who did. This may become important when we begin to validate the object creator cryptographically and calculate HMACs for objects in the future.
    I am fine with there also being a producer relationship. I still feel that the word Producer is a little
    ambiguous as it doesn’t clarify if it’s referring to the object creator, or the creator of the information contained within the object (which are often the same but may be different).
    I do note that this goes against our design philosophy of there only being one way to do it. I would
    prefer that only one way was chosen (and of course I think it’s the created-by-ref way).
    This may be something we need an official vote for.


    Does leaving out both a created_by_ref and has_source relationship for an object imply anonymity?
    It must. We
    must allow the ability for producers to provide information anonymously. It is critical for government producers. That said, we also need to ensure that there is a way for consumers who receive only part of the available dataset have the ability to find
    out if there is more data available. For this reason I’ve proposed before a standard process for looking up Objects by ID. It leverages the TAXII Query lookup mechanism and the STIX Request/Response mechanism:
    https://docs.google.com/drawings/d/1dXGJTWhInAshktrCiOjF4zcVZb8WJIDaL22WeUnt8zE/pub?w=626&h=573


    What values should be in the default vocabulary for the Roles property on Has Source relationships?
    I need to go through this in more detail before I comment deeply on this.

    Looking at other recent example posts I am not a fan of the multi-tiered structure for relationships
    where there is a ‘Nature’ category level field and then a ‘Role’ associated with it. It seems overly complex. Why not just have a single field that describes succinctly what we’re trying to describe?


    Should “Has Source” relationship support many-to-one capability to assert lots of STIX content has the same source in an efficient manner? Is the “efficiency” issue necessary to address at this
    time?


    Basically this:


    Relationship


    ID = id-5
    From = id-2, id-3, id-4
    To = id-1
    Nature = Has Source


    Versus this:


    Relationship


    ID = id-5
    From = id-2
    To = id-1
    Nature = Has Source

    Relationship


    ID = id-6
    From = id-3
    To = id-1
    Nature = Has Source

    Relationship


    ID = id-7
    From = id-4
    To = id-1
    Nature = Has Source



    At this stage I would say no. one to many relationships are something
    we need to investigate in more detail as they would apply in all places across STIX. We need to work out what caveats and problems will be caused by this. If we are changing from a 1:1 style relationship model to a 1:M then we start to have difficulties such
    as:
    ·         
    Org A releases a relationships relating a single Campaign to 3 indicators.
    ·         
    Org B realizes that the 3 rd indicator is actually related to another Campaign.


    o   
    With the new opinion object detailed in the TWIGS proposal, Organizations have the ability to comment on another organizations
    published objects. How will this work to tell OrgA that they were wrong but only on the 3 rd indicator?

    o   
    What if Indicator 2 is TLP Red and the rest are TLP green? Does OrgA need to create a different relationship object that
    doesn’t contain the 2 nd Indicator for that group?
     
    Summarization needs to be looked at across the whole model, not just at the Indicator level for this. For this reason I would like
    to see it discussed specifically at a later date, in a different tranche.
     
    FWIW at this stage I believe that there is a distinction between the STIX data model, and information gleaned by analysis of the data
    in the STIX model.
     
    Cheers
     

    Terry MacDonald
    Senior STIX Subject Matter Expert
    SOLTRA   An FS-ISAC and DTCC Company
    +61 (407) 203 206
    terry@soltra.com
     

     


    From:
    cti@lists.oasis-open.org [ mailto:cti@lists.oasis-open.org ]
    On Behalf Of Barnum, Sean D.
    Sent: Wednesday, 10 February 2016 7:37 AM
    To: cti@lists.oasis-open.org
    Subject: [cti] Detailed sub-issues for the topic of refactoring "sources"


     


    Here is the promised follow-up to the email with the concise high-level proposal statement for refactoring sources.


     


    This is a stab at identifying several of the key sub-issues that need to be discussed/decided to make the proposed approach a practical reality.


     


     


     


    More detailed sub-issues to be discussed/decided:




    What should the structure of the Reference TLO look like?



    Current proposal is three properties: (this addresses the   “ external_ids ”  issue
    as well)





    “reference_URL” (optional) - specifies a URL to the external reference
    “external_identifier” (optional) - specifies an identifier for the external reference content
    “defining_context” (optional) - specifies the context within which the external_identifier is defined (system, registry, organization, etc.)




    Use “created_by _ ref” on each TLO as shorthand for “Producer” source relationships?


    Should “created_by _ ref” shorthand and “Producer” Has Source relationship both be explicitly supported?  
    What if there is conflicting information? Should “created_by _ ref” be optional or required?



    Does leaving out both a created_by_ref and has_source relationship for an object imply anonymity?
    What values should be in the default vocabulary for the Roles property on Has Source relationships?
    Should “Has Source” relationship support many-to-one capability to assert lots of STIX content has the same source in an efficient manner? Is the “efficiency” issue necessary to address at this
    time?



    Basically this:





    Relationship







    ID = id-5
    From = id-2, id-3, id-4
    To = id-1
    Nature = Has Source






    Versus this:





    Relationship







    ID = id-5
    From = id-2
    To = id-1
    Nature = Has Source







    Relationship







    ID = id-6
    From = id-3
    To = id-1
    Nature = Has Source







    Relationship







    ID = id-7
    From = id-4
    To = id-1
    Nature = Has Source














  • 2.  Re: [cti] RE: Detailed sub-issues for the topic of refactoring "sources"

    Posted 02-11-2016 12:32
      |   view attached




    > Overall though I feel like we’re all trying to design this in a vacuum … we should try to quickly arrive at some consensus for the outstanding issues and then move on.


    I agree with this 100%. No matter how much time we spend on this topic, we will find new things when we consider it within CTI as a whole. Personally I’d be happy with getting the current proposal into the pre-draft; then I can write code and really evaluate
    the structure/concepts.


    Thank you.
    -Mark








    From: < cti@lists.oasis-open.org > on behalf of "Wunder, John A." < jwunder@mitre.org >
    Date: Wednesday, February 10, 2016 at 8:55 AM
    To: " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: Re: [cti] RE: Detailed sub-issues for the topic of refactoring "sources"






    So I’ll just throw out that I’m OK with this approach, though I do believe we’re making things more complex than I (and maybe others) would hope. My general thoughts are:

    created_by_ref needs to be optional. Its absence should mean that the STIX producer is not providing their identity. “Unknown” identities are meaningless in this scenario since the producer always knows their own identity and so could populate it if they
    want. Unknown identities will apply to derived sources (e.g., I pulled this e-mail dump off pastebin but I don’t know who posted it) and we’ll need to define a canonical way to describe this (it’s tough with relationships because you need an explicit endpoint) I have no idea what to do about external IDs. I will say that I kind of hate that we now have to use a relationship to indicate the CVE ID for a vulnerability, which I believe is a result of this proposal

    Vulnerability (empty) ===> HAS SOURCE [id] ===> Reference [CVE-2016-1234]

    Maybe this means “Vulnerability”, “Configuration”, “Weakness” should all just be types of reference?
    This would also apply to representing your incident ID in an external system, e.g. Incident ===> HAS SOURCE [external id] ===> Reference [Internal Ticket 1234], which is also not awesome. I think personally I still like the static list of references we had in 1.2 (it was super easy!), it even supports Sarah’s use case, but I do understand the desire of others to have confidence on these so I’m relatively OK with this. It does make simple
    things pretty difficult though (you now have to track 3 objects rather than 1)
    I share Terry’s concerns about multi-source (and multi-dest) relationships. IMO we don’t understand them enough to add them, and nobody has really seemed concerned about message size yet, so let’s not. I also agree with Terry about the additional “Role” field, now that I think about it. Just use separate types of relationships and, if a specific Identity/Tool/Reference is multiple, have multiple relationships. That’s really what you’re saying anyway.






    Overall though I feel like we’re all trying to design this in a vacuum and as we try to represent our own *real* content in it we’ll quickly discover things that we can’t guess at now. So IMO we should try to quickly arrive at some consensus for the outstanding
    issues and then move on.


    John




    From: < cti@lists.oasis-open.org > on behalf of Terry MacDonald < terry@soltra.com >
    Date: Tuesday, February 9, 2016 at 11:45 PM
    To: Sean Barnum < sbarnum@mitre.org >, " cti@lists.oasis-open.org " < cti@lists.oasis-open.org >
    Subject: [cti] RE: Detailed sub-issues for the topic of refactoring "sources"








    Hi Sean,
    My take on things are inline…


    What should the structure of the Reference TLO look like?


    Current proposal is three properties: (this addresses the   “ external_ids ”  issue
    as well)


    “reference_URL” (optional) - specifies a URL to the external reference
    “external_identifier” (optional) - specifies an identifier for the external reference content
    “defining_context” (optional) - specifies the context within which the external_identifier is defined (system, registry, organization, etc.)


    I’m not a fan of conflating disparate things together. In my mind there are two distinct things that
    we are trying to capture:

    1.       
    A reference to documentation that contains information that was referred to during creation of this object.

    2.       
    Any link to internal tooling that provides more information about this, such as an internal IDS alerting interface, or an internal helpdesk ticketing
    solution.
    Item 1 is ‘where I got the information from’. Information of this type could be thought of as ‘evidence
    that led me to a conclusion’. The referenced document can be thought of as evidence that proves the conclusions reached by the object creator when they constructed the object.

    This idea in turn made me wonder if information of the type described in item 1 may be better stored
    as ‘Observation’ objects that contain CybOX Web Content observables, and then we use an evidenced-by relationship to tie them to the object that references their information:

    Item 2 is ‘what we call this in our non-STIX tools’. Its simply an external identifier. This information
    I am fine with storing in a list of references/external_ids – although I would prefer a field name of something like external-tool-references. E.g.

    "references": {
                    "reference-url" (optional) - specifies a URL to the external reference
                    "external-identifier" (optional) - specifies an identifier for the external reference content
    }
    Item 1 and item 2 are two very different things in my opinion, and it is sub optimal to conflate them.
     


    Use “created_by _ ref” on each TLO as shorthand for “Producer” source relationships?


    Should “created_by _ ref” shorthand and “Producer” Has Source relationship both be explicitly supported?  
    What if there is conflicting information? Should “created_by _ ref” be optional or required?

     
    I firmly believe that the created-by-ref should be optional to enable anonymity.

    Created-by-ref is the smallest way to document who created the object, and has the smallest amount
    of indirection before you reach the Identity of the organization or individual who did. This may become important when we begin to validate the object creator cryptographically and calculate HMACs for objects in the future.
    I am fine with there also being a producer relationship. I still feel that the word Producer is a little
    ambiguous as it doesn’t clarify if it’s referring to the object creator, or the creator of the information contained within the object (which are often the same but may be different).
    I do note that this goes against our design philosophy of there only being one way to do it. I would
    prefer that only one way was chosen (and of course I think it’s the created-by-ref way).
    This may be something we need an official vote for.


    Does leaving out both a created_by_ref and has_source relationship for an object imply anonymity?
    It must. We
    must allow the ability for producers to provide information anonymously. It is critical for government producers. That said, we also need to ensure that there is a way for consumers who receive only part of the available dataset have the ability to find
    out if there is more data available. For this reason I’ve proposed before a standard process for looking up Objects by ID. It leverages the TAXII Query lookup mechanism and the STIX Request/Response mechanism:
    https://docs.google.com/drawings/d/1dXGJTWhInAshktrCiOjF4zcVZb8WJIDaL22WeUnt8zE/pub?w=626&h=573


    What values should be in the default vocabulary for the Roles property on Has Source relationships?
    I need to go through this in more detail before I comment deeply on this.

    Looking at other recent example posts I am not a fan of the multi-tiered structure for relationships
    where there is a ‘Nature’ category level field and then a ‘Role’ associated with it. It seems overly complex. Why not just have a single field that describes succinctly what we’re trying to describe?


    Should “Has Source” relationship support many-to-one capability to assert lots of STIX content has the same source in an efficient manner? Is the “efficiency” issue necessary to address at this
    time?


    Basically this:


    Relationship


    ID = id-5
    From = id-2, id-3, id-4
    To = id-1
    Nature = Has Source


    Versus this:


    Relationship


    ID = id-5
    From = id-2
    To = id-1
    Nature = Has Source

    Relationship


    ID = id-6
    From = id-3
    To = id-1
    Nature = Has Source

    Relationship


    ID = id-7
    From = id-4
    To = id-1
    Nature = Has Source



    At this stage I would say no. one to many relationships are something
    we need to investigate in more detail as they would apply in all places across STIX. We need to work out what caveats and problems will be caused by this. If we are changing from a 1:1 style relationship model to a 1:M then we start to have difficulties such
    as:
    ·         
    Org A releases a relationships relating a single Campaign to 3 indicators.
    ·         
    Org B realizes that the 3 rd indicator is actually related to another Campaign.


    o   
    With the new opinion object detailed in the TWIGS proposal, Organizations have the ability to comment on another organizations
    published objects. How will this work to tell OrgA that they were wrong but only on the 3 rd indicator?

    o   
    What if Indicator 2 is TLP Red and the rest are TLP green? Does OrgA need to create a different relationship object that
    doesn’t contain the 2 nd Indicator for that group?
     
    Summarization needs to be looked at across the whole model, not just at the Indicator level for this. For this reason I would like
    to see it discussed specifically at a later date, in a different tranche.
     
    FWIW at this stage I believe that there is a distinction between the STIX data model, and information gleaned by analysis of the data
    in the STIX model.
     
    Cheers
     

    Terry MacDonald
    Senior STIX Subject Matter Expert
    SOLTRA   An FS-ISAC and DTCC Company
    +61 (407) 203 206
    terry@soltra.com
     

     


    From: cti@lists.oasis-open.org
    [ mailto:cti@lists.oasis-open.org ]
    On Behalf Of Barnum, Sean D.
    Sent: Wednesday, 10 February 2016 7:37 AM
    To: cti@lists.oasis-open.org
    Subject: [cti] Detailed sub-issues for the topic of refactoring "sources"


     


    Here is the promised follow-up to the email with the concise high-level proposal statement for refactoring sources.


     


    This is a stab at identifying several of the key sub-issues that need to be discussed/decided to make the proposed approach a practical reality.


     


     


     


    More detailed sub-issues to be discussed/decided:




    What should the structure of the Reference TLO look like?



    Current proposal is three properties: (this addresses the   “ external_ids ”  issue
    as well)





    “reference_URL” (optional) - specifies a URL to the external reference
    “external_identifier” (optional) - specifies an identifier for the external reference content
    “defining_context” (optional) - specifies the context within which the external_identifier is defined (system, registry, organization, etc.)




    Use “created_by _ ref” on each TLO as shorthand for “Producer” source relationships?


    Should “created_by _ ref” shorthand and “Producer” Has Source relationship both be explicitly supported?  
    What if there is conflicting information? Should “created_by _ ref” be optional or required?



    Does leaving out both a created_by_ref and has_source relationship for an object imply anonymity?
    What values should be in the default vocabulary for the Roles property on Has Source relationships?
    Should “Has Source” relationship support many-to-one capability to assert lots of STIX content has the same source in an efficient manner? Is the “efficiency” issue necessary to address at this
    time?



    Basically this:





    Relationship







    ID = id-5
    From = id-2, id-3, id-4
    To = id-1
    Nature = Has Source






    Versus this:





    Relationship







    ID = id-5
    From = id-2
    To = id-1
    Nature = Has Source







    Relationship







    ID = id-6
    From = id-3
    To = id-1
    Nature = Has Source







    Relationship







    ID = id-7
    From = id-4
    To = id-1
    Nature = Has Source