OASIS Cyber Threat Intelligence (CTI) TC

 View Only
Expand all | Collapse all

Patterning Operators - CONTAINS

  • 1.  Patterning Operators - CONTAINS

    Posted 10-26-2016 19:19




    As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:
     




    Set Operator


    Description


    Example




    a
    CONTAINS b


    a and
    b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address.
    b MUST be wholly contained within a .


    '192.168.10.0/24' CONTAINS ipv4-addr:value




     
    The issues around this operator are two-fold:
     
    1.       
    Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around
    blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address:

    a.       
    '192.168.10.0/24' CONTAINS ipv4-addr:value

    b.       
    ipv4-addr:value CONTAINS  '192.168.10.0'

    2.       
    There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear
    enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.
     
    An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:
     




    a
    ISSUBSET b


    b
    MUST be a subset of a .


    ipv4-addr:value ISSUBSET '192.168.10.0/8'




    a
    ISSUPERSET b


    b
    MUST be wholly contained within a .


    ipv4-addr:value ISSUPERSET '192.168.10.0'





    That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects.
    Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?
     
    Regards,
    Ivan






  • 2.  RE: Patterning Operators - CONTAINS

    Posted 10-26-2016 19:50




    I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.
     
    I’m a little concerned for three reasons:
     
    -          
    There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special”
    case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support
    arbitrary sets.
    -          
    The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what
    If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation.
    -          
    The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate
    ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned
    this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.
     
    For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical
    (IMO) for MVP, without us needing to come up with a more general solution.

     
    Greg
     



    From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org]
    On Behalf Of Kirillov, Ivan A.
    Sent: Wednesday, October 26, 2016 2:18 PM
    To: cti@lists.oasis-open.org
    Subject: [cti] Patterning Operators - CONTAINS


     
    As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:
     




    Set Operator


    Description


    Example




    a
    CONTAINS b


    a and
    b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address.
    b MUST be wholly contained within a .


    '192.168.10.0/24' CONTAINS ipv4-addr:value




     
    The issues around this operator are two-fold:
     
    1.       
    Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around
    blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address:

    a.       
    '192.168.10.0/24' CONTAINS ipv4-addr:value

    b.       
    ipv4-addr:value CONTAINS  '192.168.10.0'

    2.       
    There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear
    enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.
     
    An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:
     




    a
    ISSUBSET b


    b
    MUST be a subset of a .


    ipv4-addr:value ISSUBSET '192.168.10.0/8'




    a
    ISSUPERSET b


    b
    MUST be wholly contained within a .


    ipv4-addr:value ISSUPERSET '192.168.10.0'




     
    That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects.
    Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?
     
    Regards,
    Ivan







  • 3.  Re: Patterning Operators - CONTAINS

    Posted 10-26-2016 21:00
    Given that most Indicators today and probably for the next 12-18 months are going to be a simple match, it really feels like we are trying to solve all of future patterning all at once.   In an analogy, it feels like you are building an amazingly new Mclaren  and the rest of STIX is still figuring out how to build horse and buggies for the dirt road.  Yes, eventually we will get to super high performance freeways, and when we do, your Mclaren will be there waiting.  But maybe we just need something a lot simpler for this first release.  Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Wednesday, October 26, 2016 1:49:39 PM To: Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: [cti] RE: Patterning Operators - CONTAINS   I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.   I’m a little concerned for three reasons:   -           There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. -           The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. -           The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.   For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical (IMO) for MVP, without us needing to come up with a more general solution.   Greg   From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Kirillov, Ivan A. Sent: Wednesday, October 26, 2016 2:18 PM To: cti@lists.oasis-open.org Subject: [cti] Patterning Operators - CONTAINS   As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:   Set Operator Description Example a CONTAINS b a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address. b MUST be wholly contained within a . '192.168.10.0/24' CONTAINS ipv4-addr:value   The issues around this operator are two-fold:   1.        Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address: a.        '192.168.10.0/24' CONTAINS ipv4-addr:value b.        ipv4-addr:value CONTAINS  '192.168.10.0' 2.        There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.   An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:   a ISSUBSET b b MUST be a subset of a . ipv4-addr:value ISSUBSET '192.168.10.0/8' a ISSUPERSET b b MUST be wholly contained within a . ipv4-addr:value ISSUPERSET '192.168.10.0'   That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects. Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?   Regards, Ivan


  • 4.  RE: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-26-2016 21:24
    Completely disagree. Simple indicators are the root of the problem not something we should encourage. Patterning is the only thing that can make CTI data actually useful to operators. I get tons of complaints about ‘false positives’ because the simple indicator misses critical context that was known and available.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 5:00 PM To: Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: [cti] Re: Patterning Operators - CONTAINS   Given that most Indicators today and probably for the next 12-18 months are going to be a simple match, it really feels like we are trying to solve all of future patterning all at once.   In an analogy, it feels like you are building an amazingly new Mclaren  and the rest of STIX is still figuring out how to build horse and buggies for the dirt road.  Yes, eventually we will get to super high performance freeways, and when we do, your Mclaren will be there waiting.  But maybe we just need something a lot simpler for this first release.  Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Wednesday, October 26, 2016 1:49:39 PM To: Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: [cti] RE: Patterning Operators - CONTAINS   I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.   I’m a little concerned for three reasons:   -           There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. -           The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. -           The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.   For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical (IMO) for MVP, without us needing to come up with a more general solution.   Greg   From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Kirillov, Ivan A. Sent: Wednesday, October 26, 2016 2:18 PM To: cti@lists.oasis-open.org Subject: [cti] Patterning Operators - CONTAINS   As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:   Set Operator Description Example a CONTAINS b a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address. b MUST be wholly contained within a . '192.168.10.0/24' CONTAINS ipv4-addr:value   The issues around this operator are two-fold:   1.        Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address: a.        '192.168.10.0/24' CONTAINS ipv4-addr:value b.        ipv4-addr:value CONTAINS  '192.168.10.0' 2.        There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.   An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:   a ISSUBSET b b MUST be a subset of a . ipv4-addr:value ISSUBSET '192.168.10.0/8' a ISSUPERSET b b MUST be wholly contained within a . ipv4-addr:value ISSUPERSET '192.168.10.0'   That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects. Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?   Regards, Ivan


  • 5.  Re: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-26-2016 22:32
    You disagree that the majority of indicators today are simples matches?  Or do you disagree that for the next 12-18 months we will simple matches still?   Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Mark Clancy <mclancy@soltra.com> Sent: Wednesday, October 26, 2016 3:23 PM To: Bret Jordan (CS); Back, Greg; Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: RE: [cti] Re: Patterning Operators - CONTAINS   Completely disagree. Simple indicators are the root of the problem not something we should encourage. Patterning is the only thing that can make CTI data actually useful to operators. I get tons of complaints about ‘false positives’ because the simple indicator misses critical context that was known and available.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 5:00 PM To: Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: [cti] Re: Patterning Operators - CONTAINS   Given that most Indicators today and probably for the next 12-18 months are going to be a simple match, it really feels like we are trying to solve all of future patterning all at once.   In an analogy, it feels like you are building an amazingly new Mclaren  and the rest of STIX is still figuring out how to build horse and buggies for the dirt road.  Yes, eventually we will get to super high performance freeways, and when we do, your Mclaren will be there waiting.  But maybe we just need something a lot simpler for this first release.  Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Wednesday, October 26, 2016 1:49:39 PM To: Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: [cti] RE: Patterning Operators - CONTAINS   I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.   I’m a little concerned for three reasons:   -           There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. -           The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. -           The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.   For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical (IMO) for MVP, without us needing to come up with a more general solution.   Greg   From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Kirillov, Ivan A. Sent: Wednesday, October 26, 2016 2:18 PM To: cti@lists.oasis-open.org Subject: [cti] Patterning Operators - CONTAINS   As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:   Set Operator Description Example a CONTAINS b a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address. b MUST be wholly contained within a . '192.168.10.0/24' CONTAINS ipv4-addr:value   The issues around this operator are two-fold:   1.        Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address: a.        '192.168.10.0/24' CONTAINS ipv4-addr:value b.        ipv4-addr:value CONTAINS  '192.168.10.0' 2.        There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.   An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:   a ISSUBSET b b MUST be a subset of a . ipv4-addr:value ISSUBSET '192.168.10.0/8' a ISSUPERSET b b MUST be wholly contained within a . ipv4-addr:value ISSUPERSET '192.168.10.0'   That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects. Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?   Regards, Ivan


  • 6.  RE: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-26-2016 23:09
    We need to build for the world we want not they one we have… today the majority of the indicators are simple and basically useless as a result. If we minimize the spec as ‘mvp’ and remove patterning in 18 months we will be nowhere better and still have simple and basically useless indicators with too many avoidable false positive's due to no ability to add context from the patterning. We don't need stix 2.0 for that world.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 6:32 PM To: Mark Clancy ; Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: Re: [cti] Re: Patterning Operators - CONTAINS   You disagree that the majority of indicators today are simples matches?  Or do you disagree that for the next 12-18 months we will simple matches still?   Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Mark Clancy <mclancy@soltra.com> Sent: Wednesday, October 26, 2016 3:23 PM To: Bret Jordan (CS); Back, Greg; Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: RE: [cti] Re: Patterning Operators - CONTAINS   Completely disagree. Simple indicators are the root of the problem not something we should encourage. Patterning is the only thing that can make CTI data actually useful to operators. I get tons of complaints about ‘false positives’ because the simple indicator misses critical context that was known and available.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 5:00 PM To: Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: [cti] Re: Patterning Operators - CONTAINS   Given that most Indicators today and probably for the next 12-18 months are going to be a simple match, it really feels like we are trying to solve all of future patterning all at once.   In an analogy, it feels like you are building an amazingly new Mclaren  and the rest of STIX is still figuring out how to build horse and buggies for the dirt road.  Yes, eventually we will get to super high performance freeways, and when we do, your Mclaren will be there waiting.  But maybe we just need something a lot simpler for this first release.  Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Wednesday, October 26, 2016 1:49:39 PM To: Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: [cti] RE: Patterning Operators - CONTAINS   I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.   I’m a little concerned for three reasons:   -           There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. -           The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. -           The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.   For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical (IMO) for MVP, without us needing to come up with a more general solution.   Greg   From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Kirillov, Ivan A. Sent: Wednesday, October 26, 2016 2:18 PM To: cti@lists.oasis-open.org Subject: [cti] Patterning Operators - CONTAINS   As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:   Set Operator Description Example a CONTAINS b a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address. b MUST be wholly contained within a . '192.168.10.0/24' CONTAINS ipv4-addr:value   The issues around this operator are two-fold:   1.        Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address: a.        '192.168.10.0/24' CONTAINS ipv4-addr:value b.        ipv4-addr:value CONTAINS  '192.168.10.0' 2.        There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.   An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:   a ISSUBSET b b MUST be a subset of a . ipv4-addr:value ISSUBSET '192.168.10.0/8' a ISSUPERSET b b MUST be wholly contained within a . ipv4-addr:value ISSUPERSET '192.168.10.0'   That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects. Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?   Regards, Ivan


  • 7.  Re: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-26-2016 23:18
    +1 to Mark's comments  Get Outlook for iOS On Wed, Oct 26, 2016 at 7:09 PM -0400, "Mark Clancy" < mclancy@soltra.com > wrote: We need to build for the world we want not they one we have… today the majority of the indicators are simple and basically useless as a result. If we minimize the spec as ‘mvp’ and remove patterning in 18 months we will be nowhere better and still have simple and basically useless indicators with too many avoidable false positive's due to no ability to add context from the patterning. We don't need stix 2.0 for that world.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 6:32 PM To: Mark Clancy ; Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: Re: [cti] Re: Patterning Operators - CONTAINS   You disagree that the majority of indicators today are simples matches?  Or do you disagree that for the next 12-18 months we will simple matches still?   Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Mark Clancy <mclancy@soltra.com> Sent: Wednesday, October 26, 2016 3:23 PM To: Bret Jordan (CS); Back, Greg; Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: RE: [cti] Re: Patterning Operators - CONTAINS   Completely disagree. Simple indicators are the root of the problem not something we should encourage. Patterning is the only thing that can make CTI data actually useful to operators. I get tons of complaints about ‘false positives’ because the simple indicator misses critical context that was known and available.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 5:00 PM To: Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: [cti] Re: Patterning Operators - CONTAINS   Given that most Indicators today and probably for the next 12-18 months are going to be a simple match, it really feels like we are trying to solve all of future patterning all at once.   In an analogy, it feels like you are building an amazingly new Mclaren  and the rest of STIX is still figuring out how to build horse and buggies for the dirt road.  Yes, eventually we will get to super high performance freeways, and when we do, your Mclaren will be there waiting.  But maybe we just need something a lot simpler for this first release.  Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Wednesday, October 26, 2016 1:49:39 PM To: Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: [cti] RE: Patterning Operators - CONTAINS   I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.   I’m a little concerned for three reasons:   -           There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. -           The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. -           The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.   For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical (IMO) for MVP, without us needing to come up with a more general solution.   Greg   From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Kirillov, Ivan A. Sent: Wednesday, October 26, 2016 2:18 PM To: cti@lists.oasis-open.org Subject: [cti] Patterning Operators - CONTAINS   As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:   Set Operator Description Example a CONTAINS b a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address. b MUST be wholly contained within a . '192.168.10.0/24' CONTAINS ipv4-addr:value   The issues around this operator are two-fold:   1.        Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address: a.        '192.168.10.0/24' CONTAINS ipv4-addr:value b.        ipv4-addr:value CONTAINS  '192.168.10.0' 2.        There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.   An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:   a ISSUBSET b b MUST be a subset of a . ipv4-addr:value ISSUBSET '192.168.10.0/8' a ISSUPERSET b b MUST be wholly contained within a . ipv4-addr:value ISSUPERSET '192.168.10.0'   That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects. Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?   Regards, Ivan This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.


  • 8.  Re: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-27-2016 17:06
    [+1] to Mark's comments as well. Get Outlook for iOS From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Paul Patrick <Paul.Patrick@FireEye.com> Sent: Wednesday, October 26, 2016 7:17:40 PM To: cti@lists.oasis-open.org; Kirillov, Ivan A.; Mark Clancy; Bret Jordan (CS); Back, Greg Subject: Re: [cti] Re: Patterning Operators - CONTAINS   +1 to Mark's comments  Get Outlook for iOS On Wed, Oct 26, 2016 at 7:09 PM -0400, "Mark Clancy" < mclancy@soltra.com > wrote: We need to build for the world we want not they one we have… today the majority of the indicators are simple and basically useless as a result. If we minimize the spec as ‘mvp’ and remove patterning in 18 months we will be nowhere better and still have simple and basically useless indicators with too many avoidable false positive's due to no ability to add context from the patterning. We don't need stix 2.0 for that world.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 6:32 PM To: Mark Clancy ; Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: Re: [cti] Re: Patterning Operators - CONTAINS   You disagree that the majority of indicators today are simples matches?  Or do you disagree that for the next 12-18 months we will simple matches still?   Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Mark Clancy <mclancy@soltra.com> Sent: Wednesday, October 26, 2016 3:23 PM To: Bret Jordan (CS); Back, Greg; Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: RE: [cti] Re: Patterning Operators - CONTAINS   Completely disagree. Simple indicators are the root of the problem not something we should encourage. Patterning is the only thing that can make CTI data actually useful to operators. I get tons of complaints about ‘false positives’ because the simple indicator misses critical context that was known and available.   Sent from my Windows 10 phone   From: Bret Jordan (CS) Sent: Wednesday, October 26, 2016 5:00 PM To: Back, Greg ; Kirillov, Ivan A. ; cti@lists.oasis-open.org Subject: [cti] Re: Patterning Operators - CONTAINS   Given that most Indicators today and probably for the next 12-18 months are going to be a simple match, it really feels like we are trying to solve all of future patterning all at once.   In an analogy, it feels like you are building an amazingly new Mclaren  and the rest of STIX is still figuring out how to build horse and buggies for the dirt road.  Yes, eventually we will get to super high performance freeways, and when we do, your Mclaren will be there waiting.  But maybe we just need something a lot simpler for this first release.  Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Wednesday, October 26, 2016 1:49:39 PM To: Kirillov, Ivan A.; cti@lists.oasis-open.org Subject: [cti] RE: Patterning Operators - CONTAINS   I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this.   I’m a little concerned for three reasons:   -           There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. -           The “set” semantics are not entirely clear. A single IP address is not a subset of anything. * The set consisting of * a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. -           The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”.  But I don’t think the current autonomous-system object supports this data.   For these reasons, I would prefer a single-purpose operator “INSUBNET” with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal on the right. It’s a common enough use case that it’s critical (IMO) for MVP, without us needing to come up with a more general solution.   Greg   From: cti@lists.oasis-open.org [mailto:cti@lists.oasis-open.org] On Behalf Of Kirillov, Ivan A. Sent: Wednesday, October 26, 2016 2:18 PM To: cti@lists.oasis-open.org Subject: [cti] Patterning Operators - CONTAINS   As we briefly mentioned on last week’s TC call, one of the open issues in the Patterning specification is around the CONTAINS Comparison Operator:   Set Operator Description Example a CONTAINS b a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6-addr Object to specify an IPv4 or IPv6 address value OR a string that represents a particular IPv4 or IPv6 address. b MUST be wholly contained within a . '192.168.10.0/24' CONTAINS ipv4-addr:value   The issues around this operator are two-fold:   1.        Unlike all other Comparison Operators, CONTAINS supports Object Paths in both arguments (i.e., not just on the left-hand side). This was done intentionally, to permit the _expression_ of patterns around blacklisting (e.g., that a particular IP address falls into a particular CIDR range), as well as testing whether a particular subnet contains a specific IP address: a.        '192.168.10.0/24' CONTAINS ipv4-addr:value b.        ipv4-addr:value CONTAINS  '192.168.10.0' 2.        There’s been some consternation around the name “CONTAINS” – some find it confusing because they think it’s a substring operator (as in many programming languages), others just don’t think it’s clear enough. On the second point, the issue with changing the name to something more specific (e.g., INSUBNET) is that also changes the abstract nature of the operator, meaning that it can’t be used with additional Cyber Observable Objects in the future.   An idea that Trey, John-Mark, and I have kicked around is to replace this operator with two generic set operators, ISSUBSET and ISSUPERSET:   a ISSUBSET b b MUST be a subset of a . ipv4-addr:value ISSUBSET '192.168.10.0/8' a ISSUPERSET b b MUST be wholly contained within a . ipv4-addr:value ISSUPERSET '192.168.10.0'   That way we can still support the blacklisting use case, consistently have Object Paths on the left-hand side of every operator, and also retain a level of abstraction that permits use with future Objects. Any thoughts on this? Does this seem like a preferable alternative to CONTAINS?   Regards, Ivan This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.


  • 9.  RE: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-27-2016 21:33
    I don't think anyone has suggested removing patterning entirely. I don't think that basic indicators are (basically or otherwise) "useless". I sincerely hope that removing some parts of patterning now doesn't mean we have to wait 18 months for another chance to add them. I certainly agree that there are things that would be valuable to represent as indicators, that can't be represented with simple patterns. For example, malware that creates an executable file with a random name and adds a registry key that runs that file on startup. But even that scenario can't be represented using the current patterning language, which can't express relationships between two different property values. The biggest issue with patterning right now, in my opinion, is the lack of clear definitions on what the "context" is that any pattern should be evaluated in. Network Sensors, host-based agents, and SIEM tools all have different operating models. Determining what counts as the "same observation" vs. "different observation" has profound implications on the way patterns are currently constructed. For a lot of patterns, the context seems underspecified to me, in the "do what I mean, not what I say" sense. Having an "all or nothing" attitude towards patterning violates the spirit of MVP and iteration based on the collective experience we get from seeing prior versions used in practice. Greg > -----Original Message----- > From: Mark Clancy [ mailto:mclancy@soltra.com ] > Sent: Wednesday, October 26, 2016 6:09 PM > To: Bret Jordan (CS) <Bret_Jordan@symantec.com>; Back, Greg > <gback@mitre.org>; Kirillov, Ivan A. <ikirillov@mitre.org>; cti@lists.oasis- > open.org > Subject: RE: [cti] Re: Patterning Operators - CONTAINS > > We need to build for the world we want not they one we have... today the > majority of the indicators are simple and basically useless as a result. If we > minimize the spec as 'mvp' and remove patterning in 18 months we will be > nowhere better and still have simple and basically useless indicators with too > many avoidable false positive's due to no ability to add context from the > patterning. > > We don't need stix 2.0 for that world. > > > > Sent from my Windows 10 phone > > > > From: Bret Jordan (CS) < mailto:Bret_Jordan@symantec.com > > Sent: Wednesday, October 26, 2016 6:32 PM > To: Mark Clancy < mailto:mclancy@soltra.com > ; Back, Greg > < mailto:gback@mitre.org > ; Kirillov, Ivan A. < mailto:ikirillov@mitre.org > ; > cti@lists.oasis-open.org < mailto:cti@lists.oasis-open.org > > Subject: Re: [cti] Re: Patterning Operators - CONTAINS > > > > You disagree that the majority of indicators today are simples matches? Or > do you disagree that for the next 12-18 months we will simple matches still? > > > > > Bret > > > > ________________________________ > > From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Mark > Clancy <mclancy@soltra.com> > Sent: Wednesday, October 26, 2016 3:23 PM > To: Bret Jordan (CS); Back, Greg; Kirillov, Ivan A.; cti@lists.oasis-open.org > Subject: RE: [cti] Re: Patterning Operators - CONTAINS > > > Completely disagree. Simple indicators are the root of the problem not > something we should encourage. Patterning is the only thing that can make > CTI data actually useful to operators. I get tons of complaints about 'false > positives' because the simple indicator misses critical context that was known > and available. > > > > Sent from my Windows 10 phone > > > > From: Bret Jordan (CS) < mailto:Bret_Jordan@symantec.com > > Sent: Wednesday, October 26, 2016 5:00 PM > To: Back, Greg < mailto:gback@mitre.org > ; Kirillov, Ivan A. > < mailto:ikirillov@mitre.org > ; cti@lists.oasis-open.org < mailto:cti@lists.oasis- > open.org> > Subject: [cti] Re: Patterning Operators - CONTAINS > > > > Given that most Indicators today and probably for the next 12-18 months are > going to be a simple match, it really feels like we are trying to solve all of > future patterning all at once. > > > > > In an analogy, it feels like you are building an amazingly new Mclaren and the > rest of STIX is still figuring out how to build horse and buggies for the dirt > road. Yes, eventually we will get to super high performance freeways, and > when we do, your Mclaren will be there waiting. But maybe we just need > something a lot simpler for this first release. > > > > > Bret > > > > > ________________________________ > > From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, > Greg <gback@mitre.org> > Sent: Wednesday, October 26, 2016 1:49:39 PM > To: Kirillov, Ivan A.; cti@lists.oasis-open.org > Subject: [cti] RE: Patterning Operators - CONTAINS > > > I strongly support any plan that makes the LHS always be an object path and > RHS always be a value, so I would be OK with this. > > > > I'm a little concerned for three reasons: > > > > - There's already the IN operator, where the RHS is a set (in this case, a > literal set of values vs the more-abstract notion of a CIDR block). I agree > there's some risk/complexity in allowing a "special" case of IN to be where > the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and > for this reason separate operators are OK. But I expect a lot of people to > want (ipv4-addr:value IN "192.168.0.0/16") to work if both IN and INSUBSET > support arbitrary sets. > > - The "set" semantics are not entirely clear. A single IP address is not a > subset of anything. *The set consisting of* a single IP address may be a > subset of a larger set of IPs. Also, what If I want to represent a more arbitrary > set of IPs where the last octet is 255 (X.X.X.255 or netmask > 00000000000000000000000011111111) or more complex bit patterns? It's still > technically a "subset", but not really representable using CIDR/IP-literal > notation. > > - The examples currently in the document (ipv4-addr:value ISSUBSET > '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually > mean the same thing, so I would be concerned about duplicate ways of > expressing the same thing. I can't think of a good use case for the > ISSUPERSET version in the message below when it comes to STIX indicators > matching observed data. You won't often observe a CIDR block, except > maybe in the case of "this AS is assigned this CIDR block", which could be > expressed as a pseudo pattern "find me a AS whose assigned IPs contains > 1.2.3.4". But I don't think the current autonomous-system object supports > this data. > > > > For these reasons, I would prefer a single-purpose operator "INSUBNET" > with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal > on the right. It's a common enough use case that it's critical (IMO) for MVP, > without us needing to come up with a more general solution. > > > > Greg > > > > From: cti@lists.oasis-open.org [ mailto:cti@lists.oasis-open.org ] On Behalf Of > Kirillov, Ivan A. > Sent: Wednesday, October 26, 2016 2:18 PM > To: cti@lists.oasis-open.org > Subject: [cti] Patterning Operators - CONTAINS > > > > As we briefly mentioned on last week's TC call, one of the open issues in the > Patterning specification is around the CONTAINS Comparison Operator: > > > > Set Operator > > Description > > Example > > a CONTAINS b > > a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6- > addr Object to specify an IPv4 or IPv6 address value OR a string that > represents a particular IPv4 or IPv6 address. b MUST be wholly contained > within a. > > '192.168.10.0/24' CONTAINS ipv4-addr:value > > > > The issues around this operator are two-fold: > > > > 1. Unlike all other Comparison Operators, CONTAINS supports Object > Paths in both arguments (i.e., not just on the left-hand side). This was done > intentionally, to permit the expression of patterns around blacklisting (e.g., > that a particular IP address falls into a particular CIDR range), as well as testing > whether a particular subnet contains a specific IP address: > > a. '192.168.10.0/24' CONTAINS ipv4-addr:value > > b. ipv4-addr:value CONTAINS '192.168.10.0' > > 2. There's been some consternation around the name "CONTAINS" - > some find it confusing because they think it's a substring operator (as in > many programming languages), others just don't think it's clear enough. On > the second point, the issue with changing the name to something more > specific (e.g., INSUBNET) is that also changes the abstract nature of the > operator, meaning that it can't be used with additional Cyber Observable > Objects in the future. > > > > An idea that Trey, John-Mark, and I have kicked around is to replace this > operator with two generic set operators, ISSUBSET and ISSUPERSET: > > > > a ISSUBSET b > > b MUST be a subset of a. > > ipv4-addr:value ISSUBSET '192.168.10.0/8' > > a ISSUPERSET b > > b MUST be wholly contained within a. > > ipv4-addr:value ISSUPERSET '192.168.10.0' > > > > That way we can still support the blacklisting use case, consistently have > Object Paths on the left-hand side of every operator, and also retain a level > of abstraction that permits use with future Objects. Any thoughts on this? > Does this seem like a preferable alternative to CONTAINS? > > > > Regards, > > Ivan


  • 10.  Re: [cti] Re: Patterning Operators - CONTAINS

    Posted 10-27-2016 21:41
    Yes, my hope is that we could do the parts of patterning we really understand and know well, in this initial MVP release.  Then, over the next few months, add in more and more pieces as we fully understand them and flesh them out.  Right now we are waiting for the development efforts on Patterning to be finished so the TC can review it.  So if we can release a smaller patterning grammar for this first MVP release, and and the rest in the coming months, then that might get this initial release out the door sooner.   The more complex Patterning is, the longer it will take during the review phase.    Lets just get something out that solves basic needs now, and make it a solid foundation from which to build.    Bret From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, Greg <gback@mitre.org> Sent: Thursday, October 27, 2016 3:33:24 PM To: Mark Clancy; cti@lists.oasis-open.org Subject: RE: [cti] Re: Patterning Operators - CONTAINS   I don't think anyone has suggested removing patterning entirely. I don't think that basic indicators are (basically or otherwise) "useless". I sincerely hope that removing some parts of patterning now doesn't mean we have to wait 18 months for another chance to add them. I certainly agree that there are things that would be valuable to represent as indicators, that can't be represented with simple patterns. For example, malware that creates an executable file with a random name and adds a registry key that runs that file on startup. But even that scenario can't be represented using the current patterning language, which can't express relationships between two different property values. The biggest issue with patterning right now, in my opinion, is the lack of clear definitions on what the "context" is that any pattern should be evaluated in. Network Sensors, host-based agents, and SIEM tools all have different operating models. Determining what counts as the "same observation" vs. "different observation" has profound implications on the way patterns are currently constructed. For a lot of patterns, the context seems underspecified to me, in the "do what I mean, not what I say" sense. Having an "all or nothing" attitude towards patterning violates the spirit of MVP and iteration based on the collective experience we get from seeing prior versions used in practice. Greg > -----Original Message----- > From: Mark Clancy [ mailto:mclancy@soltra.com ] > Sent: Wednesday, October 26, 2016 6:09 PM > To: Bret Jordan (CS) <Bret_Jordan@symantec.com>; Back, Greg > <gback@mitre.org>; Kirillov, Ivan A. <ikirillov@mitre.org>; cti@lists.oasis- > open.org > Subject: RE: [cti] Re: Patterning Operators - CONTAINS > > We need to build for the world we want not they one we have... today the > majority of the indicators are simple and basically useless as a result. If we > minimize the spec as 'mvp' and remove patterning in 18 months we will be > nowhere better and still have simple and basically useless indicators with too > many avoidable false positive's due to no ability to add context from the > patterning. > > We don't need stix 2.0 for that world. > > > > Sent from my Windows 10 phone > > > > From: Bret Jordan (CS) < mailto:Bret_Jordan@symantec.com > > Sent: Wednesday, October 26, 2016 6:32 PM > To: Mark Clancy < mailto:mclancy@soltra.com > ; Back, Greg > < mailto:gback@mitre.org > ; Kirillov, Ivan A. < mailto:ikirillov@mitre.org > ; > cti@lists.oasis-open.org < mailto:cti@lists.oasis-open.org > > Subject: Re: [cti] Re: Patterning Operators - CONTAINS > > > > You disagree that the majority of indicators today are simples matches?  Or > do you disagree that for the next 12-18 months we will simple matches still? > > > > > Bret > > > > ________________________________ > > From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Mark > Clancy <mclancy@soltra.com> > Sent: Wednesday, October 26, 2016 3:23 PM > To: Bret Jordan (CS); Back, Greg; Kirillov, Ivan A.; cti@lists.oasis-open.org > Subject: RE: [cti] Re: Patterning Operators - CONTAINS > > > Completely disagree. Simple indicators are the root of the problem not > something we should encourage. Patterning is the only thing that can make > CTI data actually useful to operators. I get tons of complaints about 'false > positives' because the simple indicator misses critical context that was known > and available. > > > > Sent from my Windows 10 phone > > > > From: Bret Jordan (CS) < mailto:Bret_Jordan@symantec.com > > Sent: Wednesday, October 26, 2016 5:00 PM > To: Back, Greg < mailto:gback@mitre.org > ; Kirillov, Ivan A. > < mailto:ikirillov@mitre.org > ; cti@lists.oasis-open.org < mailto:cti@lists.oasis- > open.org> > Subject: [cti] Re: Patterning Operators - CONTAINS > > > > Given that most Indicators today and probably for the next 12-18 months are > going to be a simple match, it really feels like we are trying to solve all of > future patterning all at once. > > > > > In an analogy, it feels like you are building an amazingly new Mclaren and the > rest of STIX is still figuring out how to build horse and buggies for the dirt > road.  Yes, eventually we will get to super high performance freeways, and > when we do, your Mclaren will be there waiting.  But maybe we just need > something a lot simpler for this first release. > > > > > Bret > > > > > ________________________________ > > From: cti@lists.oasis-open.org <cti@lists.oasis-open.org> on behalf of Back, > Greg <gback@mitre.org> > Sent: Wednesday, October 26, 2016 1:49:39 PM > To: Kirillov, Ivan A.; cti@lists.oasis-open.org > Subject: [cti] RE: Patterning Operators - CONTAINS > > > I strongly support any plan that makes the LHS always be an object path and > RHS always be a value, so I would be OK with this. > > > > I'm a little concerned for three reasons: > > > > -          There's already the IN operator, where the RHS is a set (in this case, a > literal set of values vs the more-abstract notion of a CIDR block). I agree > there's some risk/complexity in allowing a "special" case of IN to be where > the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and > for this reason separate operators are OK. But I expect a lot of people to > want (ipv4-addr:value IN "192.168.0.0/16") to work if both IN and INSUBSET > support arbitrary sets. > > -          The "set" semantics are not entirely clear. A single IP address is not a > subset of anything. *The set consisting of* a single IP address may be a > subset of a larger set of IPs. Also, what If I want to represent a more arbitrary > set of IPs where the last octet is 255 (X.X.X.255 or netmask > 00000000000000000000000011111111) or more complex bit patterns? It's still > technically a "subset", but not really representable using CIDR/IP-literal > notation. > > -          The examples currently in the document (ipv4-addr:value ISSUBSET > '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually > mean the same thing, so I would be concerned about duplicate ways of > expressing the same thing. I can't think of a good use case for the > ISSUPERSET version in the message below when it comes to STIX indicators > matching observed data. You won't often observe a CIDR block, except > maybe in the case of "this AS is assigned this CIDR block", which could be > expressed as a pseudo pattern "find me a AS whose assigned IPs contains > 1.2.3.4".  But I don't think the current autonomous-system object supports > this data. > > > > For these reasons, I would prefer a single-purpose operator "INSUBNET" > with an ipv4-addr:value or ipv6-addr:value on the LHS and a CIDR block literal > on the right. It's a common enough use case that it's critical (IMO) for MVP, > without us needing to come up with a more general solution. > > > > Greg > > > > From: cti@lists.oasis-open.org [ mailto:cti@lists.oasis-open.org ] On Behalf Of > Kirillov, Ivan A. > Sent: Wednesday, October 26, 2016 2:18 PM > To: cti@lists.oasis-open.org > Subject: [cti] Patterning Operators - CONTAINS > > > > As we briefly mentioned on last week's TC call, one of the open issues in the > Patterning specification is around the CONTAINS Comparison Operator: > > > > Set Operator > > Description > > Example > > a CONTAINS b > > a and b MUST be either an Object Path that uses either the ipv4-addr or ipv6- > addr Object to specify an IPv4 or IPv6 address value OR a string that > represents a particular IPv4 or IPv6 address. b MUST be wholly contained > within a. > > '192.168.10.0/24' CONTAINS ipv4-addr:value > > > > The issues around this operator are two-fold: > > > > 1.       Unlike all other Comparison Operators, CONTAINS supports Object > Paths in both arguments (i.e., not just on the left-hand side). This was done > intentionally, to permit the _expression_ of patterns around blacklisting (e.g., > that a particular IP address falls into a particular CIDR range), as well as testing > whether a particular subnet contains a specific IP address: > > a.       '192.168.10.0/24' CONTAINS ipv4-addr:value > > b.       ipv4-addr:value CONTAINS  '192.168.10.0' > > 2.       There's been some consternation around the name "CONTAINS" - > some find it confusing because they think it's a substring operator (as in > many programming languages), others just don't think it's clear enough. On > the second point, the issue with changing the name to something more > specific (e.g., INSUBNET) is that also changes the abstract nature of the > operator, meaning that it can't be used with additional Cyber Observable > Objects in the future. > > > > An idea that Trey, John-Mark, and I have kicked around is to replace this > operator with two generic set operators, ISSUBSET and ISSUPERSET: > > > > a ISSUBSET b > > b MUST be a subset of a. > > ipv4-addr:value ISSUBSET '192.168.10.0/8' > > a ISSUPERSET b > > b MUST be wholly contained within a. > > ipv4-addr:value ISSUPERSET '192.168.10.0' > > > > That way we can still support the blacklisting use case, consistently have > Object Paths on the left-hand side of every operator, and also retain a level > of abstraction that permits use with future Objects. Any thoughts on this? > Does this seem like a preferable alternative to CONTAINS? > > > > Regards, > > Ivan --------------------------------------------------------------------- To unsubscribe from this mail list, you must leave the OASIS TC that generates this mail.  Follow this link to all your TCs in OASIS at: https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php


  • 11.  Re: [cti] RE: Patterning Operators - CONTAINS

    Posted 10-27-2016 23:41
    Back, Greg wrote this message on Wed, Oct 26, 2016 at 19:49 +0000: > I strongly support any plan that makes the LHS always be an object path and RHS always be a value, so I would be OK with this. > > I’m a little concerned for three reasons: > > > - There’s already the IN operator, where the RHS is a set (in this case, a literal set of values vs the more-abstract notion of a CIDR block). I agree there’s some risk/complexity in allowing a “special” case of IN to be where the LHS is an ipv4-addr:value or ipv6:value, and the RHS is a CIDR block, and for this reason separate operators are OK. But I expect a lot of people to want (ipv4-addr:value IN “192.168.0.0/16”) to work if both IN and INSUBSET support arbitrary sets. Forgot to address this issue in my email. IN is not the same as CONTAINS. a IN (x, y, ...) is short hande for a == x OR a == Y ..., and so not a proper set operation. This means that we cannot repurpose IN w/o having confusing change of behavior of the IN operator. As Jason pointed out, Oracle uses a new operator to handle IP subset identification. > - The “set” semantics are not entirely clear. A single IP address is not a subset of anything. *The set consisting of* a single IP address may be a subset of a larger set of IPs. Also, what If I want to represent a more arbitrary set of IPs where the last octet is 255 (X.X.X.255 or netmask 00000000000000000000000011111111) or more complex bit patterns? It’s still technically a “subset”, but not really representable using CIDR/IP-literal notation. It's easy to make simple cases fail. I worked at nCircle when we were doing IP/CIDR matching, and it isn't an easy task. > - The examples currently in the document (ipv4-addr:value ISSUBSET '192.168.10.0/8' and '192.168.10.0/8' ISSUPERSET ipv4-addr:value) actually mean the same thing, so I would be concerned about duplicate ways of expressing the same thing. I can’t think of a good use case for the ISSUPERSET version in the message below when it comes to STIX indicators matching observed data. You won’t often observe a CIDR block, except maybe in the case of “this AS is assigned this CIDR block”, which could be expressed as a pseudo pattern “find me a AS whose assigned IPs contains 1.2.3.4”. But I don’t think the current autonomous-system object supports this data. The first standard use for this is black lists. Is this IP part of a C&C network. The second are things like, does the interface of this machine have this IP on it's local subnet: network-interface:network ISSUPERSET '192.168.1.5' -- John-Mark


  • 12.  RE: [cti] RE: Patterning Operators - CONTAINS

    Posted 10-28-2016 19:00
    Comments inline. > -----Original Message----- > From: John-Mark Gurney [ mailto:jmg@newcontext.com ] > Sent: Thursday, October 27, 2016 6:41 PM > To: Back, Greg <gback@mitre.org> > Cc: Kirillov, Ivan A. <ikirillov@mitre.org>; cti@lists.oasis-open.org > Subject: Re: [cti] RE: Patterning Operators - CONTAINS > ... > Forgot to address this issue in my email. IN is not the same as CONTAINS. a > IN (x, y, ...) is short hande for a == x OR a == Y ..., and so not a proper set > operation. How is IN not (semantically) equivalent to a set operation, assuming we are talking about finite sets (even large finite sets like a /8 netblock)? > This means that we cannot repurpose IN w/o having confusing change of > behavior of the IN operator. As Jason pointed out, Oracle uses a new > operator to handle IP subset identification. I'm fine with having a special operator for IP subnet containment; I just don't think it should be a generic "subset" operator, at least until we identify more use cases for it. Re-using IN has its pros and cons, and subnet containment is a special case either way. > The first standard use for this is black lists. Is this IP part of a C&C network. > > The second are things like, does the interface of this machine have this IP on > it's local subnet: > network-interface:network ISSUPERSET '192.168.1.5' We don't currently define a network-interface object (that I can find), so at this point I'm still not convinced it's needed. That said, I'm not opposed to a (better-named) ISSUPERSET operator, since implementing it will likely be easy once the converse ISSUBSET (again, better-named) operator is. Greg


  • 13.  Re: [cti] Patterning Operators - CONTAINS

    Posted 10-27-2016 23:32
    After a call w/ people this morning, there are two options (technically three) to move this forward. After discussion, we would like to remove the LHS always being Object Path and RHS always being constant, but we feel that it is too late to make this change, and we can change this in the future w/o breaking backward compatibility. This change could possibly happen in 2.1 when the operators are expanded. 1) Keep the current convention of having the Object Path on the LHS, and constant on RHS, and replace the CONTAINS operator w/ ISSUBSET and ISSUPERSET. Both operators are required to support current known use cases. 2a) We use CONTAINS, but make it support Object Path or Constants on either side. This is how things are currently specified. 2b) We rename CONTAINS to ISSUBSET, but keep the current definition of Object Path and Constants on either side. Sorry if things are a bit short, I've been busy today and wanted to get this email out before I leave for the day. Feel free to ask any questions about the above proposals. Thanks for your consideration. -- John-Mark


  • 14.  RE: [cti] Patterning Operators - CONTAINS

    Posted 10-28-2016 18:52
    I support option 1. I don't like the names ISSUBSET and ISSUPERSET, due to the a difference between set containment (? / U+2208) and subset relationships (? / U+2282, ? / U+2283, ? / U+2286, ? / U+2287). While I don't think we need strict mathematical formality, I would prefer CONTAINS/CONTAINEDIN, HASMEMBER/MEMBEROF, HASITEM/ELEMENTOF or any number of other terms. ====== We don't (I don't think) have a way to express that a pattern is equal to some list (there is no list constant defined in the spec), but we do have the [*] notation for matching "any item" from a list. So you can't say: [network-traffic:protocols = ('IP', 'TCP', 'HTTP')] But you can say: [network-traffic:protocols[*] = 'IP' AND network-traffic:protocols[*] = 'TCP' AND network-traffic:protocols[*] = 'HTTP'] Which is slightly less rigorous (it would also match {'protocols': ['HTTP', 'IP', 'TCP']}); in practice I don't see this as a problem. We *could* use ISSUPERSET to do something like: [network-traffic:protocols ISSUPERSET ('IP', 'TCP', 'HTTP')] Or, if you want to ensure the sets are equal (irrespective of order, and if you don't require *proper* subsets and supesets, which we shouldn't): [network-traffic:protocols ISSUPERSET ('IP', 'TCP', 'HTTP') AND network-traffic:protocols ISSUBSET ('IP', 'TCP', 'HTTP')] ====== To be clear, I'm not saying we *should* do that, but rather we *could*. I don't think we should, at least for this release. My point is that "superset" and "subset" should not be used to match a field that has a single value. Greg > -----Original Message----- > From: cti@lists.oasis-open.org [ mailto:cti@lists.oasis-open.org ] On Behalf Of > John-Mark Gurney > Sent: Thursday, October 27, 2016 6:32 PM > To: Kirillov, Ivan A. <ikirillov@mitre.org> > Cc: cti@lists.oasis-open.org > Subject: Re: [cti] Patterning Operators - CONTAINS > > After a call w/ people this morning, there are two options (technically > three) to move this forward. > > After discussion, we would like to remove the LHS always being Object Path > and RHS always being constant, but we feel that it is too late to make this > change, and we can change this in the future w/o breaking backward > compatibility. This change could possibly happen in 2.1 when the operators > are expanded. > > 1) Keep the current convention of having the Object Path on the LHS, and > constant on RHS, and replace the CONTAINS operator w/ ISSUBSET and > ISSUPERSET. Both operators are required to support current known use > cases. > > 2a) We use CONTAINS, but make it support Object Path or Constants on > either side. This is how things are currently specified. > > 2b) We rename CONTAINS to ISSUBSET, but keep the current definition of > Object Path and Constants on either side. > > Sorry if things are a bit short, I've been busy today and wanted to get this > email out before I leave for the day. Feel free to ask any questions about the > above proposals. > > Thanks for your consideration. > > -- > John-Mark > > --------------------------------------------------------------------- > To unsubscribe from this mail list, you must leave the OASIS TC that > generates this mail. Follow this link to all your TCs in OASIS at: > https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php