OASIS Cyber Threat Intelligence (CTI) TC

 View Only
  • 1.  Re: [cti] TAXII Pagination

    Posted 10-04-2019 19:50




    Thanks Bret,
     
    I like this approach and believe it s a small flexible change that lets a client consume data in page sizes they want regardless of what backend the target server uses. 

     
    Looking forward to feedback from others.
     
    Thanks,
     

    From: <cti@lists.oasis-open.org> on behalf of Bret Jordan <Bret_Jordan@symantec.com>
    Date: Friday, October 4, 2019 at 15:16
    To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: [cti] TAXII Pagination


     


    All,


     


    In TAXII 2.1 we have a pretty good pagination solution, but it suffers from a known issue when multiple records have the same date added value. We originally
    tried  to address this by saying that the date added value MUST be microsecond level precision. But that is not sufficient for some.  


     


    As such, I have been working with Looking Glass on a potential solution that requires the least amount of changes to make this work.  After many back-and-forth versions, I think we have something
    that might work.  Please review. 


     


     


    TAXII Pagination Proposal


     


    To keep things simple, for mental visualization, we will be defining the scenarios in terms of small numbers.  But one must realize that in production, these numbers will be many orders of magnitude
    larger.


     


    1 Fundamental Design Goals


    Completely stateless for the server in the true RESTful sense



    Simple way for clients to start synchronization after some point in time, without having to sync the entire collection.


    Example: A collection may have billions of records in it going back 10 years. But a client really only cares about syncing or getting data from the past 6 months.



    Need ability to paginate records where every record has its own date_added value


    Need ability to paginate records where many records may have the same date_added value


     


    2 Proposed Solution Summary




    Add a single optional property called "next" (type: string) to the TAXII Envelope
    Add a URL parameter called "next"


     


    3 Scenario



     


    The collection has 200 indicator records, however, the first 100 records all have the same date_added timestamp


     


    3.1 Problem


    Our current method breaks if and only if, the client has a limit of less than 100 or the server artificially limits the records to less than 100. Under this condition the client will not get all
    of the records or will have inconsistent experience. 


     


    3.2 Example Initial Request From Client


    ? added_after=2010-01-01T01:01:01.123456Z & limit=20


     


    3.3 Server Processes Query Request


    The server queries the datastore with a record limit of 21 records (client provided or server limited limit value + 1) that match the rest of the request


     




    The server checks results to see if there are 21 records returned.




    If NO then there are no more records that match the query and the TAXII server can send the results in a TAXII envelope to the client





    TAXII Envelope "more" property set to "false"
    TAXII Envelope "next" property is left empty





    If YES then there are more records and the server would respond with the following





    TAXII Envelope "more" property set to "true"
    TAXII Envelope "next" property set to a string value. For a relational database this could be the index autoID, for elastic search it could be the Scroll ID, for other systems it could be a cursor ID, or it could be any string
    (or int represented as a string) depending on the requirements of the server and the black magic it is doing in the background. The key is that it is something that the server knows how to deal with and process and the client only needs to send it back to
    the server in the next request to get more data.




     


    3.4 Example Follow On Request From Client


    ? added_after=2010-01-01T01:01:01.123456Z & limit=20 & next=123456789


     


     


    If we can verify that this does solve the issue, and is still easy to implement (I believe so) this is something that we could do for TAXII 2.1, if the TC agrees.  Yes it would require another
    CSD and Public Review, but it would allow us to address this last known issue.


     


     


    Thoughts ????


     


    Bret



     







  • 2.  Re: [cti] TAXII Pagination

    Posted 10-04-2019 19:57




    +1
     

    Allan Thomson
    CTO ( +1-408-331-6646)

    LookingGlass Cyber Solutions
     

    From: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org> on behalf of Matt Pladna <mpladna@lookingglasscyber.com>
    Date: Friday, October 4, 2019 at 12:50 PM
    To: Bret Jordan <Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: Re: [cti] TAXII Pagination


     

    Thanks Bret,
     
    I like this approach and believe it s a small flexible change that lets a client consume data in page sizes they want regardless of what backend the target server uses. 

     
    Looking forward to feedback from others.
     
    Thanks,
     

    From: <cti@lists.oasis-open.org> on behalf of Bret Jordan <Bret_Jordan@symantec.com>
    Date: Friday, October 4, 2019 at 15:16
    To: "cti@lists.oasis-open.org" <cti@lists.oasis-open.org>
    Subject: [cti] TAXII Pagination


     


    All,


     


    In TAXII 2.1 we have a pretty good pagination solution, but it suffers from a known issue when multiple records have the same date added value. We originally
    tried  to address this by saying that the date added value MUST be microsecond level precision. But that is not sufficient for some.  


     


    As such, I have been working with Looking Glass on a potential solution that requires the least amount of changes to make this work.  After many back-and-forth versions, I think we have something
    that might work.  Please review. 


     


     


    TAXII Pagination Proposal


     


    To keep things simple, for mental visualization, we will be defining the scenarios in terms of small numbers.  But one must realize that in production, these numbers will be many orders of magnitude
    larger.


     


    1 Fundamental Design Goals


    Completely stateless for the server in the true RESTful sense



    Simple way for clients to start synchronization after some point in time, without having to sync the entire collection.


    Example: A collection may have billions of records in it going back 10 years. But a client really only cares about syncing or getting data from the past 6 months.



    Need ability to paginate records where every record has its own date_added value


    Need ability to paginate records where many records may have the same date_added value


     


    2 Proposed Solution Summary




    Add a single optional property called "next" (type: string) to the TAXII Envelope
    Add a URL parameter called "next"


     


    3 Scenario



     


    The collection has 200 indicator records, however, the first 100 records all have the same date_added timestamp


     


    3.1 Problem


    Our current method breaks if and only if, the client has a limit of less than 100 or the server artificially limits the records to less than 100. Under this condition the client will not get all
    of the records or will have inconsistent experience. 


     


    3.2 Example Initial Request From Client


    ? added_after=2010-01-01T01:01:01.123456Z & limit=20


     


    3.3 Server Processes Query Request


    The server queries the datastore with a record limit of 21 records (client provided or server limited limit value + 1) that match the rest of the request


     




    The server checks results to see if there are 21 records returned.




    If NO then there are no more records that match the query and the TAXII server can send the results in a TAXII envelope to the client





    TAXII Envelope "more" property set to "false"
    TAXII Envelope "next" property is left empty





    If YES then there are more records and the server would respond with the following





    TAXII Envelope "more" property set to "true"
    TAXII Envelope "next" property set to a string value. For a relational database this could be the index autoID, for elastic search it could be the Scroll ID, for other systems it could be a cursor ID, or it could be any string
    (or int represented as a string) depending on the requirements of the server and the black magic it is doing in the background. The key is that it is something that the server knows how to deal with and process and the client only needs to send it back to
    the server in the next request to get more data.




     


    3.4 Example Follow On Request From Client


    ? added_after=2010-01-01T01:01:01.123456Z & limit=20 & next=123456789


     


     


    If we can verify that this does solve the issue, and is still easy to implement (I believe so) this is something that we could do for TAXII 2.1, if the TC agrees.  Yes it would require another
    CSD and Public Review, but it would allow us to address this last known issue.


     


     


    Thoughts ????


     


    Bret