The "next"
method is precisely what I was originally asking for a few months ago,
so I agree with this approach. However a question,
with this solution, can one simply opt to use the "next" method
and skip the time-based method totally? This would be ideal. - Jason Keirstead Chief Architect - IBM Security Threat Management
www.ibm.com/security "Would you like me to give you a formula for success? It's quite simple,
really. Double your rate of failure." - Thomas J. Watson From:
"Vargas-Gonzalez,
Emmanuelle" <
emmanuelle@mitre.org> To:
Allan
Thomson <
athomson@lookingglasscyber.com>, Matt Pladna <
mpladna@lookingglasscyber.com>,
Bret Jordan <
Bret_Jordan@symantec.com>, "cti@lists.oasis-open.org"
<
cti@lists.oasis-open.org> Date:
10/09/2019
12:45 PM Subject:
[EXTERNAL]
[cti] RE: [EXT] Re: [cti] TAXII Pagination Sent
by: <
cti@lists.oasis-open.org> All, As
I was reading this proposed solution for TAXII Pagination. It occurred
to me that currently a TAXII Server does not have a way of advertising
his self-imposed limit for pagination requests. This way, a client can
also know ahead of time its limit via the server api_root resource. This
more of a different problem than the originally expressed in this thread,
but related. What
I propose is adding a new property called max_limit
and you can
read the details below. Property
Name Type Description title (required) string A
human readable plain text name used to identify this API instance. description (optional) string A
human readable plain text description for this API Root. versions (required) list of type string The
list of TAXII versions that this API Root is compatible with. The values
listed in this property MUST match the media types defined in Section 1.6.8.1 and MUST include the optional version parameter. A value of " application/taxii+json;version=2.1" MUST be included in this list to indicate conformance with this
specification. max_content_length (required) integer The
maximum size of the request body in octets (8-bit bytes) that the server
can support. The value of the max_content_length MUST be a positive integer greater than zero. This applies to requests only and is determined by the
server. Requests with total body length values smaller than this value
MUST NOT result in an HTTP 413 (Request Entity Too Large) response.
If for example, the server supported 100 MB of data, the value for this
property would be determined by 100*1024*1024 which equals 104,857,600.
This property contains useful information for the client when it POSTs
requests to the Add Objects endpoint. max_limit (required) integer The
maximum server imposed limit for pagination requests. The value of the
max_limit MUST be a positive integer greater than zero. This only applies to pagination requests made to this
api root. Any request with a limit greater than max_limit will be overridden by the server self-imposed limit. Any
thoughts? -
Emmanuelle From:
cti@lists.oasis-open.org <
cti@lists.oasis-open.org> On Behalf
Of Allan Thomson Sent: Friday, October 4, 2019 3:57 PM To: Matt Pladna <
mpladna@lookingglasscyber.com>; Bret Jordan
<
Bret_Jordan@symantec.com>;
cti@lists.oasis-open.org Subject: [EXT] Re: [cti] TAXII Pagination +1 Allan
Thomson CTO
(+1-408-331-6646) LookingGlass
Cyber Solutions From:
"
cti@lists.oasis-open.org "
<
cti@lists.oasis-open.org >
on behalf of Matt Pladna <
mpladna@lookingglasscyber.com > Date: Friday, October 4, 2019 at 12:50 PM To: Bret Jordan <
Bret_Jordan@symantec.com >,
"
cti@lists.oasis-open.org "
<
cti@lists.oasis-open.org > Subject: Re: [cti] TAXII Pagination Thanks
Bret, I
like this approach and believe it s a small flexible change that lets
a client consume data in page sizes they want regardless of what backend
the target server uses. Looking
forward to feedback from others. Thanks, From:
<
cti@lists.oasis-open.org >
on behalf of Bret Jordan <
Bret_Jordan@symantec.com > Date: Friday, October 4, 2019 at 15:16 To: "
cti@lists.oasis-open.org "
<
cti@lists.oasis-open.org > Subject: [cti] TAXII Pagination All, In
TAXII 2.1 we have a pretty good pagination solution, but it suffers from
a known issue when multiple records have the same date added value. We
originally tried to address this by saying that the date added value MUST
be microsecond level precision. But that is not sufficient for some. As
such, I have been working with Looking Glass on a potential solution that
requires the least amount of changes to make this work. After many
back-and-forth versions, I think we have something that might work. Please
review. TAXII
Pagination Proposal To
keep things simple, for mental visualization, we will be defining the scenarios
in terms of small numbers. But one must realize that in production,
these numbers will be many orders of magnitude larger. 1
Fundamental Design Goals Completely
stateless for the server in the true RESTful sense Simple
way for clients to start synchronization after some point in time, without
having to sync the entire collection. Example:
A collection may have billions of records in it going back 10 years. But
a client really only cares about syncing or getting data from the past
6 months. Need
ability to paginate records where every record has its own date_added value Need
ability to paginate records where many records may have the same date_added
value 2
Proposed Solution Summary Add a single optional
property called "next" (type: string) to the TAXII Envelope Add a URL parameter
called "next" 3
Scenario The
collection has 200 indicator records, however, the first 100 records all
have the same date_added timestamp 3.1
Problem Our
current method breaks if and only if, the client has a limit of less than
100 or the server artificially limits the records to less than 100. Under
this condition the client will not get all of the records or will have
inconsistent experience. 3.2
Example Initial Request From Client ?added_after=2010-01-01T01:01:01.123456Z&limit=20 3.3
Server Processes Query Request The
server queries the datastore with a record limit of 21 records (client
provided or server limited limit value + 1) that match the rest of the
request The server
checks results to see if there are 21 records returned. If NO then
there are no more records that match the query and the TAXII server can
send the results in a TAXII envelope to the client TAXII Envelope
"more" property set to "false" TAXII Envelope
"next" property is left empty If YES then
there are more records and the server would respond with the following TAXII Envelope
"more" property set to "true" TAXII Envelope
"next" property set to a string value. For a relational database
this could be the index autoID, for elastic search it could be the Scroll
ID, for other systems it could be a cursor ID, or it could be any string
(or int represented as a string) depending on the requirements of the server
and the black magic it is doing in the background. The key is that it is
something that the server knows how to deal with and process and the client
only needs to send it back to the server in the next request to get more
data. 3.4
Example Follow On Request From Client ?added_after=2010-01-01T01:01:01.123456Z&limit=20&next=123456789 If
we can verify that this does solve the issue, and is still easy to implement
(I believe so) this is something that we could do for TAXII 2.1, if the
TC agrees. Yes it would require another CSD and Public Review, but
it would allow us to address this last known issue. Thoughts
???? Bret