Hi all,
in order to identify isolated issues of the
mult&hier resource profile, I’d like to raise this question (which I
hope can be addressed separately)
Does it make sense to evaluate the individual
resource-id attribute values through the string-regexp-match function?
The example below tries to motivate why the string-regexp-match
function could be a reasonable alternative to using the xpath-node-equal function.
Assume you have the following decision request with a
resource content element containing 1,000,000 book objects.
<Request>
…
<Attributes
Category="urn:oasis:names:tc:xacml:3.0:attribute-category:resource">
<Content>
<objects>
<book id=1>
<title>xxx</title>
<author>Bob</author>
<id>100</id>
<price>30</price>
<book-content>...</book-content>
</book>
<book id=2>
…
</book>
….
<book id=1,000,000>
…
</book>
</content>
<Attribute
AttributeId="urn:oasis:names:tc:xacml:1.0:resource:resource-id" >
<AttributeValue XPathCategory="urn:oasis:names:tc:xacml:3.0:attribute-category:resource"
DataType=" urn:oasis:names:tc:xacml:3.0:data-type:xpathExpression">/objects</AttributeValue>
</Attribute>
<Attribute AttributeId="urn:oasis:names:tc:xacml:2.0:resource:scope
" >
<AttributeValue XPathCategory="urn:oasis:names:tc:xacml:3.0:attribute-category:resource"
DataType="http://www.w3.org/2001/XMLSchema#string"> Descendants
</AttributeValue>
</Attribute>
</Attributes>
…
</Request>
As one of this simple book elements consists of 12
nodes, you will get 12 individual decision requests for each book.
Having 1,000,000 book elements will result in
12,000,000 individual decision requests.
Having a rule like permit access to book nodes will
imply 12,000,000 evaluations of the function checking the individual resource-id
attributes.
e.g. if you use the xpath-node-equal function…
xpath-node-equal(/objects[1],
/Objects/book)
xpath-node-equal(/objects[1]/book[1],
/Objects/book)
xpath-node-equal(/objects[1]/book[1]/@id,
/Objects/book)
xpath-node-equal(/objects[1]/book[1]/title[1],
/Objects/book)
xpath-node-equal(/objects[1]/book[1]/title[1]/text(),
/Objects/book)
…
xpath-node-equal(/objects[1]/book[2],
/Objects/book)
…
xpath-node-equal(/objects[1]/book[3],
/Objects/book)
…
xpath-node-equal(/objects[1]/book[1,000,000],
/Objects/book)
The question now is whether it is faster to use the
a) xpath-node-equal function or the
b) string-regexp-match function
against the resource-id values of the derived
individual decision requests.
The evaluation of the xpath-node-equal function will
imply the evaluation of 12,000,001 xpath expressions against the large (e.g. 2GB)
DOM representing the content.
The evaluation of the string-regexp-match
function doesn’t need to be evaluated against the xml resource. Further
the regular expression has a very specific structure (e.g. reg-exp-string-match(resource-id,/objects\[\d+\]/book\[\d+\])
I am no expert in xpath evaluation algorithms but it
seems to me that reg-exp-match is the cheaper operation to do.
I think a theoretical performance analysis between
the string-regexp-match function and an xpath evaluation algorithm is the core
issue we have to figure out.
If string-regexp-match performs significantly better than
we should add this alternative to the profile. In this case we have of course
to agree on a normal-form for the individual resource-id values and on standardised
guidelines how to deal with the namespace problem.
Regards
Jan
________________________________________
Jan Herrmann
Dipl.-Inform., Dipl.-Geogr.
wissenschaftlicher
Mitarbeiter
Technische
Universität München
Institut für Informatik
Lehrstuhl für
Angewandte Informatik / Kooperative Systeme
Boltzmannstr. 3
85748 Garching
Tel:
+49 (0)89 289-18692
Fax: +49 (0)89 289-18657
Raum:
www11.informatik.tu-muenchen.de
________________________________________