OASIS eXtensible Access Control Markup Language (XACML) TC

 View Only
Expand all | Collapse all

string-regexp-match function on individual resource-id values as an alternative to the xpath-node-equal/match function?

  • 1.  string-regexp-match function on individual resource-id values as an alternative to the xpath-node-equal/match function?

    Posted 10-15-2009 16:39
    
    
    
    
    
    
    
    
    
    
    

    Hi all,

    in order to identify isolated issues of the mult&hier resource profile, I’d like to raise this question (which I hope can be addressed separately)

    Does it make sense to evaluate the individual resource-id attribute values through the string-regexp-match function?

    The example below tries to motivate why the string-regexp-match function could be a reasonable alternative to using the xpath-node-equal function.

    Assume you have the following decision request with a resource content element containing 1,000,000 book objects.

    <Request>

      <Attributes   Category="urn:oasis:names:tc:xacml:3.0:attribute-category:resource">

        <Content>

          <objects>

            <book id=1>

              <title>xxx</title>

              <author>Bob</author>

              <id>100</id>

              <price>30</price>

              <book-content>...</book-content>

            </book>

            <book id=2>

              …

            </book>

             ….

            <book id=1,000,000>

              …

            </book>

          </content>

          <Attribute AttributeId="urn:oasis:names:tc:xacml:1.0:resource:resource-id" >

            <AttributeValue  XPathCategory="urn:oasis:names:tc:xacml:3.0:attribute-category:resource" DataType=" urn:oasis:names:tc:xacml:3.0:data-type:xpathExpression">/objects</AttributeValue>

          </Attribute>

          <Attribute AttributeId="urn:oasis:names:tc:xacml:2.0:resource:scope " >

            <AttributeValue  XPathCategory="urn:oasis:names:tc:xacml:3.0:attribute-category:resource" DataType="http://www.w3.org/2001/XMLSchema#string"> Descendants </AttributeValue>

          </Attribute>

        </Attributes>

        …

     </Request>  

    As one of this simple book elements consists of 12 nodes, you will get 12 individual decision requests for each book.

    Having 1,000,000 book elements will result in 12,000,000 individual decision requests.

    Having a rule like permit access to book nodes will imply 12,000,000 evaluations of the function checking the individual resource-id attributes.

    e.g.  if you use the xpath-node-equal function…

    xpath-node-equal(/objects[1],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1]/@id,  /Objects/book)

    xpath-node-equal(/objects[1]/book[1]/title[1],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1]/title[1]/text(),  /Objects/book)

    xpath-node-equal(/objects[1]/book[2],  /Objects/book)

    xpath-node-equal(/objects[1]/book[3],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1,000,000],  /Objects/book)

    The question now is whether it is faster to use the

    a) xpath-node-equal function or the

    b) string-regexp-match function

    against the resource-id values of the derived individual decision requests.

    The evaluation of the xpath-node-equal function will imply the evaluation of 12,000,001 xpath expressions against the large (e.g. 2GB) DOM representing the content.

    The evaluation of the string-regexp-match function doesn’t need to be evaluated against the xml resource. Further the regular expression has a very specific structure (e.g. reg-exp-string-match(resource-id,/objects\[\d+\]/book\[\d+\])

    I am no expert in xpath evaluation algorithms but it seems to me that reg-exp-match is the cheaper operation to do.

    I think a theoretical performance analysis between the string-regexp-match function and an xpath evaluation algorithm is the core issue we have to figure out.

    If string-regexp-match performs significantly better than we should add this alternative to the profile. In this case we have of course to agree on a normal-form for the individual resource-id values and on standardised guidelines how to deal with the namespace problem.

    Regards

    Jan

    ________________________________________

    Jan Herrmann
    Dipl.-Inform., Dipl.-Geogr. 

    wissenschaftlicher Mitarbeiter

    Technische Universität München
    Institut für Informatik

    Lehrstuhl für Angewandte Informatik / Kooperative Systeme

    Boltzmannstr. 3
    85748 Garching

    Tel:      +49 (0)89 289-18692
    Fax:     +49 (0)89 289-18657

    Raum:
    www11.informatik.tu-muenchen.de
    ________________________________________



  • 2.  RE: [xacml] string-regexp-match function on individual resource-id values as an alternative to the xpath-node-equal/match function?

    Posted 10-15-2009 17:56
    
    
    
    
    
    
    
    If you really want 12M Result elements, the PDP is going to have to do something 12M times.  If it's not very well optimized, it's likely to do things 12M x n times.
     
    Assume it's highly optimized so it just walks DOM once.  At each node it will:
     
    1. Generate a resource-id for that node
    2. Apply rule(s) on that resource-id (or node).  Either
        a. apply regexp match; OR
        b. apply some xpath expression
    3. Put a Result element in the response context
     
    I'm not sure what you want to test at each node, but with regexp all you do is test the position and ancestors of each node.  This would be problematic (and probably expensive) to do with current XACML xpath capabilities. But if AttributeSelector allowed you to set xpath evaluation context at the decision node, it would be a simple matter to test position and ancestor list (or any other information from any node in the document). I do not believe a regexp test gives much, if any, advantage here.  And, you must admit, this is an extreme use case that would most likely occur in batch (non-interactive) mode, so performance is less critical.
     
    Since you are not testing any content, maybe you really just want to know something like "Can Jones see the titles of all books in the catalog?", "Can Jones see the ids?", "Can Jones see the prices?".  In that case you could just identify the resource type (element name), perhaps by using the fully qualified type name from the schema, like:
     
        resource-id=http://example.com/myBookSchema#book (or #title, #id, etc.)
     
    No XML content is required to answer this sort of request.
     
    Regards,
    --Paul


    From: Jan Herrmann [mailto:herrmanj@in.tum.de]
    Sent: Thursday, October 15, 2009 11:39
    To: xacml@lists.oasis-open.org
    Subject: [xacml] string-regexp-match function on individual resource-id values as an alternative to the xpath-node-equal/match function?

    Hi all,

    in order to identify isolated issues of the mult&hier resource profile, I’d like to raise this question (which I hope can be addressed separately)

    Does it make sense to evaluate the individual resource-id attribute values through the string-regexp-match function?

    The example below tries to motivate why the string-regexp-match function could be a reasonable alternative to using the xpath-node-equal function.

    Assume you have the following decision request with a resource content element containing 1,000,000 book objects.

    <Request>

    …

      <Attributes   Category="urn:oasis:names:tc:xacml:3.0:attribute-category:resource">

        <Content>

          <objects>

            <book id=1>

              <title>xxx</title>

              <author>Bob</author>

              <id>100</id>

              <price>30</price>

              <book-content>...</book-content>

            </book>

            <book id=2>

              …

            </book>

             ….

            <book id=1,000,000>

              …

            </book>

          </content>

          <Attribute AttributeId="urn:oasis:names:tc:xacml:1.0:resource:resource-id" >

            <AttributeValue  XPathCategory="urn:oasis:names:tc:xacml:3.0:attribute-category:resource" DataType=" urn:oasis:names:tc:xacml:3.0:data-type:xpathExpression">/objects</AttributeValue>

          </Attribute>

          <Attribute AttributeId="urn:oasis:names:tc:xacml:2.0:resource:scope " >

            <AttributeValue  XPathCategory="urn:oasis:names:tc:xacml:3.0:attribute-category:resource" DataType="http://www.w3.org/2001/XMLSchema#string"> Descendants </AttributeValue>

          </Attribute>

        </Attributes>

        …

     </Request>  

    As one of this simple book elements consists of 12 nodes, you will get 12 individual decision requests for each book.

    Having 1,000,000 book elements will result in 12,000,000 individual decision requests.

    Having a rule like permit access to book nodes will imply 12,000,000 evaluations of the function checking the individual resource-id attributes.

    e.g.  if you use the xpath-node-equal function…

    xpath-node-equal(/objects[1],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1]/@id,  /Objects/book)

    xpath-node-equal(/objects[1]/book[1]/title[1],  /Objects/book)

    xpath-node-equal(/objects[1]/book[1]/title[1]/text(),  /Objects/book)

    …

    xpath-node-equal(/objects[1]/book[2],  /Objects/book)

    …

    xpath-node-equal(/objects[1]/book[3],  /Objects/book)

    …

    xpath-node-equal(/objects[1]/book[1,000,000],  /Objects/book)

    The question now is whether it is faster to use the

    a) xpath-node-equal function or the

    b) string-regexp-match function

    against the resource-id values of the derived individual decision requests.

    The evaluation of the xpath-node-equal function will imply the evaluation of 12,000,001 xpath expressions against the large (e.g. 2GB) DOM representing the content.

    The evaluation of the string-regexp-match function doesn’t need to be evaluated against the xml resource. Further the regular expression has a very specific structure (e.g. reg-exp-string-match(resource-id,/objects\[\d+\]/book\[\d+\])

    I am no expert in xpath evaluation algorithms but it seems to me that reg-exp-match is the cheaper operation to do.

    I think a theoretical performance analysis between the string-regexp-match function and an xpath evaluation algorithm is the core issue we have to figure out.

    If string-regexp-match performs significantly better than we should add this alternative to the profile. In this case we have of course to agree on a normal-form for the individual resource-id values and on standardised guidelines how to deal with the namespace problem.

    Regards

    Jan

    ________________________________________

    Jan Herrmann
    Dipl.-Inform., Dipl.-Geogr. 

    wissenschaftlicher Mitarbeiter

    Technische Universität München
    Institut für Informatik

    Lehrstuhl für Angewandte Informatik / Kooperative Systeme

    Boltzmannstr. 3
    85748 Garching

    Tel:      +49 (0)89 289-18692
    Fax:     +49 (0)89 289-18657

    Raum:
    www11.informatik.tu-muenchen.de
    ________________________________________



  • 3.  AW: [xacml] string-regexp-match function on individual resource-id values as an alternative to the xpath-node-equal/match function?

    Posted 10-16-2009 16:22
    
    
    
    
    


  • 4.  Re: AW: [xacml] string-regexp-match function on individual resource-idvalues as an alternative to the xpath-node-equal/match function?

    Posted 10-22-2009 12:24
    Hi Jan,
    
    See a small comment inline, which I know the history of.
    
    Jan Herrmann wrote:
    >
    > Hi Paul,
    >
    > to start I have some little questions concerning the processing you 
    > described.
    >
    > In case 2 you will always have to generate a resource-id value for 
    > each node first (e.g. /objects[1]/book[1]).
    >
    > Than in sub case 2a you use this string in the string-regexp-match 
    > function.
    >
    > What happens exactly in sub case 2b?
    >
    > xpath-node-equal(/objects[1]/book[1], /Objects/book)
    >
    > While you go through each of the 12M resource-id values (more 
    > precisely: nodes in this case), do you always have to check if the 
    > node is equal to one of the 1M book nodes selected by the second 
    > parameter?
    >
    > Further I am wondering what the section in the spec (line 4908ff -see 
    > below) actually means:
    >
    > “The function [xpath-node-equal] SHALL return "True" if any of the XML 
    > nodes in the node-set matched by the first argument equals any of the 
    > XML nodes in the node-set matched by the second argument. Two nodes 
    > are considered equal if they have the same identity.”
    >
    > How is identity defined?
    >
    
    In general, the XACML TC are a bunch of lazy people who like to 
    reference other specifications so we can reuse the results of other 
    people. ;-) For defining the xpath matching functions XACML 2.0 referred 
    to a draft specification of XPath 2.0. XPath 2.0 finalized their spec 
    during the course of XACML 3.0 development, so we updated our references 
    to the final XPath 2.0 specification. However, the final XPath 2.0 
    specification did not contain the node equality function which XACML 2.0 
    referred to. In this case I copied the text from the old XPath 2.0 
    working draft into the XACML 3.0 specification. So "... if they have the 
    same identity" comes from a draft spec of XPath 2.0. I could not find 
    anything better so I left it unchanged by copying in the text.
    
    > Further you asked in your mail what I want to test at each node. Well 
    > that depends on the needed access control semantics. For example you 
    > could
    >
    > 1: permit access to a certain book node if 
    > Selector(resource-id+/author) = subject-id
    >
    > 2: deny access to a certain book node if 
    > (Selector(resource-id+/author) != subject-id) AND 
    > (Selector(resource-id+/price>100)
    >
    > or
    >
    > 3: deny access to the book-content element of a certain book if 
    > ((Selector(resource-id+/author) != subject-id) AND 
    > (Selector(resource-id+/price)>100)
    >
    > …
    >
    > I hope that the examples show that the combination of individual 
    > resource-id values and the use of these values as base in the 
    > selectors Requestcontext path allows for very flexible access semantics.
    >
    > Further having decisions to each individual node allows to filter out 
    > the individual nodes.
    >
    > e.g. if only book No. 12345 is access restricted because of rule 2 
    > than your pep can use the resource-id value in the decision response 
    > and through a simple xslt the only access restricted book node can be 
    > filtered out.
    >
    > You were further asking if I can’t use schema based restrictions. As 
    > the examples above motivate this is not possible as we mainly have 
    > content dependant restrictions.
    >
    > As we use xacml to protect web services performance is critical. Of 
    > course the example was huge but not unrealistic.
    >
    > Of course you could do the access control on the web-service-request. 
    > E,g, assume
    >
    > you had the following web-service-request that resulted in the set of 
    > 1M books:
    >
    > 
    >
    > Enforcing the rights above while doing pre-processing access control 
    > (ie. access control on the web-service-request) implies that you have 
    > rules e.g.
    >
    >