OASIS Static Analysis Results Interchange Format (SARIF) TC

 View Only
  • 1.  Question about specifying file locations

    Posted 11-26-2018 17:33
    All (but especially Larry): We're collaborating with a partner with a tool that is producing Sarif. Their tool analyzes .jar files, but reports results in terms of source code locations. The question is how to produce valid Sarif in that situation. Right now they specify file:///c:/.../foo.jar#/org/.../bar.java (this is pre-11/14 Sarif), which implies to me that the bar.java file is expected to be explicitly textually embedded with in the foo.jar file. Although one can put source files in .jar files in this fashion, I don't believe that's normal practice, and that's actually not where the .java files are in this particular case. The tool does not actually care where the source files are; I believe it is retrieving their names from debug information embedded within the .class files. I don't believe that this was the intended use of the mechanism for specifying nested files, although I do see how the spec could have been interpreted in this way. In my opinion, it would be more correct to have the file location be specified in a relative fashion. For the above example, that would be org/.../bar.java , and then use the uriBaseId mechanism, one entry per .jar file, so that the fileLocation object for the above example could look something like this: { uri= org/.../bar.java , uriBaseId= FOO.JAR } That is, each .jar file that is analyzed would give rise to a uriBaseId specifically for that .jar file. Does my alternative make better sense? Thanks, -Paul -- Paul Anderson, VP of Engineering, GrammaTech, Inc. 531 Esty St., Ithaca, NY 14850 Tel: +1 607 273-7340 x118; http://www.grammatech.com


  • 2.  RE: [sarif] Question about specifying file locations

    Posted 11-26-2018 17:58




    You re right: the spec defines a nested file to be one that is physically embedded in its container:
     

    nested file

    file  which
    is contained within another file
     
    The old nested file URI syntax which is no longer needed now that run.files is an array was always intended to express that physical containment.
     
    Yes, your idea of using the .jar file as a uriBaseId is a good one.
     
    Larry
     


    From: sarif@lists.oasis-open.org <sarif@lists.oasis-open.org>
    On Behalf Of Paul Anderson
    Sent: Monday, November 26, 2018 9:33 AM
    To: sarif@lists.oasis-open.org
    Subject: [sarif] Question about specifying file locations


     
    All (but especially Larry):
    We're collaborating with a partner with a tool that is producing Sarif. Their tool analyzes .jar files, but reports results in terms of source code locations. The question is how to produce valid Sarif in that situation. Right now they specify " file:///c:/.../foo.jar#/org/.../bar.java "
    (this is pre-11/14 Sarif), which implies to me that the bar.java file is expected to be explicitly textually embedded with in the foo.jar file.

    Although one can put source files in .jar files in this fashion, I don't believe that's normal practice, and that's actually not where the .java files are in this particular case. The tool does not actually care where the source files are; I believe
    it is retrieving their names from debug information embedded within the .class files.
    I don't believe that this was the intended use of the mechanism for specifying nested files, although I do see how the spec could have been interpreted in this way.
    In my opinion, it would be more correct to have the file location be specified in a relative fashion. For the above example, that would be "org/.../bar.java", and then use the uriBaseId mechanism, one entry per .jar file, so that the fileLocation object
    for the above example could look something like this:
       {
           uri="org/.../bar.java",
           uriBaseId="FOO.JAR"
       }
    That is, each .jar file that is analyzed would give rise to a uriBaseId specifically for that .jar file.
    Does my alternative make better sense?
    Thanks,
    -Paul
    --
    Paul Anderson, VP of Engineering, GrammaTech, Inc.
    531 Esty St., Ithaca, NY 14850
    Tel: +1 607 273-7340 x118; http://www.grammatech.com






  • 3.  Re: [sarif] Question about specifying file locations

    Posted 11-26-2018 18:03
    Thanks Larry! On 11/26/2018 12:57 PM, Larry Golding (Myriad Consulting Inc) wrote: You re right: the spec defines a nested file to be one that is physically embedded in its container: nested file file which is contained within another file The old nested file URI syntax which is no longer needed now that run.files is an array was always intended to express that physical containment. Yes, your idea of using the .jar file as a uriBaseId is a good one. Larry From: sarif@lists.oasis-open.org <sarif@lists.oasis-open.org> On Behalf Of Paul Anderson Sent: Monday, November 26, 2018 9:33 AM To: sarif@lists.oasis-open.org Subject: [sarif] Question about specifying file locations All (but especially Larry): We're collaborating with a partner with a tool that is producing Sarif. Their tool analyzes .jar files, but reports results in terms of source code locations. The question is how to produce valid Sarif in that situation. Right now they specify file:///c:/.../foo.jar#/org/.../bar.java (this is pre-11/14 Sarif), which implies to me that the bar.java file is expected to be explicitly textually embedded with in the foo.jar file. Although one can put source files in .jar files in this fashion, I don't believe that's normal practice, and that's actually not where the .java files are in this particular case. The tool does not actually care where the source files are; I believe it is retrieving their names from debug information embedded within the .class files. I don't believe that this was the intended use of the mechanism for specifying nested files, although I do see how the spec could have been interpreted in this way. In my opinion, it would be more correct to have the file location be specified in a relative fashion. For the above example, that would be org/.../bar.java , and then use the uriBaseId mechanism, one entry per .jar file, so that the fileLocation object for the above example could look something like this: { uri= org/.../bar.java , uriBaseId= FOO.JAR } That is, each .jar file that is analyzed would give rise to a uriBaseId specifically for that .jar file. Does my alternative make better sense? Thanks, -Paul -- Paul Anderson, VP of Engineering, GrammaTech, Inc. 531 Esty St., Ithaca, NY 14850 Tel: +1 607 273-7340 x118; http://www.grammatech.com -- Paul Anderson, VP of Engineering, GrammaTech, Inc. 531 Esty St., Ithaca, NY 14850 Tel: +1 607 273-7340 x118; http://www.grammatech.com