This is right. And it really does illustrate what an odd corner case this is. You have to use a uriBaseId to construct a reference for this because the otherwise identical absolute URLs don’t comprise unique keys for the files table.
Btw – this example sheds light on a need for some general guidance for how to handle embedded file contents. In this corner case, you must absolutely prefer the embedded content because it might not exist anywhere else (i.e., it was never
in source control or it was on disk previously but was subsequently overwritten).
But does it make sense as a rule to tell people to prefer the embedded file contents if it exists? I think it does as I think about it. In cases where you are certain to have access to files properly matched to your SARIF results (such
as if you just completed a local analysis or if your absolute URLs are versioned and point to accessible copies of the file), producers should not generate the embedded file contents. As a rule, injecting the file contents is a post-processing step that is
explicitly completed because the log file is being prepared for ingestion into a results mgmt. system (attached to a work item, persisted to a common remote store, etc.).
Michael
From:
sarif@lists.oasis-open.org <
sarif@lists.oasis-open.org>
On Behalf Of Larry Golding (Comcast)
Sent: Tuesday, April 10, 2018 9:41 AM
To: Michael Fanning <
Michael.Fanning@microsoft.com>; 'James A. Kupsch' <
kupsch@cs.wisc.edu>;
sarif@lists.oasis-open.org Subject: [sarif] RE: Please comment on #125
I see! A SARIF producer enables consumers to access previous versions of an overwritten file not just by
mentioning each version in the run.files dictionary, but by
persisting their contents there. It seems so obvious now