OASIS Static Analysis Results Interchange Format (SARIF) TC

RE: [External] - RE: Followup on code metrics

  • 1.  RE: [External] - RE: Followup on code metrics

    Posted 10-14-2021 14:50




    Got it, thank you.
     
    Yes, profiling data is granular and therefore extremely verbose.

     
    I think there s an argument that code coverage puts us in the same position, since line level coverage is a thing.
     
    MCF
     


    From: Paul Anderson <paul@grammatech.com>
    Sent: Tuesday, October 12, 2021 1:14 PM
    To: Michael Fanning <Michael.Fanning@microsoft.com>; sarif@lists.oasis-open.org
    Subject: RE: [External] - RE: Followup on code metrics


     
    Michael:
     
    There s nothing concrete from the Misra committee yet. There s a meeting in a couple of weeks, but most participants are involved with the coding standards efforts too, which tend to suck up all the oxygen.
     
    The metrics I mentioned were all static metrics. Dynamic metrics such as coverage would be useful to consider too, especially as we are thinking of SARIF as extending into the dynamic realm. One caveat is that a lot of dynamic metrics are
    most useful at very fine granularities. E.g, if you are profiling then it s good to see where time is going for each
    line of code, whereas we had been thinking of these metrics as being associated with constructs no smaller than functions.
     
    -Paul
     

    --

    Paul Anderson, VP of Engineering, GrammaTech, Inc.
    531 Esty St., Ithaca, NY 14850
    Tel: +1 607 273-7340 x118;

    https://www.grammatech.com

     



    From: Michael Fanning < Michael.Fanning@microsoft.com >

    Sent: Tuesday, October 12, 2021 2:13 PM
    To: Paul Anderson < paul@grammatech.com >;
    sarif@lists.oasis-open.org
    Subject: [External] - RE: Followup on code metrics


     

    CAUTION: External Email

     

    This is extremely helpful information, Paul, thanks for sending it.

    We d also discussed whether MISRA has any current thinking/proposals we could review. Is that the case?
     
    Thanks again! In addition to the great stuff below, I think probably we should mention condition/decision metrics for code coverage, likely to be a scenario we at least consider (coverage, I mean).
     
    For completeness, we also discussed OMG s knowledge discovery metamodel (KDM) on the call as related art.

     
    Any standard that has meta in its name, of course, is likely to be a bit too abstract for our applied scenario. We are an interchange format with a focus on crisply defined, point-in-time quality data.

     
    KDM is a common  vocabulary of knowledge which can be shared between producers/consumers in an Application Lifecycle Management context. Ambitious! @David, I note that KDM has its own Wikipedia page, which might be helpful comparison
    as you develop ours.
     
    Knowledge
    Discovery Metamodel (KDM) (omg.org)
    Knowledge
    Discovery Metamodel - Wikipedia
     
    MCF
     


    From: sarif@lists.oasis-open.org < sarif@lists.oasis-open.org >
    On Behalf Of Paul Anderson
    Sent: Friday, October 1, 2021 11:20 AM
    To: sarif@lists.oasis-open.org
    Subject: [EXTERNAL] [sarif] Followup on code metrics


     
    All:
     
    I m following up on our meeting yesterday with some material on code metrics.
     
    There are a couple of metric families that are commonly used. The first are simple line counts. It s common for tools to have several varieties such as raw lines, non-blank lines, lines with comments, etc. There are also a few metrics
    that count syntactic constructs in various ways, such as nesting depth.
     
    Then there is the McCabe family which contains the much misunderstood Cyclomatic Complexity and a few relatives that attempt to compensate for its shortcomings; e.g. Essential Complexity, and Module Design Complexity.
     
    The Halstead metrics are an antiquated family that are based on counting tokens in various ways. They don t appear to be widely used any more.
     
    There are a bunch of Object-oriented metrics that are often about the relationship between classes: coupling, cohesion, etc.
     
    I mentioned metrics used in the embedded industry and how there s a need for this to be more standardized. A metrics collection that seems to be popular in automotive is the HIS/KGAS set

    https://docplayer.net/6136232-His-source-code-metrics.html . Here s a blog I wrote a little while ago that picks on one particular metric used in that set:

    https://blogs.grammatech.com/why-npath-is-a-terrible-code-metric .
     
    As I said in the meeting, metrics have numeric values, but we should also allow for exceptional results. For example:

    Computing the metric leads to a numeric error such as divide by zero The metric can t be computed because it s inapplicable for some reason The value can t be reasonably expressed (e.g., a path count metric can easily yield 30-40 digit decimal values) The value is infinity (which
    kind of infinity probably doesn t matter