Data Provenance (DPS) TC

 View Only
  • 1.  "Origin geography" history question

    Posted 06-04-2025 09:55

    Based on other work I'm involved in, I suspect there was considerable debate when creating the D&TA DPS spec that ended  up with the term "origin-geography" in the D&TA contribution to our TC. I am interested in some context due to some specific use cases (SBOM provenance) I care about. I particularly care about how D&TA defined "geography".

     

    In reading the text of the D&TA spec, I found:

    "The geographical location where the data was originally collected,

    which can be important for compliance with regional laws and

    understanding the data's context."

    In other groups having this debate, we get into "geography" vs "geopolitical" discussions because of the "compliance with regional laws"  part of use case.

     

    And things like: "data-processing-geography-excluded ... Defines the geographical boundaries within which the data cannot be processed, often for legal or regulatory reasons" seem to me to favor the 'geopolitical' definition of geography as opposed to mountains vs plains.

    This is sometimes called 'human' geography as opposed to "physical geography'.

     

    Am I correct in deducing "geography" is a catch-all term that correlates with geopolitical (eg nation, province, city, etc) or "human geography" boundaries? Any historical context here would help because I really don't want to reinvent the wheel and re-open debates that resulted in the current text. I'd prefer we just consider that our 'geography' means 'human geography' and not have to change the terms.

     

    Duncan

     

     

     

    -- 

    Duncan Sparrell

    sFractal Consulting

    iPhone, iTypo, iApologize

    I welcome VSRE emails. Learn more at http://vsre.info/

     -- 

     



  • 2.  RE: "Origin geography" history question

    Posted 06-04-2025 15:22

    Hi Duncan,

     

    We had a lot (and then some more!) conversation around origin-geography. Our stakeholders, which included legal as well as data and product people, started out by saying they wanted a metadata tag that identified whether data set is subject to GDPR or other types of regulation. The problem with telling someone that data is subject to regulation is that it is interpretive (and can be somewhat subjective). So, when we analyzed the use cases and dug deep into the rationale, we heard that organizations want to be "shown, not told" basis for compliance. Given geolocation requirements under GDPR and other legislation, we decided to establish origin-geography as the place (city, state, country) where the data was collected or created (i.e., collected from a person, created by an IoT device, etc.). A good number of our WG members didn't care about regulation or legal implications, but instead wanted to assign location tags for use in the context of a specific business case (e.g., I am deploying a vaccine and need to track recipient illness progression over time so give me the location of the recipient) or IoT device implications (e.g., where do we get the most error-prone data). In some cases, we were asked to provide house-level detailed location which wasn't practical for most cases. We explored ZIP codes as an option, but that was an issue for countries like the UAE. So yes, we used the "geopolitical" definition of geography (which notably has issues, but that was the decision made by the WG).

     

    Kristina