Data Provenance (DPS) TC

View Only

Back to discussions

Help in resolving comment on source/provenance/use on PR#20

Duncan Sparrell05-28-2025 23:30

A comment on PR#20 https://github.com/oasis-tcs/dps/pull/20/files/d973278765ae32de0780f17cb97463cfc45a8cec#r2112666416 ...

Kristina Podnar05-30-2025 11:57

Hi Duncan, Thanks for raising this, and I appreciate your effort in surfacing the question ...

1. Help in resolving comment on source/provenance/use on PR#20

Recommend
Duncan Sparrell
Posted 05-28-2025 23:30
A comment on PR#20 https://github.com/oasis-tcs/dps/pull/20/files/d973278765ae32de0780f17cb97463cfc45a8cec#r2112666416 states:

"In general, I find the separation of the "trinity"-like provenance, source, and use a difficult concept. For example: why are provenance and source separate?"

I am repeating in this email thread to get more members involved as I assume everyone is not following the GitHub PR comments (since there are only 4 forks and 6 watching, and even for those 6, if they are like myself, all the GitHub watch emails go into the bit bucket of a folder I never read since I get hundreds per day).

This appears to me to be a very fundamental issue. I assumed there was general agreement on the D&TA work as a basis of moving forward. I just did the administrivia work of "OASIS-izing" it. This comment implies otherwise, at least to me - confessing my ignorance of the work that went into creating the D&TA spec. So I would like help from those who spent time creating the D&TA spec, and I would everyone to chime in with their opinion on accepting the D&TA work as our staring point.

------------------------------
Duncan Sparrell
Chief Cyber Curmudgeion
sFractal Consulting LLC
Oakton VA
703-828-8646
------------------------------
2. RE: Help in resolving comment on source/provenance/use on PR#20

Recommend
Kristina Podnar
Posted 05-30-2025 11:57
Hi Duncan,

Thanks for raising this, and I appreciate your effort in surfacing the question from PR#20 more broadly to the group.

To clarify, the taxonomy reflected in the D&TA work was not arbitrarily defined; it was the result of extensive iteration and feedback over the course of more than 55 deep-dive conversations and many iterations. These involved both Working Group members and external experts, including data practitioners, compliance professionals, and legal stakeholders. We went through multiple conceptual models before settling on the current framing, recognizing that it isn't perfect, but it worked for most of the organizations represented in the Working Group. Of course, there is always room for (much!) improvement.

For context, here are a few of the iterations we explored (happy to dig up the working document and share more if folks would like):

Framed as core provenance questions:

What is this data?
Who supplied it?
When was this data produced?
Where was the data collected?
How was the data collected?
How can I use this data?

Focused on traceability and legal dimensions:

Lineage
Supplier location (origination, processing, storage)
Legal rights
Privacy & protection
Generation date
Data format
Generation method
Intended use & restrictions

Designed for operational and compliance clarity:

Usage (rights & restrictions)
Collection (source, standards & validation, recency, method)

These models were reviewed and evolved through months of iterative feedback. The current separation between provenance, source, and use is an intentional one, meant to provide clarity to different kinds of users (technical, legal, and business) who need to interpret and apply this metadata for different reasons.

That said, this taxonomy, like everything else in our work, should remain open to refinement. If there's a way to clarify or simplify the distinction between these elements in the spec, I'd support a discussion around that.

Thanks again for keeping the group engaged on this.

Kristina

Original Message

Data Provenance (DPS) TC

Help in resolving comment on source/provenance/use on PR#20

Duncan Sparrell05-28-2025 23:30

Kristina Podnar05-30-2025 11:57

1. Help in resolving comment on source/provenance/use on PR#20

2. RE: Help in resolving comment on source/provenance/use on PR#20

Contact Us

Membership

Privacy & Terms

Data Provenance (DPS) TC

Help in resolving comment on source/provenance/use on PR#20

Duncan Sparrell05-28-2025 23:30

Kristina Podnar05-30-2025 11:57

1. Help in resolving comment on source/provenance/use on PR#20

2. RE: Help in resolving comment on source/provenance/use on PR#20

Related Content

PR#36 (updated descriptions of metadata components based on D&TA definitions)

Provenance Information Model

Data Provenance Standards Executive Overview uploaded

Motion to approve PR#26 - FAQ on CoSAI

PR #31 Contribution of Excel spreadsheet with DPS and values (prototype)

Contact Us

Membership

Privacy & Terms