TY - UNPB
T1 - Linking provenance and its metadata in multi-organizational environments
AU - Wittner, Rudolf
AU - Gallo, Matej
AU - Frexia, Francesca
AU - Leo, Simone
AU - Pireddu, Luca
AU - Mascia, Cecilia
AU - Plass, Markus
AU - Soiland-Reyes, Stian
AU - Müller, Heimo
AU - Geiger, Jörg
AU - Holub, Petr
N1 - (revised article submitted to PeerJ CS)
PY - 2023/12/12
Y1 - 2023/12/12
N2 - Reproducibility issues are widely reported in life sciences. As a response, scientific communities have called for enhanced provenance information documenting the complete research life cycle, starting from biological or environmental material acquisition and ending with translating research results into practice. The integrity and trustworthiness of such provenance can be achieved by applying versioning mechanisms and cryptographic techniques, such as hashes or digital signatures, which are provenance metadata. However, the available provenance literature lacks an analysis of mechanisms for the exchange of provenance and its metadata between organizations as well as a grounded proposal of linking provenance and its metadata. In this work, we provide an in-depth analysis of the approaches for coupling provenance information and its metadata with documented research objects in the context of multi-organizational processes, leading to the categorization of possible approaches, description of their key properties, and derivation of requirements for underlying provenance models. We address the requirements by proposing a mechanism for linking provenance and its metadata by extending the Common Provenance Model, the open conceptual foundation for the ISO 23494 provenance standard series, currently under development. The concepts are demonstrated and validated on two complex use cases. This work is intended as a harmonized source of information on provenance coupling in the context of exchange of provenance between organizations, which can be used when designing or choosing a provenance solution. This type of usage is exemplified in the extension of the Common Provenance Model as another step toward a provenance standard for life sciences.
AB - Reproducibility issues are widely reported in life sciences. As a response, scientific communities have called for enhanced provenance information documenting the complete research life cycle, starting from biological or environmental material acquisition and ending with translating research results into practice. The integrity and trustworthiness of such provenance can be achieved by applying versioning mechanisms and cryptographic techniques, such as hashes or digital signatures, which are provenance metadata. However, the available provenance literature lacks an analysis of mechanisms for the exchange of provenance and its metadata between organizations as well as a grounded proposal of linking provenance and its metadata. In this work, we provide an in-depth analysis of the approaches for coupling provenance information and its metadata with documented research objects in the context of multi-organizational processes, leading to the categorization of possible approaches, description of their key properties, and derivation of requirements for underlying provenance models. We address the requirements by proposing a mechanism for linking provenance and its metadata by extending the Common Provenance Model, the open conceptual foundation for the ISO 23494 provenance standard series, currently under development. The concepts are demonstrated and validated on two complex use cases. This work is intended as a harmonized source of information on provenance coupling in the context of exchange of provenance between organizations, which can be used when designing or choosing a provenance solution. This type of usage is exemplified in the extension of the Common Provenance Model as another step toward a provenance standard for life sciences.
M3 - Preprint
BT - Linking provenance and its metadata in multi-organizational environments
ER -