Lightweight Distributed Provenance Model for Complex Real-world Environments.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
17 08 2022
Historique:
received: 28 07 2021
accepted: 06 07 2022
entrez: 17 8 2022
pubmed: 18 8 2022
medline: 20 8 2022
Statut: epublish

Résumé

Provenance is information describing the lineage of an object, such as a dataset or biological material. Since these objects can be passed between organizations, each organization can document only parts of the objects life cycle. As a result, interconnection of distributed provenance parts forms distributed provenance chains. Dependant on the actual provenance content, complete provenance chains can provide traceability and contribute to reproducibility and FAIRness of research objects. In this paper, we define a lightweight provenance model based on W3C PROV that enables generation of distributed provenance chains in complex, multi-organizational environments. The application of the model is demonstrated with a use case spanning several steps of a real-world research pipeline - starting with the acquisition of a specimen, its processing and storage, histological examination, and the generation/collection of associated data (images, annotations, clinical data), ending with training an AI model for the detection of tumor in the images. The proposed model has become an open conceptual foundation of the currently developed ISO 23494 standard on provenance for biotechnology domain.

Identifiants

pubmed: 35977957
doi: 10.1038/s41597-022-01537-6
pii: 10.1038/s41597-022-01537-6
pmc: PMC9383664
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

503

Subventions

Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : 824087
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : 824087
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : 825575
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : 824087
Organisme : Regione Autonoma della Sardegna (Sardinia Region)
ID : DIFRA project
Organisme : Regione Autonoma della Sardegna (Sardinia Region)
ID : DIFRA project

Informations de copyright

© 2022. The Author(s).

Références

Cancer Res. 2014 Aug 1;74(15):4024-9
pubmed: 25035389
Nature. 2012 Mar 28;483(7391):531-3
pubmed: 22460880
Circ Res. 2015 Jan 2;116(1):116-26
pubmed: 25552691
Stud Health Technol Inform. 2022 May 25;294:415-416
pubmed: 35612111
J Pathol Inform. 2013 Sep 27;4:27
pubmed: 24244884
Nature. 2012 Oct 11;490(7419):187-91
pubmed: 23060188
J Biomed Semantics. 2013 Nov 22;4(1):37
pubmed: 24267948
Science. 2020 Jun 5;368(6495):1041-1042
pubmed: 32499418
Lancet. 2014 Jan 11;383(9912):166-75
pubmed: 24411645
PLoS One. 2013 May 15;8(5):e63221
pubmed: 23691000
Nat Rev Drug Discov. 2011 Aug 31;10(9):712
pubmed: 21892149
J Biomed Inform. 2017 Jan;65:1-21
pubmed: 27856379
Biomark Insights. 2019 Feb 05;14:1177271919829162
pubmed: 30783377
Sci Data. 2019 Sep 3;6(1):166
pubmed: 31481707
Elife. 2014 Dec 10;3:
pubmed: 25493617
Gigascience. 2019 Nov 1;8(11):
pubmed: 31675414
Int J Med Inform. 2020 Sep;141:104197
pubmed: 32540775
Proc IEEE Int Conf Big Data. 2015 Oct-Nov;2015:2509-2516
pubmed: 29399671
Interface Focus. 2016 Apr 6;6(2):20150103
pubmed: 27051515
BMJ. 2020 Jun 2;369:m2197
pubmed: 32487664
PLoS Biol. 2015 Jun 09;13(6):e1002165
pubmed: 26057340
PLoS Biol. 2018 Dec 31;16(12):e3000099
pubmed: 30596645

Auteurs

Rudolf Wittner (R)

BBMRI-ERIC, Neue Stiftingtalstrasse 2, 8010, Graz, Austria.
Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic.
Institute of Computer Science, Masaryk University, Šumavská 416/15, 602 00, Brno, Czech Republic.

Cecilia Mascia (C)

CRS4 - Center for Advanced Studies, Research and Development in Sardinia, Loc. Piscina Manna, 09050, Pula, CA, Italy.

Matej Gallo (M)

Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic.

Francesca Frexia (F)

CRS4 - Center for Advanced Studies, Research and Development in Sardinia, Loc. Piscina Manna, 09050, Pula, CA, Italy.

Heimo Müller (H)

BBMRI-ERIC, Neue Stiftingtalstrasse 2, 8010, Graz, Austria.
Diagnostic and Research Center for Molecular BioMedicine, Diagnostic & Research Institute of Pathology, Medical University of Graz, Neue Stiftingtalstrasse 2, 8010, Graz, Austria.

Markus Plass (M)

Diagnostic and Research Center for Molecular BioMedicine, Diagnostic & Research Institute of Pathology, Medical University of Graz, Neue Stiftingtalstrasse 2, 8010, Graz, Austria.

Jörg Geiger (J)

Interdisciplinary Bank of Biomaterials and Data Würzburg (ibdw), University and University Hospital of Würzburg, 97080, Würzburg, Germany.

Petr Holub (P)

BBMRI-ERIC, Neue Stiftingtalstrasse 2, 8010, Graz, Austria. petr.holub@bbmri-eric.eu.
Institute of Computer Science, Masaryk University, Šumavská 416/15, 602 00, Brno, Czech Republic. petr.holub@bbmri-eric.eu.

Classifications MeSH