From Planning Stage Towards FAIR Data: A Practical Metadatasheet For Biomedical Scientists.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
22 May 2024
Historique:
received: 07 12 2023
accepted: 08 05 2024
medline: 23 5 2024
pubmed: 23 5 2024
entrez: 22 5 2024
Statut: epublish

Résumé

Datasets consist of measurement data and metadata. Metadata provides context, essential for understanding and (re-)using data. Various metadata standards exist for different methods, systems and contexts. However, relevant information resides at differing stages across the data-lifecycle. Often, this information is defined and standardized only at publication stage, which can lead to data loss and workload increase. In this study, we developed Metadatasheet, a metadata standard based on interviews with members of two biomedical consortia and systematic screening of data repositories. It aligns with the data-lifecycle allowing synchronous metadata recording within Microsoft Excel, a widespread data recording software. Additionally, we provide an implementation, the Metadata Workbook, that offers user-friendly features like automation, dynamic adaption, metadata integrity checks, and export options for various metadata standards. By design and due to its extensive documentation, the proposed metadata standard simplifies recording and structuring of metadata for biomedical scientists, promoting practicality and convenience in data management. This framework can accelerate scientific progress by enhancing collaboration and knowledge transfer throughout the intermediate steps of data creation.

Identifiants

pubmed: 38778016
doi: 10.1038/s41597-024-03349-2
pii: 10.1038/s41597-024-03349-2
doi:

Types de publication

Journal Article Dataset

Langues

eng

Sous-ensembles de citation

IM

Pagination

524

Subventions

Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 390685813
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 458597554
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : FE 1159/6-1
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : FE 1159/5-1
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : E 1159/2-1
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 432325352
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 450149205

Informations de copyright

© 2024. The Author(s).

Références

Morillo, F., Bordons, M. & Gómez, I. Interdisciplinarity in science: A tentative typology of disciplines and research areas. Journal of the American Society for Information Science and Technology 54, 1237–1249, https://doi.org/10.1002/asi.10326 (2003).
doi: 10.1002/asi.10326
Cioffi, M., Goldman, J. & Marchese, S. Harvard biomedical research data lifecycle. Zenodo https://doi.org/10.5281/zenodo.8076168 (2023).
Habermann, T. Metadata life cycles, use cases and hierarchies. Geosciences 8, https://doi.org/10.3390/geosciences8050179 (2018).
Stevens, I. et al. Ten simple rules for annotating sequencing experiments. PLOS Computational Biology 16, 1–7, https://doi.org/10.1371/journal.pcbi.1008260 (2020).
doi: 10.1371/journal.pcbi.1008260
Shaw, F. et al. Copo: a metadata platform for brokering fair data in the life sciences. F1000Research 9, 495, https://doi.org/10.12688/f1000research.23889.1 (2020).
doi: 10.12688/f1000research.23889.1
Ulrich, H. et al. Understanding the nature of metadata: Systematic review. J Med Internet Res 24, e25440, https://doi.org/10.2196/25440 (2022).
doi: 10.2196/25440 pubmed: 35014967 pmcid: 8790684
Wilkinson, M. D. et al. Comment: The fair guiding principles for scientific data management and stewardship. Scientific Data 3, https://doi.org/10.1038/sdata.2016.18 (2016).
Wolstencroft, K. et al. Rightfield: Embedding ontology annotation in spreadsheets. Bioinformatics 27, 2021–2022, https://doi.org/10.1093/bioinformatics/btr312 (2011).
doi: 10.1093/bioinformatics/btr312 pubmed: 21622664
Leipzig, J., Nüst, D., Hoyt, C. T., Ram, K. & Greenberg, J. The role of metadata in reproducible computational research. Patterns 2, https://doi.org/10.1016/j.patter.2021.100322 (2021).
Researchspace. https://www.researchspace.com/ . Accessed: 12th March 2024 (2024).
Revvity signals notebook eln. https://revvitysignals.com/products/research/signals-notebook-eln . Accessed: 12th March 2024 (2024).
Kowalczyk, S. T. Before the repository: Defining the preservation threats to research data in the lab. In Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ‘15, 215–222, https://doi.org/10.1145/2756406.2756909 (Association for Computing Machinery, New York, NY, USA, 2015).
Rocca-Serra, P. et al. ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level. Bioinformatics 26, 2354–2356, https://doi.org/10.1093/bioinformatics/btq415 (2010).
doi: 10.1093/bioinformatics/btq415 pubmed: 20679334 pmcid: 2935443
Lin, D. et al. The trust principles for digital repositories. Scientific Data 7, 144, https://doi.org/10.1038/s41597-020-0486-7 (2020).
doi: 10.1038/s41597-020-0486-7 pubmed: 32409645 pmcid: 7224370
Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Research 41, D991–D995, https://doi.org/10.1093/nar/gks1193 (2012).
doi: 10.1093/nar/gks1193 pubmed: 23193258 pmcid: 3531084
VizcaÃno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Research 44, D447–D456, https://doi.org/10.1093/nar/gkv1145 (2015).
doi: 10.1093/nar/gkv1145
Malik-Sheriff, R. S. et al. BioModels—15 years of sharing computational models in life science. Nucleic Acids Research 48, D407–D415, https://doi.org/10.1093/nar/gkz1055 (2019).
doi: 10.1093/nar/gkz1055 pmcid: 7145643
Glont, M. et al. BioModels: expanding horizons to include more modelling approaches and formats. Nucleic Acids Research 46, D1248–D1253, https://doi.org/10.1093/nar/gkx1023 (2017).
doi: 10.1093/nar/gkx1023 pmcid: 5753244
Consortium, T. G. O. et al. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031, https://doi.org/10.1093/genetics/iyad031 (2023).
doi: 10.1093/genetics/iyad031
Percie du Sert, N. et al. The arrive guidelines 2.0: Updated guidelines for reporting animal research. PLOS Biology 18, 1–12, https://doi.org/10.1371/journal.pbio.3000410 (2020).
doi: 10.1371/journal.pbio.3000410
Novère, N. L. et al. Minimum information requested in the annotation of biochemical models (miriam. Nature Biotechnology 23, 1509–1515, https://doi.org/10.1038/nbt1156 (2005).
doi: 10.1038/nbt1156 pubmed: 16333295
Gil Press. Cleaning big data: Most time-consuming, least enjoyable data science task, survey says. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/?sh=27709ef76f63 . Accessed: 2024-4-3 (2016).
Hughes, L. D. et al. Addressing barriers in fair data practices for biomedical data. Scientific Data 10, 98, https://doi.org/10.1038/s41597-023-01969-8 (2023).
doi: 10.1038/s41597-023-01969-8 pubmed: 36823198 pmcid: 9950056
The metabolomics workbench, https://www.metabolomicsworkbench.org/ .
EMBL. Ontology lookup service, https://www.ebi.ac.uk/ols4 .
Xiang, Z., Mungall, C. J., Ruttenberg, A. & He, Y. O. Ontobee: A linked data server and browser for ontology terms. In International Conference on Biomedical Ontology (2011).
Huber, W. et al. Orchestrating high-throughput genomic analysis with bioconductor. Nature Methods 12, 115–121, https://doi.org/10.1038/nmeth.3252 (2015).
doi: 10.1038/nmeth.3252 pubmed: 25633503 pmcid: 4509590
Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biology 5, R80, https://doi.org/10.1186/gb-2004-5-10-r80 (2004).
doi: 10.1186/gb-2004-5-10-r80 pubmed: 15461798 pmcid: 545600
Hunt, A. & Thomas, D. The pragmatic programmer: From journeyman to master. (Addison Wesley, Boston, MA, 1999).
Morgan, M., Obenchain, V., Hester, J. & Pages, H. Summarizedexperiment: Summarizedexperiment container. Bioconductor (2003).
Mass, E. et al. Developmental programming of kupffer cells by maternal obesity causes fatty liver disease in the offspring. Research Square Platform LLC https://doi.org/10.21203/rs.3.rs-3242837/v1 (2023).
Davis, S. & Meltzer, P. S. Geoquery: a bridge between the gene expression omnibus (geo) and bioconductor. Bioinformatics 23, 1846–1847, https://doi.org/10.1093/bioinformatics/btm254 (2007).
doi: 10.1093/bioinformatics/btm254 pubmed: 17496320
Zhu, Y., Davis, S., Stephens, R., Meltzer, P. S. & Chen, Y. Geometadb: powerful alternative search engine for the gene expression omnibus. Bioinformatics 24, 2798–2800, https://doi.org/10.1093/bioinformatics/btn520 (2008).
doi: 10.1093/bioinformatics/btn520 pubmed: 18842599 pmcid: 2639278
National Center for Biotechnology Information (US). Entrez programming utilities help. Internet. Accessed on 02.04.2024 (2010).
SciBite, CENtree, https://scibite.com/platform/centree-ontology-management-platform/
Ravagli, C., Pognan, F. & Marc, P. Ontobrowser: a collaborative tool for curation of ontologies by subject matter experts. Bioinformatics 33, 148–149, https://doi.org/10.1093/bioinformatics/btw579 (2016).
doi: 10.1093/bioinformatics/btw579 pubmed: 27605099 pmcid: 5408772
Sasse, J., Darms, J. & Fluck, J. Semantic metadata annotation services in the biomedical domain—a literature review. Applied Sciences (Switzerland) 12, https://doi.org/10.3390/app12020796 (2022).
Tedersoo, L. et al. Data sharing practices and data availability upon request differ across scientific disciplines. Scientific Data 8, 192, https://doi.org/10.1038/s41597-021-00981-0 (2021).
doi: 10.1038/s41597-021-00981-0 pubmed: 34315906 pmcid: 8381906
Menzel, J. & Weil, P. Metadata capture in an electronic notebook: How to make it as simple as possible? Metadatenerfassung in einem elektronischen laborbuch: Wie macht man es so einfach wie möglich? GMS Medizinische Informatik, Biometrie Epidemiologie 5, 11, https://doi.org/10.3205/mibe000162 (2015).
Musen, M. A. The protégé project: A look back and a look forward. AI Matters 1, 4–12, https://doi.org/10.1145/2757001.2757003 (2015).
doi: 10.1145/2757001.2757003 pubmed: 27239556 pmcid: 4883684
Seep, L. METADATASHEET - Showcases, Zenodo, https://doi.org/10.5281/zenodo.10278069 (2023).

Auteurs

Lea Seep (L)

Computational Biology, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Stephan Grein (S)

Computational Biology, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Iva Splichalova (I)

Developmental Biology of the Immune System, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Danli Ran (D)

Institute of Pharmacology and Toxicology, University Hospital, University of Bonn, Bonn, Germany.

Mickel Mikhael (M)

Institute of Pharmacology and Toxicology, University Hospital, University of Bonn, Bonn, Germany.

Staffan Hildebrand (S)

Institute of Pharmacology and Toxicology, University Hospital, University of Bonn, Bonn, Germany.

Mario Lauterbach (M)

Department of Bioinformatics and Biochemistry, Technical University Braunschweig, Braunschweig, Germany.

Karsten Hiller (K)

Department of Bioinformatics and Biochemistry, Technical University Braunschweig, Braunschweig, Germany.

Dalila Juliana Silva Ribeiro (DJS)

Institute of Innate Immunity, University Hospital Bonn, University of Bonn, Bonn, Germany.

Katharina Sieckmann (K)

Institute of Innate Immunity, University Hospital Bonn, University of Bonn, Bonn, Germany.

Ronja Kardinal (R)

Institute of Innate Immunity, University Hospital Bonn, University of Bonn, Bonn, Germany.

Hao Huang (H)

Developmental Biology of the Immune System, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Jiangyan Yu (J)

Computational Biology, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.
Quantitative Systems Biology, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Sebastian Kallabis (S)

Systems Immunology and Proteomics, Institute of Innate Immunity, Medical Faculty, University of Bonn, Bonn, Germany.

Janina Behrens (J)

Department of Biochemistry and Molecular Cell Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.

Andreas Till (A)

Department of Internal Medicine I, Division of Endocrinology, Diabetes and Metabolism, University Medical Center Bonn, Bonn, Germany.

Viktoriya Peeva (V)

Department of Internal Medicine I, Division of Endocrinology, Diabetes and Metabolism, University Medical Center Bonn, Bonn, Germany.

Akim Strohmeyer (A)

Chair of Molecular Nutritional Medicine, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Johanna Bruder (J)

Chair of Molecular Nutritional Medicine, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Tobias Blum (T)

Immunology and Environment, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Ana Soriano-Arroquia (A)

Institute of Pharmacology and Toxicology, University Hospital, University of Bonn, Bonn, Germany.

Dominik Tischer (D)

Institute of Pharmacology and Toxicology, University Hospital, University of Bonn, Bonn, Germany.

Katharina Kuellmer (K)

Chair of Molecular Nutritional Medicine, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Yuanfang Li (Y)

Immunogenomics & Neurodegeneration, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany.

Marc Beyer (M)

Immunogenomics & Neurodegeneration, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany.
PRECISE, Platform for Single Cell Genomics and Epigenomics at the German Center for Neurodegenerative Diseases and the University of Bonn, Bonn, Germany.

Anne-Kathrin Gellner (AK)

Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany.
Institute of Physiology II, Medical Faculty, University of Bonn, Bonn, Germany.

Tobias Fromme (T)

Chair of Molecular Nutritional Medicine, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Henning Wackerhage (H)

School for Medicine and Health, Faculty of Sport and Health Sciences, Technical University of Munich, Munich, Germany.

Martin Klingenspor (M)

Chair of Molecular Nutritional Medicine, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.
EKFZ-Else Kröner-Fresenius Center for Nutritional Medicine, Technical University of Munich, Freising, Germany.
ZIEL Institute for Food & Health, Technical University of Munich, Freising, Germany.

Wiebke K Fenske (WK)

Department of Internal Medicine I, Division of Endocrinology, Diabetes and Metabolism, University Medical Center Bonn, Bonn, Germany.
Department of Internal Medicine I - Endocrinology, Diabetology and Metabolism, Gastroenterology and Hepatology, University Hospital Bergmannsheil, Bochum, Germany.

Ludger Scheja (L)

Department of Biochemistry and Molecular Cell Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.

Felix Meissner (F)

Systems Immunology and Proteomics, Institute of Innate Immunity, Medical Faculty, University of Bonn, Bonn, Germany.
Experimental Systems Immunology, Max Planck Institute of Biochemistry, Martinsried, Germany.

Andreas Schlitzer (A)

Quantitative Systems Biology, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Elvira Mass (E)

Developmental Biology of the Immune System, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany.

Dagmar Wachten (D)

Institute of Innate Immunity, University Hospital Bonn, University of Bonn, Bonn, Germany.

Eicke Latz (E)

Institute of Innate Immunity, University Hospital Bonn, University of Bonn, Bonn, Germany.

Alexander Pfeifer (A)

Institute of Pharmacology and Toxicology, University Hospital, University of Bonn, Bonn, Germany.
PharmaCenter Bonn, University of Bonn, Bonn, Germany.

Jan Hasenauer (J)

Computational Biology, Life & Medical Sciences (LIMES) Institute, University of Bonn, Bonn, Germany. jan.hasenauer@uni-bonn.de.
Helmholtz Center Munich, German Research Center for Environmental Health, Computational Health Center, Munich, Germany. jan.hasenauer@uni-bonn.de.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Cephalometry Humans Anatomic Landmarks Software Internet
Humans Algorithms Software Artificial Intelligence Computer Simulation

Classifications MeSH