Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing.

Cloud computing Data science Jupyter Open access Reproducibility Statistics

Journal

Metabolomics : Official journal of the Metabolomic Society
ISSN: 1573-3890
Titre abrégé: Metabolomics
Pays: United States
ID NLM: 101274889

Informations de publication

Date de publication:
14 09 2019
Historique:
received: 30 05 2019
accepted: 07 09 2019
entrez: 16 9 2019
pubmed: 16 9 2019
medline: 2 6 2020
Statut: epublish

Résumé

A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike. To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science. This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.

Sections du résumé

BACKGROUND
A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike.
AIM OF REVIEW
To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science.
KEY SCIENTIFIC CONCEPTS OF REVIEW
This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.

Identifiants

pubmed: 31522294
doi: 10.1007/s11306-019-1588-0
pii: 10.1007/s11306-019-1588-0
pmc: PMC6745024
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

125

Références

Metabolomics. 2018;14(6):72
pubmed: 29805336
Sci Data. 2017 Sep 26;4:170138
pubmed: 29989594
IEEE Trans Vis Comput Graph. 2006 Sep-Oct;12(5):741-8
pubmed: 17080795
Gigascience. 2019 Feb 1;8(2):
pubmed: 30535405
Sci Data. 2016 Mar 15;3:160018
pubmed: 26978244
Nature. 2016 May 25;533(7604):452-4
pubmed: 27225100
Nat Biotechnol. 2004 Nov;22(11):1459-66
pubmed: 15529173
Nucleic Acids Res. 2018 Jan 4;46(D1):D608-D617
pubmed: 29140435
PLoS Comput Biol. 2017 Nov 3;13(11):e1005752
pubmed: 29099853
Metabolomics. 2013 Apr;9(2):280-299
pubmed: 23543913
Metabolites. 2019 Apr 18;9(4):
pubmed: 31003499
Nat Commun. 2019 Mar 12;10(1):1092
pubmed: 30862783
Nucleic Acids Res. 2016 Jan 4;44(D1):D463-70
pubmed: 26467476
Nat Methods. 2010 Mar;7(3 Suppl):S56-68
pubmed: 20195258
J Mass Spectrom. 2010 Jul;45(7):703-14
pubmed: 20623627
Nucleic Acids Res. 2018 Jul 2;46(W1):W537-W544
pubmed: 29790989
Ther Drug Monit. 2005 Dec;27(6):747-51
pubmed: 16404815
Nucleic Acids Res. 2013 Jan;41(Database issue):D781-6
pubmed: 23109552
Nat Mater. 2019 May;18(5):422-427
pubmed: 30478452
Bioinformatics. 2015 May 1;31(9):1493-5
pubmed: 25527831
Anal Chem. 2018 Nov 20;90(22):13400-13408
pubmed: 30335973
Metabolomics. 2017 Dec 1;14(1):7
pubmed: 30830321
Curr Protoc Bioinformatics. 2011 Jun;Chapter 14:Unit 14.10
pubmed: 21633943
Gigascience. 2016 Feb 23;5:10
pubmed: 26913198
Br J Cancer. 2016 Jan 12;114(1):59-62
pubmed: 26645240

Auteurs

Kevin M Mendez (KM)

Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia.

Leighton Pritchard (L)

Strathclyde Institute of Pharmacy & Biomedical Sciences, University of Strathclyde, Cathedral Street, Glasgow, G1 1XQ, Scotland, UK.

Stacey N Reinke (SN)

Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia. stacey.n.reinke@ecu.edu.au.

David I Broadhurst (DI)

Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia. d.broadhurst@ecu.edu.au.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH