ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File Conversion.

big data bioinformatics cloud file formats mass spectrometry metadata mzML open source software workflows

Journal

Journal of proteome research
ISSN: 1535-3907
Titre abrégé: J Proteome Res
Pays: United States
ID NLM: 101128775

Informations de publication

Date de publication:
03 01 2020
Historique:
pubmed: 23 11 2019
medline: 17 4 2021
entrez: 23 11 2019
Statut: ppublish

Résumé

The field of computational proteomics is approaching the big data age, driven both by a continuous growth in the number of samples analyzed per experiment as well as by the growing amount of data obtained in each analytical run. In order to process these large amounts of data, it is increasingly necessary to use elastic compute resources such as Linux-based cluster environments and cloud infrastructures. Unfortunately, the vast majority of cross-platform proteomics tools are not able to operate directly on the proprietary formats generated by the diverse mass spectrometers. Here, we present ThermoRawFileParser, an open-source, cross-platform tool that converts Thermo RAW files into open file formats such as MGF and the HUPO-PSI standard file format mzML. To ensure the broadest possible availability and to increase integration capabilities with popular workflow systems such as Galaxy or Nextflow, we have also built Conda package and BioContainers container around ThermoRawFileParser. In addition, we implemented a user-friendly interface (ThermoRawFileParserGUI) for those users not familiar with command-line tools. Finally, we performed a benchmark of ThermoRawFileParser and msconvert to verify that the converted mzML files contain reliable quantitative results.

Identifiants

pubmed: 31755270
doi: 10.1021/acs.jproteome.9b00328
pmc: PMC7116465
mid: EMS106469
doi:

Substances chimiques

Saccharomyces cerevisiae Proteins 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

537-542

Subventions

Organisme : Wellcome Trust
Pays : United Kingdom
Organisme : Wellcome Trust
ID : 208391
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/P024599/1
Pays : United Kingdom
Organisme : Wellcome Trust
ID : 208391/Z/17/Z
Pays : United Kingdom

Références

Proteomics. 2015 Mar;15(5-6):930-49
pubmed: 25158685
Nat Commun. 2014 Oct 31;5:5277
pubmed: 25358478
Nucleic Acids Res. 2018 Jul 2;46(W1):W537-W544
pubmed: 29790989
Bioinformatics. 2017 Aug 15;33(16):2580-2582
pubmed: 28379341
Nat Methods. 2016 Aug 30;13(9):741-8
pubmed: 27575624
Proteomics. 2005 Aug;5(13):3501-5
pubmed: 16041670
Nat Methods. 2016 Aug;13(8):651-656
pubmed: 27493588
J Proteome Res. 2017 Feb 3;16(2):945-957
pubmed: 27990823
Proteomics. 2016 Jan;16(2):214-25
pubmed: 26449181
J Proteome Res. 2015 Nov 6;14(11):4940-3
pubmed: 26477298
Proteomics Clin Appl. 2015 Aug;9(7-8):745-54
pubmed: 25631240
Nat Biotechnol. 2012 Oct;30(10):918-20
pubmed: 23051804
J Biotechnol. 2017 Nov 10;261:142-148
pubmed: 28559010
J Proteome Res. 2016 Mar 4;15(3):707-12
pubmed: 26510693
Electrophoresis. 1999 Dec;20(18):3551-67
pubmed: 10612281
Proteomics. 2015 Apr;15(8):1356-74
pubmed: 25475079
Nat Biotechnol. 2008 Dec;26(12):1367-72
pubmed: 19029910
Nat Biotechnol. 2004 Nov;22(11):1459-66
pubmed: 15529173
Nat Methods. 2018 Jun;15(6):401
pubmed: 29855570
J Proteome Res. 2019 Feb 1;18(2):700-708
pubmed: 30462513
Trends Biochem Sci. 2017 May;42(5):333-341
pubmed: 28118949
Nat Biotechnol. 2017 Apr 11;35(4):316-319
pubmed: 28398311
Nat Methods. 2018 Jul;15(7):475-476
pubmed: 29967506
Proteomics. 2014 Mar;14(4-5):367-77
pubmed: 24285552
Mol Cell Proteomics. 2011 Jan;10(1):R110.000133
pubmed: 20716697
J Proteome Res. 2019 Feb 1;18(2):728-731
pubmed: 30511867
Cancer Res. 2017 Nov 1;77(21):e43-e46
pubmed: 29092937
Clin Proteomics. 2016 Dec 5;13:23
pubmed: 27980500

Auteurs

Niels Hulstaert (N)

VIB-UGent Center for Medical Biotechnology, VIB , Ghent B-9000 , Belgium.
Department of Biomolecular Medicine , Ghent University , Ghent B-9000 , Belgium.

Jim Shofstahl (J)

Thermo Fisher Scientific , 355 River Oaks Parkway , San Jose , California 95134 , United States.

Timo Sachsenberg (T)

Applied Bioinformatics, Department for Computer Science , University of Tuebingen , Sand 14 , 72076 Tuebingen , Germany.

Mathias Walzer (M)

European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , United Kingdom.

Harald Barsnes (H)

Computational Biology Unit (CBU), Department of Informatics , University of Bergen , Bergen 5020 , Norway.
Proteomics Unit (PROBE), Department of Biomedicine , University of Bergen , Bergen 5020 , Norway.

Lennart Martens (L)

VIB-UGent Center for Medical Biotechnology, VIB , Ghent B-9000 , Belgium.
Department of Biomolecular Medicine , Ghent University , Ghent B-9000 , Belgium.

Yasset Perez-Riverol (Y)

European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , United Kingdom.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Adenosine Triphosphate Adenosine Diphosphate Mitochondrial ADP, ATP Translocases Binding Sites Mitochondria

Classifications MeSH