Robust Summarization and Inference in Proteome-wide Label-free Quantification.
Biostatistics
bioinformatics
bioinformatics software
differential expression
label-free quantification
mass spectrometry
ridge regression
shotgun proteomics
summarization
Journal
Molecular & cellular proteomics : MCP
ISSN: 1535-9484
Titre abrégé: Mol Cell Proteomics
Pays: United States
ID NLM: 101125647
Informations de publication
Date de publication:
07 2020
07 2020
Historique:
received:
20
06
2019
revised:
20
04
2020
pubmed:
24
4
2020
medline:
11
5
2021
entrez:
24
4
2020
Statut:
ppublish
Résumé
Label-Free Quantitative mass spectrometry based workflows for differential expression (DE) analysis of proteins impose important challenges on the data analysis because of peptide-specific effects and context dependent missingness of peptide intensities. Peptide-based workflows, like MSqRob, test for DE directly from peptide intensities and outperform summarization methods which first aggregate MS1 peptide intensities to protein intensities before DE analysis. However, these methods are computationally expensive, often hard to understand for the non-specialized end-user, and do not provide protein summaries, which are important for visualization or downstream processing. In this work, we therefore evaluate state-of-the-art summarization strategies using a benchmark spike-in dataset and discuss why and when these fail compared with the state-of-the-art peptide based model, MSqRob. Based on this evaluation, we propose a novel summarization strategy, MSqRobSum, which estimates MSqRob's model parameters in a two-stage procedure circumventing the drawbacks of peptide-based workflows. MSqRobSum maintains MSqRob's superior performance, while providing useful protein expression summaries for plotting and downstream analysis. Summarizing peptide to protein intensities considerably reduces the computational complexity, the memory footprint and the model complexity, and makes it easier to disseminate DE inferred on protein summaries. Moreover, MSqRobSum provides a highly modular analysis framework, which provides researchers with full flexibility to develop data analysis workflows tailored toward their specific applications.
Identifiants
pubmed: 32321741
pii: S1535-9476(20)34982-3
doi: 10.1074/mcp.RA119.001624
pmc: PMC7338080
pii:
doi:
Substances chimiques
Peptides
0
Proteome
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1209-1219Informations de copyright
© 2020 Sticker et al.
Déclaration de conflit d'intérêts
Conflict of interest—Authors declare no competing interests.
Références
J Proteomics. 2018 Jan 16;171:23-36
pubmed: 28391044
Mol Cell Proteomics. 2015 Apr;14(4):870-81
pubmed: 25616868
Bioinformatics. 2012 Jan 15;28(2):288-9
pubmed: 22113085
Nucleic Acids Res. 2015 Apr 20;43(7):e47
pubmed: 25605792
Nat Biotechnol. 2008 Dec;26(12):1367-72
pubmed: 19029910
Nat Commun. 2020 Jun 26;11(1):3234
pubmed: 32591519
Mol Cell Proteomics. 2016 Feb;15(2):657-68
pubmed: 26566788
Nat Protoc. 2018 Mar;13(3):530-550
pubmed: 29446774
Mol Cell Proteomics. 2006 Jan;5(1):144-56
pubmed: 16219938
Bioinformatics. 2003 Jan 22;19(2):185-93
pubmed: 12538238
J Proteome Res. 2016 Apr 1;15(4):1116-25
pubmed: 26906401
Bioinformatics. 2014 Sep 1;30(17):2524-6
pubmed: 24794931
J Proteome Res. 2015 Jun 5;14(6):2457-65
pubmed: 25827922
J Proteome Res. 2014 Apr 4;13(4):2069-79
pubmed: 24635752
Proc Natl Acad Sci U S A. 2018 May 22;115(21):E4767-E4776
pubmed: 29743190
Stat Interface. 2012;5(1):75-87
pubmed: 24163717
PLoS One. 2015 Sep 02;10(9):e0137048
pubmed: 26331617
Mol Cell Proteomics. 2014 Sep;13(9):2513-26
pubmed: 24942700
Rapid Commun Mass Spectrom. 2015 May 15;29(9):795-801
pubmed: 26377007