Statistical Analysis of Quantitative Peptidomics and Peptide-Level Proteomics Data with Prostar.

Data processing Differential analysis Label-free proteomics Relative quantification Statistical software

Journal

Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969

Informations de publication

Date de publication:
2023
Historique:
entrez: 29 10 2022
pubmed: 30 10 2022
medline: 2 11 2022
Statut: ppublish

Résumé

Prostar is a software tool dedicated to the processing of quantitative data resulting from mass spectrometry-based label-free proteomics. Practically, once biological samples have been analyzed by bottom-up proteomics, the raw mass spectrometer outputs are processed by bioinformatics tools, so as to identify peptides and quantify them, notably by means of precursor ion chromatogram integration. From that point, the classical workflows aggregate these pieces of peptide-level information to infer protein-level identities and amounts. Finally, protein abundances can be statistically analyzed to find out proteins that are significantly differentially abundant between compared conditions. Prostar original workflow has been developed based on this strategy. However, recent works have demonstrated that processing peptide-level information is often more accurate when searching for differentially abundant proteins, as the aggregation step tends to hide some of the data variabilities and biases. As a result, Prostar has been extended by workflows that manage peptide-level data, and this protocol details their use. The first one, deemed "peptidomics," implies that the differential analysis is conducted at peptide level, independently of the peptide-to-protein relationship. The second workflow proposes to aggregate the peptide abundances after their preprocessing (i.e., after filtering, normalization, and imputation), so as to minimize the amount of protein-level preprocessing prior to differential analysis.

Identifiants

pubmed: 36308690
doi: 10.1007/978-1-0716-1967-4_9
doi:

Substances chimiques

Proteome 0
Peptides 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

163-196

Informations de copyright

© 2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Références

Zhang Y, Fonslow BR, Shan B, Baek MC, Yates III JR (2013) Protein analysis by shotgun/bottom-up proteomics. Chem Rev 113(4):2343–2394. https://doi.org/10.1021/cr3003533
Ong SE, Foster LJ, Mann M (2003) Mass spectrometric-based approaches in quantitative proteomics. Methods 29(2):124–130. https://doi.org/10.1016/s1046-2023(02)00303-1
Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M (2011) Global quantification of mammalian gene expression control. Nature 473(7347):337–342. https://doi.org/10.1038/nature10098
Beeley C (2013) Web application development with R using Shiny. Packt Publishing Ltd. https://github.com/PacktPublishing/Web-Application-Development-with-R-Using-Shiny-third-edition
Wieczorek S, Combes F, Lazar C, Giai Gianetto Q, Gatto L, Dorffer A, Hesse AM, Coute Y, Ferro M, Bruley C, Burger T (2017) Dapar & prostar: software to perform statistical analyses in quantitative discovery proteomics. Bioinformatics 33(1):135–136. https://doi.org/10.1093/bioinformatics/btw580
Goeminne LJ, Argentini A, Martens L, Clement L (2015) Summarization vs peptide-based models in label-free quantitative proteomics: performance, pitfalls, and data analysis guidelines. J Proteome Res 14(6):2457–2465. https://doi.org/10.1021/pr501223t
Wieczorek S, Combes F, Borges H, Burger T (2019) Protein-level statistical analysis of quantitative label-free proteomics data with prostar. In: Proteomics for biomarker discovery. Springer, New York, pp 225–246. https://doi.org/10.1007/978-1-4939-9164-8_15
Gatto L, Lilley KS (2012) MSnbase-an R/Bioconductor package for isobaric tagged mass spectrometry data visualization, processing and quantitation. Bioinformatics 28(2):288–289. https://doi:10.1093/bioinformatics/btr645
Wieczorek S, Combes F, Burger T (2018) DAPAR and ProStaR user manual. In: Bioconductor. https://www.bioconductor.org/packages/release/bioc/vignettes/Prostar/inst/doc/Prostar_UserManual.pdf?attredirects=0
RStudio Team (2015) RStudio: Integrated Development for R. RStudio, Inc., Boston, MA. http://www.rstudio.com/
Cox J, Mann M (2008) Maxquant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511
Bouyssié D, Hesse AM, Mouton-Barbosa E, Rompais M, Macron C, Carapito C, Gonzalez de Peredo A, Couté Y, Dupierris V, Burel A, et al. (2020) Proline: an efficient and user-friendly software suite for large-scale proteomics. Bioinformatics 36(10):3148–3155. https://doi.org/10.1093/bioinformatics/btaa118
R-Core-Team (2020) stats package. URL https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/hclust , r package version 3.6.2
Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836, https://doi.org/10.1080/01621459.1979.10481038
Smyth GK (2005) Limma: linear models for microarray data. In: Bioinformatics and computational biology solutions using R and bioconductor. Springer, New York, pp 397–420. https://doi.org/10.1007/0-387-29362-0_23
Huber W, Von Heydebreck A, Sültmann H, Poustka A, Vingron M (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18(suppl_1):S96–S104. https://doi.org/10.1093/bioinformatics/18.suppl_1.S96
Lazar C, Gatto L, Ferro M, Bruley C, Burger T (2016) Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. J Proteome Res 15(4):1116–1125. https://doi.org/10.1021/acs.jproteome.5b00981
Giai Gianetto Q, Combes F, Ramus C, Bruley C, Couté Y, Burger T (2016) Calibration plot for proteomics: A graphical tool to visually check the assumptions underlying FDR control in quantitative experiments. Proteomics 16(1):29–32. https://doi.org/10.1002/pmic.201500189
Giai Gianetto Q, Couté Y, Bruley C, Burger T (2016) Uses and misuses of the fudge factor in quantitative discovery proteomics. Proteomics 16(14):1955–1960. https://doi.org/10.1002/pmic.201600132
Wieczorek S, Gianetto QG, Burger T (2019) Five simple yet essential steps to correctly estimate the rate of false differentially abundant proteins in mass spectrometry analyses. J Proteomics 207:103441. https://doi.org/10.1016/j.jprot.2019.103441

Auteurs

Marianne Tardif (M)

Univ. Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, Grenoble, France.

Enora Fremy (E)

Univ. Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, Grenoble, France.

Anne-Marie Hesse (AM)

Univ. Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, Grenoble, France.

Thomas Burger (T)

Univ. Grenoble Alpes, CNRS, INSERM, CEA, Grenoble, France.

Yohann Couté (Y)

Univ. Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, Grenoble, France.

Samuel Wieczorek (S)

Univ. Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, CNRS, Grenoble, France. samuel.wieczorek@cea.fr.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Cephalometry Humans Anatomic Landmarks Software Internet
Humans Algorithms Software Artificial Intelligence Computer Simulation

Classifications MeSH