Statistical Analysis of Quantitative Peptidomics and Peptide-Level Proteomics Data with Prostar.

Proteomics / methods Proteome / analysis Mass Spectrometry / methods Peptides / analysis Software

Data processing Differential analysis Label-free proteomics Relative quantification Statistical software

Journal

Methods in molecular biology (Clifton, N.J.)

ISSN: 1940-6029

Titre abrégé: Methods Mol Biol

Pays: United States

ID NLM: 9214969

Informations de publication

Date de publication:
2023

Historique:

entrez: 29 10 2022

pubmed: 30 10 2022

medline: 2 11 2022

Statut: ppublish

Résumé

Prostar is a software tool dedicated to the processing of quantitative data resulting from mass spectrometry-based label-free proteomics. Practically, once biological samples have been analyzed by bottom-up proteomics, the raw mass spectrometer outputs are processed by bioinformatics tools, so as to identify peptides and quantify them, notably by means of precursor ion chromatogram integration. From that point, the classical workflows aggregate these pieces of peptide-level information to infer protein-level identities and amounts. Finally, protein abundances can be statistically analyzed to find out proteins that are significantly differentially abundant between compared conditions. Prostar original workflow has been developed based on this strategy. However, recent works have demonstrated that processing peptide-level information is often more accurate when searching for differentially abundant proteins, as the aggregation step tends to hide some of the data variabilities and biases. As a result, Prostar has been extended by workflows that manage peptide-level data, and this protocol details their use. The first one, deemed "peptidomics," implies that the differential analysis is conducted at peptide level, independently of the peptide-to-protein relationship. The second workflow proposes to aggregate the peptide abundances after their preprocessing (i.e., after filtering, normalization, and imputation), so as to minimize the amount of protein-level preprocessing prior to differential analysis.

Identifiants

DOI: 10.1007/978-1-0716-1967-4_9 PMID: 36308690

pubmed: 36308690

doi: 10.1007/978-1-0716-1967-4_9

doi:

Substances chimiques

Proteome 0

Peptides 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

163-196

Informations de copyright

Références

Zhang Y, Fonslow BR, Shan B, Baek MC, Yates III JR (2013) Protein analysis by shotgun/bottom-up proteomics. Chem Rev 113(4):2343–2394. https://doi.org/10.1021/cr3003533

Ong SE, Foster LJ, Mann M (2003) Mass spectrometric-based approaches in quantitative proteomics. Methods 29(2):124–130. https://doi.org/10.1016/s1046-2023(02)00303-1

Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M (2011) Global quantification of mammalian gene expression control. Nature 473(7347):337–342. https://doi.org/10.1038/nature10098

Beeley C (2013) Web application development with R using Shiny. Packt Publishing Ltd. https://github.com/PacktPublishing/Web-Application-Development-with-R-Using-Shiny-third-edition

Wieczorek S, Combes F, Lazar C, Giai Gianetto Q, Gatto L, Dorffer A, Hesse AM, Coute Y, Ferro M, Bruley C, Burger T (2017) Dapar & prostar: software to perform statistical analyses in quantitative discovery proteomics. Bioinformatics 33(1):135–136. https://doi.org/10.1093/bioinformatics/btw580

Goeminne LJ, Argentini A, Martens L, Clement L (2015) Summarization vs peptide-based models in label-free quantitative proteomics: performance, pitfalls, and data analysis guidelines. J Proteome Res 14(6):2457–2465. https://doi.org/10.1021/pr501223t

Wieczorek S, Combes F, Borges H, Burger T (2019) Protein-level statistical analysis of quantitative label-free proteomics data with prostar. In: Proteomics for biomarker discovery. Springer, New York, pp 225–246. https://doi.org/10.1007/978-1-4939-9164-8_15

Gatto L, Lilley KS (2012) MSnbase-an R/Bioconductor package for isobaric tagged mass spectrometry data visualization, processing and quantitation. Bioinformatics 28(2):288–289. https://doi:10.1093/bioinformatics/btr645

Wieczorek S, Combes F, Burger T (2018) DAPAR and ProStaR user manual. In: Bioconductor. https://www.bioconductor.org/packages/release/bioc/vignettes/Prostar/inst/doc/Prostar_UserManual.pdf?attredirects=0

RStudio Team (2015) RStudio: Integrated Development for R. RStudio, Inc., Boston, MA. http://www.rstudio.com/

Cox J, Mann M (2008) Maxquant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511

Bouyssié D, Hesse AM, Mouton-Barbosa E, Rompais M, Macron C, Carapito C, Gonzalez de Peredo A, Couté Y, Dupierris V, Burel A, et al. (2020) Proline: an efficient and user-friendly software suite for large-scale proteomics. Bioinformatics 36(10):3148–3155. https://doi.org/10.1093/bioinformatics/btaa118

R-Core-Team (2020) stats package. URL https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/hclust , r package version 3.6.2

Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836, https://doi.org/10.1080/01621459.1979.10481038

Smyth GK (2005) Limma: linear models for microarray data. In: Bioinformatics and computational biology solutions using R and bioconductor. Springer, New York, pp 397–420. https://doi.org/10.1007/0-387-29362-0_23

Huber W, Von Heydebreck A, Sültmann H, Poustka A, Vingron M (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18(suppl_1):S96–S104. https://doi.org/10.1093/bioinformatics/18.suppl_1.S96

Lazar C, Gatto L, Ferro M, Bruley C, Burger T (2016) Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. J Proteome Res 15(4):1116–1125. https://doi.org/10.1021/acs.jproteome.5b00981

Giai Gianetto Q, Combes F, Ramus C, Bruley C, Couté Y, Burger T (2016) Calibration plot for proteomics: A graphical tool to visually check the assumptions underlying FDR control in quantitative experiments. Proteomics 16(1):29–32. https://doi.org/10.1002/pmic.201500189

Giai Gianetto Q, Couté Y, Bruley C, Burger T (2016) Uses and misuses of the fudge factor in quantitative discovery proteomics. Proteomics 16(14):1955–1960. https://doi.org/10.1002/pmic.201600132

Wieczorek S, Gianetto QG, Burger T (2019) Five simple yet essential steps to correctly estimate the rate of false differentially abundant proteins in mass spectrometry analyses. J Proteomics 207:103441. https://doi.org/10.1016/j.jprot.2019.103441

Statistical Analysis of Quantitative Peptidomics and Peptide-Level Proteomics Data with Prostar.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Marianne Tardif (M)

Enora Fremy (E)

Anne-Marie Hesse (AM)

Thomas Burger (T)

Yohann Couté (Y)

Samuel Wieczorek (S)

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Accuracy of web-based automated versus digital manual cephalometric landmark identification.

An arithmetic operation P system based on symmetric ternary system.

Classifications MeSH