Deep Learning-Assisted Analysis of Immunopeptidomics Data.

Deep learning Immunopeptidomics Mass spectrometry Peptide identification Prosit Rescoring Visualizations

Journal

Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969

Informations de publication

Date de publication:
2024
Historique:
medline: 29 3 2024
pubmed: 29 3 2024
entrez: 29 3 2024
Statut: ppublish

Résumé

Liquid chromatography-coupled mass spectrometry (LC-MS/MS) is the primary method to obtain direct evidence for the presentation of disease- or patient-specific human leukocyte antigen (HLA). However, compared to the analysis of tryptic peptides in proteomics, the analysis of HLA peptides still poses computational and statistical challenges. Recently, fragment ion intensity-based matching scores assessing the similarity between predicted and observed spectra were shown to substantially increase the number of confidently identified peptides, particularly in use cases where non-tryptic peptides are analyzed. In this chapter, we describe in detail three procedures on how to benefit from state-of-the-art deep learning models to analyze and validate single spectra, single measurements, and multiple measurements in mass spectrometry-based immunopeptidomics. For this, we explain how to use the Universal Spectrum Explorer (USE), online Oktoberfest, and offline Oktoberfest. For intensity-based scoring, Oktoberfest uses fragment ion intensity and retention time predictions from the deep learning framework Prosit, a deep neural network trained on a very large number of synthetic peptides and tandem mass spectra generated within the ProteomeTools project. The examples shown highlight how deep learning-assisted analysis can increase the number of identified HLA peptides, facilitate the discovery of confidently identified neo-epitopes, or provide assistance in the assessment of the presence of cryptic peptides, such as spliced peptides.

Identifiants

pubmed: 38549030
doi: 10.1007/978-1-0716-3646-6_25
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

457-483

Informations de copyright

© 2024. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Références

Chong C, Coukos G, Bassani-Sternberg M (2022) Identification of tumor antigens with immunopeptidomics. Nat Biotechnol 40:175–188. https://doi.org/10.1038/s41587-021-01038-8
doi: 10.1038/s41587-021-01038-8 pubmed: 34635837
Parker R, Tailor A, Peng X et al (2021) The choice of search engine affects sequencing depth and HLA class I allele-specific peptide repertoires. Mol Cell Proteomics 20:100124. https://doi.org/10.1016/j.mcpro.2021.100124
doi: 10.1016/j.mcpro.2021.100124 pubmed: 34303857 pmcid: 8724928
Gessulat S, Schmidt T, Zolg DP et al (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16:509–518. https://doi.org/10.1038/s41592-019-0426-7
doi: 10.1038/s41592-019-0426-7 pubmed: 31133760
Gabriels R, Martens L, Degroeve S (2019) Updated MS
doi: 10.1093/nar/gkz299 pubmed: 31028400 pmcid: 6602496
Tarn C, Zeng W-F (2021) pDeep3: toward more accurate spectrum prediction with fast few-shot learning. Anal Chem 93:5815–5822. https://doi.org/10.1021/acs.analchem.0c05427
doi: 10.1021/acs.analchem.0c05427 pubmed: 33797898
Zeng W-F, Zhou X-X, Willems S et al (2022) AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics. Nat Commun 13:7238. https://doi.org/10.1038/s41467-022-34904-3
doi: 10.1038/s41467-022-34904-3 pubmed: 36433986 pmcid: 9700817
Wilhelm M, Zolg DP, Graber M et al (2021) Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics. Nat Commun 12:3346. https://doi.org/10.1038/s41467-021-23713-9
doi: 10.1038/s41467-021-23713-9 pubmed: 34099720 pmcid: 8184761
Declercq A, Bouwmeester R, Degroeve S, et al (2021) MS2Rescore: data-driven rescoring dramatically boosts immunopeptide identification rates. 2021.11.02.466886
Cormican JA, Horokhovskyi Y, Soh WT et al (2022) inSPIRE: an open-source tool for increased mass spectrometry identification rates using Prosit spectral prediction. Mol Cell Proteomics 21:100432. https://doi.org/10.1016/j.mcpro.2022.100432
doi: 10.1016/j.mcpro.2022.100432 pubmed: 36280141 pmcid: 9720494
Zolg DP, Gessulat S, Paschke C et al (2021) INFERYS rescoring: boosting peptide identifications and scoring confidence of database search results. Rapid Commun Mass Spectrom:e9128. https://doi.org/10.1002/rcm.9128
Schmidt T, Samaras P, Dorfer V et al (2021) Universal Spectrum explorer: a standalone (web-)application for cross-resource Spectrum comparison. J Proteome Res 20:3388–3394. https://doi.org/10.1021/acs.jproteome.1c00096
doi: 10.1021/acs.jproteome.1c00096 pubmed: 33970638
Zolg DP, Wilhelm M, Schnatbaum K et al (2017) Building ProteomeTools based on a complete synthetic human proteome. Nat Methods 14:259–262. https://doi.org/10.1038/nmeth.4153
doi: 10.1038/nmeth.4153 pubmed: 28135259 pmcid: 5868332
Searle BC, Swearingen KE, Barnes CA et al (2020) Generating high quality libraries for DIA MS with empirically corrected peptide predictions. Nat Commun 11:1548. https://doi.org/10.1038/s41467-020-15346-1
doi: 10.1038/s41467-020-15346-1 pubmed: 32214105 pmcid: 7096433
Gabriel W, The M, Zolg DP et al (2022) Prosit-TMT: deep learning boosts identification of TMT-labeled peptides. Anal Chem. https://doi.org/10.1021/acs.analchem.1c05435
Gabriel W, Giurcoiu V, Lautenbacher L, Wilhelm M (2022) Predicting fragment intensities and retention time of iTRAQ- and TMTPro-labeled peptides with Prosit-TMT. Proteomics 22:2100257. https://doi.org/10.1002/pmic.202100257
doi: 10.1002/pmic.202100257
Martens L, Chambers M, Sturm M et al (2011) mzML—a community standard for mass spectrometry data. Mol Cell Proteomics 10(R110):000133. https://doi.org/10.1074/mcp.R110.000133
doi: 10.1074/mcp.R110.000133
The M, MacCoss MJ, Noble WS, Käll L (2016) Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. J Am Soc Mass Spectrom 27:1719–1727. https://doi.org/10.1007/s13361-016-1460-7
doi: 10.1007/s13361-016-1460-7 pubmed: 27572102 pmcid: 5059416
Fondrie WE, Noble WS (2021) Mokapot: fast and flexible Semisupervised learning for peptide detection. J Proteome Res 20:1966–1971. https://doi.org/10.1021/acs.jproteome.0c01010
doi: 10.1021/acs.jproteome.0c01010 pubmed: 33596079 pmcid: 8022319
Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511
doi: 10.1038/nbt.1511 pubmed: 19029910
Kong AT, Leprevost FV, Avtonomov DM et al (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513–520. https://doi.org/10.1038/nmeth.4256
doi: 10.1038/nmeth.4256 pubmed: 28394336 pmcid: 5409104
Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567. https://doi.org/10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2 pubmed: 10612281
LeDuc RD, Deutsch EW, Binz P-A et al (2022) Proteomics standards Initiative’s ProForma 2.0: unifying the encoding of Proteoforms and Peptidoforms. J Proteome Res 21:1189–1195. https://doi.org/10.1021/acs.jproteome.1c00771
doi: 10.1021/acs.jproteome.1c00771 pubmed: 35290070 pmcid: 7612572
Debrie E, Malfait M, Gabriels R et al (2023) Quality control for the target decoy approach for peptide identification. J Proteome Res 22:350–358. https://doi.org/10.1021/acs.jproteome.2c00423
doi: 10.1021/acs.jproteome.2c00423 pubmed: 36648107
Deutsch EW, Perez-Riverol Y, Carver J et al (2021) Universal spectrum identifier for mass spectra. Nat Methods 18:768–770. https://doi.org/10.1038/s41592-021-01184-6
doi: 10.1038/s41592-021-01184-6 pubmed: 34183830 pmcid: 8405201
Mylonas R, Beer I, Iseli C et al (2018) Estimating the contribution of proteasomal spliced peptides to the HLA-I Ligandome*. Mol Cell Proteomics 17:2347–2357. https://doi.org/10.1074/mcp.RA118.000877
doi: 10.1074/mcp.RA118.000877 pubmed: 30171158 pmcid: 6283289
Erhard F, Dölken L, Schilling B, Schlosser A (2020) Identification of the cryptic HLA-I Immunopeptidome. Cancer Immunol Res 8:1018–1026. https://doi.org/10.1158/2326-6066.CIR-19-0886
doi: 10.1158/2326-6066.CIR-19-0886 pubmed: 32561536
Mishto M (2021) Commentary: are there indeed spliced peptides in the Immunopeptidome? Mol Cell Proteomics 20:100158. https://doi.org/10.1016/j.mcpro.2021.100158
doi: 10.1016/j.mcpro.2021.100158 pubmed: 34607014 pmcid: 8724881
Pino LK, Searle BC, Bollinger JG et al (2020) The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spec Rev 39:229–244. https://doi.org/10.1002/mas.21540
doi: 10.1002/mas.21540
Bruderer R, Bernhardt OM, Gandhi T et al (2015) Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol Cell Proteomics 14:1400–1410. https://doi.org/10.1074/mcp.M114.044305
doi: 10.1074/mcp.M114.044305 pubmed: 25724911 pmcid: 4424408
Chen X, Sun Y, Zhang T et al (2021) Quantitative proteomics using isobaric labeling: a practical guide. Genomics Proteomics Bioinformatics 19:689–706. https://doi.org/10.1016/j.gpb.2021.08.012
doi: 10.1016/j.gpb.2021.08.012 pubmed: 35007772
Zolg DP, Wilhelm M, Yu P et al (2017) PROCAL: a set of 40 peptide standards for retention time indexing, column performance monitoring, and collision energy calibration. Proteomics 17:1700263. https://doi.org/10.1002/pmic.201700263
doi: 10.1002/pmic.201700263

Auteurs

Wassim Gabriel (W)

Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Mario Picciani (M)

Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Matthew The (M)

Chair of Proteomics and Bioanalytics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Mathias Wilhelm (M)

Computational Mass Spectrometry, TUM School of Life Sciences, Technical University of Munich, Freising, Germany. mathias.wilhelm@tum.de.

Classifications MeSH