Deep Learning-Assisted Analysis of Immunopeptidomics Data.
Deep learning
Immunopeptidomics
Mass spectrometry
Peptide identification
Prosit
Rescoring
Visualizations
Journal
Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969
Informations de publication
Date de publication:
2024
2024
Historique:
medline:
29
3
2024
pubmed:
29
3
2024
entrez:
29
3
2024
Statut:
ppublish
Résumé
Liquid chromatography-coupled mass spectrometry (LC-MS/MS) is the primary method to obtain direct evidence for the presentation of disease- or patient-specific human leukocyte antigen (HLA). However, compared to the analysis of tryptic peptides in proteomics, the analysis of HLA peptides still poses computational and statistical challenges. Recently, fragment ion intensity-based matching scores assessing the similarity between predicted and observed spectra were shown to substantially increase the number of confidently identified peptides, particularly in use cases where non-tryptic peptides are analyzed. In this chapter, we describe in detail three procedures on how to benefit from state-of-the-art deep learning models to analyze and validate single spectra, single measurements, and multiple measurements in mass spectrometry-based immunopeptidomics. For this, we explain how to use the Universal Spectrum Explorer (USE), online Oktoberfest, and offline Oktoberfest. For intensity-based scoring, Oktoberfest uses fragment ion intensity and retention time predictions from the deep learning framework Prosit, a deep neural network trained on a very large number of synthetic peptides and tandem mass spectra generated within the ProteomeTools project. The examples shown highlight how deep learning-assisted analysis can increase the number of identified HLA peptides, facilitate the discovery of confidently identified neo-epitopes, or provide assistance in the assessment of the presence of cryptic peptides, such as spliced peptides.
Identifiants
pubmed: 38549030
doi: 10.1007/978-1-0716-3646-6_25
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
457-483Informations de copyright
© 2024. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.
Références
Chong C, Coukos G, Bassani-Sternberg M (2022) Identification of tumor antigens with immunopeptidomics. Nat Biotechnol 40:175–188. https://doi.org/10.1038/s41587-021-01038-8
doi: 10.1038/s41587-021-01038-8
pubmed: 34635837
Parker R, Tailor A, Peng X et al (2021) The choice of search engine affects sequencing depth and HLA class I allele-specific peptide repertoires. Mol Cell Proteomics 20:100124. https://doi.org/10.1016/j.mcpro.2021.100124
doi: 10.1016/j.mcpro.2021.100124
pubmed: 34303857
pmcid: 8724928
Gessulat S, Schmidt T, Zolg DP et al (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16:509–518. https://doi.org/10.1038/s41592-019-0426-7
doi: 10.1038/s41592-019-0426-7
pubmed: 31133760
Gabriels R, Martens L, Degroeve S (2019) Updated MS
doi: 10.1093/nar/gkz299
pubmed: 31028400
pmcid: 6602496
Tarn C, Zeng W-F (2021) pDeep3: toward more accurate spectrum prediction with fast few-shot learning. Anal Chem 93:5815–5822. https://doi.org/10.1021/acs.analchem.0c05427
doi: 10.1021/acs.analchem.0c05427
pubmed: 33797898
Zeng W-F, Zhou X-X, Willems S et al (2022) AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics. Nat Commun 13:7238. https://doi.org/10.1038/s41467-022-34904-3
doi: 10.1038/s41467-022-34904-3
pubmed: 36433986
pmcid: 9700817
Wilhelm M, Zolg DP, Graber M et al (2021) Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics. Nat Commun 12:3346. https://doi.org/10.1038/s41467-021-23713-9
doi: 10.1038/s41467-021-23713-9
pubmed: 34099720
pmcid: 8184761
Declercq A, Bouwmeester R, Degroeve S, et al (2021) MS2Rescore: data-driven rescoring dramatically boosts immunopeptide identification rates. 2021.11.02.466886
Cormican JA, Horokhovskyi Y, Soh WT et al (2022) inSPIRE: an open-source tool for increased mass spectrometry identification rates using Prosit spectral prediction. Mol Cell Proteomics 21:100432. https://doi.org/10.1016/j.mcpro.2022.100432
doi: 10.1016/j.mcpro.2022.100432
pubmed: 36280141
pmcid: 9720494
Zolg DP, Gessulat S, Paschke C et al (2021) INFERYS rescoring: boosting peptide identifications and scoring confidence of database search results. Rapid Commun Mass Spectrom:e9128. https://doi.org/10.1002/rcm.9128
Schmidt T, Samaras P, Dorfer V et al (2021) Universal Spectrum explorer: a standalone (web-)application for cross-resource Spectrum comparison. J Proteome Res 20:3388–3394. https://doi.org/10.1021/acs.jproteome.1c00096
doi: 10.1021/acs.jproteome.1c00096
pubmed: 33970638
Zolg DP, Wilhelm M, Schnatbaum K et al (2017) Building ProteomeTools based on a complete synthetic human proteome. Nat Methods 14:259–262. https://doi.org/10.1038/nmeth.4153
doi: 10.1038/nmeth.4153
pubmed: 28135259
pmcid: 5868332
Searle BC, Swearingen KE, Barnes CA et al (2020) Generating high quality libraries for DIA MS with empirically corrected peptide predictions. Nat Commun 11:1548. https://doi.org/10.1038/s41467-020-15346-1
doi: 10.1038/s41467-020-15346-1
pubmed: 32214105
pmcid: 7096433
Gabriel W, The M, Zolg DP et al (2022) Prosit-TMT: deep learning boosts identification of TMT-labeled peptides. Anal Chem. https://doi.org/10.1021/acs.analchem.1c05435
Gabriel W, Giurcoiu V, Lautenbacher L, Wilhelm M (2022) Predicting fragment intensities and retention time of iTRAQ- and TMTPro-labeled peptides with Prosit-TMT. Proteomics 22:2100257. https://doi.org/10.1002/pmic.202100257
doi: 10.1002/pmic.202100257
Martens L, Chambers M, Sturm M et al (2011) mzML—a community standard for mass spectrometry data. Mol Cell Proteomics 10(R110):000133. https://doi.org/10.1074/mcp.R110.000133
doi: 10.1074/mcp.R110.000133
The M, MacCoss MJ, Noble WS, Käll L (2016) Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. J Am Soc Mass Spectrom 27:1719–1727. https://doi.org/10.1007/s13361-016-1460-7
doi: 10.1007/s13361-016-1460-7
pubmed: 27572102
pmcid: 5059416
Fondrie WE, Noble WS (2021) Mokapot: fast and flexible Semisupervised learning for peptide detection. J Proteome Res 20:1966–1971. https://doi.org/10.1021/acs.jproteome.0c01010
doi: 10.1021/acs.jproteome.0c01010
pubmed: 33596079
pmcid: 8022319
Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372. https://doi.org/10.1038/nbt.1511
doi: 10.1038/nbt.1511
pubmed: 19029910
Kong AT, Leprevost FV, Avtonomov DM et al (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nat Methods 14:513–520. https://doi.org/10.1038/nmeth.4256
doi: 10.1038/nmeth.4256
pubmed: 28394336
pmcid: 5409104
Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567. https://doi.org/10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
pubmed: 10612281
LeDuc RD, Deutsch EW, Binz P-A et al (2022) Proteomics standards Initiative’s ProForma 2.0: unifying the encoding of Proteoforms and Peptidoforms. J Proteome Res 21:1189–1195. https://doi.org/10.1021/acs.jproteome.1c00771
doi: 10.1021/acs.jproteome.1c00771
pubmed: 35290070
pmcid: 7612572
Debrie E, Malfait M, Gabriels R et al (2023) Quality control for the target decoy approach for peptide identification. J Proteome Res 22:350–358. https://doi.org/10.1021/acs.jproteome.2c00423
doi: 10.1021/acs.jproteome.2c00423
pubmed: 36648107
Deutsch EW, Perez-Riverol Y, Carver J et al (2021) Universal spectrum identifier for mass spectra. Nat Methods 18:768–770. https://doi.org/10.1038/s41592-021-01184-6
doi: 10.1038/s41592-021-01184-6
pubmed: 34183830
pmcid: 8405201
Mylonas R, Beer I, Iseli C et al (2018) Estimating the contribution of proteasomal spliced peptides to the HLA-I Ligandome*. Mol Cell Proteomics 17:2347–2357. https://doi.org/10.1074/mcp.RA118.000877
doi: 10.1074/mcp.RA118.000877
pubmed: 30171158
pmcid: 6283289
Erhard F, Dölken L, Schilling B, Schlosser A (2020) Identification of the cryptic HLA-I Immunopeptidome. Cancer Immunol Res 8:1018–1026. https://doi.org/10.1158/2326-6066.CIR-19-0886
doi: 10.1158/2326-6066.CIR-19-0886
pubmed: 32561536
Mishto M (2021) Commentary: are there indeed spliced peptides in the Immunopeptidome? Mol Cell Proteomics 20:100158. https://doi.org/10.1016/j.mcpro.2021.100158
doi: 10.1016/j.mcpro.2021.100158
pubmed: 34607014
pmcid: 8724881
Pino LK, Searle BC, Bollinger JG et al (2020) The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spec Rev 39:229–244. https://doi.org/10.1002/mas.21540
doi: 10.1002/mas.21540
Bruderer R, Bernhardt OM, Gandhi T et al (2015) Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol Cell Proteomics 14:1400–1410. https://doi.org/10.1074/mcp.M114.044305
doi: 10.1074/mcp.M114.044305
pubmed: 25724911
pmcid: 4424408
Chen X, Sun Y, Zhang T et al (2021) Quantitative proteomics using isobaric labeling: a practical guide. Genomics Proteomics Bioinformatics 19:689–706. https://doi.org/10.1016/j.gpb.2021.08.012
doi: 10.1016/j.gpb.2021.08.012
pubmed: 35007772
Zolg DP, Wilhelm M, Yu P et al (2017) PROCAL: a set of 40 peptide standards for retention time indexing, column performance monitoring, and collision energy calibration. Proteomics 17:1700263. https://doi.org/10.1002/pmic.201700263
doi: 10.1002/pmic.201700263