Expanding N-glycopeptide identifications by modeling fragmentation, elution, and glycome connectivity.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
22 Jul 2024
22 Jul 2024
Historique:
received:
26
01
2021
accepted:
08
07
2024
medline:
23
7
2024
pubmed:
23
7
2024
entrez:
22
7
2024
Statut:
epublish
Résumé
Accurate glycopeptide identification in mass spectrometry-based glycoproteomics is a challenging problem at scale. Recent innovation has been made in increasing the scope and accuracy of glycopeptide identifications, with more precise uncertainty estimates for each part of the structure. We present a dynamically adapting relative retention time model for detecting and correcting ambiguous glycan assignments that are difficult to detect from fragmentation alone, a layered approach to glycopeptide fragmentation modeling that improves N-glycopeptide identification in samples without compromising identification quality, and a site-specific method to increase the depth of the glycoproteome confidently identifiable even further. We demonstrate our techniques on a set of previously published datasets, showing the performance gains at each stage of optimization. These techniques are provided in the open-source glycomics and glycoproteomics platform GlycReSoft available at https://github.com/mobiusklein/glycresoft .
Identifiants
pubmed: 39039063
doi: 10.1038/s41467-024-50338-5
pii: 10.1038/s41467-024-50338-5
doi:
Substances chimiques
Glycopeptides
0
Polysaccharides
0
Glycoproteins
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
6168Informations de copyright
© 2024. The Author(s).
Références
Varki, A. Biological roles of glycans. Glycobiology 27, 3–49 (2017).
pubmed: 27558841
doi: 10.1093/glycob/cww086
Cummings, R. D. The repertoire of glycan determinants in the human glycome. Mol. BioSyst. 5, 1087–104 (2009).
pubmed: 19756298
doi: 10.1039/b907931a
Čaval, T., Heck, A. J. R. & Reiding, K. R. Meta-heterogeneity : evaluating and describing the diversity in glycosylation between sites on the same glycoprotein. Mol. Cell. Proteomics 100, https://doi.org/10.1074/mcp.R120.002093 (2020).
Riley, N. M., Hebert, A. S., Westphall, M. S. & Coon, J. J. Capturing site-specific heterogeneity with large-scale N-glycoproteome analysis. Nat. Commun. 10, 1–13 (2019).
doi: 10.1038/s41467-019-09222-w
Hinneburg, H. et al. The art of destruction: Optimizing collision energies in quadrupole-time of flight (Q-TOF) instruments for glycopeptide-based glycoproteomics. J. Am. Soc. Mass Spectrom. 27, 507–519 (2016).
pubmed: 26729457
pmcid: 4756043
doi: 10.1007/s13361-015-1308-6
Aboufazeli, F. & Dodds, E. D. Precursor ion survival energies of protonated N-glycopeptides and their weak dependencies on high mannose N-glycan composition in collision-induced dissociation. Analyst 143, 4459–4468 (2018).
Zeng, W. F., Cao, W. Q., Liu, M. Q., He, S. M. & Yang, P. Y. Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3. Nat. Methods 18, 1515–1523 (2021).
pubmed: 34824474
pmcid: 8648562
doi: 10.1038/s41592-021-01306-0
Riley, N. M., Malaker, S. A., Driessen, M. D. & Bertozzi, C. R. Optimal dissociation methods differ for N - and O -glycopeptides. J. Proteome Res. 19, 3286–3301 (2020).
pubmed: 32500713
pmcid: 7425838
doi: 10.1021/acs.jproteome.0c00218
Cao, W. et al. Recent advances in software tools for more generic and precise intact glycopeptide analysis. Mol. Cell. Proteomics 20, http://www.mcponline.org/lookup/doi/10.1074/mcp.R120.002090 (2020).
Liu, M.-Q. et al. pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification. Nat. Commun. 8, 438 (2017).
pubmed: 28874712
pmcid: 5585273
doi: 10.1038/s41467-017-00535-2
Hu, H., Khatri, K., Klein, J., Leymarie, N. & Zaia, J. A review of methods for interpretation of glycopeptide tandem mass spectral data. Glycoconj. J. 33, 285–296 (2016).
Mayampurath, A. et al. Computational framework for identification of intact glycopeptides in complex samples. Anal. Chem. 86, 453–463 (2014).
pubmed: 24279413
doi: 10.1021/ac402338u
Ranzinger, R., Herget, S., von der Lieth, C.-W. C.-W. & Frank, M. GlycomeDB–a unified database for carbohydrate structures. Nucleic Acids Res. 39, D373–6 (2011).
pubmed: 21045056
doi: 10.1093/nar/gkq1014
Klein, J. & Zaia, J. Relative retention time estimation improves N-glycopeptide identifications by LC-MS/MS. J. Proteome Res. 19, 2113–2121 (2020).
pubmed: 32223173
pmcid: 7473422
doi: 10.1021/acs.jproteome.0c00051
Fang, Z. et al. Glyco-Decipher enables glycan database-independent peptide matching and in-depth characterization of site-specific N-glycosylation. Nat. Commun. 13, 1900 (2022).
pubmed: 35393418
pmcid: 8990002
doi: 10.1038/s41467-022-29530-y
Polasky, D. A., Geiszler, D. J., Yu, F. & Nesvizhskii, A. I. Multi-attribute glycan identification and FDR control for glycoproteomics. Mol. Cell. Proteomics 21, 100205 (2022).
Halim, A. et al. Assignment of saccharide identities through analysis of oxonium ion fragmentation profiles in LC-MS/MS of glycopeptides. J. Proteome Res. 13, 6024–6032 (2014).
pubmed: 25358049
doi: 10.1021/pr500898r
Toghi Eshghi, S. et al. Classification of tandem mass spectra for identification of N- and O-linked glycopeptides. Sci. Rep. 6, 37189 (2016).
pubmed: 27869200
pmcid: 5116676
doi: 10.1038/srep37189
Zhang, Z. & Shah, B. Prediction of collision-induced dissociation spectra of common N -glycopeptides for glycoform. Anal. Chem. 82, 10194–10202 (2010).
Ma, B. et al. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17, 2337–2342 (2003).
pubmed: 14558135
doi: 10.1002/rcm.1196
Jeong, K., Kim, S. & Pevzner, P. A. UniNovo: A universal tool for de novo peptide sequencing. Bioinformatics 29, 1953–1962 (2013).
pubmed: 23766417
pmcid: 3722526
doi: 10.1093/bioinformatics/btt338
Bern, M., Kil, Y. J. & Becker, C. Byonic: advanced peptide and protein identification software. Curr. Protoc. Bioinform. 40, 13.20.1–13.20.14 (2012).
doi: 10.1002/0471250953.bi1320s40
Zhou, X. X. et al. PDeep: Predicting MS/MS spectra of peptides with deep Learning. Anal. Chem. 89, 12690–12697 (2017).
pubmed: 29125736
doi: 10.1021/acs.analchem.7b02566
Gabriels, R., Martens, L. & Degroeve, S. Updated MS
Gessulat, S. et al. Prosit: Proteome-wide predicition of peptide tandem mass spectra by deep learning. Nat. Methods 16, 509–518 (2019).
Klein, J., Carvalho, L. & Zaia, J. Application of network smoothing to glycan LC-MS profiling. Bioinformatics 34, 3511–3518 (2018).
pubmed: 29790907
pmcid: 6669418
doi: 10.1093/bioinformatics/bty397
Binz, P.-A. et al. Proteomics standards initiative extended FASTA format. J. Proteome Res. 18, 2686–2692 (2019).
pubmed: 31081335
pmcid: 6642660
doi: 10.1021/acs.jproteome.9b00064
Vizcaíno, J. A. et al. The mzIdentML data standard version 1.2, supporting advances in proteome informatics. Mol. Cell. Proteomics 16, 1275–1285 (2017).
Klein, J. & Zaia, J. glypy: An open source glycoinformatics library. J. Proteome Res. 18, 3532–3537 (2019).
pubmed: 31310539
pmcid: 7158751
doi: 10.1021/acs.jproteome.9b00367
The UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–212 (2014).
pmcid: 4384041
doi: 10.1093/nar/gku989
Bollineni, R. C., Koehler, C. J., Gislefoss, R. E., Anonsen, J. H. & Thiede, B. Large-scale intact glycopeptide identification by Mascot database search. Sci. Rep. 8, 2117 (2018).
pubmed: 29391424
pmcid: 5795011
doi: 10.1038/s41598-018-20331-2
Qin, H. et al. Highly efficient analysis of glycoprotein sialylation in human serum by simultaneous quantification of glycosites and site-specific glycoforms. J. Proteome Res. 18, 3439–3446 (2019).
pubmed: 31380653
doi: 10.1021/acs.jproteome.9b00332
Melmer, M., Stangler, T., Premstaller, A. & Lindner, W. Comparison of hydrophilic-interaction, reversed-phase and porous graphitic carbon chromatography for glycan analysis. J. Chromatogr. A 1218, 118–123 (2011).
pubmed: 21122866
doi: 10.1016/j.chroma.2010.10.122
Khatri, K. et al. Microfluidic capillary electrophoresis-mass spectrometry for analysis of monosaccharides, oligosaccharides, and glycopeptides. Anal.mChem. 89, 6645–6655 (2017).
doi: 10.1021/acs.analchem.7b00875
Ang, E., Neustaeter, H., Spicer, V., Perreault, H. & Krokhin, O. Retention time prediction for glycopeptides in reversed-phase chromatography for glycoproteomic applications. Anal. Chem. 91, 13360–13366 (2019).
pubmed: 31566965
doi: 10.1021/acs.analchem.9b02584
Bouwmeester, R., Gabriels, R., Hulstaert, N., Martens, L. & Degroeve, S. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat. Methods 18, 1363–1369 (2021).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2012).
Yang, Y. et al. GproDIA enables data-independent acquisition glycoproteomics with comprehensive statistical control. Nat. Commun. 12, 6073 (2021).
pubmed: 34663801
pmcid: 8523693
doi: 10.1038/s41467-021-26246-3
Zubarev, R. A., Zubarev, A. R. & Savitski, M. M. Electron capture/transfer versus collisionally activated/induced dissociations: solo or duet?. J. Am. Soc. Mass. Spectrom. 19, 753–761 (2008).
pubmed: 18499036
doi: 10.1016/j.jasms.2008.03.007
Kahsay, R. et al. GlyGen data model and processing workflow. Bioinformatics 36, 3941–3943 (2020).
Robin, T., Mariethoz, J. & Lisacek, F. Examining and fine-tuning the selection of glycan compositions with glyconnect compozitor. Mol. Cell. Proteomics 19, 1602–1618 (2020).
pubmed: 32636234
pmcid: 8014996
doi: 10.1074/mcp.RA120.002041
Yamada, I. et al. The glyCosmos portal : a unified and comprehensive web resource for the glycosciences. Nat. Methods 17, 649–650 (2020).
pubmed: 32572234
doi: 10.1038/s41592-020-0879-8
Galili, U., Clark, M. R., Shohet, S. B., Buehler, J. & Macher, B. A. Evolutionary relationship between the natural anti-Gal antibody and the Galα1—-3Gal epitope in primates. Proc. Natl. Acad. Sci. USA 84, 1369–1373 (1987).
pubmed: 2434954
pmcid: 304431
doi: 10.1073/pnas.84.5.1369
Zeng, W.-F. et al. pGlyco: a pipeline for the identification of intact N-glycopeptides by using HCD- and CID-MS/MS and MS3. Sci. Rep. 6, 25102 (2016).
pubmed: 27139140
pmcid: 4853738
doi: 10.1038/srep25102
Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).
pubmed: 26527722
doi: 10.1093/nar/gkv1145
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
pubmed: 23051804
pmcid: 3471674
doi: 10.1038/nbt.2377
Lee, L. Y. et al. Toward automated N-glycopeptide identification in glycoproteomics. J. Proteome Res. 15, 3904–3915 (2016).
pubmed: 27519006
doi: 10.1021/acs.jproteome.6b00438
Käll, L., Canterbury, J. D., Weston, J., Noble, W. S. & MacCoss, M. J. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat. Methods 4, 923–5 (2007).
pubmed: 17952086
doi: 10.1038/nmeth1113
Fondrie, W. E. & Noble, W. S. mokapot: Fast and flexible semisupervised learning for peptide detection. J. Proteome Res. 20, 1966–1971 (2021).
pubmed: 33596079
pmcid: 8022319
doi: 10.1021/acs.jproteome.0c01010
Shteynberg, D. D. et al. PTMProphet: Fast and accurate mass modi fi cation localization for the trans-proteomic pipeline. J. Proteome Res. 18, 4262–4272 (2019).
pubmed: 31290668
pmcid: 6898736
doi: 10.1021/acs.jproteome.9b00205
Frank, A. & Pevzner, P. PepNovo: De novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77, 964–973 (2005).
pubmed: 15858974
doi: 10.1021/ac048788h
Frank, A. M. Predicting intensity ranks of peptide fragment ions. J. Proteome Res. 8, 2226–40 (2009).
pubmed: 19256476
pmcid: 2738854
doi: 10.1021/pr800677f
Wysocki, V. H., Tsaprailis, G., Smith, L. L. & Breci, L. A. Mobile and localized protons: A framework for understanding peptide dissociation. J. Mass Spectrom. 35, 1399–1406 (2000).
pubmed: 11180630
doi: 10.1002/1096-9888(200012)35:12<1399::AID-JMS86>3.0.CO;2-R
Kolli, V., Roth, H. A., De La Cruz, G., Fernando, G. S. & Dodds, E. D. The role of proton mobility in determining the energy-resolved vibrational activation/dissociation channels of N-glycopeptide ions. Anal. Chimica Acta 896, 85–92 (2015).
doi: 10.1016/j.aca.2015.09.013
Palzs, B. & Suhal, S. Fragmentation pathways of protonated peptides. Mass Spectrom. Rev. 24, 508–548 (2005).
doi: 10.1002/mas.20024
Benedetti, E. et al. Network inference from glycoproteomics data reveals new reactions in the IgG glycosylation pathway. Nat. Commun. 8, 1–15 (2017).
doi: 10.1038/s41467-017-01525-0
Behnel, S. et al. Cython: The best of both worlds. Comput. Sci. Eng. 13, 31 –39 (2011).
doi: 10.1109/MCSE.2010.118
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
pubmed: 32939066
pmcid: 7759461
doi: 10.1038/s41586-020-2649-2
Virtanen, P. et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
pubmed: 32015543
pmcid: 7056644
doi: 10.1038/s41592-019-0686-2
Hunter, J. D. Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
doi: 10.1109/MCSE.2007.55