Navigating the maze of mass spectra: a machine-learning guide to identifying diagnostic ions in O-glycan analysis.

Bioinformatics Carbohydrates Computational biology Machine learning Mass spectrometry

Journal

Analytical and bioanalytical chemistry
ISSN: 1618-2650
Titre abrégé: Anal Bioanal Chem
Pays: Germany
ID NLM: 101134327

Informations de publication

Date de publication:
24 Aug 2024
Historique:
received: 28 06 2024
accepted: 09 08 2024
revised: 06 08 2024
medline: 24 8 2024
pubmed: 24 8 2024
entrez: 24 8 2024
Statut: aheadofprint

Résumé

Structural details of oligosaccharides, or glycans, often carry biological relevance, which is why they are typically elucidated using tandem mass spectrometry. Common approaches to distinguish isomers rely on diagnostic glycan fragments for annotating topologies or linkages. Diagnostic fragments are often only known informally among practitioners or stem from individual studies, with unclear validity or generalizability, causing annotation heterogeneity and hampering new analysts. Drawing on a curated set of 237,000 O-glycomics spectra, we here present a rule-based machine learning workflow to uncover quantifiably valid and generalizable diagnostic fragments. This results in fragmentation rules to robustly distinguish common O-glycan isomers for reduced glycans in negative ion mode. We envision this resource to improve glycan annotation accuracy and concomitantly make annotations more transparent and homogeneous across analysts.

Identifiants

pubmed: 39180595
doi: 10.1007/s00216-024-05500-9
pii: 10.1007/s00216-024-05500-9
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : Vetenskapsrådet
ID : BioMS

Informations de copyright

© 2024. The Author(s).

Références

Varki A. Biological roles of glycans. Glycobiology. 2017;27:3–49. https://doi.org/10.1093/glycob/cww086 .
doi: 10.1093/glycob/cww086 pubmed: 27558841
McMahon CM, Isabella CR, Windsor IW, Kosma P, Raines RT, Kiessling LL. Stereoelectronic effects impact glycan recognition. J Am Chem Soc. 2020;142:2386–95. https://doi.org/10.1021/jacs.9b11699 .
doi: 10.1021/jacs.9b11699 pubmed: 31930911 pmcid: 7392083
Zhang Z, Shah B, Richardson J. Impact of Fc N-glycan sialylation on IgG structure. mAbs. 2019;11:1381–90. https://doi.org/10.1080/19420862.2019.1655377 .
doi: 10.1080/19420862.2019.1655377 pubmed: 31411531 pmcid: 6816437
Bojar D, Meche L, Meng G, Eng W, Smith DF, Cummings RD, Mahal LK. A useful guide to lectin binding: machine-learning directed annotation of 57 unique lectin specificities. ACS Chem Biol. 2022;acschembio.1c00689. https://doi.org/10.1021/acschembio.1c00689.
Ashwood C, Lin C-H, Thaysen-Andersen M, Packer NH. Discrimination of isomers of released N- and O- glycans using diagnostic product ions in negative ion PGC-LC-ESI-MS/MS. J Am Soc Mass Spectrom. 2018;29:1194–209. https://doi.org/10.1007/s13361-018-1932-z .
doi: 10.1007/s13361-018-1932-z pubmed: 29603058
Everest-Dass AV, Abrahams JL, Kolarich D, Packer NH, Campbell MP. Structural feature ions for distinguishing N- and O- linked glycan isomers by LC-ESI-IT MS/MS. J Am Soc Mass Spectrom. 2013;24:895–906. https://doi.org/10.1007/s13361-013-0610-4 .
doi: 10.1007/s13361-013-0610-4 pubmed: 23605685
Doohan RA, Hayes CA, Harhen B, Karlsson NG. Negative ion CID fragmentation of O- linked oligosaccharide aldoses—charge induced and charge remote fragmentation. J Am Soc Mass Spectrom. 2011;22:s13361–011–0102–3. https://doi.org/10.1007/s13361-011-0102-3.
Karlsson NG, Schulz BL, Packer NH. Structural determination of neutral O-linked oligosaccharide alditols by negative ion LC-electrospray-MS
doi: 10.1016/j.jasms.2004.01.002 pubmed: 15121195
Jin C, Kenny DT, Skoog EC, Padra M, Adamczyk B, Vitizeva V, Thorell A, Venkatakrishnan V, Lindén SK, Karlsson NG. Structural diversity of human gastric mucin glycans. Mol Cell Proteomics. 2017;16:743–58. https://doi.org/10.1074/mcp.M117.067983 .
doi: 10.1074/mcp.M117.067983 pubmed: 28461410 pmcid: 5417818
Kawahara R, Chernykh A, Alagesan K, Bern M, Cao W, Chalkley RJ, Cheng K, Choo MS, Edwards N, Goldman R, Hoffmann M, Hu Y, Huang Y, Kim JY, Kletter D, Liquet B, Liu M, Mechref Y, Meng B, Neelamegham S, Nguyen-Khuong T, Nilsson J, Pap A, Park GW, Parker BL, Pegg CL, Penninger JM, Phung TK, Pioch M, Rapp E, Sakalli E, Sanda M, Schulz BL, Scott NE, Sofronov G, Stadlmann J, Vakhrushev SY, Woo CM, Wu H-Y, Yang P, Ying W, Zhang H, Zhang Y, Zhao J, Zaia J, Haslam SM, Palmisano G, Yoo JS, Larson G, Khoo K-H, Medzihradszky KF, Kolarich D, Packer NH, Thaysen-Andersen M. Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis. Nat Methods. 2021;18:1304–16. https://doi.org/10.1038/s41592-021-01309-x .
doi: 10.1038/s41592-021-01309-x pubmed: 34725484 pmcid: 8566223
Urban J, Jin C, Thomsson KA, Karlsson NG, Ives CM, Fadda E, Bojar D. Predicting glycan structure from tandem mass spectrometry via deep learning. Nat Methods. 2024. https://doi.org/10.1038/s41592-024-02314-6 .
doi: 10.1038/s41592-024-02314-6 pubmed: 38951670 pmcid: 11239490
Watanabe Y, Aoki-Kinoshita KF, Ishihama Y, Okuda S. GlycoPOST realizes FAIR principles for glycomics mass spectrometry data. Nucleic Acids Res. 2021;49:D1523–8. https://doi.org/10.1093/nar/gkaa1012 .
doi: 10.1093/nar/gkaa1012 pubmed: 33174597
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. New York: Springer; 2009.
doi: 10.1007/978-0-387-84858-7
Thomès L, Burkholz R, Bojar D. Glycowork: a Python package for glycan data science and machine learning. Glycobiology. 2021;cwab067. https://doi.org/10.1093/glycob/cwab067.
Joeres R, Blumenthal DB, Kalinina OV. DataSAIL: Data Splitting Against Information Leakage. 2023. https://doi.org/10.1101/2023.11.15.566305 .
Lundstrøm J, Urban J, Thomès L, Bojar D. GlycoDraw: a python implementation for generating high-quality glycan figures. Glycobiology. 2023;cwad063. https://doi.org/10.1093/glycob/cwad063 .
Bechtella L, Chunsheng J, Fentker K, Ertürk GR, Safferthal M, Polewski Ł, Götze M, Graeber SY, Vos GM, Struwe WB, Mall MA, Mertins P, Karlsson NG, Pagel K. Ion mobility-tandem mass spectrometry of mucin-type O-glycans. Nat Commun. 2024;15:2611. https://doi.org/10.1038/s41467-024-46825-4 .
doi: 10.1038/s41467-024-46825-4 pubmed: 38521783 pmcid: 10960840
Thomsson KA, Benktander JA, Toxqui-Rodríguez S, Piazzon MC, Lindén SK. Gilthead seabream mucus glycosylation is complex, differs between epithelial sites and carries unusual poly N-acetylhexosamine motifs. 2024. https://doi.org/10.2139/ssrn.4823066
Urban J, Joeres R, Thomès L, Thomsson KA, Bojar D. Navigating the maze of mass spectra: a machine-learning guide to identifying diagnostic ions in O-glycan analysis. 2024. https://doi.org/10.1101/2024.06.28.601175 .
Domon B, Costello CE. A systematic nomenclature for carbohydrate fragmentations in FAB-MS/MS spectra of glycoconjugates. Glycoconjugate J. 1988;5:397–409. https://doi.org/10.1007/BF01049915 .
doi: 10.1007/BF01049915
Jin C, Lundstrøm J, Korhonen E, Luis AS, Bojar D. Breast milk oligosaccharides contain immunomodulatory glucuronic acid and LacdiNAc. Mol Cell Proteomics. 2023;22:100635. https://doi.org/10.1016/j.mcpro.2023.100635 .
doi: 10.1016/j.mcpro.2023.100635 pubmed: 37597722 pmcid: 10509713
Bennett AR, Lundstrøm J, Chatterjee S, Thaysen-Andersen M, Bojar D (2024) Ratios in disguise, truths arise: glycomics meets compositional data analysis. https://doi.org/10.1101/2024.06.09.598163 .
Jin C, Padra JT, Sundell K, Sundh H, Karlsson NG, Lindén SK. Atlantic salmon carries a range of novel O -glycan structures differentially localized on skin and intestinal mucins. J Proteome Res. 2015;14:3239–51. https://doi.org/10.1021/acs.jproteome.5b00232 .
doi: 10.1021/acs.jproteome.5b00232 pubmed: 26066491
Geiszler DJ, Polasky DA, Yu F, Nesvizhskii AI. Detecting diagnostic features in MS/MS spectra of post-translationally modified peptides. Nat Commun. 2023;14:4132. https://doi.org/10.1038/s41467-023-39828-0 .
doi: 10.1038/s41467-023-39828-0 pubmed: 37438360 pmcid: 10338467
Ives CM, Singh O, D’Andrea S, Fogarty CA, Harbison AM, Satheesan A, Tropea B, Fadda E. Restoring protein glycosylation with GlycoShape. 2023. https://doi.org/10.1101/2023.12.11.571101 .
Zhang T, Wang W, Wuhrer M, De Haan N. Comprehensive O -glycan analysis by porous graphitized carbon nanoliquid chromatography–mass spectrometry. Anal Chem. 2024;96:8942–8. https://doi.org/10.1021/acs.analchem.3c05826 .
doi: 10.1021/acs.analchem.3c05826 pubmed: 38758656 pmcid: 11154684
Abrahams JL, Campbell MP, Packer NH. Building a PGC-LC-MS N-glycan retention library and elution mapping resource. Glycoconj J. 2018;35:15–29. https://doi.org/10.1007/s10719-017-9793-4 .
doi: 10.1007/s10719-017-9793-4 pubmed: 28905148
Staudacher E. Mucin-type O-glycosylation in invertebrates. Molecules. 2015;20:10622–40. https://doi.org/10.3390/molecules200610622 .
doi: 10.3390/molecules200610622 pubmed: 26065637 pmcid: 6272458
Van Beusekom B, Lütteke T, Joosten RP. Making glycoproteins a little bit sweeter with PDB-REDO. Acta Crystallogr F Struct Biol Commun. 2018;74:463–72. https://doi.org/10.1107/S2053230X18004016 .
doi: 10.1107/S2053230X18004016 pubmed: 30084395 pmcid: 6096482

Auteurs

James Urban (J)

Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.

Roman Joeres (R)

Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden.
Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany.
Center for Bioinformatics, Saarland University, Saarbrücken, Germany.

Luc Thomès (L)

ULR 7364 - RADEME - Maladies RAres du DÉveloppement embryonnaire et du Métabolisme, CHU Lille, University Lille, 59000, Lille, France.

Kristina A Thomsson (KA)

Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.

Daniel Bojar (D)

Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden. daniel.bojar@gu.se.
Wallenberg Centre for Molecular and Translational Medicine, University of Gothenburg, Gothenburg, Sweden. daniel.bojar@gu.se.

Classifications MeSH