GlyLES: Grammar-based Parsing of Glycans from IUPAC-condensed to SMILES.
Glycan
Glycobiology
Grammar
IUPAC-condensed
SMILES
Journal
Journal of cheminformatics
ISSN: 1758-2946
Titre abrégé: J Cheminform
Pays: England
ID NLM: 101516718
Informations de publication
Date de publication:
23 Mar 2023
23 Mar 2023
Historique:
received:
14
11
2022
accepted:
18
02
2023
entrez:
24
3
2023
pubmed:
25
3
2023
medline:
25
3
2023
Statut:
epublish
Résumé
Glycans are important polysaccharides on cellular surfaces that are bound to glycoproteins and glycolipids. These are one of the most common post-translational modifications of proteins in eukaryotic cells. They play important roles in protein folding, cell-cell interactions, and other extracellular processes. Changes in glycan structures may influence the course of different diseases, such as infections or cancer. Glycans are commonly represented using the IUPAC-condensed notation. IUPAC-condensed is a textual representation of glycans operating on the same topological level as the Symbol Nomenclature for Glycans (SNFG) that assigns colored, geometrical shapes to the main monomers. These symbols are then connected in tree-like structures, visualizing the glycan structure on a topological level. Yet for a representation on the atomic level, notations such as SMILES should be used. To our knowledge, there is no easy-to-use, general, open-source, and offline tool to convert the IUPAC-condensed notation to SMILES. Here, we present the open-access Python package GlyLES for the generalizable generation of SMILES representations out of IUPAC-condensed representations. GlyLES uses a grammar to read in the monomer tree from the IUPAC-condensed notation. From this tree, the tool can compute the atomic structures of each monomer based on their IUPAC-condensed descriptions. In the last step, it merges all monomers into the atomic structure of a glycan in the SMILES notation. GlyLES is the first package that allows conversion from the IUPAC-condensed notation of glycans to SMILES strings. This may have multiple applications, including straightforward visualization, substructure search, molecular modeling and docking, and a new featurization strategy for machine-learning algorithms. GlyLES is available at https://github.com/kalininalab/GlyLES .
Identifiants
pubmed: 36959676
doi: 10.1186/s13321-023-00704-0
pii: 10.1186/s13321-023-00704-0
pmc: PMC10035253
doi:
Types de publication
Journal Article
Langues
eng
Pagination
37Informations de copyright
© 2023. The Author(s).
Références
Glycobiology. 2021 Nov 18;31(10):1240-1244
pubmed: 34192308
J Chem Inf Model. 2014 Jun 23;54(6):1558-66
pubmed: 24897372
Cancer Res. 2011 Dec 15;71(24):7683-93
pubmed: 22025563
J Chem Inf Model. 2011 Jan 24;51(1):159-70
pubmed: 21155523
Glycobiology. 2020 Jan 28;30(2):72-73
pubmed: 31616925
MAbs. 2018 Jul;10(5):693-711
pubmed: 29733746
Methods Mol Biol. 2015;1273:55-85
pubmed: 25753703
Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109
pubmed: 30371825
Appl Microbiol Biotechnol. 2011 Jan;89(1):45-55
pubmed: 20890754
Glycobiology. 2015 Dec;25(12):1323-4
pubmed: 26543186
Adv Sci (Weinh). 2022 Jan;9(1):e2103807
pubmed: 34862760
Evid Based Complement Alternat Med. 2017;2017:1594074
pubmed: 28367220
Glycobiology. 2019 Aug 20;29(9):620-624
pubmed: 31184695
Methods Mol Biol. 2015;1273:241-58
pubmed: 25753716
Nat Methods. 2020 Jul;17(7):649-650
pubmed: 32572234
J Chem Inf Model. 2011 Mar 28;51(3):739-53
pubmed: 21384929
Nat Immunol. 2008 Jun;9(6):593-601
pubmed: 18490910
J Chem Inf Model. 2021 Oct 25;61(10):4940-4948
pubmed: 34595926
Front Immunol. 2019 Apr 30;10:789
pubmed: 31134048
Nucleic Acids Res. 2016 Jan 4;44(D1):D1229-36
pubmed: 26286194
Mol Inform. 2020 Dec;39(12):e2000216
pubmed: 32997890
J Chem Inf Model. 2017 Apr 24;57(4):632-637
pubmed: 28263066
Bioinformatics. 2018 Aug 1;34(15):2679-2681
pubmed: 29547883
Front Immunol. 2021 Apr 29;12:638573
pubmed: 33995356
Curr Opin Virol. 2014 Aug;7:88-94
pubmed: 25000207
Mol Cell Proteomics. 2022 Nov;21(11):100421
pubmed: 36182101
ACS Chem Biol. 2022 Nov 18;17(11):2993-3012
pubmed: 35084820
Glycobiology. 2017 Jan;27(1):3-49
pubmed: 27558841