An application of topological data analysis in predicting sumoylation sites.
Feature extraction
Persistent homology
Sumoylation
Topological data analysis
Journal
PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425
Informations de publication
Date de publication:
2023
2023
Historique:
received:
09
04
2023
accepted:
08
09
2023
medline:
23
10
2023
pubmed:
17
10
2023
entrez:
17
10
2023
Statut:
epublish
Résumé
Sumoylation is a reversible post-translational modification that regulates certain significant biochemical functions in proteins. The protein alterations caused by sumoylation are associated with the incidence of some human diseases. Therefore, identifying the sites of sumoylation in proteins may provide a direction for mechanistic research and drug development. Here, we propose a new computational approach for identifying sumoylation sites using an encoding method based on topological data analysis. The features of our model captured the key physical and biological properties of proteins at multiple scales. In a 10-fold cross validation, the outcomes of our model showed 96.45% of sensitivity (Sn), 94.65% of accuracy (Acc), 0.8946 of Matthew's correlation coefficient (MCC), and 0.99 of area under curve (AUC). The proposed predictor with only topological features achieves the best MCC and AUC in comparison to the other released methods. Our results suggest that topological information is an additional parameter that can assist in the prediction of sumoylation sites and provide a novel perspective for further research in protein sumoylation.
Identifiants
pubmed: 37846308
doi: 10.7717/peerj.16204
pii: 16204
pmc: PMC10576966
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e16204Informations de copyright
©2023 Lin et al.
Déclaration de conflit d'intérêts
The authors declare there are no competing interests.
Références
Comput Biol Chem. 2020 Feb 19;87:107235
pubmed: 32604027
Curr Top Microbiol Immunol. 2007;313:49-71
pubmed: 17217038
Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444
pubmed: 34791371
Comput Math Biophys. 2020 Jan;8(1):1-35
pubmed: 34278230
PeerJ. 2021 Aug 4;9:e11581
pubmed: 34430072
Brief Bioinform. 2022 Jul 18;23(4):
pubmed: 35536545
Bioinformatics. 2017 Nov 15;33(22):3549-3557
pubmed: 29036440
PLoS One. 2020 Aug 21;15(8):e0237747
pubmed: 32822369
Cell Mol Life Sci. 2007 Dec;64(23):3017-33
pubmed: 17763827
J Comput Chem. 2015 Jul 30;36(20):1502-20
pubmed: 26032339
Proteomics. 2009 Jun;9(12):3409-3412
pubmed: 29658196
PLoS One. 2012;7(6):e39195
pubmed: 22720073
Gene. 2016 Jan 15;576(1 Pt 1):99-104
pubmed: 26432000
Sci Rep. 2020 Feb 7;10(1):2079
pubmed: 32034168
Nat Commun. 2018 Jun 25;9(1):2456
pubmed: 29942033
Cells. 2022 Aug 25;11(17):
pubmed: 36078053
Nat Biotechnol. 2003 Mar;21(3):255-61
pubmed: 12610572
Bioinformatics. 2016 Oct 15;32(20):3133-3141
pubmed: 27354696
Int J Numer Method Biomed Eng. 2018 Feb;34(2):
pubmed: 28677268
Bioinformatics. 2015 Nov 1;31(21):3483-91
pubmed: 26142185
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W325-30
pubmed: 24880689
Birth Defects Res A Clin Mol Teratol. 2011 Jun;91(6):468-76
pubmed: 21563299
Mol Cell. 2005 Apr 1;18(1):1-12
pubmed: 15808504
Sci Rep. 2018 Oct 19;8(1):15512
pubmed: 30341374
Nat Rev Mol Cell Biol. 2003 Sep;4(9):690-9
pubmed: 14506472
Nat Protoc. 2016 Sep;11(9):1630-49
pubmed: 27560170
Methods Mol Biol. 2023;2627:211-229
pubmed: 36959450
J Comput Aided Mol Des. 2019 Jan;33(1):71-82
pubmed: 30116918
Yale J Biol Med. 2005 Jul;78(4):197-201
pubmed: 16720014
Neuromolecular Med. 2013 Dec;15(4):720-36
pubmed: 23979993
Nat Mach Intell. 2020;2(2):116-123
pubmed: 34170981
BMC Genomics. 2019 Apr 18;19(Suppl 9):982
pubmed: 30999862
J Biol Chem. 1993 Aug 15;268(23):16938-48
pubmed: 8349584
Molecules. 2018 Dec 10;23(12):
pubmed: 30544729
PLoS Biol. 2021 Dec 6;19(12):e3001464
pubmed: 34871295