Predicting protein network topology clusters from chemical structure using deep learning.
Deep learning
Drug discovery
Machine learning
Network topology
Neural networks
Journal
Journal of cheminformatics
ISSN: 1758-2946
Titre abrégé: J Cheminform
Pays: England
ID NLM: 101516718
Informations de publication
Date de publication:
15 Jul 2022
15 Jul 2022
Historique:
received:
11
11
2021
accepted:
06
06
2022
entrez:
15
7
2022
pubmed:
16
7
2022
medline:
16
7
2022
Statut:
epublish
Résumé
Comparing chemical structures to infer protein targets and functions is a common approach, but basing comparisons on chemical similarity alone can be misleading. Here we present a methodology for predicting target protein clusters using deep neural networks. The model is trained on clusters of compounds based on similarities calculated from combined compound-protein and protein-protein interaction data using a network topology approach. We compare several deep learning architectures including both convolutional and recurrent neural networks. The best performing method, the recurrent neural network architecture MolPMoFiT, achieved an F1 score approaching 0.9 on a held-out test set of 8907 compounds. In addition, in-depth analysis on a set of eleven well-studied chemical compounds with known functions showed that predictions were justifiable for all but one of the chemicals. Four of the compounds, similar in their molecular structure but with dissimilarities in their function, revealed advantages of our method compared to using chemical similarity.
Identifiants
pubmed: 35841114
doi: 10.1186/s13321-022-00622-7
pii: 10.1186/s13321-022-00622-7
pmc: PMC9284831
doi:
Types de publication
Journal Article
Langues
eng
Pagination
47Subventions
Organisme : Svenska Forskningsrådet Formas
ID : grant 2018-00924
Organisme : Swedish Research Council
ID : grants 2020-03731 and 2020-01865
Organisme : Svenska Forskningsrådet Formas
ID : grant 2020-01267
Informations de copyright
© 2022. The Author(s).
Références
J Pharm Sci. 2021 Jan;110(1):42-49
pubmed: 33075380
Nucleic Acids Res. 2012 Jan;40(Database issue):D876-80
pubmed: 22075997
Front Pharmacol. 2018 Nov 06;9:1256
pubmed: 30459617
Mol Divers. 2008 May;12(2):131-7
pubmed: 18704735
J Med Chem. 2011 Nov 24;54(22):7739-50
pubmed: 21936582
Mol Inform. 2013 Jan;32(1):37-45
pubmed: 27481022
BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):526
pubmed: 30598075
Nucleic Acids Res. 2018 Jul 2;46(W1):W563-W570
pubmed: 29718389
Curr Med Chem. 2003 Feb;10(3):225-33
pubmed: 12570709
IEEE J Biomed Health Inform. 2021 Feb;25(2):371-380
pubmed: 32750907
PLoS One. 2012;7(9):e45944
pubmed: 23029334
Bioinformatics. 2013 Sep 15;29(18):2369-70
pubmed: 23828784
J Chem Inf Model. 2006 Jul-Aug;46(4):1535
pubmed: 16859285
J Chem Inf Model. 2015 Feb 23;55(2):263-74
pubmed: 25635324
AMIA Annu Symp Proc. 2018 Apr 16;2017:979-984
pubmed: 29854165
Nucleic Acids Res. 2021 Jan 8;49(D1):D1388-D1395
pubmed: 33151290
J Comput Chem. 2017 Jun 15;38(16):1291-1307
pubmed: 28272810
J Chem Inf Model. 2012 May 25;52(5):1238-49
pubmed: 22482822
J Chem Inf Model. 2008 Mar;48(3):646-58
pubmed: 18303878
Nucleic Acids Res. 2013 Jan;41(Database issue):D808-15
pubmed: 23203871
PLoS One. 2011;6(12):e29491
pubmed: 22220213
J Chem Inf Model. 2021 Apr 26;61(4):1560-1569
pubmed: 33715361
ACS Chem Biol. 2012 Aug 17;7(8):1399-409
pubmed: 22594495
J Cheminform. 2015 May 20;7:20
pubmed: 26052348
Neural Comput. 2019 Jul;31(7):1235-1270
pubmed: 31113301
Mol Pharm. 2012 Oct 1;9(10):2912-23
pubmed: 22937990
J Chem Inf Model. 2010 May 24;50(5):742-54
pubmed: 20426451
Nucleic Acids Res. 2008 Jan;36(Database issue):D901-6
pubmed: 18048412
J Cheminform. 2020 Apr 22;12(1):27
pubmed: 33430978
J Cheminform. 2017 Jun 28;9(1):42
pubmed: 29086090
J Comput Aided Mol Des. 2016 May;30(5):413-24
pubmed: 27167132
BMC Bioinformatics. 2021 Jun 12;22(1):320
pubmed: 34118870
J Big Data. 2021;8(1):53
pubmed: 33816053