Industry-scale application and evaluation of deep learning for drug target prediction.
Big data
ChEMBL
Cheminformatics
Deep learning
Machine learning
Prospective evaluation
PubChem
QSAR
Retrospective evaluation
Structure-based virtual screening
Journal
Journal of cheminformatics
ISSN: 1758-2946
Titre abrégé: J Cheminform
Pays: England
ID NLM: 101516718
Informations de publication
Date de publication:
19 Apr 2020
19 Apr 2020
Historique:
received:
29
10
2019
accepted:
30
03
2020
entrez:
12
1
2021
pubmed:
13
1
2021
medline:
13
1
2021
Statut:
epublish
Résumé
Artificial intelligence (AI) is undergoing a revolution thanks to the breakthroughs of machine learning algorithms in computer vision, speech recognition, natural language processing and generative modelling. Recent works on publicly available pharmaceutical data showed that AI methods are highly promising for Drug Target prediction. However, the quality of public data might be different than that of industry data due to different labs reporting measurements, different measurement techniques, fewer samples and less diverse and specialized assays. As part of a European funded project (ExCAPE), that brought together expertise from pharmaceutical industry, machine learning, and high-performance computing, we investigated how well machine learning models obtained from public data can be transferred to internal pharmaceutical industry data. Our results show that machine learning models trained on public data can indeed maintain their predictive power to a large degree when applied to industry data. Moreover, we observed that deep learning derived machine learning models outperformed comparable models, which were trained by other machine learning algorithms, when applied to internal pharmaceutical company datasets. To our knowledge, this is the first large-scale study evaluating the potential of machine learning and especially deep learning directly at the level of industry-scale settings and moreover investigating the transferability of publicly learned target prediction models towards industrial bioactivity prediction pipelines.
Identifiants
pubmed: 33430964
doi: 10.1186/s13321-020-00428-5
pii: 10.1186/s13321-020-00428-5
pmc: PMC7169028
doi:
Types de publication
Journal Article
Langues
eng
Pagination
26Subventions
Organisme : Horizon 2020 Framework Programme
ID : 671555
Organisme : Large Infrastructures for Research, Experimental Development and Innovation
ID : IT4Innovation National Supercomputing Center - LM2015070
Références
J Chem Inf Model. 2015 Feb 23;55(2):263-74
pubmed: 25635324
Nat Mater. 2019 May;18(5):435-441
pubmed: 31000803
Drug Discov Today. 2015 Mar;20(3):318-31
pubmed: 25448759
J Proteomics. 2011 Nov 18;74(12):2554-74
pubmed: 21621023
J Cheminform. 2017 Mar 7;9:17
pubmed: 28316655
J Chem Inf Model. 2013 Aug 26;53(8):1957-66
pubmed: 23829430
J Cheminform. 2019 Jan 10;11(1):4
pubmed: 30631996
Org Biomol Chem. 2004 Nov 21;2(22):3204-18
pubmed: 15534697
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D115-9
pubmed: 14681372
J Chem Inf Model. 2017 Aug 28;57(8):2068-2076
pubmed: 28692267
Nucleic Acids Res. 2017 Jan 4;45(D1):D945-D954
pubmed: 27899562
Pharm Res. 2016 Nov;33(11):2594-603
pubmed: 27599991
J Cheminform. 2015 Oct 24;7:51
pubmed: 26500705
Mol Inform. 2013 Jun;32(5-6):481-504
pubmed: 27481667
PLoS Comput Biol. 2013;9(10):e1003253
pubmed: 24098102
Eur J Biochem. 1994 Jul 1;223(1):1-5
pubmed: 7957164
Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109
pubmed: 30371825
Chem Sci. 2018 Jun 6;9(24):5441-5451
pubmed: 30155234
Nat Rev Drug Discov. 2019 Jun;18(6):463-477
pubmed: 30976107
J Chem Inf Model. 2017 Oct 23;57(10):2490-2504
pubmed: 28872869
J Chem Inf Model. 2019 Mar 25;59(3):1253-1268
pubmed: 30615828
Front Genet. 2019 Feb 19;10:80
pubmed: 30838023
J Comput Aided Mol Des. 2017 Mar;31(3):267-273
pubmed: 27995515
J Chem Inf Model. 2019 Mar 25;59(3):1005-1016
pubmed: 30586300
J Cheminform. 2017 Jun 6;9(1):33
pubmed: 29086040
J Chem Inf Model. 2010 May 24;50(5):742-54
pubmed: 20426451
J Chem Inf Model. 2019 May 28;59(5):1728-1742
pubmed: 30817146
PLoS One. 2013 Apr 16;8(4):e61007
pubmed: 23613770
Drug Discov Today. 2018 Jun;23(6):1241-1250
pubmed: 29366762
J Cheminform. 2014 Nov 26;6(1):47
pubmed: 25506400