Drug Target Identification with Machine Learning: How to Choose Negative Examples.

Computational Biology / methods Drug Discovery / methods Humans Machine Learning Pharmaceutical Preparations / chemistry Protein Interaction Mapping Proteins / chemistry Software Support Vector Machine

chemogenomic drug discovery false positive predictions learning bias machine learning negative examples random forests support vector machines target identification

Journal

International journal of molecular sciences

ISSN: 1422-0067

Titre abrégé: Int J Mol Sci

Pays: Switzerland

ID NLM: 101092791

Informations de publication

Date de publication:
12 May 2021

Historique:

received: 29 03 2021

revised: 30 04 2021

accepted: 07 05 2021

entrez: 2 6 2021

pubmed: 3 6 2021

medline: 11 6 2021

Statut: epublish

Résumé

Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases' statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.

Identifiants

DOI: 10.3390/ijms22105118 PMID: 34066072 PMC: PMC8151112

pubmed: 34066072

pii: ijms22105118

doi: 10.3390/ijms22105118

pmc: PMC8151112

pii:

doi:

Substances chimiques

Pharmaceutical Preparations 0

Proteins 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Subventions

Organisme : Vaincre la Mucoviscidose

ID : RF20190502488

Références

J Chem Inf Comput Sci. 2003 Nov-Dec;43(6):1947-58

pubmed: 14632445

Biophys Rep. 2018;4(1):1-16

pubmed: 29577065

Comb Chem High Throughput Screen. 2008 Sep;11(8):677-85

pubmed: 18795887

Nucleic Acids Res. 2014 Jan;42(Database issue):D1091-7

pubmed: 24203711

Nat Rev Drug Discov. 2011 Jun 24;10(7):507-19

pubmed: 21701501

Comput Biol Chem. 2011 Dec 14;35(6):353-62

pubmed: 22099632

J Cheminform. 2020 Feb 10;12(1):11

pubmed: 33431042

J Chem Inf Model. 2011 Jul 25;51(7):1593-603

pubmed: 21644501

Radiology. 1982 Apr;143(1):29-36

pubmed: 7063747

J Chem Inf Model. 2010 May 24;50(5):742-54

pubmed: 20426451

Brief Bioinform. 2015 Mar;16(2):325-37

pubmed: 24723570

BMC Bioinformatics. 2007 Aug 17;8:300

pubmed: 17705863

PLoS One. 2018 Oct 4;13(10):e0204999

pubmed: 30286165

Adv Drug Deliv Rev. 2001 Mar 1;46(1-3):3-26

pubmed: 11259830

Nat Rev Drug Discov. 2017 Aug;16(8):531-543

pubmed: 28685762

Bioinformatics. 2008 Oct 1;24(19):2149-56

pubmed: 18676415

J Chem Inf Model. 2011 May 23;51(5):1183-94

pubmed: 21506615

Proc Natl Acad Sci U S A. 2020 Aug 4;117(31):18477-18488

pubmed: 32669436

Curr Top Med Chem. 2017;17(26):2957-2976

pubmed: 28828995

J Mol Biol. 1981 Mar 25;147(1):195-7

pubmed: 7265238

Mol Inform. 2014 Oct;33(10):669-81

pubmed: 27485302

PLoS Comput Biol. 2016 Feb 12;12(2):e1004760

pubmed: 26872142

Bioinformatics. 2004 Jul 22;20(11):1682-9

pubmed: 14988126

Bioinformatics. 2005 Jun;21 Suppl 1:i359-68

pubmed: 15961479

J Chem Inf Model. 2006 Mar-Apr;46(2):626-35

pubmed: 16562992

Drug Target Identification with Machine Learning: How to Choose Negative Examples.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Subventions

Références

Auteurs

Matthieu Najm (M)

Chloé-Agathe Azencott (CA)

Benoit Playe (B)

Véronique Stoven (V)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH