Knowledge Graphs for Indication Expansion: An Explainable Target-Disease Prediction Method.

drug discovery knowledge graphs ontologies target repositioning target repurposing

Journal

Frontiers in genetics
ISSN: 1664-8021
Titre abrégé: Front Genet
Pays: Switzerland
ID NLM: 101560621

Informations de publication

Date de publication:
2022
Historique:
received: 12 11 2021
accepted: 28 01 2022
entrez: 1 4 2022
pubmed: 2 4 2022
medline: 2 4 2022
Statut: epublish

Résumé

Indication expansion aims to find new indications for existing targets in order to accelerate the process of launching a new drug for a disease on the market. The rapid increase in data types and data sources for computational drug discovery has fostered the use of semantic knowledge graphs (KGs) for indication expansion through target centric approaches, or in other words, target repositioning. Previously, we developed a novel method to construct a KG for indication expansion studies, with the aim of finding and justifying alternative indications for a target gene of interest. In contrast to other KGs, ours combines human-curated full-text literature and gene expression data from biomedical databases to encode relationships between genes, diseases, and tissues. Here, we assessed the suitability of our KG for explainable target-disease link prediction using a glass-box approach. To evaluate the predictive power of our KG, we applied shortest path with tissue information- and embedding-based prediction methods to a graph constructed with information published before or during 2010. We also obtained random baselines by applying the shortest path predictive methods to KGs with randomly shuffled node labels. Then, we evaluated the accuracy of the top predictions using gene-disease links reported after 2010. In addition, we investigated the contribution of the KG's tissue expression entity to the prediction performance. Our experiments showed that shortest path-based methods significantly outperform the random baselines and embedding-based methods outperform the shortest path predictions. Importantly, removing the tissue expression entity from the KG severely impacts the quality of the predictions, especially those produced by the embedding approaches. Finally, since the interpretability of the predictions is crucial in indication expansion, we highlight the advantages of our glass-box model through the examination of example candidate target-disease predictions.

Identifiants

pubmed: 35360842
doi: 10.3389/fgene.2022.814093
pii: 814093
pmc: PMC8963915
doi:

Types de publication

Journal Article

Langues

eng

Pagination

814093

Informations de copyright

Copyright © 2022 Gurbuz, Alanis-Lobato, Picart-Armada, Sun, Haslinger, Lawless and Fernandez-Albert.

Déclaration de conflit d'intérêts

OG, GA-L, SP-A, NL, and FF-A are employees of Boehringer Ingelheim Pharma GmbH & Co. KG. MS, and CH were employed by Boehringer Ingelheim Pharma GmbH & Co. KG at the time of the study.

Références

Brief Bioinform. 2011 Jul;12(4):357-68
pubmed: 21712342
Expert Opin Drug Discov. 2018 Sep;13(9):791-794
pubmed: 30058388
BMC Bioinformatics. 2009 May 06;10 Suppl 5:S4
pubmed: 19426461
Nat Biotechnol. 2010 Dec;28(12):1248-50
pubmed: 21139605
Comput Struct Biotechnol J. 2020 Apr 13;18:1043-1055
pubmed: 32419905
Appl Netw Sci. 2018;3(1):10
pubmed: 30839777
Sci Data. 2017 Mar 14;4:170029
pubmed: 28291243
Expert Opin Drug Discov. 2010 May;5(5):413-23
pubmed: 22823127
PLoS Comput Biol. 2019 Sep 3;15(9):e1007276
pubmed: 31479437
Acta Neuropathol. 2012 Nov;124(5):733-47
pubmed: 22941224
J Chem Inf Model. 2015 Aug 24;55(8):1698-707
pubmed: 26147071
BMC Bioinformatics. 2019 Sep 9;20(1):463
pubmed: 31500569
BMC Bioinformatics. 2018 May 30;19(1):193
pubmed: 29843590
Mol Biol Rep. 2013 Oct;40(10):5607-14
pubmed: 24065520
J Neurol Sci. 2010 Nov 15;298(1-2):52-6
pubmed: 20850799
Expert Opin Drug Discov. 2019 May;14(5):433-444
pubmed: 30884989
J Biomed Semantics. 2017 Nov 09;8(1):50
pubmed: 29122012
Eur Neuropsychopharmacol. 2019 Mar;29(3):384-396
pubmed: 30630651
BMC Bioinformatics. 2016 Apr 12;17:160
pubmed: 27071755
Sci Data. 2016 Mar 15;3:160018
pubmed: 26978244
BMC Bioinformatics. 2019 Dec 18;20(1):726
pubmed: 31852427
Nucleic Acids Res. 2021 Jan 8;49(D1):D1302-D1310
pubmed: 33196847
Brain Nerve. 2007 Oct;59(10):1171-7
pubmed: 17969358
PeerJ. 2020 Feb 25;8:e8676
pubmed: 32140313
Health Informatics J. 2020 Dec;26(4):2737-2750
pubmed: 32674665
Sci Rep. 2013;3:1613
pubmed: 23563395
CNS Neurosci Ther. 2018 Dec;24(12):1253-1263
pubmed: 30106219
Drug Discov Today. 2012 Nov;17(21-22):1188-98
pubmed: 22683805
Pac Symp Biocomput. 2020;25:463-474
pubmed: 31797619
Mol Cell Biochem. 2013 Jan;372(1-2):241-8
pubmed: 23001869
Brief Bioinform. 2011 Jul;12(4):303-11
pubmed: 21690101
Sci Rep. 2020 Oct 26;10(1):18250
pubmed: 33106501

Auteurs

Ozge Gurbuz (O)

Discovery Research Coordination Germany, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Gregorio Alanis-Lobato (G)

Global Computational Biology and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Sergio Picart-Armada (S)

Global Computational Biology and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Miao Sun (M)

Global Computational Biology and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Christian Haslinger (C)

Global Computational Biology and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Nathan Lawless (N)

Global Computational Biology and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Francesc Fernandez-Albert (F)

Global Computational Biology and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany.

Classifications MeSH