Identification of pharmacodynamic biomarker hypotheses through literature analysis with IBM Watson.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2019
2019
Historique:
received:
21
12
2018
accepted:
16
03
2019
entrez:
9
4
2019
pubmed:
9
4
2019
medline:
24
12
2019
Statut:
epublish
Résumé
Pharmacodynamic biomarkers are becoming increasingly valuable for assessing drug activity and target modulation in clinical trials. However, identifying quality biomarkers is challenging due to the increasing volume and heterogeneity of relevant data describing the biological networks that underlie disease mechanisms. A biological pathway network typically includes entities (e.g. genes, proteins and chemicals/drugs) as well as the relationships between these and is typically curated or mined from structured databases and textual co-occurrence data. We propose a hybrid Natural Language Processing and directed relationships-based network analysis approach using IBM Watson for Drug Discovery to rank all human genes and identify potential candidate biomarkers, requiring only an initial determination of a specific target-disease relationship. Through natural language processing of scientific literature, Watson for Drug Discovery creates a network of semantic relationships between biological concepts such as genes, drugs, and diseases. Using Bruton's tyrosine kinase as a case study, Watson for Drug Discovery's automatically extracted relationship network was compared with a prominent manually curated physical interaction network. Additionally, potential biomarkers for Bruton's tyrosine kinase inhibition were predicted using a matrix factorization approach and subsequently compared with expert-generated biomarkers. Watson's natural language processing generated a relationship network matching 55 (86%) genes upstream of BTK and 98 (95%) genes downstream of Bruton's tyrosine kinase in a prominent manually curated physical interaction network. Matrix factorization analysis predicted 11 of 13 genes identified by Merck subject matter experts in the top 20% of Watson for Drug Discovery's 13,595 ranked genes, with 7 in the top 5%. Taken together, these results suggest that Watson for Drug Discovery's automatic relationship network identifies the majority of upstream and downstream genes in biological pathway networks and can be used to help with the identification and prioritization of pharmacodynamic biomarker evaluation, accelerating the early phases of disease hypothesis generation.
Sections du résumé
BACKGROUND
Pharmacodynamic biomarkers are becoming increasingly valuable for assessing drug activity and target modulation in clinical trials. However, identifying quality biomarkers is challenging due to the increasing volume and heterogeneity of relevant data describing the biological networks that underlie disease mechanisms. A biological pathway network typically includes entities (e.g. genes, proteins and chemicals/drugs) as well as the relationships between these and is typically curated or mined from structured databases and textual co-occurrence data. We propose a hybrid Natural Language Processing and directed relationships-based network analysis approach using IBM Watson for Drug Discovery to rank all human genes and identify potential candidate biomarkers, requiring only an initial determination of a specific target-disease relationship.
METHODS
Through natural language processing of scientific literature, Watson for Drug Discovery creates a network of semantic relationships between biological concepts such as genes, drugs, and diseases. Using Bruton's tyrosine kinase as a case study, Watson for Drug Discovery's automatically extracted relationship network was compared with a prominent manually curated physical interaction network. Additionally, potential biomarkers for Bruton's tyrosine kinase inhibition were predicted using a matrix factorization approach and subsequently compared with expert-generated biomarkers.
RESULTS
Watson's natural language processing generated a relationship network matching 55 (86%) genes upstream of BTK and 98 (95%) genes downstream of Bruton's tyrosine kinase in a prominent manually curated physical interaction network. Matrix factorization analysis predicted 11 of 13 genes identified by Merck subject matter experts in the top 20% of Watson for Drug Discovery's 13,595 ranked genes, with 7 in the top 5%.
CONCLUSION
Taken together, these results suggest that Watson for Drug Discovery's automatic relationship network identifies the majority of upstream and downstream genes in biological pathway networks and can be used to help with the identification and prioritization of pharmacodynamic biomarker evaluation, accelerating the early phases of disease hypothesis generation.
Identifiants
pubmed: 30958864
doi: 10.1371/journal.pone.0214619
pii: PONE-D-18-36562
pmc: PMC6453528
doi:
Substances chimiques
Biomarkers
0
Small Molecule Libraries
0
Agammaglobulinaemia Tyrosine Kinase
EC 2.7.10.2
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0214619Déclaration de conflit d'intérêts
Funding for this study was provided by Merck KGaA, Darmstadt Germany (https://www.emdgroup.com). Sonja Hatz, Philipp Haselmayer, Harsha Gurulingappa, and Ulrich Betz are employed by Merck KGaA. Andrew Bender and Matthew Studham are employed by EMD Serono. Scott Spangler, Alix Lacoste, Van C Willis, and Richard L Martin were all employed by IBM Watson Health during the time this research was conducted. None of these interests affected the choice of what and where to publish but the topic (pharmacodynamic biomarkers for BTK therapy) was of interest to Merck and learning what IBM's techniques showed in this case was also of interest to them. Watson for Drug Discovery is a product of IBM Watson Health. There are no patents, products in development or other marketed products to declare. This does not alter our adherence to all the PLOS ONE policies on data and materials.
Références
Pac Symp Biocomput. 2007;:209-20
pubmed: 17990493
Bioinformatics. 2001;17 Suppl 1:S97-106
pubmed: 11472998
Methods. 2015 Mar;74:97-106
pubmed: 25641519
Nucleic Acids Res. 2013 Jan;41(Database issue):D43-7
pubmed: 23161681
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
Blood. 2014 May 22;123(21):3286-95
pubmed: 24659631
Immunity. 2008 Jul 18;29(1):150-64
pubmed: 18631455
NPJ Syst Biol Appl. 2018 Jun 1;4:20
pubmed: 29872543
Biomed Eng Online. 2014;13 Suppl 2:S1
pubmed: 25559746
Lupus. 2014 Aug;23(9):868-75
pubmed: 24704774
Nucleic Acids Res. 2017 Jan 4;45(D1):D331-D338
pubmed: 27899567
Biol Direct. 2011 Feb 28;6:15
pubmed: 21356087
Curr Drug Targets. 2010 May;11(5):536-45
pubmed: 20199395
Nucleic Acids Res. 2003 Jan 1;31(1):28-33
pubmed: 12519941
Nucleic Acids Res. 2011 Jan;39(Database issue):D1067-72
pubmed: 20864448
Database (Oxford). 2017 Jan 1;2017:
pubmed: 31725857
BMC Res Notes. 2016 Apr 26;9:236
pubmed: 27112211
PLoS One. 2009 Nov 18;4(11):e7894
pubmed: 19924298
Bioorg Med Chem Lett. 2015 Mar 1;25(5):998-1008
pubmed: 25630223
Biomed Res Int. 2014;2014:253128
pubmed: 24839601
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5:S19
pubmed: 17254303
Nucleic Acids Res. 2013 Jan;41(Database issue):D545-52
pubmed: 23161694
Bioinformatics. 2002 Aug;18(8):1124-32
pubmed: 12176836
Nat Methods. 2016 Nov 29;13(12):966-967
pubmed: 27898060
J Biomed Inform. 2017 Jul;71:178-189
pubmed: 28579531