PGxMine: Text mining for curation of PharmGKB.
Journal
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
ISSN: 2335-6936
Titre abrégé: Pac Symp Biocomput
Pays: United States
ID NLM: 9711271
Informations de publication
Date de publication:
2020
2020
Historique:
entrez:
5
12
2019
pubmed:
5
12
2019
medline:
16
2
2021
Statut:
ppublish
Résumé
Precision medicine tailors treatment to individuals personal data including differences in their genome. The Pharmacogenomics Knowledgebase (PharmGKB) provides highly curated information on the effect of genetic variation on drug response and side effects for a wide range of drugs. PharmGKB's scientific curators triage, review and annotate a large number of papers each year but the task is challenging. We present the PGxMine resource, a text-mined resource of pharmacogenomic associations from all accessible published literature to assist in the curation of PharmGKB. We developed a supervised machine learning pipeline to extract associations between a variant (DNA and protein changes, star alleles and dbSNP identifiers) and a chemical. PGxMine covers 452 chemicals and 2,426 variants and contains 19,930 mentions of pharmacogenomic associations across 7,170 papers. An evaluation by PharmGKB curators found that 57 of the top 100 associations not found in PharmGKB led to 83 curatable papers and a further 24 associations would likely lead to curatable papers through citations. The results can be viewed at https://pgxmine.pharmgkb.org/ and code can be downloaded at https://github.com/jakelever/pgxmine.
Identifiants
pubmed: 31797632
pii: 9789811215636_0054
pmc: PMC6917032
mid: NIHMS1061502
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
611-622Subventions
Organisme : NCATS NIH HHS
ID : OT2 TR002515
Pays : United States
Organisme : NLM NIH HHS
ID : R01 LM005652
Pays : United States
Organisme : NIGMS NIH HHS
ID : R24 GM061374
Pays : United States
Références
N Engl J Med. 2010 Jul 22;363(4):301-4
pubmed: 20551152
Nat Methods. 2019 Jun;16(6):505-507
pubmed: 31110280
PLoS One. 2016 Apr 13;11(4):e0152725
pubmed: 27073839
Nucleic Acids Res. 2015 Jan;43(Database issue):D447-52
pubmed: 25352553
Nucleic Acids Res. 2001 Jan 1;29(1):308-11
pubmed: 11125122
Genome Med. 2019 Dec 3;11(1):78
pubmed: 31796060
Database (Oxford). 2009;2009:bap019
pubmed: 20157492
Bioinformatics. 2018 Jan 1;34(1):80-87
pubmed: 28968638
Pac Symp Biocomput. 2008;:652-63
pubmed: 18229723
Nature. 2009 Dec 24;462(7276):1070-4
pubmed: 20033049
F1000Res. 2017 May 02;6:612
pubmed: 29152221
Bioinformatics. 2016 Sep 15;32(18):2839-46
pubmed: 27283952
Bioinformatics. 2018 Aug 1;34(15):2614-2624
pubmed: 29490008
Biomed Res Int. 2015;2015:918710
pubmed: 26380306
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W518-22
pubmed: 23703206
Database (Oxford). 2019 Jan 1;2019:
pubmed: 31032839
Clin Pharmacol Ther. 2012 Oct;92(4):414-7
pubmed: 22992668
Nature. 2015 Oct 15;526(7573):343-50
pubmed: 26469045
Database (Oxford). 2013 Sep 18;2013:bat064
pubmed: 24048470
Nucleic Acids Res. 2019 Jul 2;47(W1):W587-W593
pubmed: 31114887
Adv Pharmacol. 2018;83:3-32
pubmed: 29801580
Nucleic Acids Res. 2018 Jan 4;46(D1):D1074-D1082
pubmed: 29126136
Genome Biol. 2008;9(2):R31
pubmed: 18271954