Linking rare and common disease vocabularies by mapping between the human phenotype ontology and phecodes.

Mendelian genetics electronic health record medical phenome phenotype ontology

Journal

JAMIA open
ISSN: 2574-2531
Titre abrégé: JAMIA Open
Pays: United States
ID NLM: 101730643

Informations de publication

Date de publication:
Apr 2023
Historique:
received: 18 10 2022
revised: 14 12 2022
accepted: 31 01 2023
entrez: 6 3 2023
pubmed: 7 3 2023
medline: 7 3 2023
Statut: epublish

Résumé

Enabling discovery across the spectrum of rare and common diseases requires the integration of biological knowledge with clinical data; however, differences in terminologies present a major barrier. For example, the Human Phenotype Ontology (HPO) is the primary vocabulary for describing features of rare diseases, while most clinical encounters use International Classification of Diseases (ICD) billing codes. ICD codes are further organized into clinically meaningful phenotypes via phecodes. Despite their prevalence, no robust phenome-wide disease mapping between HPO and phecodes/ICD exists. Here, we synthesize evidence using diverse sources and methods-including text matching, the National Library of Medicine's Unified Medical Language System (UMLS), Wikipedia, SORTA, and PheMap-to define a mapping between phecodes and HPO terms via 38 950 links. We evaluate the precision and recall for each domain of evidence, both individually and jointly. This flexibility permits users to tailor the HPO-phecode links for diverse applications along the spectrum of monogenic to polygenic diseases.

Identifiants

pubmed: 36875690
doi: 10.1093/jamiaopen/ooad007
pii: ooad007
pmc: PMC9976874
doi:

Types de publication

Journal Article

Langues

eng

Pagination

ooad007

Subventions

Organisme : NIGMS NIH HHS
ID : R35 GM127087
Pays : United States

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association.

Références

Nat Biotechnol. 2013 Dec;31(12):1102-10
pubmed: 24270849
J Am Med Inform Assoc. 2020 Nov 1;27(11):1675-1687
pubmed: 32974638
Genet Med. 2020 Dec;22(12):2060-2070
pubmed: 32773773
J Am Med Inform Assoc. 2019 Dec 1;26(12):1437-1447
pubmed: 31609419
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
pubmed: 14681409
Genome Biol. 2016 Nov 17;17(1):233
pubmed: 27855690
Hum Mutat. 2015 Oct;36(10):915-21
pubmed: 26295439
J Biomed Semantics. 2021 Aug 23;12(1):17
pubmed: 34425897
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D514-7
pubmed: 15608251
J Biomed Semantics. 2016 Feb 09;7:3
pubmed: 26865946
Nucleic Acids Res. 2021 Jan 8;49(D1):D1207-D1217
pubmed: 33264411
Bioinformatics. 2022 Oct 31;38(21):4972-4974
pubmed: 36083022
PLoS One. 2017 Jul 7;12(7):e0175508
pubmed: 28686612
Nat Protoc. 2015 Dec;10(12):2004-15
pubmed: 26562621
Nucleic Acids Res. 2017 Jan 4;45(D1):D712-D722
pubmed: 27899636
Curr Protoc Hum Genet. 2019 Sep;103(1):e92
pubmed: 31479590
Stud Health Technol Inform. 2015;216:795-9
pubmed: 26262161
Hum Mutat. 2013 Aug;34(8):1057-65
pubmed: 23636887
Science. 2018 Mar 16;359(6381):1233-1239
pubmed: 29590070
Am J Hum Genet. 2009 Oct;85(4):457-64
pubmed: 19800049
Genetics. 2016 Aug;203(4):1491-5
pubmed: 27516611
Annu Rev Genomics Hum Genet. 2016 Aug 31;17:353-73
pubmed: 27147087
NPJ Digit Med. 2019;2:
pubmed: 31119199
Nat Commun. 2019 Jun 28;10(1):2837
pubmed: 31253775
JMIR Med Inform. 2019 Nov 29;7(4):e14325
pubmed: 31553307
Bioinformatics. 2019 Mar 15;35(6):1076-1078
pubmed: 30165396
Nucleic Acids Res. 2023 Jan 6;51(D1):D986-D993
pubmed: 36350644
Database (Oxford). 2015 Sep 18;2015:
pubmed: 26385205
Annu Rev Biomed Data Sci. 2021 Jul 20;4:1-19
pubmed: 34465180

Auteurs

Evonne McArthur (E)

Vanderbilt Genetics Institute, Vanderbilt University, Nashville, Tennessee, USA.

Lisa Bastarache (L)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

John A Capra (JA)

Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA.
Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, USA.

Classifications MeSH