Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation.
data science
electronic health record
genome-wide association study
medical informatics applications
phenome-wide association study
phenotyping
Journal
JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109
Informations de publication
Date de publication:
29 Nov 2019
29 Nov 2019
Historique:
received:
09
04
2019
accepted:
24
09
2019
revised:
03
08
2019
pubmed:
26
9
2019
medline:
26
9
2019
entrez:
26
9
2019
Statut:
epublish
Résumé
The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR). The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes. We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS. We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]). This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR.
Sections du résumé
BACKGROUND
BACKGROUND
The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR).
OBJECTIVE
OBJECTIVE
The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes.
METHODS
METHODS
We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS.
RESULTS
RESULTS
We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]).
CONCLUSIONS
CONCLUSIONS
This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR.
Identifiants
pubmed: 31553307
pii: v7i4e14325
doi: 10.2196/14325
pmc: PMC6911227
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e14325Subventions
Organisme : Cancer Research UK
ID : 22804
Pays : United Kingdom
Organisme : NHLBI NIH HHS
ID : R01 HL133786
Pays : United States
Organisme : NLM NIH HHS
ID : R01 LM010685
Pays : United States
Organisme : NIGMS NIH HHS
ID : T32 GM007347
Pays : United States
Informations de copyright
©Patrick Wu, Aliya Gifford, Xiangrui Meng, Xue Li, Harry Campbell, Tim Varley, Juan Zhao, Robert Carroll, Lisa Bastarache, Joshua C Denny, Evropi Theodoratou, Wei-Qi Wei. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 29.11.2019.
Références
Pediatrics. 2014 Jan;133(1):e54-63
pubmed: 24323995
J Med Genet. 2016 Oct;53(10):681-9
pubmed: 27287392
Sci Rep. 2015 Nov 16;5:16645
pubmed: 26568383
Nat Genet. 2018 Sep;50(9):1234-1239
pubmed: 30061737
EGEMS (Wash DC). 2016 Apr 12;4(1):1211
pubmed: 27195309
Clin Pharmacol Ther. 2008 Sep;84(3):362-9
pubmed: 18500243
Ann Rheum Dis. 2018 Jul;77(7):1039-1047
pubmed: 29437585
AMIA Annu Symp Proc. 2005;:266-70
pubmed: 16779043
PLoS Comput Biol. 2013;9(12):e1003405
pubmed: 24385893
Sci Rep. 2019 Jan 24;9(1):717
pubmed: 30679510
Nat Biotechnol. 2013 Dec;31(12):1102-10
pubmed: 24270849
J Am Med Inform Assoc. 2010 May-Jun;17(3):274-82
pubmed: 20442144
AMIA Annu Symp Proc. 2003;:1065
pubmed: 14728568
Science. 2016 Feb 12;351(6274):737-41
pubmed: 26912863
Science. 2018 Mar 16;359(6381):1233-1239
pubmed: 29590070
PLoS One. 2017 Nov 1;12(11):e0186405
pubmed: 29091937
Eur Heart J. 2010 Dec;31(23):2844-53
pubmed: 20965889
Bioinformatics. 2014 Aug 15;30(16):2375-6
pubmed: 24733291
Nat Genet. 2018 Sep;50(9):1335-1341
pubmed: 30104761
Stud Health Technol Inform. 2015;216:574-8
pubmed: 26262116
Nat Biotechnol. 2015 Apr;33(4):342-5
pubmed: 25850054
Nat Genet. 2018 Jul;50(7):956-967
pubmed: 29955180
Circulation. 2018 Oct 23;138(17):1839-1849
pubmed: 29703846
J Am Med Inform Assoc. 2016 Nov;23(6):1046-1052
pubmed: 27026615
J Med Internet Res. 2018 May 08;20(5):e10047
pubmed: 29739741
PLoS One. 2017 Jul 7;12(7):e0175508
pubmed: 28686612
PLoS Med. 2015 Mar 31;12(3):e1001779
pubmed: 25826379
J Med Internet Res. 2015 Sep 22;17(9):e219
pubmed: 26395541
PLoS One. 2019 Feb 13;14(2):e0212112
pubmed: 30759150
JAMA. 2016 Feb 16;315(7):663-71
pubmed: 26881369
J Am Med Inform Assoc. 2018 Dec 1;25(12):1618-1625
pubmed: 30395248
PLoS One. 2015 Apr 07;10(4):e0122271
pubmed: 25849893
Perspect Health Inf Manag. 2013 Apr 01;10:1d
pubmed: 23805064
IEEE J Biomed Health Inform. 2019 Sep;23(5):2052-2062
pubmed: 30602428