Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation.

data science electronic health record genome-wide association study medical informatics applications phenome-wide association study phenotyping

Journal

JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109

Informations de publication

Date de publication:
29 Nov 2019
Historique:
received: 09 04 2019
accepted: 24 09 2019
revised: 03 08 2019
pubmed: 26 9 2019
medline: 26 9 2019
entrez: 26 9 2019
Statut: epublish

Résumé

The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR). The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes. We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS. We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]). This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR.

Sections du résumé

BACKGROUND BACKGROUND
The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR).
OBJECTIVE OBJECTIVE
The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes.
METHODS METHODS
We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS.
RESULTS RESULTS
We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]).
CONCLUSIONS CONCLUSIONS
This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR.

Identifiants

pubmed: 31553307
pii: v7i4e14325
doi: 10.2196/14325
pmc: PMC6911227
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e14325

Subventions

Organisme : Cancer Research UK
ID : 22804
Pays : United Kingdom
Organisme : NHLBI NIH HHS
ID : R01 HL133786
Pays : United States
Organisme : NLM NIH HHS
ID : R01 LM010685
Pays : United States
Organisme : NIGMS NIH HHS
ID : T32 GM007347
Pays : United States

Informations de copyright

©Patrick Wu, Aliya Gifford, Xiangrui Meng, Xue Li, Harry Campbell, Tim Varley, Juan Zhao, Robert Carroll, Lisa Bastarache, Joshua C Denny, Evropi Theodoratou, Wei-Qi Wei. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 29.11.2019.

Références

Pediatrics. 2014 Jan;133(1):e54-63
pubmed: 24323995
J Med Genet. 2016 Oct;53(10):681-9
pubmed: 27287392
Sci Rep. 2015 Nov 16;5:16645
pubmed: 26568383
Nat Genet. 2018 Sep;50(9):1234-1239
pubmed: 30061737
EGEMS (Wash DC). 2016 Apr 12;4(1):1211
pubmed: 27195309
Clin Pharmacol Ther. 2008 Sep;84(3):362-9
pubmed: 18500243
Ann Rheum Dis. 2018 Jul;77(7):1039-1047
pubmed: 29437585
AMIA Annu Symp Proc. 2005;:266-70
pubmed: 16779043
PLoS Comput Biol. 2013;9(12):e1003405
pubmed: 24385893
Sci Rep. 2019 Jan 24;9(1):717
pubmed: 30679510
Nat Biotechnol. 2013 Dec;31(12):1102-10
pubmed: 24270849
J Am Med Inform Assoc. 2010 May-Jun;17(3):274-82
pubmed: 20442144
AMIA Annu Symp Proc. 2003;:1065
pubmed: 14728568
Science. 2016 Feb 12;351(6274):737-41
pubmed: 26912863
Science. 2018 Mar 16;359(6381):1233-1239
pubmed: 29590070
PLoS One. 2017 Nov 1;12(11):e0186405
pubmed: 29091937
Eur Heart J. 2010 Dec;31(23):2844-53
pubmed: 20965889
Bioinformatics. 2014 Aug 15;30(16):2375-6
pubmed: 24733291
Nat Genet. 2018 Sep;50(9):1335-1341
pubmed: 30104761
Stud Health Technol Inform. 2015;216:574-8
pubmed: 26262116
Nat Biotechnol. 2015 Apr;33(4):342-5
pubmed: 25850054
Nat Genet. 2018 Jul;50(7):956-967
pubmed: 29955180
Circulation. 2018 Oct 23;138(17):1839-1849
pubmed: 29703846
J Am Med Inform Assoc. 2016 Nov;23(6):1046-1052
pubmed: 27026615
J Med Internet Res. 2018 May 08;20(5):e10047
pubmed: 29739741
PLoS One. 2017 Jul 7;12(7):e0175508
pubmed: 28686612
PLoS Med. 2015 Mar 31;12(3):e1001779
pubmed: 25826379
J Med Internet Res. 2015 Sep 22;17(9):e219
pubmed: 26395541
PLoS One. 2019 Feb 13;14(2):e0212112
pubmed: 30759150
JAMA. 2016 Feb 16;315(7):663-71
pubmed: 26881369
J Am Med Inform Assoc. 2018 Dec 1;25(12):1618-1625
pubmed: 30395248
PLoS One. 2015 Apr 07;10(4):e0122271
pubmed: 25849893
Perspect Health Inf Manag. 2013 Apr 01;10:1d
pubmed: 23805064
IEEE J Biomed Health Inform. 2019 Sep;23(5):2052-2062
pubmed: 30602428

Auteurs

Patrick Wu (P)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.
Medical Scientist Training Program, Vanderbilt University School of Medicine, Nashville, TN, United States.

Aliya Gifford (A)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Xiangrui Meng (X)

Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom.

Xue Li (X)

Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom.

Harry Campbell (H)

Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom.

Tim Varley (T)

Public Health and Intelligence Strategic Business Unit, National Services Scotland, Edinburgh, United Kingdom.

Juan Zhao (J)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Robert Carroll (R)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Lisa Bastarache (L)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Joshua C Denny (JC)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.
Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, United States.

Evropi Theodoratou (E)

Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh, Edinburgh, United Kingdom.
Edinburgh Cancer Research Centre, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom.

Wei-Qi Wei (WQ)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Classifications MeSH