Developing the Total Health Profile, a Generalizable Unified Set of Multimorbidity Risk Scores Derived From Machine Learning for Broad Patient Populations: Retrospective Cohort Study.

clinical informatics clinical risk score cohort decision making demographic diagnostic electronic health record machine learning morbidity multimorbidity outcome outcome research population data prediction risk

Journal

Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882

Informations de publication

Date de publication:
26 11 2021
Historique:
received: 13 08 2021
accepted: 18 09 2021
revised: 15 09 2021
entrez: 29 11 2021
pubmed: 30 11 2021
medline: 15 12 2021
Statut: epublish

Résumé

Multimorbidity clinical risk scores allow clinicians to quickly assess their patients' health for decision making, often for recommendation to care management programs. However, these scores are limited by several issues: existing multimorbidity scores (1) are generally limited to one data group (eg, diagnoses, labs) and may be missing vital information, (2) are usually limited to specific demographic groups (eg, age), and (3) do not formally provide any granularity in the form of more nuanced multimorbidity risk scores to direct clinician attention. Using diagnosis, lab, prescription, procedure, and demographic data from electronic health records (EHRs), we developed a physiologically diverse and generalizable set of multimorbidity risk scores. Using EHR data from a nationwide cohort of patients, we developed the total health profile, a set of six integrated risk scores reflecting five distinct organ systems and overall health. We selected the occurrence of an inpatient hospital visitation over a 2-year follow-up window, attributable to specific organ systems, as our risk endpoint. Using a physician-curated set of features, we trained six machine learning models on 794,294 patients to predict the calibrated probability of the aforementioned endpoint, producing risk scores for heart, lung, neuro, kidney, and digestive functions and a sixth score for combined risk. We evaluated the scores using a held-out test cohort of 198,574 patients. Study patients closely matched national census averages, with a median age of 41 years, a median income of $66,829, and racial averages by zip code of 73.8% White, 5.9% Asian, and 11.9% African American. All models were well calibrated and demonstrated strong performance with areas under the receiver operating curve (AUROCs) of 0.83 for the total health score (THS), 0.89 for heart, 0.86 for lung, 0.84 for neuro, 0.90 for kidney, and 0.83 for digestive functions. There was consistent performance of this scoring system across sexes, diverse patient ages, and zip code income levels. Each model learned to generate predictions by focusing on appropriate clinically relevant patient features, such as heart-related hospitalizations and chronic hypertension diagnosis for the heart model. The THS outperformed the other commonly used multimorbidity scoring systems, specifically the Charlson Comorbidity Index (CCI) and the Elixhauser Comorbidity Index (ECI) overall (AUROCs: THS=0.823, CCI=0.735, ECI=0.649) as well as for every age, sex, and income bracket. Performance improvements were most pronounced for middle-aged and lower-income subgroups. Ablation tests using only diagnosis, prescription, social determinants of health, and lab feature groups, while retaining procedure-related features, showed that the combination of feature groups has the best predictive performance, though only marginally better than the diagnosis-only model on at-risk groups. Massive retrospective EHR data sets have made it possible to use machine learning to build practical multimorbidity risk scores that are highly predictive, personalizable, intuitive to explain, and generalizable across diverse patient populations.

Sections du résumé

BACKGROUND
Multimorbidity clinical risk scores allow clinicians to quickly assess their patients' health for decision making, often for recommendation to care management programs. However, these scores are limited by several issues: existing multimorbidity scores (1) are generally limited to one data group (eg, diagnoses, labs) and may be missing vital information, (2) are usually limited to specific demographic groups (eg, age), and (3) do not formally provide any granularity in the form of more nuanced multimorbidity risk scores to direct clinician attention.
OBJECTIVE
Using diagnosis, lab, prescription, procedure, and demographic data from electronic health records (EHRs), we developed a physiologically diverse and generalizable set of multimorbidity risk scores.
METHODS
Using EHR data from a nationwide cohort of patients, we developed the total health profile, a set of six integrated risk scores reflecting five distinct organ systems and overall health. We selected the occurrence of an inpatient hospital visitation over a 2-year follow-up window, attributable to specific organ systems, as our risk endpoint. Using a physician-curated set of features, we trained six machine learning models on 794,294 patients to predict the calibrated probability of the aforementioned endpoint, producing risk scores for heart, lung, neuro, kidney, and digestive functions and a sixth score for combined risk. We evaluated the scores using a held-out test cohort of 198,574 patients.
RESULTS
Study patients closely matched national census averages, with a median age of 41 years, a median income of $66,829, and racial averages by zip code of 73.8% White, 5.9% Asian, and 11.9% African American. All models were well calibrated and demonstrated strong performance with areas under the receiver operating curve (AUROCs) of 0.83 for the total health score (THS), 0.89 for heart, 0.86 for lung, 0.84 for neuro, 0.90 for kidney, and 0.83 for digestive functions. There was consistent performance of this scoring system across sexes, diverse patient ages, and zip code income levels. Each model learned to generate predictions by focusing on appropriate clinically relevant patient features, such as heart-related hospitalizations and chronic hypertension diagnosis for the heart model. The THS outperformed the other commonly used multimorbidity scoring systems, specifically the Charlson Comorbidity Index (CCI) and the Elixhauser Comorbidity Index (ECI) overall (AUROCs: THS=0.823, CCI=0.735, ECI=0.649) as well as for every age, sex, and income bracket. Performance improvements were most pronounced for middle-aged and lower-income subgroups. Ablation tests using only diagnosis, prescription, social determinants of health, and lab feature groups, while retaining procedure-related features, showed that the combination of feature groups has the best predictive performance, though only marginally better than the diagnosis-only model on at-risk groups.
CONCLUSIONS
Massive retrospective EHR data sets have made it possible to use machine learning to build practical multimorbidity risk scores that are highly predictive, personalizable, intuitive to explain, and generalizable across diverse patient populations.

Identifiants

pubmed: 34842542
pii: v23i11e32900
doi: 10.2196/32900
pmc: PMC8665380
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e32900

Informations de copyright

©Abhishaike Mahajan, Andrew Deonarine, Axel Bernal, Genevieve Lyons, Beau Norgeot. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 26.11.2021.

Références

BMC Neurol. 2019 Jul 20;19(1):174
pubmed: 31325958
Mech Ageing Dev. 2020 Sep;190:111325
pubmed: 32768443
CMAJ. 2020 Feb 3;192(5):E107-E114
pubmed: 32015079
J Am Med Dir Assoc. 2020 Apr;21(4):462-468.e7
pubmed: 31948852
Dent Clin North Am. 2018 Apr;62(2):319-325
pubmed: 29478460
Health Qual Life Outcomes. 2015 Oct 15;13:168
pubmed: 26467295
Qual Life Res. 2016 Aug;25(8):1921-9
pubmed: 26781442
Int J Med Inform. 2018 Aug;116:10-17
pubmed: 29887230
Ann Fam Med. 2012 Mar-Apr;10(2):134-41
pubmed: 22412005
J Gerontol A Biol Sci Med Sci. 2008 Jun;63(6):603-9
pubmed: 18559635
J Clin Epidemiol. 1999 Mar;52(3):171-9
pubmed: 10210233
J Am Med Inform Assoc. 2018 Jan 1;25(1):47-53
pubmed: 29177457
Radiology. 1982 Apr;143(1):29-36
pubmed: 7063747
J Am Med Inform Assoc. 2013 Dec;20(e2):e206-11
pubmed: 24302669
Arch Intern Med. 2002 Nov 11;162(20):2269-76
pubmed: 12418941
Br J Gen Pract. 2013 Feb;63(607):64-5
pubmed: 23561658
Am J Manag Care. 2013 Sep;19(9):725-32
pubmed: 24304255
Qual Life Res. 2006 Feb;15(1):83-91
pubmed: 16411033
Circulation. 1998 May 12;97(18):1837-47
pubmed: 9603539
J Am Board Fam Med. 2018 Jul-Aug;31(4):503-513
pubmed: 29986975
Risk Manag Healthc Policy. 2016 Jul 05;9:143-56
pubmed: 27462182
Int J Behav Nutr Phys Act. 2008 Nov 06;5:56
pubmed: 18990237
Health Qual Life Outcomes. 2009 Sep 08;7:82
pubmed: 19737412
Med Care. 2009 Jun;47(6):626-33
pubmed: 19433995
BMJ. 2020 Feb 18;368:m160
pubmed: 32071114
J Clin Epidemiol. 1992 Feb;45(2):197-203
pubmed: 1573438
J Intern Med. 2009 Feb;265(2):288-95
pubmed: 19192038
J Clin Epidemiol. 1990;43(1):87-91
pubmed: 2319285
J Chronic Dis. 1987;40(5):373-83
pubmed: 3558716
Am J Kidney Dis. 2017 May;69(5):555-557
pubmed: 28434521
Br J Gen Pract. 2018 Apr;68(669):e245-e251
pubmed: 29530918
Nat Med. 2020 Sep;26(9):1320-1324
pubmed: 32908275
BMJ. 2015 Mar 11;350:h1059
pubmed: 25761379

Auteurs

Abhishaike Mahajan (A)

Anthem Inc, Palo Alto, CA, United States.

Andrew Deonarine (A)

XY Health, Cambridge, MA, United States.

Axel Bernal (A)

Anthem Inc, Palo Alto, CA, United States.

Genevieve Lyons (G)

XY Health, Cambridge, MA, United States.

Beau Norgeot (B)

Anthem Inc, Palo Alto, CA, United States.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH