COVID-19 Mortality Prediction From Deep Learning in a Large Multistate Electronic Health Record and Laboratory Information System Data Set: Algorithm Development and Validation.

COVID-19 EHR algorithm deep learning development electronic health record machine learning missing data mortality neural network prediction recurrent neural networks time series validation

Journal

Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882

Informations de publication

Date de publication:
28 09 2021
Historique:
received: 03 05 2021
accepted: 11 08 2021
revised: 18 07 2021
pubmed: 28 8 2021
medline: 2 10 2021
entrez: 27 8 2021
Statut: epublish

Résumé

COVID-19 is caused by the SARS-CoV-2 virus and has strikingly heterogeneous clinical manifestations, with most individuals contracting mild disease but a substantial minority experiencing fulminant cardiopulmonary symptoms or death. The clinical covariates and the laboratory tests performed on a patient provide robust statistics to guide clinical treatment. Deep learning approaches on a data set of this nature enable patient stratification and provide methods to guide clinical treatment. Here, we report on the development and prospective validation of a state-of-the-art machine learning model to provide mortality prediction shortly after confirmation of SARS-CoV-2 infection in the Mayo Clinic patient population. We retrospectively constructed one of the largest reported and most geographically diverse laboratory information system and electronic health record of COVID-19 data sets in the published literature, which included 11,807 patients residing in 41 states of the United States of America and treated at medical sites across 5 states in 3 time zones. Traditional machine learning models were evaluated independently as well as in a stacked learner approach by using AutoGluon, and various recurrent neural network architectures were considered. The traditional machine learning models were implemented using the AutoGluon-Tabular framework, whereas the recurrent neural networks utilized the TensorFlow Keras framework. We trained these models to operate solely using routine laboratory measurements and clinical covariates available within 72 hours of a patient's first positive COVID-19 nucleic acid test result. The GRU-D recurrent neural network achieved peak cross-validation performance with 0.938 (SE 0.004) as the area under the receiver operating characteristic (AUROC) curve. This model retained strong performance by reducing the follow-up time to 12 hours (0.916 [SE 0.005] AUROC), and the leave-one-out feature importance analysis indicated that the most independently valuable features were age, Charlson comorbidity index, minimum oxygen saturation, fibrinogen level, and serum iron level. In the prospective testing cohort, this model provided an AUROC of 0.901 and a statistically significant difference in survival (P<.001, hazard ratio for those predicted to survive, 95% CI 0.043-0.106). Our deep learning approach using GRU-D provides an alert system to flag mortality for COVID-19-positive patients by using clinical covariates and laboratory values within a 72-hour window after the first positive nucleic acid test result.

Sections du résumé

BACKGROUND
COVID-19 is caused by the SARS-CoV-2 virus and has strikingly heterogeneous clinical manifestations, with most individuals contracting mild disease but a substantial minority experiencing fulminant cardiopulmonary symptoms or death. The clinical covariates and the laboratory tests performed on a patient provide robust statistics to guide clinical treatment. Deep learning approaches on a data set of this nature enable patient stratification and provide methods to guide clinical treatment.
OBJECTIVE
Here, we report on the development and prospective validation of a state-of-the-art machine learning model to provide mortality prediction shortly after confirmation of SARS-CoV-2 infection in the Mayo Clinic patient population.
METHODS
We retrospectively constructed one of the largest reported and most geographically diverse laboratory information system and electronic health record of COVID-19 data sets in the published literature, which included 11,807 patients residing in 41 states of the United States of America and treated at medical sites across 5 states in 3 time zones. Traditional machine learning models were evaluated independently as well as in a stacked learner approach by using AutoGluon, and various recurrent neural network architectures were considered. The traditional machine learning models were implemented using the AutoGluon-Tabular framework, whereas the recurrent neural networks utilized the TensorFlow Keras framework. We trained these models to operate solely using routine laboratory measurements and clinical covariates available within 72 hours of a patient's first positive COVID-19 nucleic acid test result.
RESULTS
The GRU-D recurrent neural network achieved peak cross-validation performance with 0.938 (SE 0.004) as the area under the receiver operating characteristic (AUROC) curve. This model retained strong performance by reducing the follow-up time to 12 hours (0.916 [SE 0.005] AUROC), and the leave-one-out feature importance analysis indicated that the most independently valuable features were age, Charlson comorbidity index, minimum oxygen saturation, fibrinogen level, and serum iron level. In the prospective testing cohort, this model provided an AUROC of 0.901 and a statistically significant difference in survival (P<.001, hazard ratio for those predicted to survive, 95% CI 0.043-0.106).
CONCLUSIONS
Our deep learning approach using GRU-D provides an alert system to flag mortality for COVID-19-positive patients by using clinical covariates and laboratory values within a 72-hour window after the first positive nucleic acid test result.

Identifiants

pubmed: 34449401
pii: v23i9e30157
doi: 10.2196/30157
pmc: PMC8480399
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e30157

Informations de copyright

©Saranya Sankaranarayanan, Jagadheshwar Balan, Jesse R Walsh, Yanhong Wu, Sara Minnich, Amy Piazza, Collin Osborne, Gavin R Oliver, Jessica Lesko, Kathy L Bates, Kia Khezeli, Darci R Block, Margaret DiGuardo, Justin Kreuter, John C O’Horo, John Kalantari, Eric W Klee, Mohamed E Salama, Benjamin Kipp, William G Morice, Garrett Jenkinson. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 28.09.2021.

Références

Lancet. 2020 Apr 11;395(10231):1225-1228
pubmed: 32178769
Nat Commun. 2020 Oct 6;11(1):5033
pubmed: 33024092
Nat Med. 2020 Apr;26(4):450-452
pubmed: 32284615
J Med Internet Res. 2021 Feb 10;23(2):e24246
pubmed: 33476281
J Med Internet Res. 2020 Dec 23;22(12):e25442
pubmed: 33301414
J Chronic Dis. 1987;40(5):373-83
pubmed: 3558716
BMJ. 2020 Apr 7;369:m1328
pubmed: 32265220
J Med Internet Res. 2020 Nov 11;22(11):e23128
pubmed: 33035175
Stat Med. 2020 Sep 20;39(21):2815-2842
pubmed: 32419182
Science. 2020 Apr 24;368(6489):395-400
pubmed: 32144116
Nature. 2020 Mar;579(7798):270-273
pubmed: 32015507
N Engl J Med. 2020 May 14;382(20):1873-1875
pubmed: 32187459
Crit Care. 2020 Mar 18;24(1):108
pubmed: 32188484
J Med Internet Res. 2020 Nov 6;22(11):e24018
pubmed: 33027032
JAMA. 2020 Aug 25;324(8):782-793
pubmed: 32648899
Clin Infect Dis. 2020 Jul 28;71(15):833-840
pubmed: 32296824
J Med Internet Res. 2020 Aug 25;22(8):e20259
pubmed: 32735549
J Med Internet Res. 2020 Nov 9;22(11):e24225
pubmed: 33108316
Sci Rep. 2018 Apr 17;8(1):6085
pubmed: 29666385
Diabetes Metab Syndr. 2020 Jul - Aug;14(4):535-545
pubmed: 32408118

Auteurs

Saranya Sankaranarayanan (S)

Mayo Clinic, Rochester, MN, United States.

Jagadheshwar Balan (J)

Mayo Clinic, Rochester, MN, United States.

Jesse R Walsh (JR)

Mayo Clinic, Rochester, MN, United States.

Yanhong Wu (Y)

Mayo Clinic, Rochester, MN, United States.

Sara Minnich (S)

Mayo Clinic, Rochester, MN, United States.

Amy Piazza (A)

Mayo Clinic, Rochester, MN, United States.

Collin Osborne (C)

Mayo Clinic, Rochester, MN, United States.

Gavin R Oliver (GR)

Mayo Clinic, Rochester, MN, United States.

Jessica Lesko (J)

Mayo Clinic, Rochester, MN, United States.

Kathy L Bates (KL)

Mayo Clinic, Rochester, MN, United States.

Kia Khezeli (K)

Mayo Clinic, Rochester, MN, United States.

Darci R Block (DR)

Mayo Clinic, Rochester, MN, United States.

Margaret DiGuardo (M)

Mayo Clinic, Rochester, MN, United States.

Justin Kreuter (J)

Mayo Clinic, Rochester, MN, United States.

John C O'Horo (JC)

Mayo Clinic, Rochester, MN, United States.

John Kalantari (J)

Mayo Clinic, Rochester, MN, United States.

Eric W Klee (EW)

Mayo Clinic, Rochester, MN, United States.

Mohamed E Salama (ME)

Mayo Clinic, Rochester, MN, United States.

Benjamin Kipp (B)

Mayo Clinic, Rochester, MN, United States.

William G Morice (WG)

Mayo Clinic, Rochester, MN, United States.

Garrett Jenkinson (G)

Mayo Clinic, Rochester, MN, United States.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH