Machine learning for prediction of in-hospital mortality in coronavirus disease 2019 patients: results from an Italian multicenter study.
Journal
Journal of cardiovascular medicine (Hagerstown, Md.)
ISSN: 1558-2035
Titre abrégé: J Cardiovasc Med (Hagerstown)
Pays: United States
ID NLM: 101259752
Informations de publication
Date de publication:
01 07 2022
01 07 2022
Historique:
entrez:
28
6
2022
pubmed:
29
6
2022
medline:
1
7
2022
Statut:
ppublish
Résumé
Several risk factors have been identified to predict worse outcomes in patients affected by SARS-CoV-2 infection. Machine learning algorithms represent a novel approach to identifying a prediction model with a good discriminatory capacity to be easily used in clinical practice. The aim of this study was to obtain a risk score for in-hospital mortality in patients with coronavirus disease infection (COVID-19) based on a limited number of features collected at hospital admission. We studied an Italian cohort of consecutive adult Caucasian patients with laboratory-confirmed COVID-19 who were hospitalized in 13 cardiology units during Spring 2020. The Lasso procedure was used to select the most relevant covariates. The dataset was randomly divided into a training set containing 80% of the data, used for estimating the model, and a test set with the remaining 20%. A Random Forest modeled in-hospital mortality with the selected set of covariates: its accuracy was measured by means of the ROC curve, obtaining AUC, sensitivity, specificity and related 95% confidence interval (CI). This model was then compared with the one obtained by the Gradient Boosting Machine (GBM) and with logistic regression. Finally, to understand if each model has the same performance in the training and test set, the two AUCs were compared using the DeLong's test. Among 701 patients enrolled (mean age 67.2 ± 13.2 years, 69.5% male individuals), 165 (23.5%) died during a median hospitalization of 15 (IQR, 9-24) days. Variables selected by the Lasso procedure were: age, oxygen saturation, PaO2/FiO2, creatinine clearance and elevated troponin. Compared with those who survived, deceased patients were older, had a lower blood oxygenation, lower creatinine clearance levels and higher prevalence of elevated troponin (all P < 0.001). The best performance out of the samples was provided by Random Forest with an AUC of 0.78 (95% CI: 0.68-0.88) and a sensitivity of 0.88 (95% CI: 0.58-1.00). Moreover, Random Forest was the unique model that provided similar performance in sample and out of sample (DeLong test P = 0.78). In a large COVID-19 population, we showed that a customizable machine learning-based score derived from clinical variables is feasible and effective for the prediction of in-hospital mortality.
Sections du résumé
BACKGROUND
Several risk factors have been identified to predict worse outcomes in patients affected by SARS-CoV-2 infection. Machine learning algorithms represent a novel approach to identifying a prediction model with a good discriminatory capacity to be easily used in clinical practice. The aim of this study was to obtain a risk score for in-hospital mortality in patients with coronavirus disease infection (COVID-19) based on a limited number of features collected at hospital admission.
METHODS AND RESULTS
We studied an Italian cohort of consecutive adult Caucasian patients with laboratory-confirmed COVID-19 who were hospitalized in 13 cardiology units during Spring 2020. The Lasso procedure was used to select the most relevant covariates. The dataset was randomly divided into a training set containing 80% of the data, used for estimating the model, and a test set with the remaining 20%. A Random Forest modeled in-hospital mortality with the selected set of covariates: its accuracy was measured by means of the ROC curve, obtaining AUC, sensitivity, specificity and related 95% confidence interval (CI). This model was then compared with the one obtained by the Gradient Boosting Machine (GBM) and with logistic regression. Finally, to understand if each model has the same performance in the training and test set, the two AUCs were compared using the DeLong's test. Among 701 patients enrolled (mean age 67.2 ± 13.2 years, 69.5% male individuals), 165 (23.5%) died during a median hospitalization of 15 (IQR, 9-24) days. Variables selected by the Lasso procedure were: age, oxygen saturation, PaO2/FiO2, creatinine clearance and elevated troponin. Compared with those who survived, deceased patients were older, had a lower blood oxygenation, lower creatinine clearance levels and higher prevalence of elevated troponin (all P < 0.001). The best performance out of the samples was provided by Random Forest with an AUC of 0.78 (95% CI: 0.68-0.88) and a sensitivity of 0.88 (95% CI: 0.58-1.00). Moreover, Random Forest was the unique model that provided similar performance in sample and out of sample (DeLong test P = 0.78).
CONCLUSION
In a large COVID-19 population, we showed that a customizable machine learning-based score derived from clinical variables is feasible and effective for the prediction of in-hospital mortality.
Identifiants
pubmed: 35763764
doi: 10.2459/JCM.0000000000001329
pii: 01244665-202207000-00004
doi:
Substances chimiques
Troponin
0
Creatinine
AYI8EX34EU
Types de publication
Journal Article
Multicenter Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
439-446Informations de copyright
Copyright © 2022 Italian Federation of Cardiology - I.F.C. All rights reserved.
Références
Berlin DA, Gulick RM, Martinez FJ. Severe Covid-19. N Engl J Med 2020; 383:2451–2460.
Inciardi RM, Adamo M, Lupi L, et al. Characteristics and outcomes of patients hospitalized for COVID-19 and cardiac disease in Northern Italy. Eur Heart J 2020; 41:1821–1829.
Inciardi RM, Lupi L, Zaccone G, et al. Cardiac involvement in a patient with coronavirus disease 2019 (COVID-19). JAMA Cardiol 2020; 5:819–824.
Lombardi CM, Carubelli V, Iorio A, et al. Association of troponin levels with mortality in Italian patients hospitalized with coronavirus disease 2019: results of a multicenter study. JAMA Cardiol 2020; 5:1274–1280.
Tomasoni D, Inciardi RM, Lombardi CM, et al. Impact of heart failure on the clinical course and outcomes of patients hospitalized for COVID-19: results of the Cardio-COVID-Italy multicentre study. Eur J Heart Fail 2020; 22:2238–2247.
Nuzzi V, Merlo M, Specchia C, et al. The prognostic value of serial troponin measurements in patients admitted for COVID-19. ESC Heart Fail 2021; 8:3504–3511.
Levey AS, Stevens LA, Schmid CH, et al. A new equation to estimate glomerular filtration rate. Ann Intern Med 2009; 150:604–612.
Stekhoven DJ, Bühlmann P. MissForest—nonparametric missing value imputation for mixed-type data. Bioinformatics 2012; 28:112–118.
Dancelli L, Manisera M, Vezzoli M. On two classes of weighted rank correlation measures deriving from the Spearman's ρ. Studies in classification, data analysis, and knowledge organization . New York: Springer; 2013.
Salvi A, Vezzoli M, Busatto S, et al. Analysis of a nanoparticle-enriched fraction of plasma reveals miRNA candidates for Down syndrome pathogenesis. Int J Mol Med 2019; 43:2303–2318.
Codenotti S, Vezzoli M, Poliani PL, et al. Caveolin-1, Caveolin-2 and Cavin-1 are strong predictors of adipogenic differentiation in human tumors and cell lines of liposarcoma. Eur J Cell Biol 2016; 95:252–264.
Vezzoli M, Ravaggi A, Zanotti L, et al. RERT: a novel regression tree approach to predict extrauterine disease in endometrial carcinoma patients. Sci Rep 2017; 7:10528.
Carpita M, Vezzoli M. Statistical evidence of the subjective work quality: the fairness drivers of the job satisfaction. Electron J Appl Stat Anal 2012; 5:89–107.
Abate G, Vezzoli M, Polito L, et al. A conformation variant of p53 combined with machine learning identifies alzheimer disease in preclinical and prodromal stages. J Pers Med 2021; 11:1–16.
Garrafa E, Vezzoli M, Ravanelli M, et al. Early prediction of in-hospital death of COVID-19 patients: a machine-learning model based on age, blood analyses, and chest x-ray score. eLife 2021; 10:e70640.
Breiman L. Random Forests. Mach Learn 2001; 45:5–32.
Vezzoli M. Exploring the facets of overall job satisfaction through a novel ensemble learning. Electron J Appl Stat Anal 2011; 4:23–38.
Savona R, Vezzoli M. Fitting and forecasting sovereign defaults using multiple risk signals. Oxf Bull Econ Stat 2015; 77:66–92.
Azzolina D, Ileana B, Giulia B, et al. Machine learning in clinical and epidemiological research: isn’t it time for biostatisticians to work on it? Epidemiol Biostat Public Health 2019; 16:e13245-1–e13245-3.
Friedman JH. Greedy Function Approximation: a gradient boosting machine. Ann Stat 2001; 29:1189–1232.
Knight SR, Ho A, Pius R, et al. Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. BMJ 2020; 370:m3339.
Halasz G, Sperti M, Villani M, et al. A machine learning approach for mortality prediction in COVID-19 pneumonia: development and evaluation of the Piacenza score. J Med Internet Res 2021; 23:e29058.
Yuan Y, Sun C, Tang X, et al. Development and validation of a prognostic risk score system for COVID-19 inpatients: a multi-center retrospective study in China. Eng Beijing China 2020; 8:116–121.
Wiersinga WJ, Rhodes A, Cheng AC, Peacock SJ, Prescott HC. Pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (COVID-19): a review. JAMA 2020; 324:782–793.
Romero Starke K, Petereit-Haack G, Schubert M, et al. The age-related risk of severe outcomes due to COVID-19 infection: a rapid review, meta-analysis, and meta-regression. Int J Environ Res Public Health 2020; 17:E5974.
Opal SM, Girard TD, Ely EW. The immunopathogenesis of sepsis in elderly patients. Clin Infect Dis 2005; 41: (Suppl 7): S504–512.
Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention. JAMA 2020; 323:1239–1242.
Gibson PG, Qin L, Puah SH. COVID-19 acute respiratory distress syndrome (ARDS): clinical features and differences from typical pre-COVID-19 ARDS. Med J Aust 2020; 213:54.e1–56.e1.
Grasselli G, Greco M, Zanella A, et al. Risk factors associated with mortality among patients with COVID-19 in intensive care units in Lombardy, Italy. JAMA Intern Med 2020; 180:1345–1355.
Cheng Y, Luo R, Wang K, et al. Kidney disease is associated with in-hospital death of patients with COVID-19. Kidney Int 2020; 97:829–838.
Wang M, Xiong H, Chen H, Li Q, Ruan XZ. Renal injury by SARS-CoV-2 infection: a systematic review. Kidney Dis 2021; 7:100–110.
Wang D, Hu B, Hu C, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus–infected pneumonia in Wuhan, China. JAMA 2020; 323:1061–1069.
Inciardi RM, Solomon SD, Ridker PM, Metra M. Coronavirus 2019 disease (COVID-19), systemic inflammation, and cardiovascular disease. J Am Heart Assoc 2020; 9:e017756.
Nie S-F, Yu M, Xie T, et al. Cardiac troponin I is an independent predictor for mortality in hospitalized patients with COVID-19. Circulation 2020; 142:608–610.