Feature selection and transformation by machine learning reduce variable numbers and improve prediction for heart failure readmission or death.
Aged
Aged, 80 and over
Area Under Curve
Cohort Studies
Electronic Health Records
/ statistics & numerical data
Female
Heart Failure
/ mortality
Humans
Machine Learning
Male
Patient Discharge
/ statistics & numerical data
Patient Readmission
/ statistics & numerical data
Risk Factors
Western Australia
/ epidemiology
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2019
2019
Historique:
received:
29
03
2019
accepted:
08
06
2019
entrez:
27
6
2019
pubmed:
27
6
2019
medline:
25
2
2020
Statut:
epublish
Résumé
The prediction of readmission or death after a hospital discharge for heart failure (HF) remains a major challenge. Modern healthcare systems, electronic health records, and machine learning (ML) techniques allow us to mine data to select the most significant variables (allowing for reduction in the number of variables) without compromising the performance of models used for prediction of readmission and death. Moreover, ML methods based on transformation of variables may potentially further improve the performance. To use ML techniques to determine the most relevant and also transform variables for the prediction of 30-day readmission or death in HF patients. We identified all Western Australian patients aged 65 years and above admitted for HF between 2003-2008 in linked administrative data. We evaluated variables associated with HF readmission or death using standard statistical and ML based selection techniques. We also tested the new variables produced by transformation of the original variables. We developed multi-layer perceptron prediction models and compared their predictive performance using metrics such as Area Under the receiver operating characteristic Curve (AUC), sensitivity and specificity. Following hospital discharge, the proportion of 30-day readmissions or death was 23.7% in our cohort of 10,757 HF patients. The prediction model developed by us using a smaller set of variables (n = 8) had comparable performance (AUC 0.62) to the traditional model (n = 47, AUC 0.62). Transformation of the original 47 variables further improved (p<0.001) the performance of the predictive model (AUC 0.66). A small set of variables selected using ML matched the performance of the model that used the full set of 47 variables for predicting 30-day readmission or death in HF patients. Model performance can be further significantly improved by transforming the original variables using ML methods.
Sections du résumé
BACKGROUND
The prediction of readmission or death after a hospital discharge for heart failure (HF) remains a major challenge. Modern healthcare systems, electronic health records, and machine learning (ML) techniques allow us to mine data to select the most significant variables (allowing for reduction in the number of variables) without compromising the performance of models used for prediction of readmission and death. Moreover, ML methods based on transformation of variables may potentially further improve the performance.
OBJECTIVE
To use ML techniques to determine the most relevant and also transform variables for the prediction of 30-day readmission or death in HF patients.
METHODS
We identified all Western Australian patients aged 65 years and above admitted for HF between 2003-2008 in linked administrative data. We evaluated variables associated with HF readmission or death using standard statistical and ML based selection techniques. We also tested the new variables produced by transformation of the original variables. We developed multi-layer perceptron prediction models and compared their predictive performance using metrics such as Area Under the receiver operating characteristic Curve (AUC), sensitivity and specificity.
RESULTS
Following hospital discharge, the proportion of 30-day readmissions or death was 23.7% in our cohort of 10,757 HF patients. The prediction model developed by us using a smaller set of variables (n = 8) had comparable performance (AUC 0.62) to the traditional model (n = 47, AUC 0.62). Transformation of the original 47 variables further improved (p<0.001) the performance of the predictive model (AUC 0.66).
CONCLUSIONS
A small set of variables selected using ML matched the performance of the model that used the full set of 47 variables for predicting 30-day readmission or death in HF patients. Model performance can be further significantly improved by transforming the original variables using ML methods.
Identifiants
pubmed: 31242238
doi: 10.1371/journal.pone.0218760
pii: PONE-D-19-08930
pmc: PMC6594617
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0218760Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
BMJ Open. 2016 Nov 1;6(11):e014397
pubmed: 27803111
Card Fail Rev. 2017 Apr;3(1):7-11
pubmed: 28785469
Curr Opin Cardiol. 2018 Mar;33(2):190-195
pubmed: 29194052
ESC Heart Fail. 2019 Apr;6(2):428-435
pubmed: 30810291
Eur J Heart Fail. 2018 Aug;20(8):1169-1174
pubmed: 29791084
Aust N Z J Public Health. 1999 Oct;23(5):453-9
pubmed: 10575763
Clin Epidemiol. 2017 Dec 27;10:51-59
pubmed: 29343987
J Biomed Inform. 2015 Aug;56:229-38
pubmed: 26044081
JAMA Intern Med. 2013 Apr 22;173(8):632-8
pubmed: 23529115
Circ Cardiovasc Qual Outcomes. 2016 Nov;9(6):629-640
pubmed: 28263938
IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1226-38
pubmed: 16119262
BMJ Open. 2014 Sep 18;4(9):e006258
pubmed: 25234510
CMAJ. 2010 Apr 6;182(6):551-7
pubmed: 20194559
Am Heart J. 2000 Jan;139(1 Pt 1):72-7
pubmed: 10618565
BMC Cardiovasc Disord. 2014 Aug 07;14:97
pubmed: 25099997
Int J Cardiol. 2016 Aug 1;216:78-84
pubmed: 27140340