Identifying Ventricular Arrhythmias and Their Predictors by Applying Machine Learning Methods to Electronic Health Records in Patients With Hypertrophic Cardiomyopathy (HCM-VAr-Risk Model).
Cardiomyopathy, Hypertrophic
Echocardiography, Stress
Electrocardiography
Electronic Health Records
Female
Humans
Machine Learning
Magnetic Resonance Imaging, Cine
/ methods
Male
Middle Aged
Predictive Value of Tests
Prognosis
Registries
Reproducibility of Results
Retrospective Studies
Risk Assessment
/ methods
Risk Factors
Tachycardia, Ventricular
/ diagnosis
Journal
The American journal of cardiology
ISSN: 1879-1913
Titre abrégé: Am J Cardiol
Pays: United States
ID NLM: 0207277
Informations de publication
Date de publication:
15 05 2019
15 05 2019
Historique:
received:
13
11
2018
revised:
06
02
2019
accepted:
11
02
2019
pubmed:
7
4
2019
medline:
17
1
2020
entrez:
7
4
2019
Statut:
ppublish
Résumé
Clinical risk stratification for sudden cardiac death (SCD) in hypertrophic cardiomyopathy (HC) employs rules derived from American College of Cardiology Foundation/American Heart Association (ACCF/AHA) guidelines or the HCM Risk-SCD model (C-index ∼0.69), which utilize a few clinical variables. We assessed whether data-driven machine learning methods that consider a wider range of variables can effectively identify HC patients with ventricular arrhythmias (VAr) that lead to SCD. We scanned the electronic health records of 711 HC patients for sustained ventricular tachycardia or ventricular fibrillation. Patients with ventricular tachycardia or ventricular fibrillation (n = 61) were tagged as VAr cases and the remaining (n = 650) as non-VAr. The 2-sample ttest and information gain criterion were used to identify the most informative clinical variables that distinguish VAr from non-VAr; patient records were reduced to include only these variables. Data imbalance stemming from low number of VAr cases was addressed by applying a combination of over- and undersampling strategies. We trained and tested multiple classifiers under this sampling approach, showing effective classification. We evaluated 93 clinical variables, of which 22 proved predictive of VAr. The ensemble of logistic regression and naïve Bayes classifiers, trained based on these 22 variables and corrected for data imbalance, was most effective in separating VAr from non-VAr cases (sensitivity = 0.73, specificity = 0.76, C-index = 0.83). Our method (HCM-VAr-Risk Model) identified 12 new predictors of VAr, in addition to 10 established SCD predictors. In conclusion, this is the first application of machine learning for identifying HC patients with VAr, using clinical attributes. Our model demonstrates good performance (C-index) compared with currently employed SCD prediction algorithms, while addressing imbalance inherent in clinical data.
Identifiants
pubmed: 30952382
pii: S0002-9149(19)30227-9
doi: 10.1016/j.amjcard.2019.02.022
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
1681-1689Subventions
Organisme : NLM NIH HHS
ID : R01 LM011945
Pays : United States
Informations de copyright
Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved.