A methodological comparison of risk scores versus decision trees for predicting drug-resistant infections: A case study using extended-spectrum beta-lactamase (ESBL) bacteremia.
Bacteremia
/ drug therapy
Baltimore
/ epidemiology
Cohort Studies
Decision Trees
Drug Resistance, Multiple, Bacterial
Escherichia coli
Escherichia coli Infections
/ drug therapy
Hospitals, University
Humans
Klebsiella
Klebsiella Infections
/ diet therapy
Logistic Models
Risk Assessment
/ methods
beta-Lactamases
Journal
Infection control and hospital epidemiology
ISSN: 1559-6834
Titre abrégé: Infect Control Hosp Epidemiol
Pays: United States
ID NLM: 8804099
Informations de publication
Date de publication:
04 2019
04 2019
Historique:
pubmed:
5
3
2019
medline:
10
3
2020
entrez:
5
3
2019
Statut:
ppublish
Résumé
Timely identification of multidrug-resistant gram-negative infections remains an epidemiological challenge. Statistical models for predicting drug resistance can offer utility where rapid diagnostics are unavailable or resource-impractical. Logistic regression-derived risk scores are common in the healthcare epidemiology literature. Machine learning-derived decision trees are an alternative approach for developing decision support tools. Our group previously reported on a decision tree for predicting ESBL bloodstream infections. Our objective in the current study was to develop a risk score from the same ESBL dataset to compare these 2 methods and to offer general guiding principles for using each approach. Using a dataset of 1,288 patients with Escherichia coli or Klebsiella spp bacteremia, we generated a risk score to predict the likelihood that a bacteremic patient was infected with an ESBL-producer. We evaluated discrimination (original and cross-validated models) using receiver operating characteristic curves and C statistics. We compared risk score and decision tree performance, and we reviewed their practical and methodological attributes. In total, 194 patients (15%) were infected with ESBL-producing bacteremia. The clinical risk score included 14 variables, compared to the 5 decision-tree variables. The positive and negative predictive values of the risk score and decision tree were similar (>90%), but the C statistic of the risk score (0.87) was 10% higher. A decision tree and risk score performed similarly for predicting ESBL infection. The decision tree was more user-friendly, with fewer variables for the end user, whereas the risk score offered higher discrimination and greater flexibility for adjusting sensitivity and specificity.
Sections du résumé
BACKGROUND
Timely identification of multidrug-resistant gram-negative infections remains an epidemiological challenge. Statistical models for predicting drug resistance can offer utility where rapid diagnostics are unavailable or resource-impractical. Logistic regression-derived risk scores are common in the healthcare epidemiology literature. Machine learning-derived decision trees are an alternative approach for developing decision support tools. Our group previously reported on a decision tree for predicting ESBL bloodstream infections. Our objective in the current study was to develop a risk score from the same ESBL dataset to compare these 2 methods and to offer general guiding principles for using each approach.
METHODS
Using a dataset of 1,288 patients with Escherichia coli or Klebsiella spp bacteremia, we generated a risk score to predict the likelihood that a bacteremic patient was infected with an ESBL-producer. We evaluated discrimination (original and cross-validated models) using receiver operating characteristic curves and C statistics. We compared risk score and decision tree performance, and we reviewed their practical and methodological attributes.
RESULTS
In total, 194 patients (15%) were infected with ESBL-producing bacteremia. The clinical risk score included 14 variables, compared to the 5 decision-tree variables. The positive and negative predictive values of the risk score and decision tree were similar (>90%), but the C statistic of the risk score (0.87) was 10% higher.
CONCLUSIONS
A decision tree and risk score performed similarly for predicting ESBL infection. The decision tree was more user-friendly, with fewer variables for the end user, whereas the risk score offered higher discrimination and greater flexibility for adjusting sensitivity and specificity.
Identifiants
pubmed: 30827286
pii: S0899823X19000175
doi: 10.1017/ice.2019.17
doi:
Substances chimiques
beta-Lactamases
EC 3.5.2.6
Types de publication
Comparative Study
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
400-407Subventions
Organisme : AHRQ HHS
ID : R36 HS025089
Pays : United States
Organisme : NIAID NIH HHS
ID : K23 AI127935
Pays : United States