Early diagnosis of candidemia with explainable machine learning on automatically extracted laboratory and microbiological data: results of the AUTO-CAND project.


Journal

Annals of medicine
ISSN: 1365-2060
Titre abrégé: Ann Med
Pays: England
ID NLM: 8906388

Informations de publication

Date de publication:
2023
Historique:
medline: 29 11 2023
pubmed: 27 11 2023
entrez: 27 11 2023
Statut: ppublish

Résumé

Candidemia is associated with a heavy burden of morbidity and mortality in hospitalized patients. The availability of blood culture results could require up to 48-72 h after blood draw; thus, early treatment decisions are made in the absence of a definite diagnosis. In this retrospective study, we assessed the performance of different supervised machine learning algorithms for the early differential diagnosis of candidemia and bacteremia in adult patients on a large dataset automatically extracted within the AUTO-CAND project. Overall, 12,483 episodes of candidemia (1275; 10%) or bacteremia (11,208; 90%) were included in the analysis. A random forest classifier achieved the best diagnostic performance for candidemia, with sensitivity 0.98 and specificity 0.65 on the training set (true skill statistic [TSS] = 0.63) and sensitivity 0.74 and specificity 0.57 on the test set (TSS = 0.31). Then, the random classifier was trained in the subgroup of patients with available serum β-D-glucan (BDG) and procalcitonin (PCT) values by exploiting the feature ranking learned in the entire dataset. Although no statistically significant differences were observed from the performance measures obtained by employing BDG and PCT alone, the performance measures of the classifier that included the features selected in the entire dataset, plus BDG and PCT, were the highest in most cases. Random forest classifiers trained on large datasets of automatically extracted data have the potential to improve current diagnostic algorithms for candidemia. However, further development through implementation of automatically extracted clinical features may be necessary to achieve crucial improvements.

Sections du résumé

BACKGROUND UNASSIGNED
Candidemia is associated with a heavy burden of morbidity and mortality in hospitalized patients. The availability of blood culture results could require up to 48-72 h after blood draw; thus, early treatment decisions are made in the absence of a definite diagnosis.
METHODS UNASSIGNED
In this retrospective study, we assessed the performance of different supervised machine learning algorithms for the early differential diagnosis of candidemia and bacteremia in adult patients on a large dataset automatically extracted within the AUTO-CAND project.
RESULTS UNASSIGNED
Overall, 12,483 episodes of candidemia (1275; 10%) or bacteremia (11,208; 90%) were included in the analysis. A random forest classifier achieved the best diagnostic performance for candidemia, with sensitivity 0.98 and specificity 0.65 on the training set (true skill statistic [TSS] = 0.63) and sensitivity 0.74 and specificity 0.57 on the test set (TSS = 0.31). Then, the random classifier was trained in the subgroup of patients with available serum β-D-glucan (BDG) and procalcitonin (PCT) values by exploiting the feature ranking learned in the entire dataset. Although no statistically significant differences were observed from the performance measures obtained by employing BDG and PCT alone, the performance measures of the classifier that included the features selected in the entire dataset, plus BDG and PCT, were the highest in most cases.
CONCLUSIONS UNASSIGNED
Random forest classifiers trained on large datasets of automatically extracted data have the potential to improve current diagnostic algorithms for candidemia. However, further development through implementation of automatically extracted clinical features may be necessary to achieve crucial improvements.

Identifiants

pubmed: 38010342
doi: 10.1080/07853890.2023.2285454
doi:

Substances chimiques

Procalcitonin 0
beta-Glucans 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

2285454

Auteurs

Daniele Roberto Giacobbe (DR)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Cristina Marelli (C)

Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Sara Mora (S)

Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy.

Sabrina Guastavino (S)

Department of Mathematics (DIMA), University of Genoa, Genoa, Italy.

Chiara Russo (C)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Giorgia Brucci (G)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Alessandro Limongelli (A)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Antonio Vena (A)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Malgorzata Mikulska (M)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Maryam Tayefi (M)

Norwegian Centre for E-Health Research, Tromsø, Norway.

Stefano Peluso (S)

Department of Statistics and Quantitative Methods, University of Milan - Bicocca, Milan, Italy.

Alessio Signori (A)

Section of Biostatistics, Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.

Antonio Di Biagio (A)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Anna Marchese (A)

Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy.
Microbiology Unit, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Cristina Campi (C)

Department of Mathematics (DIMA), University of Genoa, Genoa, Italy.
Life Science Computational Laboratory (LISCOMP), IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Mauro Giacomini (M)

Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy.

Matteo Bassetti (M)

Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy.
Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH