Improving mixed-integer temporal modeling by generating synthetic data using conditional generative adversarial networks: A case study of fluid overload prediction in the intensive care unit.

Critical care Fluid overload GAN Machine learning Mixed-integer temporal modeling Synthetic data

Journal

Computers in biology and medicine
ISSN: 1879-0534
Titre abrégé: Comput Biol Med
Pays: United States
ID NLM: 1250250

Informations de publication

Date de publication:
22 Nov 2023
Historique:
received: 16 07 2023
revised: 29 10 2023
accepted: 20 11 2023
medline: 28 11 2023
pubmed: 28 11 2023
entrez: 27 11 2023
Statut: aheadofprint

Résumé

The challenge of mixed-integer temporal data, which is particularly prominent for medication use in the critically ill, limits the performance of predictive models. The purpose of this evaluation was to pilot test integrating synthetic data within an existing dataset of complex medication data to improve machine learning model prediction of fluid overload. This retrospective cohort study evaluated patients admitted to an ICU ≥ 72 h. Four machine learning algorithms to predict fluid overload after 48-72 h of ICU admission were developed using the original dataset. Then, two distinct synthetic data generation methodologies (synthetic minority over-sampling technique (SMOTE) and conditional tabular generative adversarial network (CTGAN)) were used to create synthetic data. Finally, a stacking ensemble technique designed to train a meta-learner was established. Models underwent training in three scenarios of varying qualities and quantities of datasets. Training machine learning algorithms on the combined synthetic and original dataset overall increased the performance of the predictive models compared to training on the original dataset. The highest performing model was the meta-model trained on the combined dataset with 0.83 AUROC while it managed to significantly enhance the sensitivity across different training scenarios. The integration of synthetically generated data is the first time such methods have been applied to ICU medication data and offers a promising solution to enhance the performance of machine learning models for fluid overload, which may be translated to other ICU outcomes. A meta-learner was able to make a trade-off between different performance metrics and improve the ability to identify the minority class.

Identifiants

pubmed: 38011778
pii: S0010-4825(23)01214-3
doi: 10.1016/j.compbiomed.2023.107749
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

107749

Commentaires et corrections

Type : UpdateOf

Informations de copyright

Copyright © 2023 Elsevier Ltd. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Auteurs

Alireza Rafiei (A)

Department of Computer Science and Informatics, Emory University, Ste. W302, 400 Dowman Dr., Atlanta, GA, 30322, USA. Electronic address: alireza.rafiei@emory.edu.

Milad Ghiasi Rad (M)

Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA. Electronic address: mghias2@emory.edu.

Andrea Sikora (A)

University of Georgia College of Pharmacy, Department of Clinical and Administrative Pharmacy, Augusta, GA, USA. Electronic address: sikora@uga.edu.

Rishikesan Kamaleswaran (R)

Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA; Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, USA. Electronic address: rkamaleswaran@emory.edu.

Classifications MeSH