A synthetic dataset of liver disorder patients.

Bayesian network Causal model Dataset shift Machine learning Synthetic patients

Journal

Data in brief
ISSN: 2352-3409
Titre abrégé: Data Brief
Pays: Netherlands
ID NLM: 101654995

Informations de publication

Date de publication:
Apr 2023
Historique:
received: 20 12 2022
revised: 10 01 2023
accepted: 16 01 2023
entrez: 7 2 2023
pubmed: 8 2 2023
medline: 8 2 2023
Statut: epublish

Résumé

The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a publicly available Bayesian network describing a casual model for liver disorders. By varying the network parameters, we also generated an additional set of 500 patients with characteristics that deviated from the initial patient population. We provide an overview of the synthetic data generation process and the associated scripts for generating the cohorts. This dataset can be useful for the machine learning models training and validation, especially under the effect of dataset shift between training and testing sets.

Identifiants

pubmed: 36747982
doi: 10.1016/j.dib.2023.108921
pii: S2352-3409(23)00039-2
pmc: PMC9898618
doi:

Types de publication

Journal Article

Langues

eng

Pagination

108921

Informations de copyright

© 2023 The Authors. Published by Elsevier Inc.

Déclaration de conflit d'intérêts

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: GN is a full employee of enGenome srl.

Références

Artif Intell Med. 2015 Sep;65(1):61-73
pubmed: 26265491
Front Med (Lausanne). 2020 Feb 05;7:27
pubmed: 32118012
J Biomed Inform. 2022 Mar;127:103996
pubmed: 35041981
Artif Intell Med. 2023 Jan;135:102471
pubmed: 36628785

Auteurs

Giovanna Nicora (G)

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.
enGenome Srl, Italy.

Tommaso Mario Buonocore (TM)

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.

Enea Parimbelli (E)

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.
Telfer School of Management, University of Ottawa, Ottawa, ON, Canada.

Classifications MeSH