A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example.
data science
heart failure
missing data
narrow dataset cardinality
random shuffle
Journal
Frontiers in cardiovascular medicine
ISSN: 2297-055X
Titre abrégé: Front Cardiovasc Med
Pays: Switzerland
ID NLM: 101653388
Informations de publication
Date de publication:
2020
2020
Historique:
received:
28
08
2020
accepted:
19
10
2020
entrez:
17
12
2020
pubmed:
18
12
2020
medline:
18
12
2020
Statut:
epublish
Résumé
Heart failure (HF) affects at least 26 million people worldwide, so predicting adverse events in HF patients represents a major target of clinical data science. However, achieving large sample sizes sometimes represents a challenge due to difficulties in patient recruiting and long follow-up times, increasing the problem of missing data. To overcome the issue of a narrow dataset cardinality (in a clinical dataset, the cardinality is the number of patients in that dataset), population-enhancing algorithms are therefore crucial. The aim of this study was to design a random shuffle method to enhance the cardinality of an HF dataset while it is statistically legitimate, without the need of specific hypotheses and regression models. The cardinality enhancement was validated against an established random repeated-measures method with regard to the correctness in predicting clinical conditions and endpoints. In particular, machine learning and regression models were employed to highlight the benefits of the enhanced datasets. The proposed random shuffle method was able to enhance the HF dataset cardinality (711 patients before dataset preprocessing) circa 10 times and circa 21 times when followed by a random repeated-measures approach. We believe that the random shuffle method could be used in the cardiovascular field and in other data science problems when missing data and the narrow dataset cardinality represent an issue.
Identifiants
pubmed: 33330661
doi: 10.3389/fcvm.2020.599923
pmc: PMC7714902
doi:
Types de publication
Journal Article
Langues
eng
Pagination
599923Informations de copyright
Copyright © 2020 Fassina, Faragli, Lo Muzio, Kelle, Campana, Pieske, Edelmann and Alogna.
Références
Card Fail Rev. 2017 Apr;3(1):7-11
pubmed: 28785469
Eur J Prev Cardiol. 2012 Aug;19(2 Suppl):7-13
pubmed: 22801064
Eur Heart J. 2013 May;34(19):1404-13
pubmed: 23095984
Stat Med. 2011 Feb 20;30(4):377-99
pubmed: 21225900
Stat Med. 2009 Jul 10;28(15):1982-98
pubmed: 19452569
Eur J Heart Fail. 2010 Aug;12(8):874-82
pubmed: 20538867
Int J Appl Basic Med Res. 2014 Sep;4(Suppl 1):S2-5
pubmed: 25298936