Impact of the Choice of Cross-Validation Techniques on the Results of Machine Learning-Based Diagnostic Applications.

Data Analysis Diagnosis Machine Learning Parkinson Disease Statistical Models

Journal

Healthcare informatics research
ISSN: 2093-3681
Titre abrégé: Healthc Inform Res
Pays: Korea (South)
ID NLM: 101534553

Informations de publication

Date de publication:
Jul 2021
Historique:
received: 02 03 2021
accepted: 28 06 2021
entrez: 13 8 2021
pubmed: 14 8 2021
medline: 14 8 2021
Statut: ppublish

Résumé

With advances in data availability and computing capabilities, artificial intelligence and machine learning technologies have evolved rapidly in recent years. Researchers have taken advantage of these developments in healthcare informatics and created reliable tools to predict or classify diseases using machine learning-based algorithms. To correctly quantify the performance of those algorithms, the standard approach is to use cross-validation, where the algorithm is trained on a training set, and its performance is measured on a validation set. Both datasets should be subject-independent to simulate the expected behavior of a clinical study. This study compares two cross-validation strategies, the subject-wise and the record-wise techniques; the subject-wise strategy correctly mimics the process of a clinical study, while the record-wise strategy does not. We started by creating a dataset of smartphone audio recordings of subjects diagnosed with and without Parkinson's disease. This dataset was then divided into training and holdout sets using subject-wise and the record-wise divisions. The training set was used to measure the performance of two classifiers (support vector machine and random forest) to compare six cross-validation techniques that simulated either the subject-wise process or the record-wise process. The holdout set was used to calculate the true error of the classifiers. The record-wise division and the record-wise cross-validation techniques overestimated the performance of the classifiers and underestimated the classification error. In a diagnostic scenario, the subject-wise technique is the proper way of estimating a model's performance, and record-wise techniques should be avoided.

Identifiants

pubmed: 34384201
pii: hir.2021.27.3.189
doi: 10.4258/hir.2021.27.3.189
pmc: PMC8369053
doi:

Types de publication

Journal Article

Langues

eng

Pagination

189-199

Références

Gigascience. 2017 May 1;6(5):1-9
pubmed: 28327985
Sci Data. 2016 Mar 03;3:160011
pubmed: 26938265
J Chem Inf Comput Sci. 2003 Mar-Apr;43(2):579-86
pubmed: 12653524
Int J Biostat. 2009 Jan 06;5(1):Article 1
pubmed: 20231866
Health Rep. 2014 Nov;25(11):10-4
pubmed: 25408491
PLoS One. 2015 Dec 11;10(12):e0144610
pubmed: 26656189
Digit Biomark. 2018 Jan 31;2(1):11-30
pubmed: 29938250

Auteurs

Ilias Tougui (I)

Electronic Systems Sensors and Nanobiotechnologies (E2SN), ENSAM, Mohammed V University in Rabat, Morocco.

Abdelilah Jilbab (A)

Electronic Systems Sensors and Nanobiotechnologies (E2SN), ENSAM, Mohammed V University in Rabat, Morocco.

Jamal El Mhamdi (JE)

Electronic Systems Sensors and Nanobiotechnologies (E2SN), ENSAM, Mohammed V University in Rabat, Morocco.

Classifications MeSH