Are Machine Learning Algorithms More Accurate in Predicting Vegetable and Fruit Consumption Than Traditional Statistical Models? An Exploratory Analysis.
artificial intelligence
dietary behaviour
machine learning
nutrition
prediction
statistical models
Journal
Frontiers in nutrition
ISSN: 2296-861X
Titre abrégé: Front Nutr
Pays: Switzerland
ID NLM: 101642264
Informations de publication
Date de publication:
2022
2022
Historique:
received:
13
07
2021
accepted:
25
01
2022
entrez:
7
3
2022
pubmed:
8
3
2022
medline:
8
3
2022
Statut:
epublish
Résumé
Machine learning (ML) algorithms may help better understand the complex interactions among factors that influence dietary choices and behaviors. The aim of this study was to explore whether ML algorithms are more accurate than traditional statistical models in predicting vegetable and fruit (VF) consumption. A large array of features (2,452 features from 525 variables) encompassing individual and environmental information related to dietary habits and food choices in a sample of 1,147 French-speaking adult men and women was used for the purpose of this study. Adequate VF consumption, which was defined as 5 servings/d or more, was measured by averaging data from three web-based 24 h recalls and used as the outcome to predict. Nine classification ML algorithms were compared to two traditional statistical predictive models, logistic regression and penalized regression (Lasso). The performance of the predictive ML algorithms was tested after the implementation of adjustments, including normalizing the data, as well as in a series of sensitivity analyses such as using VF consumption obtained from a web-based food frequency questionnaire (wFFQ) and applying a feature selection algorithm in an attempt to reduce overfitting. Logistic regression and Lasso predicted adequate VF consumption with an accuracy of 0.64 (95% confidence interval [CI]: 0.58-0.70) and 0.64 (95%CI: 0.60-0.68) respectively. Among the ML algorithms tested, the most accurate algorithms to predict adequate VF consumption were the support vector machine (SVM) with either a radial basis kernel or a sigmoid kernel, both with an accuracy of 0.65 (95%CI: 0.59-0.71). The least accurate ML algorithm was the SVM with a linear kernel with an accuracy of 0.55 (95%CI: 0.49-0.61). Using dietary intake data from the wFFQ and applying a feature selection algorithm had little to no impact on the performance of the algorithms. In summary, ML algorithms and traditional statistical models predicted adequate VF consumption with similar accuracies among adults. These results suggest that additional research is needed to explore further the true potential of ML in predicting dietary behaviours that are determined by complex interactions among several individual, social and environmental factors.
Identifiants
pubmed: 35252288
doi: 10.3389/fnut.2022.740898
pmc: PMC8891134
doi:
Types de publication
Journal Article
Langues
eng
Pagination
740898Informations de copyright
Copyright © 2022 Côté, Osseni, Brassard, Carbonneau, Robitaille, Vohl, Lemieux, Laviolette and Lamarche.
Déclaration de conflit d'intérêts
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Références
Lancet Digit Health. 2020 Dec;2(12):e677-e680
pubmed: 33328030
JAMA. 2018 Apr 3;319(13):1317-1318
pubmed: 29532063
Lancet. 2019 May 11;393(10184):1958-1972
pubmed: 30954305
Shanghai Arch Psychiatry. 2015 Apr 25;27(2):130-5
pubmed: 26120265
BMJ Open. 2019 Nov 28;9(11):e032703
pubmed: 31784446
BMC Pregnancy Childbirth. 2018 Aug 15;18(1):333
pubmed: 30111303
JAMA. 2020 Feb 11;323(6):509-510
pubmed: 31845963
Diagn Progn Res. 2020 Jun 4;4:6
pubmed: 32607451
Circ Cardiovasc Qual Outcomes. 2016 Nov;9(6):629-640
pubmed: 28263938
Eur J Clin Nutr. 2012 Feb;66(2):166-73
pubmed: 21934698
J Crit Care. 2019 Dec;54:110-116
pubmed: 31408805
Ann N Y Acad Sci. 2017 Jan;1387(1):44-53
pubmed: 27750378
J Clin Epidemiol. 2019 Jun;110:12-22
pubmed: 30763612
Eur Heart J. 2017 Jun 14;38(23):1805-1814
pubmed: 27436868
Nutr Rev. 2007 Apr;65(4):155-66
pubmed: 17503710
J Clin Epidemiol. 2020 Jun;122:56-69
pubmed: 32169597
Nutrients. 2016 Nov 15;8(11):
pubmed: 27854276
Public Health Nutr. 2018 Oct;21(15):2744-2752
pubmed: 29976261
J Clin Epidemiol. 2020 Jun;122:95-107
pubmed: 32201256
Can J Cardiol. 2018 Dec;34(12):1665-1673
pubmed: 30527156
Clin Infect Dis. 2018 Jan 6;66(1):149-153
pubmed: 29020316
Crit Care Med. 2016 Feb;44(2):368-74
pubmed: 26771782
Am J Gastroenterol. 2013 Nov;108(11):1723-30
pubmed: 24169273
Nutrition. 2019 Jan;57:252-256
pubmed: 30195246
BMC Med Res Methodol. 2014 Dec 22;14:137
pubmed: 25532820
Chaos Solitons Fractals. 2020 Oct;139:110055
pubmed: 32834608
Health Rep. 2009 Sep;20(3):41-52
pubmed: 19813438
BMC Nutr. 2017 Apr 5;3:34
pubmed: 32153814
J Allergy Clin Immunol. 2018 Jun;141(6):2019-2021.e1
pubmed: 29518424