Tailoring the Nutritional Composition of Italian Foods to the US Nutrition5k Dataset for Food Image Recognition: Challenges and a Comparative Analysis.

Italy Nutritive Value United States Humans Databases, Factual Food Analysis / methods Machine Learning Food Nutrients / analysis

database harmonization dish images food composition database food matching manual data curation missing imputation nutrition nutritional composition of foods “Nutrition5k” dataset

Journal

Nutrients

ISSN: 2072-6643

Titre abrégé: Nutrients

Pays: Switzerland

ID NLM: 101521595

Informations de publication

Date de publication:
01 Oct 2024

Historique:

received: 01 08 2024

revised: 23 09 2024

accepted: 27 09 2024

medline: 16 10 2024

pubmed: 16 10 2024

entrez: 16 10 2024

Statut: epublish

Résumé

Training of machine learning algorithms on dish images collected in other countries requires possible sources of systematic discrepancies, including country-specific food composition databases (FCDBs), to be tackled. The US Nutrition5k project provides for ~5000 dish images and related dish- and ingredient-level information on mass, energy, and macronutrients from the US FCDB. The aim of this study is to (1) identify challenges/solutions in linking the nutritional composition of Italian foods with food images from Nutrition5k and (2) assess potential differences in nutrient content estimated across the Italian and US FCDBs and their determinants. After food matching, expert data curation, and handling of missing values, dish-level ingredients from Nutrition5k were integrated with the Italian-FCDB-specific nutritional composition (86 components); dish-specific nutrient content was calculated by summing the corresponding ingredient-specific nutritional values. Measures of agreement/difference were calculated between Italian- and US-FCDB-specific content of energy and macronutrients. Potential determinants of identified differences were investigated with multiple robust regression models. Dishes showed a median mass of 145 g and included three ingredients in median. Energy, proteins, fats, and carbohydrates showed moderate-to-strong agreement between Italian- and US-FCDB-specific content; carbohydrates showed the worst performance, with the Italian FCDB providing smaller median values (median raw difference between the Italian and US FCDBs: -2.10 g). Regression models on dishes suggested a role for mass, number of ingredients, and presence of recreated recipes, alone or jointly with differential use of raw/cooked ingredients across the two FCDBs. In the era of machine learning approaches for food image recognition, manual data curation in the alignment of FCDBs is worth the effort.

Sections du résumé

BACKGROUND BACKGROUND

METHODS METHODS

After food matching, expert data curation, and handling of missing values, dish-level ingredients from Nutrition5k were integrated with the Italian-FCDB-specific nutritional composition (86 components); dish-specific nutrient content was calculated by summing the corresponding ingredient-specific nutritional values. Measures of agreement/difference were calculated between Italian- and US-FCDB-specific content of energy and macronutrients. Potential determinants of identified differences were investigated with multiple robust regression models.

RESULTS RESULTS

Dishes showed a median mass of 145 g and included three ingredients in median. Energy, proteins, fats, and carbohydrates showed moderate-to-strong agreement between Italian- and US-FCDB-specific content; carbohydrates showed the worst performance, with the Italian FCDB providing smaller median values (median raw difference between the Italian and US FCDBs: -2.10 g). Regression models on dishes suggested a role for mass, number of ingredients, and presence of recreated recipes, alone or jointly with differential use of raw/cooked ingredients across the two FCDBs.

CONCLUSIONS CONCLUSIONS

In the era of machine learning approaches for food image recognition, manual data curation in the alignment of FCDBs is worth the effort.

Identifiants

DOI: 10.3390/nu16193339 PMID: 39408306

pubmed: 39408306

pii: nu16193339

doi: 10.3390/nu16193339

pii:

doi:

Types de publication

Journal Article Comparative Study

Langues

eng

Sous-ensembles de citation

Subventions

Organisme : Ministero dell'Istruzione e del Merito

ID : PRIN 20227YCB5P

Tailoring the Nutritional Composition of Italian Foods to the US Nutrition5k Dataset for Food Image Recognition: Challenges and a Comparative Analysis.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Subventions

Auteurs

Rachele Bianco (R)

Michela Marinoni (M)

Sergio Coluccia (S)

Giulia Carioni (G)

Federica Fiori (F)

Patrizia Gnagnarella (P)

Valeria Edefonti (V)

Maria Parpinel (M)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH