Explainable artificial intelligence for genotype-to-phenotype prediction in plant breeding: a case study with a dataset from an almond germplasm collection.

almond explainable artificial intelligence genotype-phenotype prediction machine learning shelling fraction

Journal

Frontiers in plant science
ISSN: 1664-462X
Titre abrégé: Front Plant Sci
Pays: Switzerland
ID NLM: 101568200

Informations de publication

Date de publication:
2024
Historique:
received: 17 05 2024
accepted: 13 08 2024
medline: 25 9 2024
pubmed: 25 9 2024
entrez: 25 9 2024
Statut: epublish

Résumé

Advances in DNA sequencing revolutionized plant genomics and significantly contributed to the study of genetic diversity. However, predicting phenotypes from genomic data remains a challenge, particularly in the context of plant breeding. Despite significant progress, accurately predicting phenotypes from high-dimensional genomic data remains a challenge, particularly in identifying the key genetic factors influencing these predictions. This study aims to bridge this gap by integrating explainable artificial intelligence (XAI) techniques with advanced machine learning models. This approach is intended to enhance both the predictive accuracy and interpretability of genotype-to-phenotype models, thereby improving their reliability and supporting more informed breeding decisions. This study compares several ML methods for genotype-to-phenotype prediction, using data available from an almond germplasm collection. After preprocessing and feature selection, regression models are employed to predict almond shelling fraction. Best predictions were obtained by the Random Forest method (correlation = 0.727 ± 0.020, an Employing explainable artificial intelligence algorithms enhances model interpretability, identifying genetic polymorphisms associated with the shelling percentage. These findings underscore XAI's efficacy in predicting phenotypic traits from genomic data, highlighting its significance in optimizing crop production for sustainable agriculture.

Sections du résumé

Background UNASSIGNED
Advances in DNA sequencing revolutionized plant genomics and significantly contributed to the study of genetic diversity. However, predicting phenotypes from genomic data remains a challenge, particularly in the context of plant breeding. Despite significant progress, accurately predicting phenotypes from high-dimensional genomic data remains a challenge, particularly in identifying the key genetic factors influencing these predictions. This study aims to bridge this gap by integrating explainable artificial intelligence (XAI) techniques with advanced machine learning models. This approach is intended to enhance both the predictive accuracy and interpretability of genotype-to-phenotype models, thereby improving their reliability and supporting more informed breeding decisions.
Results UNASSIGNED
This study compares several ML methods for genotype-to-phenotype prediction, using data available from an almond germplasm collection. After preprocessing and feature selection, regression models are employed to predict almond shelling fraction. Best predictions were obtained by the Random Forest method (correlation = 0.727 ± 0.020, an
Conclusions UNASSIGNED
Employing explainable artificial intelligence algorithms enhances model interpretability, identifying genetic polymorphisms associated with the shelling percentage. These findings underscore XAI's efficacy in predicting phenotypic traits from genomic data, highlighting its significance in optimizing crop production for sustainable agriculture.

Identifiants

pubmed: 39319003
doi: 10.3389/fpls.2024.1434229
pmc: PMC11420924
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1434229

Informations de copyright

Copyright © 2024 Novielli, Romano, Pavan, Losciale, Stellacci, Diacono, Bellotti and Tangaro.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Auteurs

Pierfrancesco Novielli (P)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Donato Romano (D)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Stefano Pavan (S)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Pasquale Losciale (P)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Anna Maria Stellacci (AM)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Domenico Diacono (D)

Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Roberto Bellotti (R)

Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.
Dipartimento Interateneo di Fisica "M. Merlin", Università degli Studi di Bari Aldo Moro, Bari, Italy.

Sabina Tangaro (S)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Classifications MeSH