Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP.

general combining ability genomic prediction gradient boosting hybrid breeding hybrid prediction machine learning specific combining ability stacked ensembles

Journal

Frontiers in plant science
ISSN: 1664-462X
Titre abrégé: Front Plant Sci
Pays: Switzerland
ID NLM: 101568200

Informations de publication

Date de publication:
2023
Historique:
received: 03 03 2023
accepted: 26 06 2023
medline: 7 8 2023
pubmed: 7 8 2023
entrez: 7 8 2023
Statut: epublish

Résumé

Testcross factorials in newly established hybrid breeding programs are often highly unbalanced, incomplete, and characterized by predominance of special combining ability (SCA) over general combining ability (GCA). This results in a low efficiency of GCA-based selection. Machine learning algorithms might improve prediction of hybrid performance in such testcross factorials, as they have been successfully applied to find complex underlying patterns in sparse data. Our objective was to compare the prediction accuracy of machine learning algorithms to that of GCA-based prediction and genomic best linear unbiased prediction (GBLUP) in six unbalanced incomplete factorials from hybrid breeding programs of rapeseed, wheat, and corn. We investigated a range of machine learning algorithms with three different types of predictor variables: (a) information on parentage of hybrids, (b) in addition hybrid performance of crosses of the parental lines with other crossing partners, and (c) genotypic marker data. In two highly incomplete and unbalanced factorials from rapeseed, in which the SCA variance contributed considerably to the genetic variance, stacked ensembles of gradient boosting machines based on parentage information outperformed GCA prediction. The stacked ensembles increased prediction accuracy from 0.39 to 0.45, and from 0.48 to 0.54 compared to GCA prediction. The prediction accuracy reached by stacked ensembles without marker data reached values comparable to those of GBLUP that requires marker data. We conclude that hybrid prediction with stacked ensembles of gradient boosting machines based on parentage information is a promising approach that is worth further investigations with other data sets in which SCA variance is high.

Identifiants

pubmed: 37546247
doi: 10.3389/fpls.2023.1178902
pmc: PMC10401275
doi:

Banques de données

Dryad
['10.5061/dryad.461nc']

Types de publication

Journal Article

Langues

eng

Pagination

1178902

Informations de copyright

Copyright © 2023 Heilmann, Frisch, Abbadi, Kox and Herzog.

Déclaration de conflit d'intérêts

Authors AA and TK were employed by the company NPZ Innovation GmbH. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Front Plant Sci. 2021 Nov 11;12:699589
pubmed: 34880880
PLoS One. 2016 Jun 06;11(6):e0156744
pubmed: 27271781
Theor Appl Genet. 2021 Dec;134(12):3997-4011
pubmed: 34448888
Stat Appl Genet Mol Biol. 2007;6:Article25
pubmed: 17910531
Front Genet. 2018 Jul 04;9:237
pubmed: 30023001
Bioinformatics. 2019 Oct 15;35(20):4045-4052
pubmed: 30977782
Theor Appl Genet. 2012 Oct;125(6):1181-94
pubmed: 22733443
G3 (Bethesda). 2018 Dec 10;8(12):3829-3840
pubmed: 30291108
G3 (Bethesda). 2018 Jan 4;8(1):131-147
pubmed: 29097376
Front Plant Sci. 2019 May 22;10:621
pubmed: 31191564
Planta. 2018 Nov;248(5):1307-1318
pubmed: 30101399
Genetics. 2014 Aug;197(4):1343-55
pubmed: 24850820
Front Artif Intell. 2022 May 20;5:876578
pubmed: 35669178
G3 (Bethesda). 2019 Nov 5;9(11):3691-3702
pubmed: 31533955
Theor Appl Genet. 2011 Jul;123(2):339-50
pubmed: 21505832
Front Plant Sci. 2022 Mar 07;13:845524
pubmed: 35321444
G3 (Bethesda). 2012 Nov;2(11):1405-13
pubmed: 23173092
Plant Methods. 2019 Aug 21;15:98
pubmed: 31452674
Front Plant Sci. 2016 Sep 22;7:1419
pubmed: 27713752
BMC Genomics. 2021 Jan 06;22(1):19
pubmed: 33407114
Nat Plants. 2022 May;8(5):463-473
pubmed: 35513713
Genetics. 2018 Apr;208(4):1373-1385
pubmed: 29363551
Front Genet. 2021 Mar 04;12:600040
pubmed: 33747037
G3 (Bethesda). 2019 Feb 7;9(2):601-618
pubmed: 30593512
Genetics. 1966 Dec;54(6):1279-86
pubmed: 17248353
Plant Genome. 2020 Nov;13(3):e20056
pubmed: 33217206
Biology (Basel). 2022 Nov 11;11(11):
pubmed: 36421361
Plant Biotechnol J. 2021 Feb;19(2):261-272
pubmed: 32738177
Plant Genome. 2016 Jul;9(2):
pubmed: 27898835
Plant Methods. 2018 Oct 03;14:86
pubmed: 30305840
Genet Sel Evol. 2020 Feb 24;52(1):12
pubmed: 32093611
BMC Genomics. 2016 Mar 29;17:262
pubmed: 27025377
G3 (Bethesda). 2019 Sep 4;9(9):2913-2924
pubmed: 31289023
Theor Appl Genet. 2012 Dec;125(8):1639-45
pubmed: 22814724
Proc Natl Acad Sci U S A. 2015 Dec 22;112(51):15624-9
pubmed: 26663911
PLoS One. 2020 May 21;15(5):e0233382
pubmed: 32437473
Sci Rep. 2021 Jan 15;11(1):1606
pubmed: 33452349
Genome Biol. 2021 Sep 20;22(1):271
pubmed: 34544450
Front Plant Sci. 2017 Jun 07;8:963
pubmed: 28638399
Heredity (Edinb). 2014 May;112(5):552-61
pubmed: 24346498
Genetics. 2014 Oct;198(2):483-95
pubmed: 25009151
Sci Adv. 2021 Jun 11;7(24):
pubmed: 34117061
Nature. 2018 Jul;559(7715):547-555
pubmed: 30046072

Auteurs

Philipp Georg Heilmann (PG)

Institute of Agronomy and Plant Breeding II, Justus Liebig University, Gießen, Germany.

Matthias Frisch (M)

Institute of Agronomy and Plant Breeding II, Justus Liebig University, Gießen, Germany.

Amine Abbadi (A)

NPZ Innovation GmbH, Holtsee, Germany.

Tobias Kox (T)

NPZ Innovation GmbH, Holtsee, Germany.

Eva Herzog (E)

Institute of Agronomy and Plant Breeding II, Justus Liebig University, Gießen, Germany.

Classifications MeSH