Sparse kernel models provide optimization of training set design for genomic prediction in multiyear wheat breeding data.


Journal

The plant genome
ISSN: 1940-3372
Titre abrégé: Plant Genome
Pays: United States
ID NLM: 101273919

Informations de publication

Date de publication:
12 2022
Historique:
received: 19 12 2021
accepted: 17 07 2022
pubmed: 1 9 2022
medline: 16 12 2022
entrez: 31 8 2022
Statut: ppublish

Résumé

The success of genomic selection (GS) in breeding schemes relies on its ability to provide accurate predictions of unobserved lines at early stages. Multigeneration data provides opportunities to increase the training data size and thus, the likelihood of extracting useful information from ancestors to improve prediction accuracy. The genomic best linear unbiased predictions (GBLUPs) are performed by borrowing information through kinship relationships between individuals. Multigeneration data usually becomes heterogeneous with complex family relationship patterns that are increasingly entangled with each generation. Under these conditions, historical data may not be optimal for model training as the accuracy could be compromised. The sparse selection index (SSI) is a method for training set (TRN) optimization, in which training individuals provide predictions to some but not all predicted subjects. We added an additional trimming process to the original SSI (trimmed SSI) to remove less important training individuals for prediction. Using a large multigeneration (8 yr) wheat (Triticum aestivum L.) grain yield dataset (n = 68,836), we found increases in accuracy as more years are included in the TRN, with improvements of ∼0.05 in the GBLUP accuracy when using 5 yr of historical data relative to when using only 1 yr. The SSI method showed a small gain over the GBLUP accuracy but with an important reduction on the TRN size. These reduced TRNs were formed with a similar number of subjects from each training generation. Our results suggest that the SSI provides a more stable ranking of genotypes than the GBLUP as the TRN becomes larger.

Identifiants

pubmed: 36043341
doi: 10.1002/tpg2.20254
doi:

Types de publication

Journal Article Research Support, U.S. Gov't, Non-P.H.S. Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e20254

Informations de copyright

© 2022 International Maize and Wheat Improvement Center (CIMMYT). The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.

Références

Akdemir, D., & Isidro-Sanchez, J. (2019). Design of training populations for selective phenotyping in genomic prediction. Science Reports, 9, 1446. https://doi.org/10.1038/s41598-018-38081-6
Akdemir, D., Sanchez, J. I., & Jannink, J.-L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genetics Selection Evolution., 47, 380. https://doi.org/10.1186/s12711-015-0116-6
Combs, E., & Bernardo, R. (2013). Accuracy of genomewide selection for different traits with constant population size, heritability, and number of markers. The Plant Genome, 6, plantgenome2012.11.0030. https://doi.org/10.3835/plantgenome2012.11.0030
Crespo-Herrera, L. A., Crossa, J., Huerta-Espino, J., Mondal, S., Velu, G., Juliana, P., Vargas, M., Pérez-Rodríguez, P., Joshi, A. K., Braun, H. J., & Singh, R. P. (2021). Target population of environments for wheat breeding in India: Definition, prediction and genetic gains. Frontiers in Plant Science, 12, 638520. https://doi.org/10.3389/fpls.2021.638520
Crossa, J., Martini, J. W. R., Gianola, D., Pérez-Rodríguez, P., Jarquin, D., Juliana, P., Montesinos-López, O., & Cuevas, J. (2019). Deep kernel and deep learning for genome-based prediction of single traits in multienvironment breeding trials. Frontiers in Genetics, 10, 1168. https://doi.org/10.3389/fgene.2019.01168
Crossa, J., Pérez-Rodríguez, P., Cuevas, J., Montesinos-López, O., Jarquín, D., de los Campos, G., Burgueño, J., González-Camacho, J. M., Pérez-Elizalde, S., Beyene, Y., Dreisigacker, S., Singh, R., Zhang, X., Gowda, M., Roorkiwal, M., Rutkoski, J., & Varshney, R. K. (2017). Genomic selection in plant breeding: Methods, models, and perspectives. Trends in Plant Science, 22, 961-975. https://doi.org/10.1016/j.tplants.2017.08.011
Daetwyler, H. D., Villanueva, B., & Woolliams, J. A. (2008). Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One, 3, e3395. https://doi.org/10.1371/journal.pone.0003395
Dawson, J. C., Endelman, J. B., Heslot, N., Crossa, J., Poland, J., Dreisigacker, S., Manès, Y., Sorrells, M. E., & Jannink, J.-L. (2013). The use of unbalanced historical data for genomic selection in an international wheat breeding program. Field Crops Research, 154, 12-22. https://doi.org/10.1016/j.fcr.2013.07.020
de los Campos, G., Gianola, D., & Rosa, G. J. (2009). Reproducing kernel Hilbert spaces regression: A general framework for genetic evaluation. Journal of Animal Science, 87, 1883-1887. https://doi.org/10.2527/jas.2008-1259
de los Campos, G., Sorensen, D., & Gianola, D. (2015a). Genomic heritability: What is it? PLoS Genetics, 11, e1005048. https://doi.org/10.1371/journal.pgen.1005048
de los Campos, G., Vazquez, A. I., Fernando, R., Klimentidis, Y. C., & Sorensen, D. (2013). Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genetics, 9, e1003608. https://doi.org/10.1371/journal.pgen.1003608
de los Campos, G., Veturi, Y., Vazquez, A. I., Lehermeier, C., & Pérez-Rodríguez, P. (2015b). Incorporating genetic heterogeneity in whole-genome regressions using interactions. Journal of Agricultural, Biological, and Environmental Statistics, 20, 467-490. https://doi.org/10.1007/s13253-015-0222-5
Dreisigacker, S., Crossa, J., Pérez-Rodríguez, P., Montesinos-Lopez, O. A., Rosyara, U., Juliana, P., Mondal, S., Crespo-Herrera, L., Govindan, V., Singh, R. P., & Braun, H.-J. (2021). Implementation of genomic selection in the CIMMYT global wheat program, findings from the past 10 years. Crop Breeding, Genetics and Genomics, 3, e210005. https://doi.org/10.20900/cbgg20210005
Gianola, D., Fernando, R. L., & Stella, A. (2006). Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics, 173, 1761-1776. https://doi.org/10.1534/genetics.105.049510
Gianola, D., Morota, G., & Crossa, J. (2014). Genome-enabled Prediction of Complex Traits with Kernel Methods: What Have We Learned? Paper 212. Paper presented at 10th World Congress of Genetics Applied to Livestock Production, Vancouver, BC, Canada, August 17-22, 2014.
Gianola, D., Okut, H., Weigel, K. A., & Rosa, G. J. M. (2011). Predicting complex quantitative traits with Bayesian neural networks: A case study with Jersey cows and wheat. BMC Genetics, 12, 87. https://doi.org/10.1186/1471-2156-12-87
Glaubitz, J. C., Casstevens, T. M., Lu, F., Harriman, J., Elshire, R. J., Sun, Q., & Buckler, E. S. (2014). TASSEL-GBS: A high capacity genotyping by sequencing analysis pipeline. PLoS One, 9, e90346. https://doi.org/10.1371/journal.pone.0090346
Goddard, M. (2009). Genomic selection: Prediction of accuracy and maximisation of long term response. Genetica, 136, 245-257. https://doi.org/10.1007/s10709-008-9308-0
Grueneberg, A., & de los Campos, G. (2019). BGData - A suite of R packages for genomic analysis with big data. G3 Genes, Genomes, Genetics, 9, 1377-1383. https://doi.org/10.1534/g3.119.400018
Habier, D., Fernando, R. L., & Dekkers, J. C. M. (2007). The impact of genetic relationship information on genome-assisted breeding values. Genetics, 177, 2389-2397. https://doi.org/10.1534/genetics.107.081190
Heffner, E. L., Sorrells, M. E., & Jannink, J.-L. (2009). Genomic selection for crop improvement. Crop Science, 49, 1-12. https://doi.org/10.2135/cropsci2008.08.0512
Howard, R., Gianola, D., Montesinos-López, O., Juliana, P., Singh, R., Poland, J., Shrestha, S., Pérez-Rodríguez, P., Crossa, J., & Jarquín, D. (2019). Joint use of genome, pedigree, and their interaction with environment for predicting the performance of wheat lines in new environments. G3 Genes, Genomes, Genetics, 9, 2925-2934. https://doi.org/10.1534/g3.119.400508
Jarquín, D., Crossa, J., Lacaze, X., Du Cheyron, P., Daucourt, J., Lorgeou, J., Piraux, F., Guerreiro, L., Pérez, P., Calus, M., Burgueño, J., & de los Campos, G. (2014). A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theoretical and Applied Genetics, 127, 595-607. https://doi.org/10.1007/s00122-013-2243-1
Lopez-Cruz, M., Beyene, Y., Gowda, M., Crossa, J., Pérez-Rodríguez, P., & de los Campos, G. (2021). Multi-generation genomic prediction of maize yield using parametric and non-parametric sparse selection indices. Heredity, 127, 423-432. https://doi.org/10.1038/s41437-021-00474-1
Lopez-Cruz, M., Crossa, J., Bonnett, D., Dreisigacker, S., Poland, J., Jannink, J.-L., Singh, R. P., Autrique, E., & de los Campos, G. (2015). Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3 Genes, Genomes, Genetics, 5, 569-582. https://doi.org/10.1534/g3.114.016097
Lopez-Cruz, M., & de los Campos, G. (2021). Optimal breeding-value prediction using a sparse selection index. Genetics, 218, iyab030. https://doi.org/10.1093/genetics/iyab030
Lopez-Cruz, M., Olson, E., Rovere, G., Crossa, J., Dreisigacker, S., Suchismita, M., Singh, R., & Campos, G. L. (2020). Regularized selection indices for breeding value prediction using hyper-spectral image data. Science Reports, 10, 8195. https://doi.org/10.1038/s41598-020-65011-2
Lorenz, A. J., & Smith, K. P. (2015). Adding genetically distant individuals to training populations reduces genomic prediction accuracy in Barley. Crop Science, 55, 2657-2667. https://doi.org/10.2135/cropsci2014.12.0827
Lorenzana, R. E., & Bernardo, R. (2009). Accuracy of genotypic value predictions for marker-based selection in biparental plant populations. Theoretical and Applied Genetics, 120, 151-161. https://doi.org/10.1007/s00122-009-1166-3
Makowsky, R., Pajewski, N. M., Klimentidis, Y. C., Vazquez, A. I., Duarte, C. W., Allison, D. B., & de los Campos, G. (2011). Beyond missing heritability: Prediction of complex traits. PLoS Genetics, 7, e1002051. https://doi.org/10.1371/journal.pgen.1002051
Meuwissen, T. H. E., Hayes, B. J., & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157, 1819-1829. https://doi.org/10.1093/genetics/157.4.1819
Morota, G., & Gianola, D. (2014). Kernel-based whole-genome prediction of complex traits: A review. Frontiers in Genetics, 5, 363. https://doi.org/10.3389/fgene.2014.00363
Morota, G., Koyama, M., Rosa, G. J. M., Weigel, K. A., & Gianola, D. (2013). Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data. Genetics Selection Evolution., 45, 17. https://doi.org/10.1186/1297-9686-45-17
Pérez-Rodríguez, P., Burgueño, J., Montesinos-López, O., Singh, R. P., Juliana, P., Mondal, S., & Crossa, J. (2020). Prediction with big data in the genomic and high-throughput phenotyping era: A case study with wheat data. In M. S. Kang (Ed.), Quantitative genetics, genomics and plant breeding (pp. 213-226). CAB International. https://doi.org/10.1079/9781789240214.0213
Pérez-Rodríguez, P., Crossa, J., Rutkoski, J., Poland, J., Singh, R., Legarra, A., Autrique, E., Campos, G. L., Burgueño, J., & Dreisigacker, S. (2017). Single-step genomic and pedigree genotype × environment interaction models for predicting wheat lines in international environments. The Plant Genome, 10, plantgenome2016.09.0089. https://doi.org/10.3835/plantgenome2016.09.0089
Poland, J., Endelman, J., Dawson, J., Rutkoski, J., Wu, S., Manes, Y., Dreisigacker, S., Crossa, J., Sánchez-Villeda, H., Sorrells, M., & Jannink, J.-L. (2012). Genomic selection in wheat breeding using genotyping-by-sequencing. The Plant Genome, 5. https://doi.org/10.3835/plantgenome2012.06.0006
Pszczola, M., & Calus, M. P. L. (2016). Updating the reference population to achieve constant genomic prediction reliability across generations. Animal, 10, 1018-1024. https://doi.org/10.1017/S1751731115002785
R Core Team. (2020). R statistical software version 4.0.3. R Foundation for Statistical Computing.
Rincent, R., Nicolas, S., Altmann, T., Brunel, D., Revilla, P., Melchinger, A., Bauer, E., Schoen, C. C., Meyer, N., Giauffret, C., Bauland, C., Jamin, P., Laborde, J., Monod, H., Flament, P., Charcosset, A., & Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: Comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics, 192, 715-728. https://doi.org/10.1534/genetics.112.141473
VanRaden, P. M. (2007). Genomic measures of relationship and inbreeding. Interbull Bulletin, 37, 33-36.
VanRaden, P. M. (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91, 4414-4423. https://doi.org/10.3168/jds.2007-0980
Velazco, J. G., Malosetti, M., Hunt, C. H., Mace, E. S., Jordan, D. R., & van Eeuwijk, F. A. (2019). Combining pedigree and genomic information to improve prediction quality: An example in sorghum. Theoretical and Applied Genetics, 132, 2055-2067. https://doi.org/10.1007/s00122-019-03337-w

Auteurs

Marco Lopez-Cruz (M)

Dep. of Epidemiology and Biostatistics, Michigan State Univ., East Lansing, MI, USA.

Susanne Dreisigacker (S)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Leonardo Crespo-Herrera (L)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Alison R Bentley (AR)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Ravi Singh (R)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Jesse Poland (J)

Dep. of Agronomy, Kansas State Univ., Manhattan, KS, USA.

Sandesh Shrestha (S)

Dep. of Agronomy, Kansas State Univ., Manhattan, KS, USA.

Julio Huerta-Espino (J)

Campo Experimental Valle de Mexico, Instituto Nacional de Investigaciones Forestales, Agricolas y Pecuarias (INIFAP), Chapingo, Mexico.

Velu Govindan (V)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Philomin Juliana (P)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Suchismita Mondal (S)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

Paulino Pérez-Rodríguez (P)

Colegio de Postgraduados, Montecillos, Mexico.

Jose Crossa (J)

Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.
Colegio de Postgraduados, Montecillos, Mexico.

Articles similaires

Coal Metagenome Phylogeny Bacteria Genome, Bacterial
Animals Natural Killer T-Cells Mice Adipose Tissue Lipid Metabolism
Genome, Bacterial Virulence Phylogeny Genomics Plant Diseases
Triticum Transcription Factors Gene Expression Regulation, Plant Plant Proteins Salt Stress

Classifications MeSH