Preparation and Curation of Multiyear, Multilocation, Multitrait Datasets.

Adjusted phenotype per trial Analysis of residuals Combined phenotype across trials Descriptive statistics Design diagnostics Experimental design Genotype × environment Genotype–phenotype association Linear mixed model Multienvironment trials Outliers Raw phenotype per trial

Journal

Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969

Informations de publication

Date de publication:
2022
Historique:
entrez: 31 5 2022
pubmed: 1 6 2022
medline: 3 6 2022
Statut: ppublish

Résumé

Genome-wide association studies (GWAS) are a powerful approach to dissect genotype-phenotype associations and identify causative regions. However, this power is highly influenced by the accuracy of the phenotypic data. To obtain accurate phenotypic values, the phenotyping should be achieved through multienvironment trials (METs). In order to avoid any technical errors, the required time needs to be spent on exploring, understanding, curating and adjusting the phenotypic data in each trial before combining them using an appropriate linear mixed model (LMM). The LMM is chosen to minimize as much as possible any effect that can lead to misestimation of the phenotypic values. The purpose of this chapter is to explain a series of important steps to explore and analyze data from METs used to characterize an association panel. Two datasets are used to illustrate two different scenarios.

Identifiants

pubmed: 35641760
doi: 10.1007/978-1-0716-2237-7_6
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

83-104

Informations de copyright

© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Références

Zhu C, Gore M, Buckler ES, Yu J (2008) Status and prospects of association mapping in plants. Plant Genome J 1(1):5. https://doi.org/10.3835/plantgenome2008.02.0089
doi: 10.3835/plantgenome2008.02.0089
Alqudah AM, Sallam A, Baenziger PS, Börner A (2020) GWAS: fast-forwarding gene identification and characterization in temperate cereals: lessons from barley – a review. J Adv Res 22:119–135. https://doi.org/10.1016/j.jare.2019.10.013
doi: 10.1016/j.jare.2019.10.013 pubmed: 31956447
Dominik S (2013) Descriptive statistics of data: understanding the data set and phenotypes of interest. In: Gondro C, van der Werf J, Hayes B (eds) Genome-wide association studies and genomic prediction. Methods in molecular biology, vol 1019. Humana Press, Totowa, NJ, pp 19–36. https://doi.org/10.1007/978-1-62703-447-0
doi: 10.1007/978-1-62703-447-0
Bernardo R (2010) Breeding for quantitative traits in plants, 2nd edn. Stemma Press, Woodbury, MN
Falconer DS, Mackay TFC (1996) An introduction to quantitative genetics, 4th edn. Prentice Hall, London
Arnold MH, Kempton RA (1979) Estimating the performance of sugar beet varieties. In: Proceedings of the 42nd Winter Congress of the Institut International de Recherches Betteravières, Brussels, Belgium. Plant Breeding Inst, Trumpington, pp 189–203
Gilmour AR, Cullis BR, Verbyla AP (1997) Accounting for natural and extraneous variation in the analysis of field experiments. J Agric Biol Environ Stat 2:269–293. https://doi.org/10.2307/1400446
doi: 10.2307/1400446
Casler MD (2015) Fundamentals of experimental design: guidelines for designing successful experiments. Agron J 107:692–705. https://doi.org/10.2134/agronj2013.0114
doi: 10.2134/agronj2013.0114
Pablo G-B, Díaz-García L, Gutiérrez L (2019) Mega-environmental design: using genotype × environment interaction to optimize resources for cultivar testing. Crop Sci 59(5):1899. https://doi.org/10.2135/cropsci2018.11.0692
doi: 10.2135/cropsci2018.11.0692
Pacheco A, Vargas M, Alvarado G, Rodríguez F, Crossa J, Burgueño, J (2015) GEA-R genotype x environment analysis with R for windows, Version 4.1, https://hdl.handle.net/11529/10203 , CIMMYT Research Data & Software Repository Network, https://data.cimmyt.org/dataset.xhtml?persistentId=hdl:11529/10203
Malosetti M, Bustos-Korts D, Boer MP, van Eeuwijk FA (2016) Multi environment genomic prediction: issues in relation to genotype by environment interaction. Crop Sci 56(5):2210–2222. https://doi.org/10.2135/cropsci2015.05.0311
doi: 10.2135/cropsci2015.05.0311
Welham S, Gogel B, Smith A, Thompson R, Cullis B (2010) A comparison of analysis methods for late-stage variety evaluation trials. Aust N Z J Stat 52:125–149
doi: 10.1111/j.1467-842X.2010.00570.x
Piepho HP, Mohring J, Schulz-Streeck T, Ogutu JO (2012) A stage-wise approach for the analysis of multi-environment trials. Biom J 54:844–886
doi: 10.1002/bimj.201100219
R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
Conomos, Matthew P Gogarten SM, Brown L, Chen H, Rice K, Sofer T, Thornton T et al (2018) GENESIS: GENetic EStimation and Inference in Structured samples (GENESIS): statistical methods for analyzing genetic data from samples with population structure and/or relatedness. R package version 2.10.0. https://rdrr.io/github/smgogarten/GENESIS/
Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F (2020) e1071: Misc functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.7-4. https://CRAN.R-project.org/package=e1071
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York, NY. ISBN 0-387-95457-0
doi: 10.1007/978-0-387-21706-2
Hyndman R, Athanasopoulos G, Bergmeir C, Caceres G, Chhay L, O’Hara-Wild M, Petropoulos F, Razbash S, Wang E, Yasmeen F (2021) forecast: forecasting functions for time series and linear models. R package version 8.14, https://pkg.robjhyndman.com/forecast
Smith AB, Cullis BR, Thompson R (2005) The analysis of crop cultivar breeding and evaluation trials: an overview of current mixed model approaches. J Agric Sci 143(6):449–462. https://doi.org/10.1017/S0021859605005587
doi: 10.1017/S0021859605005587
Butler DG, Cullis BR, Gilmour A R, Thompson R (2018) ASReml-R Reference Manual (Version 4): ASReml estimates variance components under a general linear mixed model by residual maximum likelihood (REML). University of Wollongong. https://mmade.org/wp content/uploads/2019/01/asremlRMfinal.pdf
Bates D, Maechler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67(1):1–48. https://doi.org/10.18637/jss.v067.i01
doi: 10.18637/jss.v067.i01
Covarrubias-Pazaran G (2018) Software update: moving the R package sommer to multivariate mixed models for genome-assisted prediction. BioRxiv. https://doi.org/10.1101/354639
Pinheiro J, Bates D, Deb Roy S, Sarkar D, R Core Team (2020) nlme: linear and nonlinear mixed effects models. R package version 3.1-149, URL: https://CRAN.R-project.org/package=nlme
Aparicio J (2021) MrBean: web application for analyzing field experiments. R package version 2.0.6., https://apariciojohan.github.io/MrBeanApp/
Technow F (2015) R package mvngGrAd: moving grid adjustment in plant breeding field trials. R package version 0.1.5
Auguie B (2017) gridExtra: miscellaneous functions for “Grid” graphics. R package version 2.3. https://CRAN.R-project.org/package=gridExtra
Rodriguez-Alvarez MX, Boer MP, van Eeuwijk FA, Eilers PHC (2018) Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spat Stat 23:52–71. https://doi.org/10.1016/j.spasta.2017.10.003
doi: 10.1016/j.spasta.2017.10.003
Cullis BR, Gleeson AC (1991) Spatial analysis of field experiments-an extension to two dimensions. Biometrics 47:1449–1460
doi: 10.2307/2532398
Kehel Z, Habash DZ, Gezan SA, Welham SJ, Nachit MM (2010) Estimation of spatial trend and automatic model selection in augmented designs. Agron J 102:1542–1552
doi: 10.2134/agronj2010.0175
Ben-Shachar M, Lüdecke D, Makowski D (2020) effectsize: estimation of effect size indices and standardized parameters. J Open Source Softw 5(56):2815. https://doi.org/10.21105/joss.02815
doi: 10.21105/joss.02815
Neyhart JL, Smith KP (2019) Validating genome wide predictions of genetic variance in a contemporary breeding program. Crop Sci 59(3):1062. https://doi.org/10.2135/cropsci2018.11.0716
doi: 10.2135/cropsci2018.11.0716
Milliken GA, Johnson DE (2002) Analysis of messy data, Volume III. Analysis of covariance. Chapman and Hall/CRC, New York, NY
Gastwirth JL, Gel YR, Wallace Hui WL, Lyubchich V, Miao W, Noguchi K (2020) lawstat: tools for biostatistics, public policy, and law. R package version 3.4. https://CRAN.R-project.org/package=lawstat
Kassambara A (2021) rstatix: pipe-friendly framework for basic statistical tests. R package version 0.7.0. https://CRAN.R-project.org/package=rstatix
Alvarado G, Rodríguez F, Pacheco A, Burgueño J, Crossa J, Vargas M, Pérez-Rodríguez P, Lopez-Cruz MA (2020) META-R: a software to analyze data from multi-environment plant breeding trials. Crop J 8(5):745–756. https://doi.org/10.1016/j.cj.2020.03.010
doi: 10.1016/j.cj.2020.03.010
Mohring J, Piepho HP (2009) Comparison of weighting in two-stage analysis of plant breeding trials. Crop Sci 49:1977–1988
doi: 10.2135/cropsci2009.02.0083
Piepho HP (1998) Empirical best linear unbiased prediction in cultivar trials using factor-analytic variance-covariance structures. Theor Appl Genet 97:195–201. https://doi.org/10.1007/s001220050885
doi: 10.1007/s001220050885
Meyer K (2009) Factor-analytic models for genotype × environment type problems and structured covariance matrices. Genet Sel Evol 41:21. https://doi.org/10.1007/978-94-009-7142-4_3
doi: 10.1007/978-94-009-7142-4_3 pubmed: 19284520 pmcid: 2674411
Smith AB, Ganesalingam A, Kuchel H (2015) Factor analytic mixed models for the provision of grower information from national crop variety testing programs. Theor Appl Genet 128:55–72. https://doi.org/10.1007/s00122-014-2412-x
doi: 10.1007/s00122-014-2412-x pubmed: 25326722
Smith AB, Cullis BR, Thompson R (2001) Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics 57:1138–1147
doi: 10.1111/j.0006-341X.2001.01138.x

Auteurs

Amina Abed (A)

Consortium de recherche sur la pomme de terre du Québec (CRPTQ), Québec, Canada. aminaabed@yahoo.fr.

Zakaria Kehel (Z)

International Center for Agricultural Research in the Dry Areas (ICARDA), Rabat, Morocco. Z.Kehel@cgiar.org.

Articles similaires

Humans Macular Degeneration Mendelian Randomization Analysis Life Style Genome-Wide Association Study
Humans Metabolic Syndrome Sleep Apnea, Obstructive Mendelian Randomization Analysis Gastrointestinal Diseases
Humans Mendelian Randomization Analysis Graves Disease Aging Genome-Wide Association Study
Oryza Phylogeny Gene Expression Regulation, Plant Plant Diseases Crops, Agricultural

Classifications MeSH