L2,1-norm regularized multivariate regression model with applications to genomic prediction.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
29 09 2021
Historique:
received: 16 06 2020
revised: 16 03 2021
accepted: 26 03 2021
pubmed: 29 3 2021
medline: 2 2 2023
entrez: 28 3 2021
Statut: ppublish

Résumé

Genomic selection (GS) is currently deemed the most effective approach to speed up breeding of agricultural varieties. It has been recognized that consideration of multiple traits in GS can improve accuracy of prediction for traits of low heritability. However, since GS forgoes statistical testing with the idea of improving predictions, it does not facilitate mechanistic understanding of the contribution of particular single nucleotide polymorphisms (SNP). Here, we propose a L2,1-norm regularized multivariate regression model and devise a fast and efficient iterative optimization algorithm, called L2,1-joint, applicable in multi-trait GS. The usage of the L2,1-norm facilitates variable selection in a penalized multivariate regression that considers the relation between individuals, when the number of SNPs is much larger than the number of individuals. The capacity for variable selection allows us to define master regulators that can be used in a multi-trait GS setting to dissect the genetic architecture of the analyzed traits. Our comparative analyses demonstrate that the proposed model is a favorable candidate compared to existing state-of-the-art approaches. Prediction and variable selection with datasets from Brassica napus, wheat and Arabidopsis thaliana diversity panels are conducted to further showcase the performance of the proposed model. : The model is implemented using R programming language and the code is freely available from https://github.com/alainmbebi/L21-norm-GS. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 33774677
pii: 6198100
doi: 10.1093/bioinformatics/btab212
pmc: PMC8479665
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

2896-2904

Subventions

Organisme : European Union's Horizon 2020 research and innovation programme
Organisme : BREEDCAFS
ID : 727934
Organisme : PlantaSYST
ID : 664620

Informations de copyright

© The Author(s) 2021. Published by Oxford University Press.

Références

Biometrics. 1975 Jun;31(2):423-47
pubmed: 1174616
Plant Physiol Biochem. 2018 Mar;124:167-174
pubmed: 29414312
PLoS Genet. 2016 Oct 19;12(10):e1006363
pubmed: 27760136
BMC Proc. 2012 May 21;6 Suppl 2:S10
pubmed: 22640436
Biostatistics. 2008 Jul;9(3):432-41
pubmed: 18079126
J Comput Biol. 2019 Oct;26(10):1100-1112
pubmed: 30994361
Int J Mol Sci. 2018 Jun 28;19(7):
pubmed: 29958430
Genetics. 2012 Dec;192(4):1513-22
pubmed: 23086217
Trends Plant Sci. 2017 Nov;22(11):961-975
pubmed: 28965742
Bioinformatics. 2016 Jun 15;32(12):i37-i43
pubmed: 27307640
BMC Bioinformatics. 2011 May 23;12:186
pubmed: 21605355
Nat Genet. 2007 Sep;39(9):1151-5
pubmed: 17676040
Plant Physiol. 2012 Feb;158(2):824-34
pubmed: 22135431
PLoS One. 2019 Feb 28;14(2):e0210442
pubmed: 30817758
Plant Physiol. 2007 Feb;143(2):941-58
pubmed: 17208962
Theor Appl Genet. 2017 Sep;130(9):1927-1939
pubmed: 28647896
Genet Res (Camb). 2009 Dec;91(6):427-36
pubmed: 20122298
Ann Appl Stat. 2010 Mar;4(1):53-77
pubmed: 24489618
BMC Bioinformatics. 2020 Oct 31;21(1):491
pubmed: 33129253
G3 (Bethesda). 2018 Nov 6;8(11):3549-3558
pubmed: 30194089
Dev Cell. 2014 Aug 25;30(4):437-48
pubmed: 25132385
PLoS Genet. 2015 Sep 24;11(9):e1005511
pubmed: 26401841
Genetics. 2001 Apr;157(4):1819-29
pubmed: 11290733
J Comput Graph Stat. 2010 Fall;19(4):947-962
pubmed: 24963268
Genet Sel Evol. 2011 Jul 05;43:26
pubmed: 21729282
Nat Genet. 2012 Jan 08;44(2):212-6
pubmed: 22231484
J Multivar Anal. 2012 Oct 1;111:241-255
pubmed: 22791925
Genetics. 2018 May;209(1):89-103
pubmed: 29514861
Plant Cell. 2003 Jan;15(1):63-78
pubmed: 12509522
J Dairy Sci. 2008 Nov;91(11):4414-23
pubmed: 18946147
Genetics. 2014 Oct;198(2):483-95
pubmed: 25009151

Auteurs

Alain J Mbebi (AJ)

Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
Bioinformatics Group, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany.

Hao Tong (H)

Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
Bioinformatics Group, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany.
Center for Plant Systems Biology and Biotechnology, Ruski 139, 4000 Tsentar, Plovdiv, Bulgaria.

Zoran Nikoloski (Z)

Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
Bioinformatics Group, Institute of Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany.
Center for Plant Systems Biology and Biotechnology, Ruski 139, 4000 Tsentar, Plovdiv, Bulgaria.

Articles similaires

Humans Macular Degeneration Mendelian Randomization Analysis Life Style Genome-Wide Association Study

A scenario for an evolutionary selection of ageing.

Tristan Roget, Claire Macmurray, Pierre Jolivet et al.
1.00
Aging Selection, Genetic Biological Evolution Animals Fertility
Coal Metagenome Phylogeny Bacteria Genome, Bacterial

Classifications MeSH