Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population.
Genotype Imputation
Low density arrays
Populus nigra
Whole-Genome Resequencing
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
18 Apr 2019
18 Apr 2019
Historique:
received:
10
08
2018
accepted:
29
03
2019
entrez:
20
4
2019
pubmed:
20
4
2019
medline:
3
8
2019
Statut:
epublish
Résumé
Genomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alternative during the first stages of genomic resources development. Our study is the first to explore the imputation in a tree species: black poplar. About 1000 pure-breed Populus nigra trees from a breeding population were selected and genotyped with a 12K custom Infinium Bead-Chip. Forty-three of those individuals corresponding to nodal trees in the pedigree were fully sequenced (reference), while the remaining majority (target) was imputed from 8K to 1.4 million SNPs using FImpute. Each SNP and individual was evaluated for imputation errors by leave-one-out cross validation in the training sample of 43 sequenced trees. Some summary statistics such as Hardy-Weinberg Equilibrium exact test p-value, quality of sequencing, depth of sequencing per site and per individual, minor allele frequency, marker density ratio or SNP information redundancy were calculated. Principal component and Boruta analyses were used on all these parameters to rank the factors affecting the quality of imputation. Additionally, we characterize the impact of the relatedness between reference population and target population. During the imputation process, we used 7540 SNPs from the chip to impute 1,438,827 SNPs from sequences. At the individual level, imputation accuracy was high with a proportion of SNPs correctly imputed between 0.84 and 0.99. The variation in accuracies was mostly due to differences in relatedness between individuals. At a SNP level, the imputation quality depended on genotyped SNP density and on the original minor allele frequency. The imputation did not appear to result in an increase of linkage disequilibrium. The genotype densification not only brought a better distribution of markers all along the genome, but also we did not detect any substantial bias in annotation categories. This study shows that it is possible to impute low-density marker panels to whole genome sequence with good accuracy under certain conditions that could be common to many breeding populations.
Sections du résumé
BACKGROUND
BACKGROUND
Genomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alternative during the first stages of genomic resources development. Our study is the first to explore the imputation in a tree species: black poplar. About 1000 pure-breed Populus nigra trees from a breeding population were selected and genotyped with a 12K custom Infinium Bead-Chip. Forty-three of those individuals corresponding to nodal trees in the pedigree were fully sequenced (reference), while the remaining majority (target) was imputed from 8K to 1.4 million SNPs using FImpute. Each SNP and individual was evaluated for imputation errors by leave-one-out cross validation in the training sample of 43 sequenced trees. Some summary statistics such as Hardy-Weinberg Equilibrium exact test p-value, quality of sequencing, depth of sequencing per site and per individual, minor allele frequency, marker density ratio or SNP information redundancy were calculated. Principal component and Boruta analyses were used on all these parameters to rank the factors affecting the quality of imputation. Additionally, we characterize the impact of the relatedness between reference population and target population.
RESULTS
RESULTS
During the imputation process, we used 7540 SNPs from the chip to impute 1,438,827 SNPs from sequences. At the individual level, imputation accuracy was high with a proportion of SNPs correctly imputed between 0.84 and 0.99. The variation in accuracies was mostly due to differences in relatedness between individuals. At a SNP level, the imputation quality depended on genotyped SNP density and on the original minor allele frequency. The imputation did not appear to result in an increase of linkage disequilibrium. The genotype densification not only brought a better distribution of markers all along the genome, but also we did not detect any substantial bias in annotation categories.
CONCLUSIONS
CONCLUSIONS
This study shows that it is possible to impute low-density marker panels to whole genome sequence with good accuracy under certain conditions that could be common to many breeding populations.
Identifiants
pubmed: 30999856
doi: 10.1186/s12864-019-5660-y
pii: 10.1186/s12864-019-5660-y
pmc: PMC6471894
doi:
Types de publication
Journal Article
Langues
eng
Pagination
302Subventions
Organisme : EU Noveltree
ID : FP7 - 211868
Organisme : EU Evoltree
ID : FP6-16322
Références
Brief Bioinform. 2004 Dec;5(4):355-64
pubmed: 15606972
Am J Hum Genet. 2005 May;76(5):887-93
pubmed: 15789306
Am J Hum Genet. 2006 Apr;78(4):629-44
pubmed: 16532393
Biometrics. 2006 Mar;62(1):49-53
pubmed: 16542228
Nat Genet. 2006 Sep;38(9):1002-4
pubmed: 16921375
Science. 2006 Sep 15;313(5793):1596-604
pubmed: 16973872
Nat Genet. 2007 Jul;39(7):906-13
pubmed: 17572673
Am J Hum Genet. 2007 Sep;81(3):559-75
pubmed: 17701901
Am J Hum Genet. 2007 Nov;81(5):1084-97
pubmed: 17924348
BMC Bioinformatics. 2007 Nov 02;8:428
pubmed: 17980034
PLoS One. 2008;3(10):e3551
pubmed: 18958166
PLoS Genet. 2009 Jun;5(6):e1000529
pubmed: 19543373
Annu Rev Genomics Hum Genet. 2009;10:387-406
pubmed: 19715440
Genetics. 2010 Aug;185(4):1441-9
pubmed: 20479147
Nat Rev Genet. 2010 Jul;11(7):499-511
pubmed: 20517342
Nucleic Acids Res. 2010 Sep;38(16):e164
pubmed: 20601685
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
PLoS Genet. 2010 Sep 23;6(9):e1001139
pubmed: 20927186
Nat Genet. 2011 May;43(5):491-8
pubmed: 21478889
Bioinformatics. 2011 Aug 1;27(15):2156-8
pubmed: 21653522
Genetics. 2011 Sep;189(1):317-27
pubmed: 21705746
Nat Rev Genet. 2011 Sep 16;12(10):703-14
pubmed: 21921926
Genet Sel Evol. 2012 Jun 19;44:9
pubmed: 22462519
G3 (Bethesda). 2012 Apr;2(4):425-7
pubmed: 22540033
Genet Sel Evol. 2012 Jul 31;44:25
pubmed: 22849718
Heredity (Edinb). 2014 Jan;112(1):39-47
pubmed: 23549338
J Anim Sci. 2013 Aug;91(8):3583-92
pubmed: 23736050
BMC Genomics. 2013 Jul 04;14:446
pubmed: 23826801
Genet Sel Evol. 2013 Sep 03;45:33
pubmed: 24004563
G3 (Bethesda). 2014 Apr 16;4(4):623-31
pubmed: 24531728
Bioinformatics. 2014 Aug 1;30(15):2114-20
pubmed: 24695404
J Anim Breed Genet. 2014 Jun;131(3):165-72
pubmed: 24906026
BMC Genomics. 2014 Jun 17;15:478
pubmed: 24935670
Genet Sel Evol. 2014 Jul 15;46:41
pubmed: 25022768
Animal. 2014 Nov;8(11):1743-53
pubmed: 25045914
BMC Genet. 2014 Aug 12;15:88
pubmed: 25112433
BMC Genomics. 2014 Aug 27;15:728
pubmed: 25164068
Gigascience. 2015 Feb 25;4:7
pubmed: 25722852
J Dairy Sci. 2015 Jun;98(6):4131-8
pubmed: 25841966
BMC Genet. 2015 Mar 12;16:24
pubmed: 25887220
Genet Sel Evol. 2014 Oct 01;46:63
pubmed: 25927638
BMC Genet. 2015 Jul 14;16:82
pubmed: 26168789
BMC Genet. 2015 Jul 22;16:90
pubmed: 26193934
BMC Genet. 2015 Aug 07;16:99
pubmed: 26250698
Am J Hum Genet. 2016 Jan 7;98(1):116-26
pubmed: 26748515
Mol Ecol Resour. 2016 Jul;16(4):1023-36
pubmed: 26929265
J Theor Biol. 2016 Jun 21;399:148-58
pubmed: 27049046
Animal. 2016 Jul;10(7):1077-85
pubmed: 27076192
G3 (Bethesda). 2017 Apr 3;7(4):1377-1383
pubmed: 28250015
J Anim Sci. 2017 Apr;95(4):1489-1501
pubmed: 28464096
Nat Genet. 2017 Jul;49(7):986-992
pubmed: 28530675
Animal. 2018 Feb;12(2):191-198
pubmed: 28712375
J Dairy Sci. 2018 Feb;101(2):1292-1296
pubmed: 29153527
Genetica. 2018 Apr;146(2):137-149
pubmed: 29243001
J Anim Sci Biotechnol. 2018 Mar 21;9:30
pubmed: 29581880
Genet Sel Evol. 2018 Apr 6;50(1):14
pubmed: 29625549
Proc Natl Acad Sci U S A. 1987 Apr;84(8):2363-7
pubmed: 3470801
Hum Hered. 1971;21(6):523-42
pubmed: 5149961
Genetics. 1995 May;140(1):377-88
pubmed: 7635301
Am J Hum Genet. 1997 Sep;61(3):748-60
pubmed: 9326339