Development of the Wheat Practical Haplotype Graph database as a resource for genotyping data storage and genotype imputation.
Practical Haplotype Graph
exome capture
genotype imputation
skim-seq
wheat
Journal
G3 (Bethesda, Md.)
ISSN: 2160-1836
Titre abrégé: G3 (Bethesda)
Pays: England
ID NLM: 101566598
Informations de publication
Date de publication:
04 02 2022
04 02 2022
Historique:
received:
11
06
2021
accepted:
21
10
2021
pubmed:
10
11
2021
medline:
9
3
2022
entrez:
9
11
2021
Statut:
ppublish
Résumé
To improve the efficiency of high-density genotype data storage and imputation in bread wheat (Triticum aestivum L.), we applied the Practical Haplotype Graph (PHG) tool. The Wheat PHG database was built using whole-exome capture sequencing data from a diverse set of 65 wheat accessions. Population haplotypes were inferred for the reference genome intervals defined by the boundaries of the high-quality gene models. Missing genotypes in the inference panels, composed of wheat cultivars or recombinant inbred lines genotyped by exome capture, genotyping-by-sequencing (GBS), or whole-genome skim-seq sequencing approaches, were imputed using the Wheat PHG database. Though imputation accuracy varied depending on the method of sequencing and coverage depth, we found 92% imputation accuracy with 0.01× sequence coverage, which was slightly lower than the accuracy obtained using the 0.5× sequence coverage (96.6%). Compared to Beagle, on average, PHG imputation was ∼3.5% (P-value < 2 × 10-14) more accurate, and showed 27% higher accuracy at imputing a rare haplotype introgressed from a wild relative into wheat. We found reduced accuracy of imputation with independent 2× GBS data (88.6%), which increases to 89.2% with the inclusion of parental haplotypes in the database. The accuracy reduction with GBS is likely associated with the small overlap between GBS markers and the exome capture dataset, which was used for constructing PHG. The highest imputation accuracy was obtained with exome capture for the wheat D genome, which also showed the highest levels of linkage disequilibrium and proportion of identity-by-descent regions among accessions in the PHG database. We demonstrate that genetic mapping based on genotypes imputed using PHG identifies SNPs with a broader range of effect sizes that together explain a higher proportion of genetic variance for heading date and meiotic crossover rate compared to previous studies.
Identifiants
pubmed: 34751373
pii: 6423995
doi: 10.1093/g3journal/jkab390
pmc: PMC9210282
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.
Références
Front Plant Sci. 2018 Dec 07;9:1809
pubmed: 30581450
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
Theor Appl Genet. 2017 Jul;130(7):1393-1404
pubmed: 28378053
Sci Adv. 2019 May 29;5(5):eaav0536
pubmed: 31149630
Nat Genet. 2019 Oct;51(10):1530-1539
pubmed: 31548720
Science. 2018 Aug 17;361(6403):
pubmed: 30115783
Plant Biotechnol J. 2020 Jan;18(1):254-265
pubmed: 31199572
Plant Biotechnol J. 2014 Aug;12(6):787-96
pubmed: 24646323
Nat Genet. 2021 Jan;53(1):120-126
pubmed: 33414550
Crop Sci. 2016;56(3):990-1000
pubmed: 27814405
Plant J. 2018 Sep;95(6):1039-1054
pubmed: 29952048
Nature. 2020 Dec;588(7837):277-283
pubmed: 33239791
Nat Genet. 2016 Aug;48(8):965-969
pubmed: 27376236
PLoS One. 2012;7(2):e30619
pubmed: 22312429
Annu Rev Genomics Hum Genet. 2018 Aug 31;19:73-96
pubmed: 29799802
PLoS One. 2011 May 04;6(5):e19379
pubmed: 21573248
Bioinformatics. 2007 Oct 1;23(19):2633-5
pubmed: 17586829
Nat Genet. 2019 May;51(5):896-904
pubmed: 31043759
J Exp Bot. 2016 Jan;67(1):287-99
pubmed: 26476691
G3 (Bethesda). 2019 Jan 9;9(1):125-133
pubmed: 30420469
New Phytol. 2013 May;198(3):925-937
pubmed: 23374069
G3 (Bethesda). 2013 Jul 08;3(7):1105-14
pubmed: 23665877
Gigascience. 2021 Feb 16;10(2):
pubmed: 33590861
Theor Appl Genet. 2015 Jan;128(1):145-58
pubmed: 25367380
Mol Genet Genomics. 2018 Oct;293(5):1231-1243
pubmed: 29872926
Genome Biol. 2015 Feb 26;16:48
pubmed: 25886949
Funct Integr Genomics. 2016 Jul;16(4):365-82
pubmed: 27085709
Curr Opin Plant Biol. 2009 Apr;12(2):178-84
pubmed: 19195924
BMC Genomics. 2010 Dec 29;11:727
pubmed: 21190581
Plant Biotechnol J. 2019 Jul;17(7):1276-1288
pubmed: 30549213
Plant Genome. 2020 Mar;13(1):e20009
pubmed: 33016627
Proc Natl Acad Sci U S A. 2017 Feb 7;114(6):E913-E921
pubmed: 28096351
Am J Hum Genet. 2007 Sep;81(3):559-75
pubmed: 17701901
Nat Genet. 2019 May;51(5):905-911
pubmed: 31043760
BMC Genomics. 2010 Dec 14;11:702
pubmed: 21156062
Nat Methods. 2015 Apr;12(4):357-60
pubmed: 25751142
Sci Rep. 2020 Jul 2;10(1):10908
pubmed: 32616836
Genetics. 2013 Jun;194(2):459-71
pubmed: 23535385
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Genome Biol. 2018 Aug 17;19(1):103
pubmed: 30115100
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Genome Biol. 2021 May 6;22(1):137
pubmed: 33957956