Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
29 10 2020
29 10 2020
Historique:
received:
13
05
2020
accepted:
12
10
2020
entrez:
30
10
2020
pubmed:
31
10
2020
medline:
5
3
2021
Statut:
epublish
Résumé
Although ancient DNA data have become increasingly more important in studies about past populations, it is often not feasible or practical to obtain high coverage genomes from poorly preserved samples. While methods of accurate genotype imputation from > 1 × coverage data have recently become a routine, a large proportion of ancient samples remain unusable for downstream analyses due to their low coverage. Here, we evaluate a two-step pipeline for the imputation of common variants in ancient genomes at 0.05-1 × coverage. We use the genotype likelihood input mode in Beagle and filter for confident genotypes as the input to impute missing genotypes. This procedure, when tested on ancient genomes, outperforms a single-step imputation from genotype likelihoods, suggesting that current genotype callers do not fully account for errors in ancient sequences and additional quality controls can be beneficial. We compared the effect of various genotype likelihood calling methods, post-calling, pre-imputation and post-imputation filters, different reference panels, as well as different imputation tools. In a Neolithic Hungarian genome, we obtain ~ 90% imputation accuracy for heterozygous common variants at coverage 0.05 × and > 97% accuracy at coverage 0.5 ×. We show that imputation can mitigate, though not eliminate reference bias in ultra-low coverage ancient genomes.
Identifiants
pubmed: 33122697
doi: 10.1038/s41598-020-75387-w
pii: 10.1038/s41598-020-75387-w
pmc: PMC7596702
doi:
Substances chimiques
DNA, Ancient
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
18542Subventions
Organisme : Wellcome Trust
Pays : United Kingdom
Organisme : Wellcome Trust
ID : 200368/Z/15/Z
Pays : United Kingdom
Organisme : Wellcome Trust
ID : 2000368/Z/15/Z
Pays : United Kingdom
Références
Nat Genet. 2016 Oct;48(10):1279-83
pubmed: 27548312
Nature. 2020 Jun;582(7812):384-388
pubmed: 32555485
Genome Med. 2019 Nov 26;11(1):74
pubmed: 31771638
Bioinformatics. 2019 Aug 1;35(15):2555-2561
pubmed: 30576415
Annu Rev Genomics Hum Genet. 2009;10:387-406
pubmed: 19715440
Genetics. 2003 Dec;165(4):2213-33
pubmed: 14704198
Science. 2019 Nov 8;366(6466):708-714
pubmed: 31699931
Nature. 2018 Oct;562(7726):203-209
pubmed: 30305743
Nature. 2015 Oct 1;526(7571):68-74
pubmed: 26432245
BMC Bioinformatics. 2014 Nov 25;15:356
pubmed: 25420514
Nat Genet. 2016 Nov;48(11):1443-1448
pubmed: 27694958
Curr Biol. 2019 Apr 1;29(7):1169-1177.e7
pubmed: 30880015
PLoS Biol. 2018 Jan 9;16(1):e2003703
pubmed: 29315301
Annu Rev Genomics Hum Genet. 2018 Aug 31;19:73-96
pubmed: 29799802
Am J Hum Genet. 2018 Sep 6;103(3):338-348
pubmed: 30100085
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
Nat Genet. 2016 Oct;48(10):1284-1287
pubmed: 27571263
PLoS Genet. 2009 Jun;5(6):e1000529
pubmed: 19543373
Bioinformatics. 2014 May 1;30(9):1266-72
pubmed: 24413527
Nat Commun. 2014 Oct 21;5:5257
pubmed: 25334030
Ann Hum Biol. 2019 Mar;46(2):145-149
pubmed: 31184205
Am J Hum Genet. 2016 Jan 7;98(1):116-26
pubmed: 26748515
Nat Rev Genet. 2017 Nov;18(11):659-674
pubmed: 28890534
PLoS Genet. 2017 Jul 27;13(7):e1006852
pubmed: 28749934
Nat Commun. 2015 Nov 16;6:8912
pubmed: 26567969
Nat Genet. 2012 May 20;44(6):631-5
pubmed: 22610117