Differences in local population history at the finest level: the case of the Estonian population.
Journal
European journal of human genetics : EJHG
ISSN: 1476-5438
Titre abrégé: Eur J Hum Genet
Pays: England
ID NLM: 9302235
Informations de publication
Date de publication:
11 2020
11 2020
Historique:
received:
17
03
2020
accepted:
14
07
2020
revised:
24
06
2020
pubmed:
28
7
2020
medline:
16
6
2021
entrez:
27
7
2020
Statut:
ppublish
Résumé
Several recent studies detected fine-scale genetic structure in human populations. Hence, groups conventionally treated as single populations harbour significant variation in terms of allele frequencies and patterns of haplotype sharing. It has been shown that these findings should be considered when performing studies of genetic associations and natural selection, especially when dealing with polygenic phenotypes. However, there is little understanding of the practical effects of such genetic structure on demography reconstructions and selection scans when focusing on recent population history. Here we tested the impact of population structure on such inferences using high-coverage (~30×) genome sequences of 2305 Estonians. We show that different regions of Estonia differ in both effective population size dynamics and signatures of natural selection. By analyzing identity-by-descent segments we also reveal that some Estonian regions exhibit evidence of a bottleneck 10-15 generations ago reflecting sequential episodes of wars, plague and famine, although this signal is virtually undetected when treating Estonia as a single population. Besides that, we provide a framework for relating effective population size estimated from genetic data to actual census size and validate it on the Estonian population. This approach may be widely used both to cross-check estimates based on historical sources as well as to get insight into times and/or regions with no other information available. Our results suggest that the history of human populations within the last few millennia can be highly region specific and cannot be properly studied without taking local genetic structure into account.
Identifiants
pubmed: 32712624
doi: 10.1038/s41431-020-0699-4
pii: 10.1038/s41431-020-0699-4
pmc: PMC7575549
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1580-1591Subventions
Organisme : EC | European Regional Development Fund (Europski Fond za Regionalni Razvoj)
ID : 2014-2020.4.01.16-0024
Pays : International
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : PRG243
Pays : International
Organisme : Wellcome Trust (Wellcome)
ID : WT104125MA
Pays : International
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : PRG184
Pays : International
Organisme : NIDDK NIH HHS
ID : R01 DK075787
Pays : United States
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : IUT20-60
Pays : International
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : PUTJD817
Pays : International
Organisme : EC | European Regional Development Fund (Europski Fond za Regionalni Razvoj)
ID : 2014-2020.4.01.16-0271
Pays : International
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : PUT1339
Pays : International
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : 810645
Pays : International
Organisme : Wellcome Trust
Pays : United Kingdom
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PRESICE4Q
Pays : International
Organisme : EC | European Regional Development Fund (Europski Fond za Regionalni Razvoj)
ID : 2014-2020.4.01.16-0030
Pays : International
Organisme : EC | European Regional Development Fund (Europski Fond za Regionalni Razvoj)
ID : 2014-2020.4.01.15-0012
Pays : International
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : IUT24-6
Pays : International
Organisme : EC | European Regional Development Fund (Europski Fond za Regionalni Razvoj)
ID : 2014-2020.4.01.16-0125
Pays : International
Organisme : Eesti Teadusagentuur (Estonian Research Council)
ID : MOBTP108
Pays : International
Références
Leslie S, Winney B, Hellenthal G, Davison D, Boumertit A, Day T, et al. The fine-scale genetic structure of the British population. Nature. 2015;519:309–14.
pubmed: 25788095
pmcid: 4632200
Martin AR, Karczewski KJ, Kerminen S, Kurki MI, Sarin A-P, Artomov M, et al. Haplotype sharing provides insights into fine-scale population history and disease in Finland. Am J Hum Genet. 2018;102:760–75.
pubmed: 29706349
pmcid: 5986696
Bycroft C, Fernandez-Rozadilla C, Ruiz-Ponte C, Quintela I, Carracedo Á, Donnelly P, et al. Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula. Nat Commun. 2019;10:551.
pubmed: 30710075
pmcid: 6358624
Raveane A, Aneli S, Montinaro F, Athanasiadis G, Barlera S, Birolo G, et al. Population structure of modern-day Italians reveals patterns of ancient and archaic ancestries in Southern Europe. Sci Adv. 2019;5:eaaw3492.
pubmed: 31517044
pmcid: 6726452
Saint Pierre A, Giemza J, Alves I, Karakachoff M, Gaudin M, Amouyel P, et al. The genetic history of France. Eur J Hum Genet. 2020;28:853–65.
pubmed: 32042083
Berg JJ, Harpak A, Sinnott-Armstrong N, Joergensen AM, Mostafavi H, Field Y, et al. Reduced signal for polygenic adaptation of height in UK Biobank. eLife. 2019;8:e39725.
pubmed: 30895923
pmcid: 6428572
Sohail M, Vakhrusheva OA, Sul JH, Pulit SL, Francioli LC. Genome of the Netherlands Consortium et al. Negative selection in humans and fruit flies involves synergistic epistasis. Science. 2017;356:539–42.
pubmed: 28473589
pmcid: 6200135
Haworth S, Mitchell R, Corbin L, Wade KH, Dudding T, Budu-Aggrey A, et al. Apparent latent structure within the UK Biobank sample has implications for epidemiological analysis. Nat Commun. 2019;10:333.
pubmed: 30659178
pmcid: 6338768
Kerminen S, Martin AR, Koskela J, Ruotsalainen SE, Havulinna AS, Surakka I, et al. Geographic variation and bias in the polygenic scores of complex diseases and traits in Finland. Am J Hum Genet. 2019;104:1169–81.
pubmed: 31155286
pmcid: 6562021
Kals M, Nikopensius T, Läll K, Pärn K, Sikka TT, Suvisaari J, et al. Advantages of genotype imputation with ethnically matched reference panel for rare variant association analyses. bioRxiv. 2019:579201. https://www.biorxiv.org/content/10.1101/579201v1 .
Nelis M, Esko T, Mägi R, Zimprich F, Zimprich A, Toncheva D, et al. Genetic structure of Europeans: a view from the North–East. PLoS One. 2009;4:e5472.
pubmed: 19424496
pmcid: 2675054
Haller T, Leitsalu L, Fischer K, Nuotio M-L, Esko T, Boomsma DI, et al. MixFit: methodology for computing ancestry-related genetic scores at the individual level and its application to the Estonian and Finnish population studies. PLoS ONE. 2017;12. https://doi.org/10.1371/journal.pone.0170325 .
Browning BL, Browning SR. Detecting identity by descent and estimating genotype error rates in sequence data. Am J Hum Genet. 2013;93:840–51.
pubmed: 24207118
pmcid: 3824133
Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453.
pubmed: 22291602
pmcid: 3266881
Browning SR, Browning BL. Accurate non-parametric estimation of recent effective population size from segments of identity by descent. Am J Hum Genet. 2015;97:404–18.
pubmed: 26299365
pmcid: 4564943
Al-Asadi H, Petkova D, Stephens M, Novembre J. Estimating recent migration and population-size surfaces. PLoS Genet. 2019;15:e1007908.
pubmed: 30640906
pmcid: 6347299
Kallio P. The Diversification of Proto-Finnic. Fibula, Fabula, Fact: The Viking Age in Finland, pp. 155–168. Studia Fennica Historica 18. Helsinki, 2014.
Hellenthal G, Busby GBJ, Band G, Wilson JF, Capelli C, Falush D, et al. A genetic atlas of human admixture history. Science. 2014;343:747–51.
pubmed: 24531965
pmcid: 4209567
Loit A. Invandringen från Finland till Baltikum under 1600-talet. Hist Tidskr Finl. 1982;2:194–5.
Field Y, Boyle EA, Telis N, Gao Z, Gaulton KJ, Golan D, et al. Detection of human adaptation during the past 2000 years. Science. 2016;354:760–4.
pubmed: 27738015
pmcid: 5182071
Laporte V, Charlesworth B. Effective population size and population subdivision in demographically structured populations. Genetics. 2002;162:501–19.
pubmed: 12242257
pmcid: 1462266
Charlesworth B. Fundamental concepts in genetics: effective population size and patterns of molecular evolution and variation. Nat Rev Genet. 2009;10:195–205.
pubmed: 19204717
Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475:493–6.
pubmed: 21753753
pmcid: 3154645
Okada Y, Momozawa Y, Sakaue S, Kanai M, Ishigaki K, Akiyama M, et al. Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese. Nat Commun. 2018;9. https://doi.org/10.1038/s41467-018-03274-0 .
Minassian BA, Lee JR, Herbrick JA, Huizenga J, Soder S, Mungall AJ, et al. Mutations in a gene encoding a novel protein tyrosine phosphatase cause progressive myoclonus epilepsy. Nat Genet. 1998;20:171–4.
pubmed: 9771710
Serratosa JM, Gómez-Garre P, Gallardo ME, Anta B, de Bernabé DB, Lindhout D, et al. A novel protein tyrosine phosphatase gene is mutated in progressive myoclonus epilepsy of the Lafora type (EPM2). Hum Mol Genet. 1999;8:345–52.
pubmed: 9931343
Nitschke F, Ahonen SJ, Nitschke S, Mitra S, Minassian BA. Lafora disease—from pathogenesis to treatment strategies. Nat Rev Neurol. 2018;14:606–17.
pubmed: 30143794
pmcid: 6317072
Palamara PF, Terhorst J, Song YS, Price AL. High-throughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability. Nat Genet. 2018;50:1311–7.
pubmed: 30104759
pmcid: 6145075
Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 2013;14:128.
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–7.
pubmed: 27141961
pmcid: 27141961
Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310–5.
pubmed: 3992975
pmcid: 3992975
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
pubmed: 2723002
pmcid: 2723002
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.
pubmed: 17701901
pmcid: 17701901
Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen W-M. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–73.
pubmed: 20926424
pmcid: 20926424
Loh P-R, Palamara PF, Price AL. Fast and accurate long-range phasing in a UK Biobank cohort. Nat Genet. 2016;48:811–6.
pubmed: 27270109
pmcid: 4925291
Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLoS Genet. 2006;2:e190.
pubmed: 17194218
pmcid: 1713260
Hudjashov G, Karafet TM, Lawson DJ, Downey S, Savina O, Sudoyo H, et al. Complex patterns of admixture across the Indonesian archipelago. Mol Biol Evol. 2017;34:2439–52.
pubmed: 28957506
pmcid: 5850824
R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2018. https://www.R-project.org/ .
Browning BL, Browning SR. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics. 2013;194:459–71.
pubmed: 23535385
pmcid: 3664855
Weir B, Clark Cockerham C, Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population-structure. Evolution. 1984;38:1358–70.
pubmed: 28563791
pmcid: 28563791
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–8.
pubmed: 21653522
pmcid: 21653522
Pebesma E, Bivand R. Classes and methods for spatial data in R. R News. 2005;5:9–13.
Bivand RS, Pebesma E, Gómez-Rubio V. Applied spatial data analysis with R. 2nd ed. New York: Springer-Verlag; 2013. https://www.springer.com/gp/book/9781461476177 . Accessed 18 Jun 2019.
Pebesma E. Simple features for R: standardized support for spatial vector data. R J. 2018. https://journal.r-project.org/archive/2018/RJ-2018-009/ .
Bivand R, Keitt T, Rowlingson B, Pebesma E, Sumner M, Hijmans R, et al. rgdal: bindings for the ‘Geospatial’ data abstraction library. 2019. https://CRAN.R-project.org/package=rgdal . Accessed 18 Jun 2019.
Bivand R, Rundel C, Pebesma E, Stuetz R, Hufthammer KO, Giraudoux P, et al. rgeos: interface to geometry engine—open source (‘GEOS’). 2019. https://CRAN.R-project.org/package=rgeos . Accessed 18 Jun 2019.
Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer-Verlag; 2009. https://www.springer.com/gp/book/9780387981413 . Accessed 18 Jun 2019.
Kelleher J, Etheridge AM, McVean G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput Biol. 2016;12:e1004842.
pubmed: 27145223
pmcid: 4856371
Felsenstein J. Inbreeding and variance effective numbers in populations with overlapping generations. Genetics. 1971;68:581–97.
pubmed: 5166069
pmcid: 1212678
Austerlitz F, Heyer E. Social transmission of reproductive behavior increases frequency of inherited disorders in a young-expanding population. Proc Natl Acad Sci USA. 1998;95:15140–4.
pubmed: 9844029
Heyer E, Chaix R, Pavard S, Austerlitz F. Sex-specific demographic behaviours that shape human genomic variation. Mol Ecol. 2012;21:597–612.
pubmed: 22211311
MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017;45:D896–901.
pubmed: 27899670
pmcid: 27899670
Võsa U, Claringbould A, Westra H-J, Bonder MJ, Deelen P, Zeng B, et al. Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis. bioRxiv. 2018:447367. https://www.biorxiv.org/content/10.1101/447367v1 .