Understanding admixture fractions: theory and estimation of gene-flow.


Journal

Journal of mathematical biology
ISSN: 1432-1416
Titre abrégé: J Math Biol
Pays: Germany
ID NLM: 7502105

Informations de publication

Date de publication:
04 Oct 2024
Historique:
received: 09 12 2023
accepted: 17 09 2024
revised: 21 08 2024
medline: 4 10 2024
pubmed: 4 10 2024
entrez: 3 10 2024
Statut: epublish

Résumé

Estimation of admixture proportions has become one of the most commonly used computational tools in population genomics. However, there is remarkably little population genetic theory on statistical properties of these variables. We develop theoretical results that can accurately predict means and variances of admixture proportions within a population using models with recombination and genetic drift. Based on established theory on measures of multilocus disequilibrium, we show that there is a set of recurrence relations that can be used to derive expectations for higher moments of the admixture proportions distribution. We obtain closed form solutions for some special cases. Using these results, we develop a method for estimating admixture parameters from estimated admixture proportions obtained from programs such as Structure or Admixture. We apply this method to HapMap 3 data and find that the population history of African Americans, as expected, is not best explained by a single admixture event between people of European and African ancestry. The model of constant gene flow starting at 8 generations and ending at 2 generations before present gives the best fit.

Identifiants

pubmed: 39363040
doi: 10.1007/s00285-024-02146-0
pii: 10.1007/s00285-024-02146-0
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

47

Subventions

Organisme : Russian Science Foundation
ID : 22-71-10056
Organisme : Russian Science Foundation
ID : 22-71-10056
Organisme : NIH HHS
ID : R01GM138634
Pays : United States

Informations de copyright

© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.

Références

Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19(9):1655–1664
doi: 10.1101/gr.094052.109
Bennett J (1952) On the theory of random mating. Ann Eugen 17(1):311–317
doi: 10.1111/j.1469-1809.1952.tb02522.x
Bradburd GS, Coop GM, Ralph PL (2018) Inferring continuous and discrete population genetic structure across space. Genetics 210(1):33–52. https://doi.org/10.1534/genetics.118.301333
doi: 10.1534/genetics.118.301333
Chintalapati M, Patterson N, Moorjani P (2022) The spatiotemporal patterns of major human admixture events during the european holocene. eLife 11:e77625. https://doi.org/10.7554/eLife.77625
doi: 10.7554/eLife.77625
Crow JF (1990) Mapping functions. Genetics 125(4):669
doi: 10.1093/genetics/125.4.669
Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164(4):1567–1587
doi: 10.1093/genetics/164.4.1567
Fisher RA (1930) Moments and product moments of sampling distributions. Proc Lond Math Soc 2(1):199–238
doi: 10.1112/plms/s2-30.1.199
Graf U (2012) Applied Laplace transforms and z-transforms for scientists and engineers: a computational approach using a Mathematica package. Birkhäuser
Gravel S (2012) Population genetics models of local ancestry. Genetics 191(2):607–619
doi: 10.1534/genetics.112.139808
Haldane JBS (1919) The combination of linkage values and the calculation of distances between the loci of linked factors. J Genet 8(29):299–309
Hill WG (1974) Disequilibrium among several linked neutral genes in finite population i: mean changes in disequilibrium. Theor Popul Biol 5(3):366–392
doi: 10.1016/0040-5809(74)90059-8
International HapMap 3 Consortium et al (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467(7311):52–58
Kelleher J, Etheridge AM, McVean G (2016) Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput Biol 12(5):e1004842
doi: 10.1371/journal.pcbi.1004842
Kosambi DD (2016) The estimation of map distances from recombination values. DD Kosambi: selected works in mathematics and statistics, pp 125–130
Kostenetskiy PS, Chulkevich RA, Kozyrev VI (2021) Hpc resources of the higher school of economics. In: Journal of Physics: Conference Series, vol 1740. IOP Publishing, p 012050
Lawson DJ, van Dorp L, Falush D (2018) A tutorial on how not to over-interpret structure and admixture bar plots. Nat Commun 9(1):3258. https://doi.org/10.1038/s41467-018-05257-7
doi: 10.1038/s41467-018-05257-7
Liang M, Nielsen R (2014) The lengths of admixture tracts. Genetics:p 114
Liang M, Shishkin M, Mikhailova A, Shchur V, Nielsen R (2022) Estimating the timing of multiple admixture events using 3-locus linkage disequilibrium. PLOS Genet 18(7):1–17. https://doi.org/10.1371/journal.pgen.1010281
doi: 10.1371/journal.pgen.1010281
Loh P-R, Lipson M, Patterson N, Moorjani P, Pickrell JK, Reich D, Berger B (2013) Inferring admixture histories of human populations using linkage disequilibrium. Genetics 193(4):1233–1254
doi: 10.1534/genetics.112.147330
Maples BK, Gravel S, Kenny EE, Bustamante CD (2013) Rfmix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am J Human Genet 93(2):278–288
doi: 10.1016/j.ajhg.2013.06.020
Menotti-Raymond M, David VA, Pflueger SM, Lindblad-Toh K, Wade CM, O’Brien SJ, Johnson WE (2008) Patterns of molecular genetic variation among cat breeds. Genomics 91(1):1–11
doi: 10.1016/j.ygeno.2007.08.008
Moorjani P, Patterson N, Hirschhorn JN, Keinan A, Hao L, Atzmon G, Burns E, Ostrer H, Price AL, Reich D (2011) The history of african gene flow into southern europeans, levantines, and jews. PLOS Genet 7(4):1–13. https://doi.org/10.1371/journal.pgen.1001373
doi: 10.1371/journal.pgen.1001373
Morrison ML, Alcala N, Rosenberg NA (2022) Fstruct: an fst-based tool for measuring ancestry variation in inference of population structure. Mol Ecol Resour 22(7):2614–2626. https://doi.org/10.1111/1755-0998.13647
doi: 10.1111/1755-0998.13647
Nelson D, Kelleher J, Ragsdale AP, Moreau C, McVean G, Gravel S (2020) Accounting for long-range correlations in genome-wide simulations of large cohorts. PLOS Genet 16(5):1–12. https://doi.org/10.1371/journal.pgen.1008619
doi: 10.1371/journal.pgen.1008619
Park L (2011) Effective population size of current human population. Genet Res 93(2):105–114. https://doi.org/10.1017/S0016672310000558
doi: 10.1017/S0016672310000558
Pool JE, Nielsen R (2009) Inference of historical changes in migration rate from the lengths of migrant tracts. Genetics 181(2):711–719
doi: 10.1534/genetics.108.098095
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959
doi: 10.1093/genetics/155.2.945
Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298(5602):2381–2385
doi: 10.1126/science.1078311
Slatkin M (1972) On treating the chromosome as the unit of selection. Genetics 72(1):157–168
doi: 10.1093/genetics/72.1.157
Tang H, Peng J, Wang P, Risch NJ (2005) Estimation of individual admixture: analytical and study design considerations. Genet Epidemiol 28(4):289–301
doi: 10.1002/gepi.20064
Tang H, Coram M, Wang P, Zhu X, Risch N (2006) Reconstructing genetic ancestry blocks in admixed individuals. Am J Hum Genet 79(1):1–12
doi: 10.1086/504302
Verdu P, Rosenberg NA (2011) A general mechanistic model for admixture histories of hybrid populations. Genetics 189(4):1413–1426
doi: 10.1534/genetics.111.132787
Wakeley J, King L, Wilton PR (2016) Effects of the population pedigree on genetic signatures of historical demographic events. Proc Natl Acad Sci 113(29):7994–8001. https://doi.org/10.1073/pnas.1601080113
doi: 10.1073/pnas.1601080113
Zhang B, Li M, Zhang Z, Goossens B, Zhu L, Zhang S, Jinchu H, Bruford MW, Wei F (2007) Genetic viability and population history of the giant panda, putting an end to the “evolutionary dead end’’? Mol Biol Evol 24(8):1801–1810
doi: 10.1093/molbev/msm099

Auteurs

Mason Liang (M)

Department of Integrative Biology, University of California, Valley Life Sciences Building, Berkeley, CA, 94720, USA.
Department of Statistics, University of California, Evans Hall, Berkeley, CA, 94720, USA.

Mikhail Shishkin (M)

International laboratory of statistical and computational genomics, Faculty of computer science, HSE University, Pokrovskiy Boulevard, 11, Moscow, Russian Federation, 109028.

Vladimir Shchur (V)

International laboratory of statistical and computational genomics, Faculty of computer science, HSE University, Pokrovskiy Boulevard, 11, Moscow, Russian Federation, 109028. vshchur@hse.ru.

Rasmus Nielsen (R)

Department of Integrative Biology, University of California, Valley Life Sciences Building, Berkeley, CA, 94720, USA. rasmus_nielsen@berkeley.edu.
Department of Statistics, University of California, Evans Hall, Berkeley, CA, 94720, USA. rasmus_nielsen@berkeley.edu.
Globe Institute, University of Copenhagen, Oester Voldgade 5-7, 1350, Copenhagen K, Denmark. rasmus_nielsen@berkeley.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH