Indigenous Australian genomes show deep structure and rich novel variation.
Journal
Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462
Informations de publication
Date de publication:
13 Dec 2023
13 Dec 2023
Historique:
received:
29
11
2022
accepted:
03
11
2023
medline:
14
12
2023
pubmed:
14
12
2023
entrez:
13
12
2023
Statut:
aheadofprint
Résumé
The Indigenous peoples of Australia have a rich linguistic and cultural history. How this relates to genetic diversity remains largely unknown because of their limited engagement with genomic studies. Here we analyse the genomes of 159 individuals from four remote Indigenous communities, including people who speak a language (Tiwi) not from the most widespread family (Pama-Nyungan). This large collection of Indigenous Australian genomes was made possible by careful community engagement and consultation. We observe exceptionally strong population structure across Australia, driven by divergence times between communities of 26,000-35,000 years ago and long-term low but stable effective population sizes. This demographic history, including early divergence from Papua New Guinean (47,000 years ago) and Eurasian groups
Identifiants
pubmed: 38093005
doi: 10.1038/s41586-023-06831-w
pii: 10.1038/s41586-023-06831-w
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Investigateurs
Ashley Farlow
(A)
Azure Hermes
(A)
Hardip R Patel
(HR)
Sharon Huebner
(S)
Gareth Baynam
(G)
Misty R Jenkins
(MR)
Simon Easteal
(S)
Stephen Leslie
(S)
Informations de copyright
© 2023. The Author(s).
Références
Malaspinas, A. S. et al. A genomic history of Aboriginal Australia. Nature 538, 207–214 (2016).
pubmed: 27654914
doi: 10.1038/nature18299
Henn, B. M. et al. Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc. Natl Acad. Sci. USA 113, E440–E449 (2016).
pubmed: 26712023
doi: 10.1073/pnas.1510805112
Rasmussen, M. et al. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98 (2011).
pubmed: 21940856
pmcid: 3991479
doi: 10.1126/science.1211177
Jacobs, G. S. et al. Multiple deeply divergent denisovan ancestries in Papuans. Cell 177, 1010–1021.e32 (2019).
pubmed: 30981557
doi: 10.1016/j.cell.2019.02.035
Vernot, B. et al. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 352, 235–239 (2016).
pubmed: 26989198
pmcid: 6743480
doi: 10.1126/science.aad9416
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
pubmed: 27654912
pmcid: 5161557
doi: 10.1038/nature18964
Tobler, R. et al. Aboriginal mitogenomes reveal 50,000 years of regionalism in Australia. Nature 544, 180–184 (2017).
pubmed: 28273067
doi: 10.1038/nature21416
Bouckaert, R. R., Bowern, C. & Atkinson, Q. D. The origin and expansion of Pama–Nyungan languages across Australia. Nat. Ecol. Evol. 2, 741–749 (2018).
pubmed: 29531347
doi: 10.1038/s41559-018-0489-3
McConvell, P. & Bowern, C. The prehistory and internal relationships of Australian languages. Lang. Linguist. Compass 5, 19–32 (2011).
doi: 10.1111/j.1749-818X.2010.00257.x
Barbieri, C. et al. A global analysis of matches and mismatches between human genetic and linguistic histories. Proc. Natl Acad. Sci. USA 119, e2122084119 (2022).
pubmed: 36399547
pmcid: 9704691
doi: 10.1073/pnas.2122084119
Australian National University. National Centre for Indigenous Genomics Statute (2021); www.legislation.gov.au/Details/F2021L00183 .
Peterson, N. & Taylor, J. Demographic transition in a hunter-gatherer population: the Tiwi case, 1929–1996. Aust. Aborig. Stud. 1, 11–27 (1998).
Tindale, N. Genealogical Data on the Aborigines of Australia, Vol. 2 (1938–1939) (Department of Aboriginal and Torres Strait Islander Partnerships, Community and Personal Histories Removals Database; originally held by the Museum of South Australia, 1938).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
doi: 10.1038/nature15393
Nagle, N. et al. Antiquity and diversity of aboriginal Australian Y-chromosomes. Am. J. Phys. Anthropol. 159, 367–381 (2016).
pubmed: 26515539
doi: 10.1002/ajpa.22886
McEvoy, B. P. et al. Whole-genome genetic diversity in a sample of Australians with deep Aboriginal ancestry. Am. J. Hum. Genet. 87, 297–305 (2010).
pubmed: 20691402
pmcid: 2917718
doi: 10.1016/j.ajhg.2010.07.008
Bergström, A. et al. Deep roots for Aboriginal Australian Y chromosomes. Curr. Biol. 26, 809–813 (2016).
pubmed: 26923783
pmcid: 4819516
doi: 10.1016/j.cub.2016.01.028
Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell 185, 3426–3440.e19 (2022).
pubmed: 36055201
pmcid: 9439720
doi: 10.1016/j.cell.2022.08.004
Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
pubmed: 32193295
pmcid: 7115999
doi: 10.1126/science.aay5012
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654
pmcid: 7334197
doi: 10.1038/s41586-020-2308-7
Henn, B. M., Cavalli-Sforza, L. L. & Feldman, M. W. The great human expansion. Proc. Natl Acad. Sci. USA 109, 17758–17764 (2012).
pubmed: 23077256
pmcid: 3497766
doi: 10.1073/pnas.1212380109
Friedlaender, J. S. et al. The genetic structure of Pacific Islanders. PLoS Genet. 4, e19 (2008).
pubmed: 18208337
pmcid: 2211537
doi: 10.1371/journal.pgen.0040019
Xue, A. et al. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat. Commun. 9, 2941 (2018).
pubmed: 30054458
pmcid: 6063971
doi: 10.1038/s41467-018-04951-w
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res. 11, 863–874 (2001).
pubmed: 11337480
pmcid: 311071
doi: 10.1101/gr.176601
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
pubmed: 20354512
pmcid: 2855889
doi: 10.1038/nmeth0410-248
Landrum, M. J. & Kattman, B. L. ClinVar at five years: delivering on the promise. Hum. Mutat. 39, 1623–1630 (2018).
pubmed: 30311387
doi: 10.1002/humu.23641
Kirin, M. et al. Genomic runs of homozygosity record population history and consanguinity. PLoS ONE 5, e13996 (2010).
pubmed: 21085596
pmcid: 2981575
doi: 10.1371/journal.pone.0013996
Hermes, A. et al. Beyond platitudes: a qualitative study of Australian Aboriginal people’s perspectives on biobanking. Intern Med. J. 51, 1426–1432 (2021).
pubmed: 33528097
doi: 10.1111/imj.15223
International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).
doi: 10.1038/nature09298
Marchini, J., Cardon, L. R., Phillips, M. S. & Donnelly, P. The effects of human population structure on large genetic association studies. Nat. Genet. 36, 512–517 (2004).
pubmed: 15052271
doi: 10.1038/ng1337
Leslie, S. et al. The fine-scale genetic structure of the British population. Nature 519, 309–314 (2015).
pubmed: 25788095
pmcid: 4632200
doi: 10.1038/nature14230
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
pubmed: 19648217
pmcid: 2752134
doi: 10.1101/gr.094052.109
Lawson, D. J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012).
pubmed: 22291602
pmcid: 3266881
doi: 10.1371/journal.pgen.1002453
Diaz-Papkovich, A., Anderson-Trocmé, L., Ben-Eghan, C. & Gravel, S. UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet. 15, e1008432 (2019).
pubmed: 31675358
pmcid: 6853336
doi: 10.1371/journal.pgen.1008432
Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).
pubmed: 19779445
pmcid: 2842210
doi: 10.1038/nature08365
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
pubmed: 22960212
pmcid: 3522152
doi: 10.1534/genetics.112.145037
Nagle, N. et al. Mitochondrial DNA diversity of present-day Aboriginal Australians and implications for human evolution in Oceania. J. Hum. Genet. 62, 343–353 (2017).
pubmed: 27904152
doi: 10.1038/jhg.2016.147
Baumdicker, F. et al. Efficient ancestry and mutation simulation with msprime 1.0. Genetics 220, iyab229 (2022).
pubmed: 34897427
doi: 10.1093/genetics/iyab229
Raynal, L. et al. ABC random forests for Bayesian parameter inference. Bioinformatics 35, 1720–1728 (2019).
pubmed: 30321307
doi: 10.1093/bioinformatics/bty867
Nielsen, S. V. et al. Bayesian inference of admixture graphs on Native American and Arctic populations. PLoS Genet. 19, e1010410 (2023).
pubmed: 36780565
pmcid: 9956672
doi: 10.1371/journal.pgen.1010410
Browning, S. R. & Browning, B. L. Accurate non-parametric estimation of recent effective population size from segments of identity by descent. Am. J. Hum. Genet. 97, 404–418 (2015).
pubmed: 26299365
pmcid: 4564943
doi: 10.1016/j.ajhg.2015.07.012
Schiffels, S. & Wang, K. MSMC and MSMC2: the multiple sequentially Markovian coalescent. Methods Mol. Biol. 2090, 147–166 (2020).
pubmed: 31975167
doi: 10.1007/978-1-0716-0199-0_7
Yunusbaev, U. et al. Reconstructing recent population history while mapping rare variants using haplotypes. Sci. Rep. 9, 5849 (2019).
pubmed: 30971755
pmcid: 6458133
doi: 10.1038/s41598-019-42385-6
Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).
pubmed: 24952747
pmcid: 4116295
doi: 10.1038/ng.3015
Nagle, N. et al. Aboriginal Australian mitochondrial genome variation – an increased understanding of population antiquity and diversity. Sci. Rep. 7, 43041 (2017).
pubmed: 28287095
pmcid: 5347126
doi: 10.1038/srep43041
Hudjashov, G. et al. Revealing the prehistoric settlement of Australia by Y chromosome and mtDNA analysis. Proc. Natl Acad. Sci. USA 104, 8726–8730 (2007).
pubmed: 17496137
pmcid: 1885570
doi: 10.1073/pnas.0702928104
Pedro, N. et al. Papuan mitochondrial genomes and the settlement of Sahul. J. Hum. Genet. 65, 875–887 (2020).
pubmed: 32483274
pmcid: 7449881
doi: 10.1038/s10038-020-0781-3
Purnomo, G. A. et al. Mitogenomes reveal two major influxes of Papuan ancestry across Wallacea following the last glacial maximum and Austronesian contact. Genes (Basel) 12, 965 (2021).
pubmed: 34202821
doi: 10.3390/genes12070965
Nielsen, R. et al. Tracing the peopling of the world through genomics. Nature 541, 302–310 (2017).
pubmed: 28102248
pmcid: 5772775
doi: 10.1038/nature21347
Easteal, S. et al. Equitable expanded carrier screening needs indigenous clinical and population genomic data. Am. J. Hum. Genet. 107, 175–182 (2020).
pubmed: 32763188
pmcid: 7413856
doi: 10.1016/j.ajhg.2020.06.005
Baynam, G. et al. A germline MTOR mutation in Aboriginal Australian siblings with intellectual disability, dysmorphism, macrocephaly, and small thoraces. Am. J. Med. Genet. A 167, 1659–1667 (2015).
pubmed: 25851998
doi: 10.1002/ajmg.a.37070
Chen, S. et al. A genome-wide mutational constraint map quantified from variation in 76,156 human genomes. Preprint at bioRxiv https://doi.org/10.1101/2022.03.20.485034 (2022).
Deelen, P. et al. Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands’. Eur. J. Hum. Genet. 22, 1321–1326 (2014).
pubmed: 24896149
pmcid: 4200431
doi: 10.1038/ejhg.2014.19
Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).
pubmed: 25807286
doi: 10.1038/ng.3247
Mitt, M. et al. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. Eur. J. Hum. Genet. 25, 869–876 (2017).
pubmed: 28401899
pmcid: 5520064
doi: 10.1038/ejhg.2017.51
Huebner, S., Hermes, A. & Easteal, S. in Indigenous Research Ethics: Claiming Research Sovereignty Beyond Deficit and the Colonial Legacy, Vol. 6 (eds George, L., Tauri, J. & MacDonald, L. T. A. o T.) Ch. 8 (Emerald, 2020).
Bergström, A. et al. A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017).
pubmed: 28912245
pmcid: 5802383
doi: 10.1126/science.aan3842
Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).
doi: 10.1038/nmeth.1785
Delaneau, O., Howie, B., Cox, A. J., Zagury, J. F. & Marchini, J. Haplotype estimation using sequencing reads. Am. J. Hum. Genet. 93, 687–696 (2013).
pubmed: 24094745
pmcid: 3791270
doi: 10.1016/j.ajhg.2013.09.002
Choi, Y., Chan, A. P., Kirkness, E., Telenti, A. & Schork, N. J. Comparison of phasing strategies for whole human genomes. PLoS Genet. 14, e1007308 (2018).
pubmed: 29621242
pmcid: 5903673
doi: 10.1371/journal.pgen.1007308
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, 2074–2093 (2006).
doi: 10.1371/journal.pgen.0020190
Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).
pubmed: 23910464
pmcid: 3738819
doi: 10.1016/j.ajhg.2013.06.020
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
pubmed: 20926424
pmcid: 3025716
doi: 10.1093/bioinformatics/btq559
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
pubmed: 17701901
pmcid: 1950838
doi: 10.1086/519795
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
pubmed: 27268795
pmcid: 4893825
doi: 10.1186/s13059-016-0974-4
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
pubmed: 33590861
pmcid: 7931819
doi: 10.1093/gigascience/giab008
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2009); https://www.R-project.org/ .
Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).
pubmed: 23535385
pmcid: 3664855
doi: 10.1534/genetics.113.150029
Browning, S. R. et al. Local ancestry inference in a large US-based Hispanic/Latino study: Hispanic Community Health Study/Study of Latinos (HCHS/SOL). G3 6, 1525–1534 (2016).
pubmed: 27172203
pmcid: 4889649
doi: 10.1534/g3.116.028779
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 3, 861 (2018).
doi: 10.21105/joss.00861
Bhatia, G., Patterson, N., Sankararaman, S. & Price, A. L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 23, 1514–1521 (2013).
pubmed: 23861382
pmcid: 3759727
doi: 10.1101/gr.154831.113
Peter, B. M. Admixture, population structure, and f-statistics. Genetics 202, 1485–1501 (2016).
pubmed: 26857625
pmcid: 4905545
doi: 10.1534/genetics.115.183913
Kelleher, J., Etheridge, A. M. & McVean, G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput. Biol. 12, e1004842 (2016).
pubmed: 27145223
pmcid: 4856371
doi: 10.1371/journal.pcbi.1004842
Ralph, P., Thornton, K. & Kelleher, J. Efficiently summarizing relationships in large samples: a general duality between statistics of genealogies and genomes. Genetics 215, 779–797 (2020).
pubmed: 32357960
pmcid: 7337078
doi: 10.1534/genetics.120.303253
Pudlo, P. et al. Reliable ABC model choice via random forests. Bioinformatics 32, 859–866 (2016).
pubmed: 26589278
doi: 10.1093/bioinformatics/btv684
Browning, B. L. & Browning, S. R. Detecting identity by descent and estimating genotype error rates in sequence data. Am. J. Hum. Genet. 93, 840–851 (2013).
pubmed: 24207118
pmcid: 3824133
doi: 10.1016/j.ajhg.2013.09.014
Browning, S. R. et al. Ancestry-specific recent effective population size in the Americas. PLoS Genet. 14, e1007385 (2018).
pubmed: 29795556
pmcid: 5967706
doi: 10.1371/journal.pgen.1007385
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
pubmed: 20644199
pmcid: 2928508
doi: 10.1101/gr.107524.110
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).
pubmed: 17996036
pmcid: 2247476
doi: 10.1186/1471-2148-7-214
Kahle, D. & Wickham, H. ggmap: spatial visualization with ggplot2. R J. 5, 144–161 (2013).
doi: 10.32614/RJ-2013-014
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).