Indigenous Australian genomes show deep structure and rich novel variation.


Journal

Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462

Informations de publication

Date de publication:
13 Dec 2023
Historique:
received: 29 11 2022
accepted: 03 11 2023
medline: 14 12 2023
pubmed: 14 12 2023
entrez: 13 12 2023
Statut: aheadofprint

Résumé

The Indigenous peoples of Australia have a rich linguistic and cultural history. How this relates to genetic diversity remains largely unknown because of their limited engagement with genomic studies. Here we analyse the genomes of 159 individuals from four remote Indigenous communities, including people who speak a language (Tiwi) not from the most widespread family (Pama-Nyungan). This large collection of Indigenous Australian genomes was made possible by careful community engagement and consultation. We observe exceptionally strong population structure across Australia, driven by divergence times between communities of 26,000-35,000 years ago and long-term low but stable effective population sizes. This demographic history, including early divergence from Papua New Guinean (47,000 years ago) and Eurasian groups

Identifiants

pubmed: 38093005
doi: 10.1038/s41586-023-06831-w
pii: 10.1038/s41586-023-06831-w
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Investigateurs

Ashley Farlow (A)
Azure Hermes (A)
Hardip R Patel (HR)
Sharon Huebner (S)
Gareth Baynam (G)
Misty R Jenkins (MR)
Simon Easteal (S)
Stephen Leslie (S)

Informations de copyright

© 2023. The Author(s).

Références

Malaspinas, A. S. et al. A genomic history of Aboriginal Australia. Nature 538, 207–214 (2016).
pubmed: 27654914 doi: 10.1038/nature18299
Henn, B. M. et al. Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc. Natl Acad. Sci. USA 113, E440–E449 (2016).
pubmed: 26712023 doi: 10.1073/pnas.1510805112
Rasmussen, M. et al. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98 (2011).
pubmed: 21940856 pmcid: 3991479 doi: 10.1126/science.1211177
Jacobs, G. S. et al. Multiple deeply divergent denisovan ancestries in Papuans. Cell 177, 1010–1021.e32 (2019).
pubmed: 30981557 doi: 10.1016/j.cell.2019.02.035
Vernot, B. et al. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 352, 235–239 (2016).
pubmed: 26989198 pmcid: 6743480 doi: 10.1126/science.aad9416
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
pubmed: 27654912 pmcid: 5161557 doi: 10.1038/nature18964
Tobler, R. et al. Aboriginal mitogenomes reveal 50,000 years of regionalism in Australia. Nature 544, 180–184 (2017).
pubmed: 28273067 doi: 10.1038/nature21416
Bouckaert, R. R., Bowern, C. & Atkinson, Q. D. The origin and expansion of Pama–Nyungan languages across Australia. Nat. Ecol. Evol. 2, 741–749 (2018).
pubmed: 29531347 doi: 10.1038/s41559-018-0489-3
McConvell, P. & Bowern, C. The prehistory and internal relationships of Australian languages. Lang. Linguist. Compass 5, 19–32 (2011).
doi: 10.1111/j.1749-818X.2010.00257.x
Barbieri, C. et al. A global analysis of matches and mismatches between human genetic and linguistic histories. Proc. Natl Acad. Sci. USA 119, e2122084119 (2022).
pubmed: 36399547 pmcid: 9704691 doi: 10.1073/pnas.2122084119
Australian National University. National Centre for Indigenous Genomics Statute (2021); www.legislation.gov.au/Details/F2021L00183 .
Peterson, N. & Taylor, J. Demographic transition in a hunter-gatherer population: the Tiwi case, 1929–1996. Aust. Aborig. Stud. 1, 11–27 (1998).
Tindale, N. Genealogical Data on the Aborigines of Australia, Vol. 2 (1938–1939) (Department of Aboriginal and Torres Strait Islander Partnerships, Community and Personal Histories Removals Database; originally held by the Museum of South Australia, 1938).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
doi: 10.1038/nature15393
Nagle, N. et al. Antiquity and diversity of aboriginal Australian Y-chromosomes. Am. J. Phys. Anthropol. 159, 367–381 (2016).
pubmed: 26515539 doi: 10.1002/ajpa.22886
McEvoy, B. P. et al. Whole-genome genetic diversity in a sample of Australians with deep Aboriginal ancestry. Am. J. Hum. Genet. 87, 297–305 (2010).
pubmed: 20691402 pmcid: 2917718 doi: 10.1016/j.ajhg.2010.07.008
Bergström, A. et al. Deep roots for Aboriginal Australian Y chromosomes. Curr. Biol. 26, 809–813 (2016).
pubmed: 26923783 pmcid: 4819516 doi: 10.1016/j.cub.2016.01.028
Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell 185, 3426–3440.e19 (2022).
pubmed: 36055201 pmcid: 9439720 doi: 10.1016/j.cell.2022.08.004
Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
pubmed: 32193295 pmcid: 7115999 doi: 10.1126/science.aay5012
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654 pmcid: 7334197 doi: 10.1038/s41586-020-2308-7
Henn, B. M., Cavalli-Sforza, L. L. & Feldman, M. W. The great human expansion. Proc. Natl Acad. Sci. USA 109, 17758–17764 (2012).
pubmed: 23077256 pmcid: 3497766 doi: 10.1073/pnas.1212380109
Friedlaender, J. S. et al. The genetic structure of Pacific Islanders. PLoS Genet. 4, e19 (2008).
pubmed: 18208337 pmcid: 2211537 doi: 10.1371/journal.pgen.0040019
Xue, A. et al. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat. Commun. 9, 2941 (2018).
pubmed: 30054458 pmcid: 6063971 doi: 10.1038/s41467-018-04951-w
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res. 11, 863–874 (2001).
pubmed: 11337480 pmcid: 311071 doi: 10.1101/gr.176601
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
pubmed: 20354512 pmcid: 2855889 doi: 10.1038/nmeth0410-248
Landrum, M. J. & Kattman, B. L. ClinVar at five years: delivering on the promise. Hum. Mutat. 39, 1623–1630 (2018).
pubmed: 30311387 doi: 10.1002/humu.23641
Kirin, M. et al. Genomic runs of homozygosity record population history and consanguinity. PLoS ONE 5, e13996 (2010).
pubmed: 21085596 pmcid: 2981575 doi: 10.1371/journal.pone.0013996
Hermes, A. et al. Beyond platitudes: a qualitative study of Australian Aboriginal people’s perspectives on biobanking. Intern Med. J. 51, 1426–1432 (2021).
pubmed: 33528097 doi: 10.1111/imj.15223
International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).
doi: 10.1038/nature09298
Marchini, J., Cardon, L. R., Phillips, M. S. & Donnelly, P. The effects of human population structure on large genetic association studies. Nat. Genet. 36, 512–517 (2004).
pubmed: 15052271 doi: 10.1038/ng1337
Leslie, S. et al. The fine-scale genetic structure of the British population. Nature 519, 309–314 (2015).
pubmed: 25788095 pmcid: 4632200 doi: 10.1038/nature14230
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
pubmed: 19648217 pmcid: 2752134 doi: 10.1101/gr.094052.109
Lawson, D. J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012).
pubmed: 22291602 pmcid: 3266881 doi: 10.1371/journal.pgen.1002453
Diaz-Papkovich, A., Anderson-Trocmé, L., Ben-Eghan, C. & Gravel, S. UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet. 15, e1008432 (2019).
pubmed: 31675358 pmcid: 6853336 doi: 10.1371/journal.pgen.1008432
Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).
pubmed: 19779445 pmcid: 2842210 doi: 10.1038/nature08365
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
pubmed: 22960212 pmcid: 3522152 doi: 10.1534/genetics.112.145037
Nagle, N. et al. Mitochondrial DNA diversity of present-day Aboriginal Australians and implications for human evolution in Oceania. J. Hum. Genet. 62, 343–353 (2017).
pubmed: 27904152 doi: 10.1038/jhg.2016.147
Baumdicker, F. et al. Efficient ancestry and mutation simulation with msprime 1.0. Genetics 220, iyab229 (2022).
pubmed: 34897427 doi: 10.1093/genetics/iyab229
Raynal, L. et al. ABC random forests for Bayesian parameter inference. Bioinformatics 35, 1720–1728 (2019).
pubmed: 30321307 doi: 10.1093/bioinformatics/bty867
Nielsen, S. V. et al. Bayesian inference of admixture graphs on Native American and Arctic populations. PLoS Genet. 19, e1010410 (2023).
pubmed: 36780565 pmcid: 9956672 doi: 10.1371/journal.pgen.1010410
Browning, S. R. & Browning, B. L. Accurate non-parametric estimation of recent effective population size from segments of identity by descent. Am. J. Hum. Genet. 97, 404–418 (2015).
pubmed: 26299365 pmcid: 4564943 doi: 10.1016/j.ajhg.2015.07.012
Schiffels, S. & Wang, K. MSMC and MSMC2: the multiple sequentially Markovian coalescent. Methods Mol. Biol. 2090, 147–166 (2020).
pubmed: 31975167 doi: 10.1007/978-1-0716-0199-0_7
Yunusbaev, U. et al. Reconstructing recent population history while mapping rare variants using haplotypes. Sci. Rep. 9, 5849 (2019).
pubmed: 30971755 pmcid: 6458133 doi: 10.1038/s41598-019-42385-6
Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).
pubmed: 24952747 pmcid: 4116295 doi: 10.1038/ng.3015
Nagle, N. et al. Aboriginal Australian mitochondrial genome variation – an increased understanding of population antiquity and diversity. Sci. Rep. 7, 43041 (2017).
pubmed: 28287095 pmcid: 5347126 doi: 10.1038/srep43041
Hudjashov, G. et al. Revealing the prehistoric settlement of Australia by Y chromosome and mtDNA analysis. Proc. Natl Acad. Sci. USA 104, 8726–8730 (2007).
pubmed: 17496137 pmcid: 1885570 doi: 10.1073/pnas.0702928104
Pedro, N. et al. Papuan mitochondrial genomes and the settlement of Sahul. J. Hum. Genet. 65, 875–887 (2020).
pubmed: 32483274 pmcid: 7449881 doi: 10.1038/s10038-020-0781-3
Purnomo, G. A. et al. Mitogenomes reveal two major influxes of Papuan ancestry across Wallacea following the last glacial maximum and Austronesian contact. Genes (Basel) 12, 965 (2021).
pubmed: 34202821 doi: 10.3390/genes12070965
Nielsen, R. et al. Tracing the peopling of the world through genomics. Nature 541, 302–310 (2017).
pubmed: 28102248 pmcid: 5772775 doi: 10.1038/nature21347
Easteal, S. et al. Equitable expanded carrier screening needs indigenous clinical and population genomic data. Am. J. Hum. Genet. 107, 175–182 (2020).
pubmed: 32763188 pmcid: 7413856 doi: 10.1016/j.ajhg.2020.06.005
Baynam, G. et al. A germline MTOR mutation in Aboriginal Australian siblings with intellectual disability, dysmorphism, macrocephaly, and small thoraces. Am. J. Med. Genet. A 167, 1659–1667 (2015).
pubmed: 25851998 doi: 10.1002/ajmg.a.37070
Chen, S. et al. A genome-wide mutational constraint map quantified from variation in 76,156 human genomes. Preprint at bioRxiv https://doi.org/10.1101/2022.03.20.485034 (2022).
Deelen, P. et al. Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands’. Eur. J. Hum. Genet. 22, 1321–1326 (2014).
pubmed: 24896149 pmcid: 4200431 doi: 10.1038/ejhg.2014.19
Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).
pubmed: 25807286 doi: 10.1038/ng.3247
Mitt, M. et al. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. Eur. J. Hum. Genet. 25, 869–876 (2017).
pubmed: 28401899 pmcid: 5520064 doi: 10.1038/ejhg.2017.51
Huebner, S., Hermes, A. & Easteal, S. in Indigenous Research Ethics: Claiming Research Sovereignty Beyond Deficit and the Colonial Legacy, Vol. 6 (eds George, L., Tauri, J. & MacDonald, L. T. A. o T.) Ch. 8 (Emerald, 2020).
Bergström, A. et al. A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017).
pubmed: 28912245 pmcid: 5802383 doi: 10.1126/science.aan3842
Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).
doi: 10.1038/nmeth.1785
Delaneau, O., Howie, B., Cox, A. J., Zagury, J. F. & Marchini, J. Haplotype estimation using sequencing reads. Am. J. Hum. Genet. 93, 687–696 (2013).
pubmed: 24094745 pmcid: 3791270 doi: 10.1016/j.ajhg.2013.09.002
Choi, Y., Chan, A. P., Kirkness, E., Telenti, A. & Schork, N. J. Comparison of phasing strategies for whole human genomes. PLoS Genet. 14, e1007308 (2018).
pubmed: 29621242 pmcid: 5903673 doi: 10.1371/journal.pgen.1007308
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, 2074–2093 (2006).
doi: 10.1371/journal.pgen.0020190
Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).
pubmed: 23910464 pmcid: 3738819 doi: 10.1016/j.ajhg.2013.06.020
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
pubmed: 20926424 pmcid: 3025716 doi: 10.1093/bioinformatics/btq559
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
pubmed: 17701901 pmcid: 1950838 doi: 10.1086/519795
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
pubmed: 27268795 pmcid: 4893825 doi: 10.1186/s13059-016-0974-4
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
pubmed: 33590861 pmcid: 7931819 doi: 10.1093/gigascience/giab008
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2009); https://www.R-project.org/ .
Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).
pubmed: 23535385 pmcid: 3664855 doi: 10.1534/genetics.113.150029
Browning, S. R. et al. Local ancestry inference in a large US-based Hispanic/Latino study: Hispanic Community Health Study/Study of Latinos (HCHS/SOL). G3 6, 1525–1534 (2016).
pubmed: 27172203 pmcid: 4889649 doi: 10.1534/g3.116.028779
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 3, 861 (2018).
doi: 10.21105/joss.00861
Bhatia, G., Patterson, N., Sankararaman, S. & Price, A. L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 23, 1514–1521 (2013).
pubmed: 23861382 pmcid: 3759727 doi: 10.1101/gr.154831.113
Peter, B. M. Admixture, population structure, and f-statistics. Genetics 202, 1485–1501 (2016).
pubmed: 26857625 pmcid: 4905545 doi: 10.1534/genetics.115.183913
Kelleher, J., Etheridge, A. M. & McVean, G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput. Biol. 12, e1004842 (2016).
pubmed: 27145223 pmcid: 4856371 doi: 10.1371/journal.pcbi.1004842
Ralph, P., Thornton, K. & Kelleher, J. Efficiently summarizing relationships in large samples: a general duality between statistics of genealogies and genomes. Genetics 215, 779–797 (2020).
pubmed: 32357960 pmcid: 7337078 doi: 10.1534/genetics.120.303253
Pudlo, P. et al. Reliable ABC model choice via random forests. Bioinformatics 32, 859–866 (2016).
pubmed: 26589278 doi: 10.1093/bioinformatics/btv684
Browning, B. L. & Browning, S. R. Detecting identity by descent and estimating genotype error rates in sequence data. Am. J. Hum. Genet. 93, 840–851 (2013).
pubmed: 24207118 pmcid: 3824133 doi: 10.1016/j.ajhg.2013.09.014
Browning, S. R. et al. Ancestry-specific recent effective population size in the Americas. PLoS Genet. 14, e1007385 (2018).
pubmed: 29795556 pmcid: 5967706 doi: 10.1371/journal.pgen.1007385
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
pubmed: 20644199 pmcid: 2928508 doi: 10.1101/gr.107524.110
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).
pubmed: 17996036 pmcid: 2247476 doi: 10.1186/1471-2148-7-214
Kahle, D. & Wickham, H. ggmap: spatial visualization with ggplot2. R J. 5, 144–161 (2013).
doi: 10.32614/RJ-2013-014
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).

Auteurs

Matthew Silcocks (M)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.
University of Melbourne, School of Biosciences, Parkville, Victoria, Australia.

Ashley Farlow (A)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.
University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia.

Azure Hermes (A)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.

Georgia Tsambos (G)

University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia.

Hardip R Patel (HR)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.

Sharon Huebner (S)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.

Gareth Baynam (G)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.
Faculty of Health and Medical Sciences, Division of Paediatrics and Telethon Kids Institute, University of Western Australia, Perth, Western Australia, Australia.
Western Australian Register of Developmental Anomalies, King Edward Memorial Hospital and Rare Care Centre, Perth Children's Hospital, Perth, Western Australia, Australia.

Misty R Jenkins (MR)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.
Immunology Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.
University of Melbourne, Department of Medical Biology, Parkville, Victoria, Australia.

Damjan Vukcevic (D)

University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia.

Simon Easteal (S)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.

Stephen Leslie (S)

National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia. stephen.leslie@unimelb.edu.au.
University of Melbourne, School of Biosciences, Parkville, Victoria, Australia. stephen.leslie@unimelb.edu.au.
University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia. stephen.leslie@unimelb.edu.au.

Classifications MeSH