Complex trait susceptibilities and population diversity in a sample of 4,145 Russians.
Humans
Russia
/ epidemiology
Gene Frequency
Genome-Wide Association Study
Male
Polymorphism, Single Nucleotide
Female
Genetic Predisposition to Disease
Genetics, Population
Phenotype
White People
/ genetics
Finland
Asian People
/ genetics
Genetic Variation
Cohort Studies
Multifactorial Inheritance
/ genetics
Ethnicity
/ genetics
Eastern European People
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
23 Jul 2024
23 Jul 2024
Historique:
received:
27
03
2023
accepted:
02
07
2024
medline:
24
7
2024
pubmed:
24
7
2024
entrez:
23
7
2024
Statut:
epublish
Résumé
The population of Russia consists of more than 150 local ethnicities. The ethnic diversity and geographic origins, which extend from eastern Europe to Asia, make the population uniquely positioned to investigate the shared properties of inherited disease risks between European and Asian ancestries. We present the analysis of genetic and phenotypic data from a cohort of 4,145 individuals collected in three metro areas in western Russia. We show the presence of multiple admixed genetic ancestry clusters spanning from primarily European to Asian and high identity-by-descent sharing with the Finnish population. As a result, there was notable enrichment of Finnish-specific variants in Russia. We illustrate the utility of Russian-descent cohorts for discovery of novel population-specific genetic associations, as well as replication of previously identified associations that were thought to be population-specific in other cohorts. Finally, we provide access to a database of allele frequencies and GWAS results for 464 phenotypes.
Identifiants
pubmed: 39043636
doi: 10.1038/s41467-024-50304-1
pii: 10.1038/s41467-024-50304-1
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
6212Informations de copyright
© 2024. The Author(s).
Références
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
doi: 10.1038/s41586-018-0579-z
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
doi: 10.1038/s41586-022-05473-8
Kubo, M., Guest Editors. BioBank Japan project: epidemiological study. J. Epidemiol. 27, S1 (2017).
doi: 10.1016/j.je.2016.11.001
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
doi: 10.1038/s41586-019-1310-4
Lin, M., Park, D. S., Zaitlen, N. A., Henn, B. M. & Gignoux, C. R. Admixed populations improve power for variant discovery and portability in genome-wide association studies. Front. Genet. 12, 673167 (2021).
doi: 10.3389/fgene.2021.673167
Watkins, W. S. et al. The simons genome diversity project: a global analysis of mobile element diversity. Genome Biol. Evol. 12, 779–794 (2020).
doi: 10.1093/gbe/evaa086
Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
Pagani, L. et al. Genomic analyses inform on migration events during the peopling of Eurasia. Nature 538, 238–242 (2016).
doi: 10.1038/nature19792
Nelis, M. et al. Genetic structure of Europeans: a view from the North-East. PLoS One 4, e5472 (2009).
doi: 10.1371/journal.pone.0005472
Barbitoff, Y. A. et al. Expanding the Russian allele frequency reference via cross-laboratory data integration: insights from 7,452 exome samples. medRxiv 2021.11.02.21265801 https://doi.org/10.1101/2021.11.02.21265801 (2022).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
doi: 10.1038/s41588-019-0379-x
Kolosov, N. et al. Genotype imputation and polygenic score estimation in northwestern Russian population. PLoS One 17, e0269434 (2022).
doi: 10.1371/journal.pone.0269434
Albert, E. A. et al. Transferability of the PRS estimates for height and BMI obtained from the European ethnic groups to the Western Russian populations. Front. Genet. 14, 1086709 (2023).
doi: 10.3389/fgene.2023.1086709
Khrunin, A. V. et al. A genome-wide analysis of populations from European Russia reveals a new pole of genetic diversity in northern Europe. PLoS One 8, e58552 (2013).
doi: 10.1371/journal.pone.0058552
Kushniarevich, A. et al. Genetic heritage of the balto-slavic speaking populations: a synthesis of autosomal, mitochondrial and Y-chromosomal data. PLoS One 10, e0135820 (2015).
doi: 10.1371/journal.pone.0135820
Wong, E. H. M. et al. Reconstructing genetic history of Siberian and Northeastern European populations. Genome Res. 27, 1–14 (2017).
doi: 10.1101/gr.202945.115
Bai, H. et al. Whole-genome sequencing of 175 Mongolians uncovers population-specific genetic architecture and gene flow throughout North and East Asia. Nat. Genet. 50, 1696–1704 (2018).
doi: 10.1038/s41588-018-0250-5
Zhernakova, D. V. et al. Genome-wide sequence analyses of ethnic populations across Russia. Genomics 112, 442–458 (2020).
doi: 10.1016/j.ygeno.2019.03.007
Kontsevaya, A. et al. Overweight and obesity in the russian population: prevalence in adults and association with socioeconomic parameters and cardiovascular risk factors. Obes. Facts 12, 103–114 (2019).
doi: 10.1159/000493885
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
doi: 10.1038/ng.3643
Browning, B. L., Zhou, Y. & Browning, S. R. A one-penny imputed genome from next-generation reference panels. Am. J. Hum. Genet. 103, 338–348 (2018).
doi: 10.1016/j.ajhg.2018.07.015
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
doi: 10.1101/gr.094052.109
Behar, D. M. et al. The genome-wide structure of the Jewish people. Nature 466, 238–242 (2010).
doi: 10.1038/nature09103
Yunusbayev, B. et al. The Caucasus as an asymmetric semipermeable barrier to ancient human migrations. Mol. Biol. Evol. 29, 359–365 (2012).
doi: 10.1093/molbev/msr221
Xing, J. et al. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians. PLoS Genet 9, e1003634 (2013).
doi: 10.1371/journal.pgen.1003634
Martin, A. R. et al. Haplotype sharing provides insights into fine-scale population history and disease in Finland. Am. J. Hum. Genet. 102, 760–775 (2018).
doi: 10.1016/j.ajhg.2018.03.003
Liu, M. et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 51, 237–244 (2019).
doi: 10.1038/s41588-018-0307-5
Ishii, M. Apolipoprotein B as a new link between cholesterol and Alzheimer disease. JAMA Neurol. 76, 751–753 (2019).
doi: 10.1001/jamaneurol.2019.0212
Raghavan, M. et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature 505, 87–91 (2014).
doi: 10.1038/nature12736
Di Cristofaro, J. et al. Afghan Hindu Kush: where Eurasian sub-continent gene flows converge. PLoS One 8, e76748 (2013).
doi: 10.1371/journal.pone.0076748
Baker, J. L., Rotimi, C. N. & Shriner, D. Human ancestry correlates with language and reveals that race is not an objective genomic classifier. Sci. Rep. 7, 1–10 (2017).
doi: 10.1038/s41598-017-01837-7
Marnetto, D. et al. Ancestry deconvolution and partial polygenic score can improve susceptibility predictions in recently admixed individuals. Nat. Commun. 11, 1628 (2020).
doi: 10.1038/s41467-020-15464-w
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
doi: 10.1186/s13742-015-0047-8
Hail Team. Hail 0.2. https://github.com/hail-is/hail .
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria, 2021). https://www.R-project.org/ .
Vinue, G., & Epifanio, I. adamethods: Archetypoid Algorithms and Anomaly Detection (Comprehensive R Archive Network (CRAN)). https://CRAN.R-project.org/package=adamethods .
Artomov, M., Loboda, A. A., Artyomov, M. N. & Daly, M. J. Public platform with 39,472 exome control samples enables association studies without genotype sharing. Nat. Genet. 56, 327–335 (2024).
doi: 10.1038/s41588-023-01637-y
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
doi: 10.1093/bioinformatics/btr330
Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
doi: 10.1086/521987
Browning, S. R. & Browning, B. L. Accurate non-parametric estimation of recent effective population size from segments of identity by descent. Am. J. Hum. Genet. 97, 404–418 (2015).
doi: 10.1016/j.ajhg.2015.07.012
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet 8, e1002967 (2012).
doi: 10.1371/journal.pgen.1002967
Wickham H., François R., Henry L., Müller K., Vaughan D. dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://github.com/tidyverse/dplyr , https://dplyr.tidyverse.org (2023).
Wickham H., Vaughan D., Girlich M. tidyr: Tidy Messy Data. R package version 1.3.1, https://github.com/tidyverse/tidyr , https://tidyr.tidyverse.org (2024).
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
doi: 10.1186/s13059-016-0974-4
Gagliano Taliun, S. A. et al. Exploring and visualizing large-scale genetic associations by using PheWeb. Nat. Genet. 52, 550–552 (2020).
doi: 10.1038/s41588-020-0622-5
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
doi: 10.1038/ng.3211
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
doi: 10.1038/ng.3406
Peat, G. et al. The open targets post-GWAS analysis pipeline. Bioinformatics 36, 2936–2937 (2020).
doi: 10.1093/bioinformatics/btaa020
Kolosov, N., Daly, M. J. & Artomov, M. Prioritization of disease genes from GWAS using ensemble-based positive-unlabeled learning. Eur. J. Hum. Genet. 29, 1527–1535 (2021).
doi: 10.1038/s41431-021-00930-w
GitHub. GitHub - MRCIEU/ieugwasr: R interface to the IEU GWAS database API https://github.com/MRCIEU/ieugwasr .
Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet 13, e1007081 (2017).
doi: 10.1371/journal.pgen.1007081
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
doi: 10.1109/MCSE.2007.55