Dimensionality reduction reveals fine-scale structure in the Japanese population with consequences for polygenic risk prediction.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
26 03 2020
Historique:
received: 12 06 2019
accepted: 19 02 2020
entrez: 29 3 2020
pubmed: 29 3 2020
medline: 16 7 2020
Statut: epublish

Résumé

The diversity in our genome is crucial to understanding the demographic history of worldwide populations. However, we have yet to know whether subtle genetic differences within a population can be disentangled, or whether they have an impact on complex traits. Here we apply dimensionality reduction methods (PCA, t-SNE, PCA-t-SNE, UMAP, and PCA-UMAP) to biobank-derived genomic data of a Japanese population (n = 169,719). Dimensionality reduction reveals fine-scale population structure, conspicuously differentiating adjacent insular subpopulations. We further enluciate the demographic landscape of these Japanese subpopulations using population genetics analyses. Finally, we perform phenome-wide polygenic risk score (PRS) analyses on 67 complex traits. Differences in PRS between the deconvoluted subpopulations are not always concordant with those in the observed phenotypes, suggesting that the PRS differences might reflect biases from the uncorrected structure, in a trait-dependent manner. This study suggests that such an uncorrected structure can be a potential pitfall in the clinical application of PRS.

Identifiants

pubmed: 32218440
doi: 10.1038/s41467-020-15194-z
pii: 10.1038/s41467-020-15194-z
pmc: PMC7099015
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1569

Références

Groucutt, H. S. et al. Rethinking the dispersal of Homo sapiens out of Africa. Evol. Anthropol. 24, 149–164 (2015).
pubmed: 26267436 pmcid: 6715448 doi: 10.1002/evan.21455
Pontzer, Herman Overview of hominin evolution|learn science at scitable. Nat. Educ. Knowl. 3, 8 (2012).
Fumagalli, M. et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science 349, 1343–1347 (2015).
pubmed: 26383953 doi: 10.1126/science.aab2319
Huerta-Sánchez, E. et al. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature 512, 194–197 (2014).
pubmed: 25043035 pmcid: 4134395 doi: 10.1038/nature13408
Yang, J. et al. Genetic signatures of high-altitude adaptation in Tibetans. Proc. Natl Acad. Sci. USA 114, 4189–4194 (2017).
pubmed: 28373541 doi: 10.1073/pnas.1617042114
Sikora, M. et al. Physiological and genetic adaptations to diving in Sea Nomads. Cell 173, 569–580.e15 (2018).
pubmed: 29677510 doi: 10.1016/j.cell.2018.03.054
Galinsky, K. J. et al. Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia. Am. J. Hum. Genet. 98, 456–472 (2016).
pubmed: 26924531 pmcid: 4827102 doi: 10.1016/j.ajhg.2015.12.022
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
pubmed: 16862161 doi: 10.1038/ng1847
Hirata, J. et al. Genetic and phenotypic landscape of the major histocompatibilty complex region in the Japanese population. Nat. Genet. 51, 470–480 (2019).
pubmed: 30692682 doi: 10.1038/s41588-018-0336-0
Li, L. et al. A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features. J. Bioinform. Comput. Biol. 15, 1650025 (2017).
pubmed: 27411307 doi: 10.1142/S0219720016500256
Platzer, A. Visualization of SNPs with t-SNE. PLoS ONE 8, e56883 (2013).
pubmed: 23457633 pmcid: 3574019 doi: 10.1371/journal.pone.0056883
Diaz-Papkovich, A., Anderson-Trocmé, L., Ben-Eghan, C. & Gravel, S. UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet. 15, e1008432 (2019).
pubmed: 31675358 pmcid: 6853336 doi: 10.1371/journal.pgen.1008432
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
pubmed: 30926966 pmcid: 6563838 doi: 10.1038/s41588-019-0379-x
Kerminen, S. et al. Geographic variation and bias in the polygenic scores of complex diseases and traits in Finland. Am. J. Hum. Genet. 104, 1169–1181 (2019).
pubmed: 31155286 pmcid: 6562021 doi: 10.1016/j.ajhg.2019.05.001
Jinam, T. et al. The history of human populations in the Japanese Archipelago inferred from genome-wide SNP data with a special reference to the Ainu and the Ryukyuan populations. J. Hum. Genet. 57, 787–795 (2012).
pubmed: 23135232 doi: 10.1038/jhg.2012.114 pmcid: 23135232
Takeuchi, F. et al. The fine-scale genetic structure and evolution of the Japanese population. PLoS ONE 12, 1–28 (2017).
Omoto, K. & Saitou, N. Genetic origins of the Japanese: a partial support for the dual structure hypothesis. Am. J. Phys. Anthropol. 102, 437–446 (1997).
pubmed: 9140536 doi: 10.1002/(SICI)1096-8644(199704)102:4<437::AID-AJPA1>3.0.CO;2-P pmcid: 9140536
van der Maaten, Laurens & Hinton, G. Visualizing data using t-SNE Laurens. J. Mach. Learn. Res. 9, 2579–2605 (2008).
McInnes, L., Healy, J. & Melville, J. UMAP: Uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).
Okada, Y. et al. Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese. Nat. Commun. 9, 1631 (2018).
pubmed: 29691385 pmcid: 5915442 doi: 10.1038/s41467-018-03274-0
Cohen, J. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46 (1960).
doi: 10.1177/001316446002000104
Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics 33, 159 (1977).
pubmed: 843571 pmcid: 843571
Gibbs, R. A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
doi: 10.1038/nature15393
Leslie, S. et al. The fine-scale genetic structure of the British population. Nature 519, 309–314 (2015).
pubmed: 25788095 pmcid: 4632200 doi: 10.1038/nature14230
Kerminen, S. et al. Fine-scale genetic structure in Finland. G3 Genes Genomes Genet. 7, 3459–3468 (2017).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
pubmed: 19648217 pmcid: 2752134 doi: 10.1101/gr.094052.109
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012).
pubmed: 23166502 pmcid: 3499260 doi: 10.1371/journal.pgen.1002967
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
pubmed: 30305743 pmcid: 6786975 doi: 10.1038/s41586-018-0579-z
Too, C. L. et al. Smoking interacts with HLA-DRB1 shared epitope in the development of anti-citrullinated protein antibody-positive rheumatoid arthritis: results from the Malaysian Epidemiological Investigation of Rheumatoid Arthritis (MyEIRA). Arthritis Res. Ther. 14, R89 (2012).
pubmed: 22537824 pmcid: 3446463 doi: 10.1186/ar3813
Saxena, R. et al. A multinational Arab Genome‐Wide Association Study identifies new genetic associations for rheumatoid. Arthritis Arthritis Rheumatol. 69, 976–985 (2017).
pubmed: 28118524 doi: 10.1002/art.40051
Sakaue, S. et al. Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan. Nat. Med. https://doi.org/10.1038/s41591-020-0785-8 (in press).
Berg, J. J. et al. Reduced signal for polygenic adaptation of height in UK Biobank. eLife 8, e39725. https://doi.org/10.7554/eLife.39725 (2019).
O’Connor, L. J. et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019).
pubmed: 31402091 pmcid: 6732528 doi: 10.1016/j.ajhg.2019.07.003
Turchin, M. C. et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nat. Genet. 44, 1015–1019 (2012).
pubmed: 22902787 pmcid: 3480734 doi: 10.1038/ng.2368
Berg, J. J. & Coop, G. A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014).
pubmed: 25102153 pmcid: 4125079 doi: 10.1371/journal.pgen.1004412
The “All of Us” research program. N. Engl. J. Med. 381, 668–676. https://www.nejm.org/doi/full/10.1056/NEJMsr1809937 (2019).
Hirata, M. et al. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J. Epidemiol. 27, S9–S21 (2017).
pubmed: 28190657 pmcid: 5363792 doi: 10.1016/j.je.2016.12.003
Nagai, A. et al. Overview of the BioBank Japan Project: study design and profile. J. Epidemiol. 27, S2–S8 (2017).
pubmed: 28189464 pmcid: 5350590 doi: 10.1016/j.je.2016.12.005
Akiyama, M. et al. Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. Nat. Genet. 49, 1458–1467 (2017).
pubmed: 28892062 doi: 10.1038/ng.3951 pmcid: 28892062
Kanai, M. et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 50, 390–400 (2018).
pubmed: 29403010 doi: 10.1038/s41588-018-0047-6
Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
pubmed: 27694958 pmcid: 5096458 doi: 10.1038/ng.3679
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
pubmed: 1950838 pmcid: 1950838 doi: 10.1086/519795
Van Der Maaten, L. Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, http://homepage.tudelft.nl/19j49/tsne (2014).
Kanai, M., Maeda, Y. & Okada, Y. Grimon: graphical interface to visualize multi-omics networks. Bioinformatics 34, 3934–3936 (2018).
pubmed: 29931190 pmcid: 6223372 doi: 10.1093/bioinformatics/bty488
Too, C. L. et al. Polymorphisms in peptidylarginine deiminase associate with rheumatoid arthritis in diverse Asian populations: evidence from MyEIRA study and meta-analysis. Arthritis Res. Ther. 14, R250 (2012).
pubmed: 23164236 pmcid: 3674620 doi: 10.1186/ar4093
Sohail, M. et al. Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies. eLife 8, e39702. https://doi.org/10.7554/eLife.39702 (2019).
Bulik-Sullivan, B. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
pubmed: 25642630 pmcid: 4495769 doi: 10.1038/ng.3211
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
pubmed: 21167468 pmcid: 3014363 doi: 10.1016/j.ajhg.2010.11.011

Auteurs

Saori Sakaue (S)

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan.
Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan.
Department of Allergy and Rheumatology, Graduate School of Medicine, the University of Tokyo, Tokyo, 113-8655, Japan.

Jun Hirata (J)

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan.
Pharmaceutical Discovery Research Laboratories, TEIJIN PHARMA LIMITED, Hino, 191-8512, Japan.

Masahiro Kanai (M)

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan.
Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan.
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA.

Ken Suzuki (K)

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan.

Masato Akiyama (M)

Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan.
Department of Ophthalmology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, 812-8582, Japan.

Chun Lai Too (C)

Allergy and Immunology Research Center, Institute for Medical Research, Ministry of Health Malaysia, 40170, Setia Alam, Malaysia.
Division of Rheumatology, Department of Medicine, Karolinska Institutet and Karolinska University Hospital, 17177, Stockholm, Sweden.

Thurayya Arayssi (T)

Department of Internal Medicine, Weill Cornell Medicine-Qatar, Education City, Doha, 24144, Qatar.

Mohammed Hammoudeh (M)

Department of Internal Medicine, Hamad Medical Corporation, Doha, 3050, Qatar.

Samar Al Emadi (S)

Department of Internal Medicine, Hamad Medical Corporation, Doha, 3050, Qatar.

Basel K Masri (BK)

Department of Internal Medicine, Jordan Hospital, Amman, 520248, Jordan.

Hussein Halabi (H)

Rheumatology Division, Department of Internal Medicine, King Faisal Specialist Hospital and Research Center, Jeddah, H45X+P6, Saudi Arabia.

Humeira Badsha (H)

Dr. Humeira Badsha Medical Center, Emirates Hospital, Dubai, 391203, United Arab Emirates.

Imad W Uthman (IW)

Department of Rheumatology, American University of Beirut, Beirut, 11-0236, Lebanon.

Richa Saxena (R)

Center for Genomic Medicine, Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02115, USA.
Program in Medical and Population Genetics Broad Institute, Cambridge, MA, 02142, USA.

Leonid Padyukov (L)

Division of Rheumatology, Department of Medicine, Karolinska Institutet and Karolinska University Hospital, 17177, Stockholm, Sweden.

Makoto Hirata (M)

Laboratory of Genome Technology, Institute of Medical Science, the University of Tokyo, Tokyo, 108-8639, Japan.

Koichi Matsuda (K)

Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, the University of Tokyo, Tokyo, 108-8639, Japan.

Yoshinori Murakami (Y)

Division of Molecular Pathology, Institute of Medical Science, the University of Tokyo, Tokyo, 108-8639, Japan.

Yoichiro Kamatani (Y)

Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan.
Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, the University of Tokyo, Tokyo, 108-8639, Japan.

Yukinori Okada (Y)

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan. yokada@sg.med.osaksa-u.ac.jp.
Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan. yokada@sg.med.osaksa-u.ac.jp.
Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan. yokada@sg.med.osaksa-u.ac.jp.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH