Partitioning and aggregating cross-tissue and tissue-specific genetic effects to identify gene-trait associations.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
09 Jul 2024
09 Jul 2024
Historique:
received:
27
11
2023
accepted:
25
06
2024
medline:
10
7
2024
pubmed:
10
7
2024
entrez:
9
7
2024
Statut:
epublish
Résumé
TWAS have shown great promise in extending GWAS loci to a functional understanding of disease mechanisms. In an effort to fully unleash the TWAS and GWAS information, we propose MTWAS, a statistical framework that partitions and aggregates cross-tissue and tissue-specific genetic effects in identifying gene-trait associations. We introduce a non-parametric imputation strategy to augment the inaccessible tissues, accommodating complex interactions and non-linear expression data structures across various tissues. We further classify eQTLs into cross-tissue eQTLs and tissue-specific eQTLs via a stepwise procedure based on the extended Bayesian information criterion, which is consistent under high-dimensional settings. We show that MTWAS significantly improves the prediction accuracy across all 47 tissues of the GTEx dataset, compared with other single-tissue and multi-tissue methods, such as PrediXcan, TIGAR, and UTMOST. Applying MTWAS to the DICE and OneK1K datasets with bulk and single-cell RNA sequencing data on immune cell types showcases consistent improvements in prediction accuracy. MTWAS also identifies more predictable genes, and the improvement can be replicated with independent studies. We apply MTWAS to 84 UK Biobank GWAS studies, which provides insights into disease etiology.
Identifiants
pubmed: 38982044
doi: 10.1038/s41467-024-49924-4
pii: 10.1038/s41467-024-49924-4
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
5769Subventions
Organisme : U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
ID : R01 GM152814-01
Informations de copyright
© 2024. The Author(s).
Références
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
pubmed: 28686856
pmcid: 5501872
doi: 10.1016/j.ajhg.2017.06.005
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
pubmed: 26258848
pmcid: 4552594
doi: 10.1038/ng.3367
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1–20 (2018).
doi: 10.1038/s41467-018-03621-1
Cloney, R. Integrating gene variation and expression to understand complex traits. Nat. Rev. Genet. 17, 194–194 (2016).
pubmed: 26900024
doi: 10.1038/nrg.2016.18
Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371–375 (2014).
pubmed: 24646999
pmcid: 4113484
doi: 10.1038/nature13138
Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
doi: 10.1038/ng.2653
Yao, D. W., O’connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).
pubmed: 32424349
pmcid: 7276299
doi: 10.1038/s41588-020-0625-2
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
pubmed: 26854917
pmcid: 4767558
doi: 10.1038/ng.3506
Nagpal, S. et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Hum. Genet. 105, 258–266 (2019).
pubmed: 31230719
pmcid: 6698804
doi: 10.1016/j.ajhg.2019.05.018
Parrish, R. L., Gibson, G. C., Epstein, M. P. & Yang, J. TIGAR-V2: efficient TWAS tool with nonparametric Bayesian eQTL weights of 49 tissue types from GTEx V8. Hum. Genet. Genomics Adv. 3, 100068 (2022).
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
pubmed: 30668570
pmcid: 6358100
doi: 10.1371/journal.pgen.1007889
Feng, H. et al. Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies. PLoS Genet. 17, e1008973 (2021).
pubmed: 33831007
pmcid: 8057593
doi: 10.1371/journal.pgen.1008973
Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).
pubmed: 30804563
pmcid: 6788740
doi: 10.1038/s41588-019-0345-7
Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).
pubmed: 30478440
doi: 10.1038/s41588-018-0268-8
Brown, C. D., Mangravite, L. M. & Engelhardt, B. E. Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs. PLoS Genet. 9, e1003649 (2013).
pubmed: 23935528
pmcid: 3731231
doi: 10.1371/journal.pgen.1003649
Chen, L. et al. TIVAN: tissue-specific cis-eQTL single nucleotide variant annotation and prediction. Bioinformatics 35, 1573–1575 (2019).
pubmed: 30304335
doi: 10.1093/bioinformatics/bty872
Chen, J. & Chen, Z. Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95, 759–771 (2008).
doi: 10.1093/biomet/asn034
Li, Y. & Liu, J. S. Robust variable and interaction selection for logistic regression and general index models. J. Am. Stat. Assoc. 114, 271–286 (2019).
pubmed: 32863479
doi: 10.1080/01621459.2017.1401541
Stekhoven, D. J. & Bühlmann, P. MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 112–118 (2012).
pubmed: 22039212
doi: 10.1093/bioinformatics/btr597
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
pubmed: 24037378
pmcid: 3918453
doi: 10.1038/nature12531
Schmiedel, B. J. et al. Impact of genetic polymorphisms on human immune cell gene expression. Cell 175, 1701–1715 (2018).
pubmed: 30449622
pmcid: 6289654
doi: 10.1016/j.cell.2018.10.022
Yazar, S. et al. Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
pubmed: 35389779
doi: 10.1126/science.abf3041
Arvanitis, M., Tayeb, K., Strober, B. J. & Battle, A. Redefining tissue specificity of genetic regulation of gene expression in the presence of allelic heterogeneity. Am. J. Hum. Genet. 109, 223–239 (2022).
pubmed: 35085493
pmcid: 8874223
doi: 10.1016/j.ajhg.2022.01.002
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654
pmcid: 7334197
doi: 10.1038/s41586-020-2308-7
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
pubmed: 27535533
pmcid: 5018207
doi: 10.1038/nature19057
Volland, C. et al. Control of p21
pubmed: 31286143
doi: 10.1093/cvr/cvz177
Dong, W.-q. et al. Prohibitin overexpression improves myocardial function in diabetic cardiomyopathy. Oncotarget 7, 66 (2016).
pubmed: 26623724
doi: 10.18632/oncotarget.6384
Yu, Y.-d, Xue, Y.-t & Li, Y. Identification and verification of feature biomarkers associated in heart failure by bioinformatics analysis. Sci. Rep. 13, 3488 (2023).
pubmed: 36859608
pmcid: 9977868
doi: 10.1038/s41598-023-30666-0
Brecker, M., Khakhina, S., Schubert, T., Thompson, Z. & Rubenstein, R. The probable, possible, and novel functions of ERp29. Front. Physiol. 11, 574339 (2020).
Ugidos, N. et al. Interactome of the autoimmune risk protein ANKRD55. Front. Immunol. 10, 2067 (2019).
pubmed: 31620119
pmcid: 6759997
doi: 10.3389/fimmu.2019.02067
Tang, P. et al. NADPH oxidase NOX4 is a glycolytic regulator through mROS-HIF1α axis in thyroid carcinomas. Sci. Rep. 8, 15897 (2018).
pubmed: 30367082
pmcid: 6203707
doi: 10.1038/s41598-018-34154-8
Azouzi, N. et al. NADPH oxidase NOX4 is a critical mediator of BRAFV600E-induced downregulation of the sodium/iodide symporter in papillary thyroid carcinomas. Antioxid. Redox Signal. 26, 864–877 (2017).
pubmed: 27401113
pmcid: 5444494
doi: 10.1089/ars.2015.6616
Lazzara, D. R., Zarkhin, S. G., Rubenstein, S. N. & Glick, B. P. Melanoma and thyroid carcinoma: our current understanding. J. Clin. Aesthetic Dermatol. 12, 39 (2019).
Ulisse, S. et al. Is melanoma progression affected by thyroid diseases? Int. J. Mol. Sci. 23, 10036 (2022).
pubmed: 36077430
pmcid: 9456309
doi: 10.3390/ijms231710036
Ozgun, A. et al. Malignant melanoma and papillary thyroid carcinoma that were diagnosed concurrently and treated simultaneously: a case report. Oncol. Lett. 9, 468–470 (2015).
pubmed: 25436010
doi: 10.3892/ol.2014.2642
Beretti, F. et al. The interplay between HGF/c-met axis and NOX4 in BRAF mutated melanoma. Int. J. Mol. Sci. 22, 761 (2021).
pubmed: 33451139
pmcid: 7828605
doi: 10.3390/ijms22020761
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B: Stat. Methodol. 82, 1273–1300 (2020).
doi: 10.1111/rssb.12388
Wen, X., Lee, Y., Luca, F. & Pique-Regi, R. Efficient integrative multi-snp association analysis via deterministic approximation of posteriors. Am. J. Hum. Genet. 98, 1114–1129 (2016).
pubmed: 27236919
pmcid: 4908152
doi: 10.1016/j.ajhg.2016.03.029
Barbeira, A. N. et al. Fine-mapping and qtl tissue-sharing information improves the reliability of causal gene identification. Genet. Epidemiol. 44, 854–867 (2020).
pubmed: 32964524
pmcid: 7693040
doi: 10.1002/gepi.22346
Song, S. et al. Openness weighted association studies: leveraging personal genome information to prioritize non-coding variants. Bioinformatics 37, 4737–4743 (2021).
pubmed: 34260700
pmcid: 8665759
doi: 10.1093/bioinformatics/btab514
Dai, Q. et al. OTTERS: a powerful TWAS framework leveraging summary-level reference data. Nat. Commun. 14, 1271 (2023).
pubmed: 36882394
pmcid: 9992663
doi: 10.1038/s41467-023-36862-w
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
pubmed: 22343431
pmcid: 3398141
doi: 10.1038/nprot.2011.457
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995).
doi: 10.1111/j.2517-6161.1995.tb02031.x
Kanehisa, M. et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480–D484 (2007).
pubmed: 18077471
pmcid: 2238879
doi: 10.1093/nar/gkm882
Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: A J. Integr. Biol. 16, 284–287 (2012).
doi: 10.1089/omi.2011.0118
Song, S., Wang, L., Hou, L. & Liu, J. S. MTWAS: Partitioning and aggregating cross-tissue and tissue-specific genetic effects to identify gene-trait associations. Zenodo https://doi.org/10.5281/zenodo.11647460 (2024).
Pan-UKB team. https://pan.ukbb.broadinstitute.org (2020).
Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283 (2016).
pubmed: 26395773
doi: 10.1093/bioinformatics/btv546
Lalonde, S. et al. Integrative analysis of vascular endothelial cell genomic features identifies AIDA as a coronary artery disease candidate gene. Genome Biol. 20, 1–13 (2019).
doi: 10.1186/s13059-019-1749-5
Castillo-Avila, R. G. et al. Association between genetic variants of CELSR2-PSRC1-SORT1 and cardiovascular diseases: a systematic review and meta-analysis. J. Cardiovas. Dev. Dis. 10, 91 (2023).
Joshua, J., Caswell, J., O’Sullivan, M. L., Wood, G. & Fonfara, S. Feline myocardial transcriptome in health and in hypertrophic cardiomyopathy-a translational animal model for human disease. PLoS ONE 18, e0283244 (2023).
pubmed: 36928240
pmcid: 10019628
doi: 10.1371/journal.pone.0283244
Li, X. et al. Meta-analysis identifies robust association between SNP rs17465637 in MIA3 on chromosome 1q41 and coronary artery disease. Atherosclerosis 231, 136–140 (2013).
pubmed: 24125424
doi: 10.1016/j.atherosclerosis.2013.08.031
Aggarwal, S., Narang, R., Saluja, D. & Srivastava, K. Diagnostic potential of SORT1 gene in coronary artery disease. Gene 909, 148308 (2024).
pubmed: 38395240
doi: 10.1016/j.gene.2024.148308
Nordestgaard, B. G. & Langsted, A. Lipoprotein (a) as a cause of cardiovascular disease: insights from epidemiology, genetics, and biology. J. Lipid Res. 57, 1953–1975 (2016).
pubmed: 27677946
pmcid: 5087876
doi: 10.1194/jlr.R071233
Enas, E. A., Varkey, B., Dharmarajan, T., Pare, G. & Bahl, V. K. Lipoprotein (a): An independent, genetic, and causal factor for cardiovascular disease and acute myocardial infarction. Indian Heart J. 71, 99–112 (2019).
pubmed: 31280836
pmcid: 6620428
doi: 10.1016/j.ihj.2019.03.004
Paquette, M., Dufour, R. & Baass, A. PHACTR1 genotype predicts coronary artery disease in patients with familial hypercholesterolemia. J. Clin. Lipidol. 12, 966–971 (2018).
pubmed: 29784573
doi: 10.1016/j.jacl.2018.04.012
Yuan, W. et al. New findings in the roles of Cyclin-dependent Kinase inhibitors 2B Antisense RNA 1 (CDKN2B-AS1) rs1333049 G/C and rs4977574 A/G variants on the risk to coronary heart disease. Bioengineered 11, 1084–1098 (2020).
pubmed: 33054494
pmcid: 8291866
doi: 10.1080/21655979.2020.1827892
Ozaki, K. et al. SNPs in BRAP associated with risk of myocardial infarction in Asian populations. Nat. Genet. 41, 329–333 (2009).
pubmed: 19198608
doi: 10.1038/ng.326
Hinohara, K. et al. Validation of eight genetic risk factors in East Asian populations replicated the association of BRAP with coronary artery disease. J. Hum. Genet. 54, 642–646 (2009).
pubmed: 19713974
doi: 10.1038/jhg.2009.87
Karamanavi, E. et al. The FES gene at the 15q26 coronary-artery-disease locus inhibits atherosclerosis. Circ. Res. 131, 1004–1017 (2022).
pubmed: 36321446
pmcid: 9770135
doi: 10.1161/CIRCRESAHA.122.321146
Ken-Dror, G., Talmud, P. J., Humphries, S. E. & Drenos, F. APOE/C1/C4/C2 gene cluster genotypes, haplotypes and lipid levels in prospective coronary heart disease risk among UK healthy men. Mol. Med. 16, 389–399 (2010).
pubmed: 20498921
pmcid: 2935949
doi: 10.2119/molmed.2010.00044