Partitioning and aggregating cross-tissue and tissue-specific genetic effects to identify gene-trait associations.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
09 Jul 2024
Historique:
received: 27 11 2023
accepted: 25 06 2024
medline: 10 7 2024
pubmed: 10 7 2024
entrez: 9 7 2024
Statut: epublish

Résumé

TWAS have shown great promise in extending GWAS loci to a functional understanding of disease mechanisms. In an effort to fully unleash the TWAS and GWAS information, we propose MTWAS, a statistical framework that partitions and aggregates cross-tissue and tissue-specific genetic effects in identifying gene-trait associations. We introduce a non-parametric imputation strategy to augment the inaccessible tissues, accommodating complex interactions and non-linear expression data structures across various tissues. We further classify eQTLs into cross-tissue eQTLs and tissue-specific eQTLs via a stepwise procedure based on the extended Bayesian information criterion, which is consistent under high-dimensional settings. We show that MTWAS significantly improves the prediction accuracy across all 47 tissues of the GTEx dataset, compared with other single-tissue and multi-tissue methods, such as PrediXcan, TIGAR, and UTMOST. Applying MTWAS to the DICE and OneK1K datasets with bulk and single-cell RNA sequencing data on immune cell types showcases consistent improvements in prediction accuracy. MTWAS also identifies more predictable genes, and the improvement can be replicated with independent studies. We apply MTWAS to 84 UK Biobank GWAS studies, which provides insights into disease etiology.

Identifiants

pubmed: 38982044
doi: 10.1038/s41467-024-49924-4
pii: 10.1038/s41467-024-49924-4
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

5769

Subventions

Organisme : U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
ID : R01 GM152814-01

Informations de copyright

© 2024. The Author(s).

Références

Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
pubmed: 28686856 pmcid: 5501872 doi: 10.1016/j.ajhg.2017.06.005
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
pubmed: 26258848 pmcid: 4552594 doi: 10.1038/ng.3367
Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1–20 (2018).
doi: 10.1038/s41467-018-03621-1
Cloney, R. Integrating gene variation and expression to understand complex traits. Nat. Rev. Genet. 17, 194–194 (2016).
pubmed: 26900024 doi: 10.1038/nrg.2016.18
Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371–375 (2014).
pubmed: 24646999 pmcid: 4113484 doi: 10.1038/nature13138
Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
doi: 10.1038/ng.2653
Yao, D. W., O’connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).
pubmed: 32424349 pmcid: 7276299 doi: 10.1038/s41588-020-0625-2
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
pubmed: 26854917 pmcid: 4767558 doi: 10.1038/ng.3506
Nagpal, S. et al. TIGAR: an improved Bayesian tool for transcriptomic data imputation enhances gene mapping of complex traits. Am. J. Hum. Genet. 105, 258–266 (2019).
pubmed: 31230719 pmcid: 6698804 doi: 10.1016/j.ajhg.2019.05.018
Parrish, R. L., Gibson, G. C., Epstein, M. P. & Yang, J. TIGAR-V2: efficient TWAS tool with nonparametric Bayesian eQTL weights of 49 tissue types from GTEx V8. Hum. Genet. Genomics Adv. 3, 100068 (2022).
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
pubmed: 30668570 pmcid: 6358100 doi: 10.1371/journal.pgen.1007889
Feng, H. et al. Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies. PLoS Genet. 17, e1008973 (2021).
pubmed: 33831007 pmcid: 8057593 doi: 10.1371/journal.pgen.1008973
Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).
pubmed: 30804563 pmcid: 6788740 doi: 10.1038/s41588-019-0345-7
Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).
pubmed: 30478440 doi: 10.1038/s41588-018-0268-8
Brown, C. D., Mangravite, L. M. & Engelhardt, B. E. Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs. PLoS Genet. 9, e1003649 (2013).
pubmed: 23935528 pmcid: 3731231 doi: 10.1371/journal.pgen.1003649
Chen, L. et al. TIVAN: tissue-specific cis-eQTL single nucleotide variant annotation and prediction. Bioinformatics 35, 1573–1575 (2019).
pubmed: 30304335 doi: 10.1093/bioinformatics/bty872
Chen, J. & Chen, Z. Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95, 759–771 (2008).
doi: 10.1093/biomet/asn034
Li, Y. & Liu, J. S. Robust variable and interaction selection for logistic regression and general index models. J. Am. Stat. Assoc. 114, 271–286 (2019).
pubmed: 32863479 doi: 10.1080/01621459.2017.1401541
Stekhoven, D. J. & Bühlmann, P. MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 112–118 (2012).
pubmed: 22039212 doi: 10.1093/bioinformatics/btr597
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
pubmed: 24037378 pmcid: 3918453 doi: 10.1038/nature12531
Schmiedel, B. J. et al. Impact of genetic polymorphisms on human immune cell gene expression. Cell 175, 1701–1715 (2018).
pubmed: 30449622 pmcid: 6289654 doi: 10.1016/j.cell.2018.10.022
Yazar, S. et al. Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
pubmed: 35389779 doi: 10.1126/science.abf3041
Arvanitis, M., Tayeb, K., Strober, B. J. & Battle, A. Redefining tissue specificity of genetic regulation of gene expression in the presence of allelic heterogeneity. Am. J. Hum. Genet. 109, 223–239 (2022).
pubmed: 35085493 pmcid: 8874223 doi: 10.1016/j.ajhg.2022.01.002
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654 pmcid: 7334197 doi: 10.1038/s41586-020-2308-7
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
pubmed: 27535533 pmcid: 5018207 doi: 10.1038/nature19057
Volland, C. et al. Control of p21
pubmed: 31286143 doi: 10.1093/cvr/cvz177
Dong, W.-q. et al. Prohibitin overexpression improves myocardial function in diabetic cardiomyopathy. Oncotarget 7, 66 (2016).
pubmed: 26623724 doi: 10.18632/oncotarget.6384
Yu, Y.-d, Xue, Y.-t & Li, Y. Identification and verification of feature biomarkers associated in heart failure by bioinformatics analysis. Sci. Rep. 13, 3488 (2023).
pubmed: 36859608 pmcid: 9977868 doi: 10.1038/s41598-023-30666-0
Brecker, M., Khakhina, S., Schubert, T., Thompson, Z. & Rubenstein, R. The probable, possible, and novel functions of ERp29. Front. Physiol. 11, 574339 (2020).
Ugidos, N. et al. Interactome of the autoimmune risk protein ANKRD55. Front. Immunol. 10, 2067 (2019).
pubmed: 31620119 pmcid: 6759997 doi: 10.3389/fimmu.2019.02067
Tang, P. et al. NADPH oxidase NOX4 is a glycolytic regulator through mROS-HIF1α axis in thyroid carcinomas. Sci. Rep. 8, 15897 (2018).
pubmed: 30367082 pmcid: 6203707 doi: 10.1038/s41598-018-34154-8
Azouzi, N. et al. NADPH oxidase NOX4 is a critical mediator of BRAFV600E-induced downregulation of the sodium/iodide symporter in papillary thyroid carcinomas. Antioxid. Redox Signal. 26, 864–877 (2017).
pubmed: 27401113 pmcid: 5444494 doi: 10.1089/ars.2015.6616
Lazzara, D. R., Zarkhin, S. G., Rubenstein, S. N. & Glick, B. P. Melanoma and thyroid carcinoma: our current understanding. J. Clin. Aesthetic Dermatol. 12, 39 (2019).
Ulisse, S. et al. Is melanoma progression affected by thyroid diseases? Int. J. Mol. Sci. 23, 10036 (2022).
pubmed: 36077430 pmcid: 9456309 doi: 10.3390/ijms231710036
Ozgun, A. et al. Malignant melanoma and papillary thyroid carcinoma that were diagnosed concurrently and treated simultaneously: a case report. Oncol. Lett. 9, 468–470 (2015).
pubmed: 25436010 doi: 10.3892/ol.2014.2642
Beretti, F. et al. The interplay between HGF/c-met axis and NOX4 in BRAF mutated melanoma. Int. J. Mol. Sci. 22, 761 (2021).
pubmed: 33451139 pmcid: 7828605 doi: 10.3390/ijms22020761
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B: Stat. Methodol. 82, 1273–1300 (2020).
doi: 10.1111/rssb.12388
Wen, X., Lee, Y., Luca, F. & Pique-Regi, R. Efficient integrative multi-snp association analysis via deterministic approximation of posteriors. Am. J. Hum. Genet. 98, 1114–1129 (2016).
pubmed: 27236919 pmcid: 4908152 doi: 10.1016/j.ajhg.2016.03.029
Barbeira, A. N. et al. Fine-mapping and qtl tissue-sharing information improves the reliability of causal gene identification. Genet. Epidemiol. 44, 854–867 (2020).
pubmed: 32964524 pmcid: 7693040 doi: 10.1002/gepi.22346
Song, S. et al. Openness weighted association studies: leveraging personal genome information to prioritize non-coding variants. Bioinformatics 37, 4737–4743 (2021).
pubmed: 34260700 pmcid: 8665759 doi: 10.1093/bioinformatics/btab514
Dai, Q. et al. OTTERS: a powerful TWAS framework leveraging summary-level reference data. Nat. Commun. 14, 1271 (2023).
pubmed: 36882394 pmcid: 9992663 doi: 10.1038/s41467-023-36862-w
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
pubmed: 22343431 pmcid: 3398141 doi: 10.1038/nprot.2011.457
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995).
doi: 10.1111/j.2517-6161.1995.tb02031.x
Kanehisa, M. et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36, D480–D484 (2007).
pubmed: 18077471 pmcid: 2238879 doi: 10.1093/nar/gkm882
Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: A J. Integr. Biol. 16, 284–287 (2012).
doi: 10.1089/omi.2011.0118
Song, S., Wang, L., Hou, L. & Liu, J. S. MTWAS: Partitioning and aggregating cross-tissue and tissue-specific genetic effects to identify gene-trait associations. Zenodo https://doi.org/10.5281/zenodo.11647460 (2024).
Pan-UKB team. https://pan.ukbb.broadinstitute.org (2020).
Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283 (2016).
pubmed: 26395773 doi: 10.1093/bioinformatics/btv546
Lalonde, S. et al. Integrative analysis of vascular endothelial cell genomic features identifies AIDA as a coronary artery disease candidate gene. Genome Biol. 20, 1–13 (2019).
doi: 10.1186/s13059-019-1749-5
Castillo-Avila, R. G. et al. Association between genetic variants of CELSR2-PSRC1-SORT1 and cardiovascular diseases: a systematic review and meta-analysis. J. Cardiovas. Dev. Dis. 10, 91 (2023).
Joshua, J., Caswell, J., O’Sullivan, M. L., Wood, G. & Fonfara, S. Feline myocardial transcriptome in health and in hypertrophic cardiomyopathy-a translational animal model for human disease. PLoS ONE 18, e0283244 (2023).
pubmed: 36928240 pmcid: 10019628 doi: 10.1371/journal.pone.0283244
Li, X. et al. Meta-analysis identifies robust association between SNP rs17465637 in MIA3 on chromosome 1q41 and coronary artery disease. Atherosclerosis 231, 136–140 (2013).
pubmed: 24125424 doi: 10.1016/j.atherosclerosis.2013.08.031
Aggarwal, S., Narang, R., Saluja, D. & Srivastava, K. Diagnostic potential of SORT1 gene in coronary artery disease. Gene 909, 148308 (2024).
pubmed: 38395240 doi: 10.1016/j.gene.2024.148308
Nordestgaard, B. G. & Langsted, A. Lipoprotein (a) as a cause of cardiovascular disease: insights from epidemiology, genetics, and biology. J. Lipid Res. 57, 1953–1975 (2016).
pubmed: 27677946 pmcid: 5087876 doi: 10.1194/jlr.R071233
Enas, E. A., Varkey, B., Dharmarajan, T., Pare, G. & Bahl, V. K. Lipoprotein (a): An independent, genetic, and causal factor for cardiovascular disease and acute myocardial infarction. Indian Heart J. 71, 99–112 (2019).
pubmed: 31280836 pmcid: 6620428 doi: 10.1016/j.ihj.2019.03.004
Paquette, M., Dufour, R. & Baass, A. PHACTR1 genotype predicts coronary artery disease in patients with familial hypercholesterolemia. J. Clin. Lipidol. 12, 966–971 (2018).
pubmed: 29784573 doi: 10.1016/j.jacl.2018.04.012
Yuan, W. et al. New findings in the roles of Cyclin-dependent Kinase inhibitors 2B Antisense RNA 1 (CDKN2B-AS1) rs1333049 G/C and rs4977574 A/G variants on the risk to coronary heart disease. Bioengineered 11, 1084–1098 (2020).
pubmed: 33054494 pmcid: 8291866 doi: 10.1080/21655979.2020.1827892
Ozaki, K. et al. SNPs in BRAP associated with risk of myocardial infarction in Asian populations. Nat. Genet. 41, 329–333 (2009).
pubmed: 19198608 doi: 10.1038/ng.326
Hinohara, K. et al. Validation of eight genetic risk factors in East Asian populations replicated the association of BRAP with coronary artery disease. J. Hum. Genet. 54, 642–646 (2009).
pubmed: 19713974 doi: 10.1038/jhg.2009.87
Karamanavi, E. et al. The FES gene at the 15q26 coronary-artery-disease locus inhibits atherosclerosis. Circ. Res. 131, 1004–1017 (2022).
pubmed: 36321446 pmcid: 9770135 doi: 10.1161/CIRCRESAHA.122.321146
Ken-Dror, G., Talmud, P. J., Humphries, S. E. & Drenos, F. APOE/C1/C4/C2 gene cluster genotypes, haplotypes and lipid levels in prospective coronary heart disease risk among UK healthy men. Mol. Med. 16, 389–399 (2010).
pubmed: 20498921 pmcid: 2935949 doi: 10.2119/molmed.2010.00044

Auteurs

Shuang Song (S)

Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China.

Lijun Wang (L)

Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.

Lin Hou (L)

Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China. houl@tsinghua.edu.cn.
MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China. houl@tsinghua.edu.cn.

Jun S Liu (JS)

Department of Statistics, Harvard University, Cambridge, MA, USA. jliu@stat.harvard.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH