Omnibus testing approach for gene-based gene-gene interaction.
correlated statistics
gene-gene interaction
genome-wide association studies
omnibus
replication studies
welcome trust case control consortium
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
10 07 2022
10 07 2022
Historique:
revised:
03
03
2022
received:
12
07
2020
accepted:
04
03
2022
pubmed:
27
3
2022
medline:
22
6
2022
entrez:
26
3
2022
Statut:
ppublish
Résumé
Genetic interaction is considered as one of the main heritable component of complex traits. With the emergence of genome-wide association studies (GWAS), a collection of statistical methods dedicated to the identification of interaction at the SNP level have been proposed. More recently, gene-based gene-gene interaction testing has emerged as an attractive alternative as they confer advantage in both statistical power and biological interpretation. Most of the gene-based interaction methods rely on a multidimensional modeling of the interaction, thus facing a lack of robustness against the huge space of interaction patterns. In this paper, we study a global testing approaches to address the issue of gene-based gene-gene interaction. Based on a logistic regression modeling framework, all SNP-SNP interaction tests are combined to produce a gene-level test for interaction. We propose an omnibus test that takes advantage of (1) the heterogeneity between existing global tests and (2) the complementarity between allele-based and genotype-based coding of SNPs. Through an extensive simulation study, it is demonstrated that the proposed omnibus test has the ability to detect with high power the most common interaction genetic models with one causal pair as well as more complex genetic models where more than one causal pair is involved. On the other hand, the flexibility of the proposed approach is shown to be robust and improves power compared to single global tests in replication studies. Furthermore, the application of our procedure to real datasets confirms the adaptability of our approach to replicate various gene-gene interactions.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
2854-2878Informations de copyright
© 2022 John Wiley & Sons Ltd.
Références
Buniello A, MacArthur JAL, Cerezo M, et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucl Acids Res. 2018;47(D1):D1005-D1012. doi:10.1093/nar/gky1120
Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18-21.
Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747-753.
Moore JH. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity. 2003;56:73-82.
Phillips P. Epistasis, the essential role of gene interactions in the structure and evolution of genetic systems. Nature Rev Genet. 2008;9:855-867.
Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: genetic interactions create phantom heritability. Proc Natl Acad Sci U S A. 2012;109(4):1193-1198.
Ritchie MD, Steen KV. The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. Ann Transl Med. 2018;6(8):1-14.
Wan X, Yang C, Yang Q, et al. BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Human Genet. 2010;87:325-340.
Cordell HJ. Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Human Molecul Genet. 2002;11(20):2463-2468.
Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genet. 2005;37(4):413-417.
Purcell S, Neale B, Todd-Brown K, et al. PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Human Genet. 2007;81:559-575.
Emily M. IndOR: a new statistical procedure to test for SNP x SNP epistasis in genome-wide association studies. Stat Med. 2012;31(21):2359-2373.
Zhao J, Jin L, Xiong M. Test for interaction between two unlinked loci. Am J Human Genet. 2006;79(5):831-845.
Wu X, Dong H, Luo L, et al. A novel statistic for genome-wide interaction analysis. PLoS Genet. 2010;6(9):e1001131.
Ueki M, Cordell HJ. Improved statistics for genome-wide interaction analysis. PLoS Genet. 2012;8(4):e1002625. doi:10.1371/journal.pgen.1002625
Dong C, Chu X, Wang Y, et al. Exploration of gene-gene interaction effects using entropy-based methods. Eur J Human Genet. 2008;16(2):229-235.
Kang G, Yue W, Zhang J, Cui Y, Zuo Y, Zhang D. An entropy-based approach for testing genetic epistasis underlying complex diseases. J Theor Biol. 2008;250(2):362-374.
Zhang Y, Liu JS. Bayesian inference of epistatic interactions in case-control studies. Nature Genet. 2007;39:1167-1173.
Ritchie MD, Hahn LW, Roodi N, et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Human Genet. 2001;69(1):138-147.
Moore J, White B. Tuning reliefF for genome-wide genetic analysis. Lect Notes Comput Sci. 2007;4447:166-175.
Schwarz D, Konig I, Ziegler A. On safari to random jungle: a fast implementation of random forests for high dimensional data. Bioinformatics. 2010;26:1752-1758.
Shang J, Zhang J, Sun Y, Liu D, Ye D, Yin Y. Performance analysis of novel methods for detecting epistasis. BMC Bioinform. 2011;12(1):475. doi:10.1186/1471-2105-12-475
Emily M. A survey of statistical methods for gene-gene interaction in case-control genome-wide association studies. J de la Société Française de Statistique. 2018;159(1):27-67.
Neale BM, Sham PC. The future of association studies: gene-based analysis and replication. Am J Human Genet. 2004;75(3):353-362.
Jorgenson E, Witte JS. A gene-centric approach to genome-wide association studies. Nature Rev Genet. 2006;7(11):885-891.
Huang H, Chanda P, Alonso A, Bader JS, Arking DE. Gene-based tests of association. PLoS Genet. 2011;7(7):e1002177. doi:10.1371/journal.pgen.1002177
Wu MC, Kraft P, Epstein MP, et al. Powerful SNP-set analysis for case-control genome-wide association studies. Am J Human Genet. 2010;86(6):929-942.
Li J, Tang R, Biernacka J, de Andrade M. Identification of gene-gene interaction using principal components. BMC Proc. 2009;3(Suppl 7):S78.
Peng Q, Zhao J, Xue F. A gene-based method for detecting gene-gene co-association in a case-control association study. Eur J Human Genet. 2010;18(5):582-587.
Yuan Z, Gao Q, He Y, et al. Detection for gene-gene co-association via kernel canonical correlation analysis. BMC Genet. 2012;13(1):83. doi:10.1186/1471-2156-13-83
Larson NB, Jenkins GD, Larson MC, et al. Kernel canonical correlation analysis for assessing gene-gene interactions and application to ovarian cancer. Eur J Human Genet. 2014;22(1):126-131.
Zhang X, Yang X, Yuan Z, et al. A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design. PLoS One. 2013;8(4):e62129. doi:10.1371/journal.pone.0062129
Li S, Cui Y. Gene-centric gene-gene interaction: a model-based kernel machine method. Ann Appl Stat. 2012;6(3):1134-1161.
Larson NB, Schaid DJ. A kernel regression approach to gene-gene interaction detection for case-control studies. Genet Epidemiol. 2013;37(17):695-703.
Rajapakse I, Perlman MD, Martin PJ, Hansen JA, Kooperberg C. Multivariate detection of gene-gene interactions. Genet Epidemiol. 2012;36(6):622-630.
Li J, Huang D, Guo M, et al. A gene-based information gain method for detecting gene-gene interactions in case-control studies. Eur J Human Genet. 2015;23:1566-1572.
Emily M. AGGrEGATOr: a gene-based GEne-Gene interAcTiOn test for case-control association studies. Stat Appl Genet Mol Biol. 2016;15(2):151-171.
Causeur D, Sheu CF, Perthame E, Rufini F. A functional generalized F-test for signal detection with applications to event-related potentials significance analysis. Biometrics. 2020;76(1):246-256.
Arias-Castro E, Candès EJ, Plan Y. Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Ann Stat. 2011;39(5):2533-2556.
Conneely KN, Boehnke M. So many correlated tests, so little time! rapid adjustment of P values for multiple correlated tests. Am J Human Genet. 2007;81(6):1158-1168.
Liu JZ, Mcrae AF, Nyholt DR, et al. A versatile gene-based test for genome-wide association studies. Am J Human Genet. 2010;87(1):139-145.
Donoho D, Jin J. Higher criticism for detecting sparse heterogeneous mixtures. Ann Stat. 2004;32(3):962-994.
Derkach A, Lawless JF, Sun L. Pooled association tests for rare genetic variants: a review and some new results. Stat Sci. 2014;29(2):302-321.
Hebert F, Causeur D, Emily M. An adaptive decorrelation procedure for signal detection. Comput Stat Data Anal. 2021;153:107082.
Liu Y, Xie J. Cauchy combination test: a powerful test with analytic P-value calculation under arbitrary dependency structures. J Am Stat Assoc. 2020;115(529):393-402.
Buzkova P, Lumley T, Rice K. Permutation and parametric bootstrap tests for gene-gene and gene-environment interactions. Ann Human Genet. 2016;75(1):36-45.
Wu Z, Sun Y, He S, et al. Detection boundary and higher criticism approach for rare and weak genetic effects. Ann Appl Stat. 2014;8(2):824-851.
Barnett I, Mukherjee R, Lin X. The generalized higher criticism for testing SNP-set effects in genetic association studies. J Am Stat Assoc. 2017;112(517):64-76.
Lin X, Lee S, Christiani DC, Lin X. Test for interactions between a genetic marker set and environment in generalized linear models. Biostatistics. 2013;14(4):667-681.
Luo L, Peng G, Zhu Y, Dong H, Amos CI, Xiong M. Genome-wide gene and pathway analysis. Eur J Human Genet. 2010;18(9):1045.
Westfall P, Young S. Resampling-Based Multiple Testing. New York, NY: Wiley; 1993.
WTCCC. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661-678.
Horn RA, Johnson CR, Elsner L. Topics in Matrix Analysis. 1st ed. Cambridge, UK: Cambridge University Press; 1994.
Coombes BJ, Biernacka JM. Application of the parametric bootstrap for gene-set analysis of gene-environment interactions. Eur J Human Genet. 2018;26(11):1679.
Epstein MP, Duncan R, Jiang Y, Conneely KN, Allen AS, Satten GA. A permutation procedure to correct for confounders in case-control studies, including tests of rare variation. Am J Human Genet. 2012;91(2):215-223.
Barbiero A, Ferrari PA. GenOrd: simulation of discrete random variables with given correlation matrix and marginal Distributions. R package on CRAN; 2015. R package version 1.4.0.
Emily M, Sounac N, Kroell F, Houée-Bigot M. Gene-based methods to detect gene-gene interaction in R: the GeneGeneInteR package. J Stat Softw. 2020;95(12):1-32. doi:10.18637/jss.v095.i12
Hallgrimsdottir IB, Yuster DS. A complete classification of epistatic two-locus models. BMC Genet. 2008;9(17):1-15.
Li MX, Gui HS, Kwan J, Sham P. GATES: a rapid and powerful gene-based association test using extended simes procedure. Am J Human Genet. 2011;88(3):283-293.
Simes RJ. An improved Bonferroni procedure for multiple tests of significance. Biometrika. 1986;73(3):751-754.
Lin X, Lee S, Wu M, et al. Test for rare variants by environment interactions in sequencing association studies. Biometrics. 2016;72(1):156-164.
Li W, Reich J. A complete enumeration and classification of two-locus disease models. Human Heredity. 2000;50(6):334-349.
Maj C, Milanesi E, Gennarelli M, Milanesi L, Merelli I. Epistasis analysis reveals associations between gene variants and bipolar disorder. PeerJ Preprints. 2017;5:e3242v1.
Judy J, Seifuddin F, Pirooznia M, et al. Converging evidence for epistasis between ANK3 and potassium channel gene KCNQ2 in bipolar disorder. Front Genet. 2013;4:87.
Sirotina S, Ponomarenko I, Kharchenko A, et al. A novel polymorphism in the promoter of the <i>CYP4A11</i> gene is associated with susceptibility to coronary artery disease. Disease Markers. 2018;2018:1-12.
Li Y, Cho H, Wang F, et al. Statistical and functional studies identify epistasis of cardiovascular risk genomic variants from genome-wide association studies. J Am Heart Assoc. 2020;9(7):e014146.
Seiderer J, Glas J, Pasciuto G, et al. First evidence for strong epistasis between two Crohn's disease susceptibility loci: PTGER4-expression-modulating polymorphisms in the 5p13.1 region enhance ATG16L1-associated susceptibility to Crohn's disease. Z Gastroenterol. 2008;46(1):022-022.
Levine AP, Pontikos N, Schiff ER, et al. Genetic complexity of Crohn's disease in two large Ashkenazi Jewish families. Z Gastroenterol. 2016;151:698-709.
Abegaz F, Van Lishout F, Mahachie John JM, et al. Epistasis detection in genome-wide screening for complex human diseases in structured populations. Syst Med. 2019;2(1):19-27. doi:10.1089/sysm.2019.0003
Ndiaye NC, Said ES, Stathopoulou MG, Siest G, Tsai MY, Visvikis-Siest S. Epistatic study reveals two genetic interactions in blood pressure regulation. BMC Med Genet. 2013;14:1-7.
Meng Y, Groth S, Quinn JR, Bisognano J, Wu TT. An exploration of gene-gene interactions and their effects on hypertension. Int J Genom. 2017;2017:1-9.
Génin E, Coustet B, Allanore Y, et al. Epistatic interaction between BANK1 and BLK in rheumatoid arthritis: results from a large trans-ethnic meta-analysis. Plos One. 2013;8(4):1-8.
Mahachie John JM, Van Lishout F, Gusareva ES, Van Steen K. A robustness study of parametric and non-parametric tests in model-based multifactor dimensionality reduction for epistasis detection. BioData Min. 2013;6(1):1-17.
Franberg M, Gertow K, Hamsten A, Consortium P, Lagergren J, Sennblad B. discovering genetic interactions in large-scale association studies by stage-wise likelihood ratio tests. PLOS Genet. 2015;11(9):1-24.
Steen KV, Moore JH. How to increase our belief in discovered statistical interactions via large-scale association studies? Human Genet. 2019;138:293-305.
Ma L, Clark AG, Keinan A. Gene-based testing of interactions in association studies of quantitative traits. PLoS Genet. 2013;9(2):e1003321. doi:10.1371/journal.pgen.1003321
Li J, Li X, Zhang S, Snyder M. Gene-environment interaction in the era of precision medicine. Cell. 2019;177(1):38-44.
de Maturana L, Alonso L, Alarcón P, et al. Challenges in the integration of omics and non-omics data. Genes. 2019;10(3):1-17.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2013.