Next-Gen GWAS: full 2D epistatic interaction maps retrieve part of missing heritability and improve phenotypic prediction.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
25 Mar 2024
Historique:
received: 13 07 2023
accepted: 19 02 2024
medline: 25 3 2024
pubmed: 25 3 2024
entrez: 25 3 2024
Statut: epublish

Résumé

The problem of missing heritability requires the consideration of genetic interactions among different loci, called epistasis. Current GWAS statistical models require years to assess the entire combinatorial epistatic space for a single phenotype. We propose Next-Gen GWAS (NGG) that evaluates over 60 billion single nucleotide polymorphism combinatorial first-order interactions within hours. We apply NGG to Arabidopsis thaliana providing two-dimensional epistatic maps at gene resolution. We demonstrate on several phenotypes that a large proportion of the missing heritability can be retrieved, that it indeed lies in epistatic interactions, and that it can be used to improve phenotype prediction.

Identifiants

pubmed: 38523316
doi: 10.1186/s13059-024-03202-0
pii: 10.1186/s13059-024-03202-0
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

76

Informations de copyright

© 2024. The Author(s).

Références

Fuchsberger C, Flannick J, Teslovich TM, Mahajan A, Agarwala V, Gaulton KJ, et al. The genetic architecture of type 2 diabetes. Nature. 2016;536:41–7.
doi: 10.1038/nature18642 pubmed: 27398621 pmcid: 5034897
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–78.
doi: 10.1038/nature05911
MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017;45:D896-901.
doi: 10.1093/nar/gkw1133 pubmed: 27899670
Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y, et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature. 2010;465:627–31.
doi: 10.1038/nature08800 pubmed: 20336072 pmcid: 3023908
Tian D, Wang P, Tang B, Teng X, Li C, Liu X, et al. GWAS Atlas: a curated resource of genome-wide variant-trait associations in plants and animals. Nucleic Acids Res. 2019;48:D927–32.
doi: 10.1093/nar/gkz828 pmcid: 6943065
Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet. 2017;101:5–22.
doi: 10.1016/j.ajhg.2017.06.005 pubmed: 28686856 pmcid: 5501872
Chatelain C, Durand G, Thuillier V, Augé F. Performance of epistasis detection methods in semi-simulated GWAS. BMC Bioinformatics. 2018;19:231.
doi: 10.1186/s12859-018-2229-8 pubmed: 29914375 pmcid: 6006572
Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–53.
doi: 10.1038/nature08494 pubmed: 19812666 pmcid: 2831613
Phillips PC. Epistasis–the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008;9:855–67.
doi: 10.1038/nrg2452 pubmed: 18852697 pmcid: 2689140
Hind J, Lisboa P, Hussain AJ, Al-Jumeily D. A Novel Approach to Detecting Epistasis using Random Sampling Regularisation. IEEE/ACM Trans Comput Biol Bioinform. 2020;17:1535–45.
pubmed: 31634840
Cordell HJ. Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet. 2009;10:392–404.
doi: 10.1038/nrg2579 pubmed: 19434077 pmcid: 2872761
Niel C, Sinoquet C, Dina C, Rocheleau G. A survey about methods dedicated to epistasis detection. Front Genet. 2015;6:285.
doi: 10.3389/fgene.2015.00285 pubmed: 26442103 pmcid: 4564769
Slim L, Chatelain C, Azencott C-A, Vert J-P. Novel methods for epistasis detection in genome-wide association studies. PLoS One. 2020;15:e0242927.
doi: 10.1371/journal.pone.0242927 pubmed: 33253293 pmcid: 7703915
Snaebjarnarson AS, Helgadottir A, Arnadottir GA, Ivarsdottir EV, Thorleifsson G, Ferkingstad E, et al. Complex effects of sequence variants on lipid levels and coronary artery disease. Cell. 2023;186:4085-99.e15.
doi: 10.1016/j.cell.2023.08.012 pubmed: 37714134
Koo CL, Liew MJ, Mohamad MS, Salleh AHM, Deris S, Ibrahim Z, et al. Software for detecting gene-gene interactions in genome wide association studies. Biotechnol Bioprocess Eng. 2015;20:662–76.
doi: 10.1007/s12257-015-0064-6
Candès EJ, Romberg JK, Tao T. Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math. 2006;59:1207–23.
doi: 10.1002/cpa.20124
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
doi: 10.1038/nature14539 pubmed: 26017442
Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci U S A. 2012;109:1193–8.
doi: 10.1073/pnas.1119675109 pubmed: 22223662 pmcid: 3268279
Slyusar VI. A family of face products of matrices and its properties. Cybern Syst Anal. 1999;35:379–84.
doi: 10.1007/BF02733426
Martini JWR, Crossa J, Toledo FH, Cuevas J. On Hadamard and Kronecker products in covariance structures for genotype × environment interaction. Plant Genome. 2020;13:e20033.
doi: 10.1002/tpg2.20033 pubmed: 33217210
1001 Genomes Consortium. Electronic address: magnus.nordborg@gmi.oeaw.ac.at, 1001 Genomes Consortium. 1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. Cell. 2016;166:481–91.
doi: 10.1016/j.cell.2016.05.063
Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 2005;37:413–7.
doi: 10.1038/ng1537 pubmed: 15793588
Korte A, Vilhjálmsson BJ, Segura V, Platt A, Long Q, Nordborg M. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nat Genet. 2012;44:1066–71.
doi: 10.1038/ng.2376 pubmed: 22902788 pmcid: 3432668
Grant MR, Godiard L, Straube E, Ashfield T, Lewald J, Sattler A, et al. Structure of the Arabidopsis RPM1 gene enabling dual specificity disease resistance. Science. 1995;269:843–6.
doi: 10.1126/science.7638602 pubmed: 7638602
Campos ACAL, van Dijk WFA, Ramakrishna P, Giles T, Korte P, Douglas A, et al. 1,135 ionomes reveals the global pattern of leaf and seed mineral nutrient and trace element diversity in Arabidopsis thaliana. Plant J. 2021. https://doi.org/10.1111/tpj.15177
Michaels SD, Amasino RM. FLOWERING LOCUS C encodes a novel MADS domain protein that acts as a repressor of flowering. Plant Cell. 1999; Available from: http://www.plantcell.org/content/11/5/949.short
Sheldon CC, Burn JE, Perez PP, Metzger J, Edwards JA, Peacock WJ, et al. The FLF MADS box gene: a repressor of flowering in Arabidopsis regulated by vernalization and methylation. Plant Cell. 1999;11:445–58.
doi: 10.1105/tpc.11.3.445 pubmed: 10072403 pmcid: 144185
Segura V, Vilhjálmsson BJ, Platt A, Korte A, Seren Ü, Long Q, et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat Genet. 2012;44:825–30.
doi: 10.1038/ng.2314 pubmed: 22706313 pmcid: 3386481
John M, Ankenbrand MJ, Artmann C, Freudenthal JA, Korte A, Grimm DG. Efficient Permutation-based Genome-wide Association Studies for Normal and Skewed Phenotypic Distributions. bioRxiv. 2022 p. 2022.04.05.487185. Available from: https://www.biorxiv.org/content/10.1101/2022.04.05.487185 , [Cited 2022 Jul 13].
Verzelen N. Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. EJSS. 2012;6:38–90.
Park SH. Collinearity and Optimal Restrictions on Regression Parameters for Estimating Responses. Technometrics. 1981;23:289–95.
doi: 10.2307/1267793
Carré C, Carluer JB, Mas A, Krouk G.. Next Gen GWAS. Zenodo; 2024.. https://doi.org/10.5281/zenodo.10656895

Auteurs

Clément Carré (C)

BionomeeX, Montpellier, France. clement.carre@bionomeex.com.
IMAG, Univ. Montpellier, CNRS, Montpellier, France. clement.carre@bionomeex.com.
IPSiM, Univ. Montpellier, CNRS, INRAE, Montpellier, France. clement.carre@bionomeex.com.

Jean Baptiste Carluer (JB)

IMAG, Univ. Montpellier, CNRS, Montpellier, France.
IPSiM, Univ. Montpellier, CNRS, INRAE, Montpellier, France.

Christian Chaux (C)

BionomeeX, Montpellier, France.

Chad Estoup-Streiff (C)

BionomeeX, Montpellier, France.

Nicolas Roche (N)

BionomeeX, Montpellier, France.

Eric Hosy (E)

Interdisciplinary Institute for Neuroscience, University of Bordeaux, CNRS, Bordeaux, France.

André Mas (A)

BionomeeX, Montpellier, France. andre.mas@umontpellier.fr.
IMAG, Univ. Montpellier, CNRS, Montpellier, France. andre.mas@umontpellier.fr.

Gabriel Krouk (G)

BionomeeX, Montpellier, France. gkrouk@gmail.com.
IPSiM, Univ. Montpellier, CNRS, INRAE, Montpellier, France. gkrouk@gmail.com.

Classifications MeSH