Semi-parametric empirical Bayes factor for genome-wide association studies.


Journal

European journal of human genetics : EJHG
ISSN: 1476-5438
Titre abrégé: Eur J Hum Genet
Pays: England
ID NLM: 9302235

Informations de publication

Date de publication:
05 2021
Historique:
received: 25 07 2020
accepted: 09 12 2020
revised: 02 11 2020
pubmed: 27 1 2021
medline: 18 1 2022
entrez: 26 1 2021
Statut: ppublish

Résumé

Bayes factor analysis has the attractive property of accommodating the risks of both false negatives and false positives when identifying susceptibility gene variants in genome-wide association studies (GWASs). For a particular SNP, the critical aspect of this analysis is that it incorporates the probability of obtaining the observed value of a statistic on disease association under the alternative hypotheses of non-null association. An approximate Bayes factor (ABF) was proposed by Wakefield (Genetic Epidemiology 2009;33:79-86) based on a normal prior for the underlying effect-size distribution. However, misspecification of the prior can lead to failure in incorporating the probability under the alternative hypothesis. In this paper, we propose a semi-parametric, empirical Bayes factor (SP-EBF) based on a nonparametric effect-size distribution estimated from the data. Analysis of several GWAS datasets revealed the presence of substantial numbers of SNPs with small effect sizes, and the SP-EBF attributed much greater significance to such SNPs than the ABF. Overall, the SP-EBF incorporates an effect-size distribution that is estimated from the data, and it has the potential to improve the accuracy of Bayes factor analysis in GWASs.

Identifiants

pubmed: 33495595
doi: 10.1038/s41431-020-00800-x
pii: 10.1038/s41431-020-00800-x
pmc: PMC8110551
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

800-807

Références

Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–6.
doi: 10.1093/nar/gkt1229
Visscher PM, Brown MA, McCarthy MI, Yang J. Five years of GWAS discovery. Am J Hum Genet. 2012;90:7–24.
doi: 10.1016/j.ajhg.2011.11.029
Wasserstein RL, Lazar NA. The ASA’s statement on p-values: context, process, and purpose. Am Statistician. 2016;70:129–33.
doi: 10.1080/00031305.2016.1154108
Dudbridge F, Gusnanto A. Estimation of significance thresholds for genomewide association scans. Genet Epidemiol. 2008;32:227–34.
doi: 10.1002/gepi.20297
Pe’er I, Yelensky R, Altshuler D, Daly MJ. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol. 2008;32:381–5.
doi: 10.1002/gepi.20303
Panagiotou OA, Ioannidis JPA, Genome-Wide Significance Project. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int J Epidemiol. 2012;41:273–286.
doi: 10.1093/ije/dyr178
Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet. 2014;15:335–346.
doi: 10.1038/nrg3706
Otani T, Noma H, Nishino J, Matsui S. Re-assessment of multiple testing strategies for more efficient genome-wide association studies. Eur J Hum Genet. 2018;26:1038–48.
doi: 10.1038/s41431-018-0125-3
Stahl E, Wegmann D, Trynka G, Gutierrez-Achury J, Do R, Voight BF, et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat Genet. 2012;44:483–9.
doi: 10.1038/ng.2232
Ripke S, O’Dushlaine C, Chambert K, Moran JL, Kähler AK, Akterin S, et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nat Genet. 2013;45:1150–9.
doi: 10.1038/ng.2742
Nishino J, Kochi Y, Shigemizu D, Kato M, Ikari K, Ochi H, et al. Empirical Bayes estimation of semi-parametric hierarchical mixture models for unbiased characterization of polygenic disease architectures. Front Genet. 2018;9:115.
doi: 10.3389/fgene.2018.00115
Stephens M, Balding DJ. Bayesian statistical methods for genetic association studies. Nat Rev Genet. 2009;10:681–90.
doi: 10.1038/nrg2615
Maller JB, McVean G, Byrnes J, Vukcevic D, Palin K, Su Z, et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat Genet. 2012;44:1294–301.
doi: 10.1038/ng.2435
Li Z, Chen J, Yu H, He L, Xu Y, Zhang D, et al. (2017) Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia. Nat Genet. 2017;49:1576–83.
doi: 10.1038/ng.3973
Robert CP. The Bayesian choice: from decision-theoretic foundations to computational implementation. New York: Springer-Verlag; 2007.
Wakefield J. Bayes factors for genome-wide association studies: comparison with P-values. Genet Epidemiol. 2009;33:79–86.
doi: 10.1002/gepi.20359
Spencer AV, Cox A, Lin WY, Easton DF, Michailidou K, Waltesd K. Novel Bayes factors that capture expert uncertainly in prior density specification in genetic association studies. Genet Epidemiol. 2015;39:239–48.
doi: 10.1002/gepi.21891
Spencer AV, Cox A, Lin WY, Easton DF, Michailidou K, Waltesd K. Incorporating functional genomic information in genetic association studies using an empirical Bayes approach. Genet Epidemiol. 2016;40:176–87.
doi: 10.1002/gepi.21956
Walters K, Cox A, Yaacob H. Using GWAS top hits to inform priors in Bayesian fine-mapping association studies. Genet Epidemiol. 2019;43:675–89.
Matsui S, Noma H. Estimating effect sizes of differentially expressed genes for power and sample-size assessments in microarray experiments. Biometrics. 2011;67:1225–35.
doi: 10.1111/j.1541-0420.2011.01618.x
Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006;7:781–91.
doi: 10.1038/nrg1916
Shen W, Louis TA. Empirical Bayes estimation via the smoothing by roughing approach. J Comput Graph Stat. 1999;8:800–23.
Johnson VE, Rossell D. On the use of non-local prior densities in Bayesian hypothesis tests. J R Stat Soc. 2010;72:143–70.
doi: 10.1111/j.1467-9868.2009.00730.x
Zhou X, Carbonetto P, Stephens M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 2013;9:e1003264.
doi: 10.1371/journal.pgen.1003264
Stephens M. False discovery rates: a new deal. Biostatistics. 2017;8:275–94.
Sklar P, Ripke S, Scott LJ, Andreassen OA, Cichon S, Craddock N, et al. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat Genet. 2011;43:977–83.
doi: 10.1038/ng.943
Charney AW, Ruderfer DM, Stahl EA, Moran JL, Chambert K, Bellivean RA, et al. Evidence for genetic heterogeneity between clinical subtypes of bipolar disorder. Transl Psychiatry. 2017;7:e993.
doi: 10.1038/tp.2016.242
Chen DT, Jiang X, Akula N, Shugart YY, Wendland JR, Steele CJM, et al. Genome-wide association study meta-analysis of European and Asian-ancestry samples identifies three novel loci associated with bipolar disorder. Mol Psychiatry. 2013;18:195–205.
doi: 10.1038/mp.2011.157
Mühleisen TW, Leber M, Schulze TG, Strohmaier J, Degenhardt F, et al. Genome-wide association study reveals two new risk loci for bipolar disorder. Nat Commun. 2013;5:3339.
doi: 10.1038/ncomms4339
Green EK, Grozeva D, Forty L, Gordon-Smith K, Russell E, et al. Association at SYNE1 in both bipolar disorder and recurrent major depression. Mol Psychiatry. 2013;18:614–7.
doi: 10.1038/mp.2012.48
Servin B, Stephens M. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet. 2007;3:e114.
doi: 10.1371/journal.pgen.0030114
Legarra A, Ricard A, Varona L. GWAS by GBLUP: single and multimarker EMMAX and Bayes factors, with an example in detection of a major gene for horse gait. G3: Genes, Genomes, Genet. 2018;8:2301–2308.
doi: 10.1534/g3.118.200336
Fernando R, Toosi A, Wolc A, Garrick D, Dekkers J. Application of whole-genome prediction methods for genome-wide association studies: a Bayesian approach. J Agric Biol Environ Stat. 2017;22:172–93.
doi: 10.1007/s13253-017-0277-6
Otani T, Noma H, Sugasawa S, Kuchiba A, Goto A, Yamaji T, et al. Exploring predictive biomarkers from clinical genome-wide association studies via multidimensional hierarchical mixture models. Eur J Hum Genet. 2019;27:140–9.
doi: 10.1038/s41431-018-0251-y

Auteurs

Junji Morisawa (J)

Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan. morisawa.junji@a.mbox.nagoya-u.ac.jp.

Takahiro Otani (T)

Department of Public Health, Graduate School of Medical Sciences, Nagoya City University, Nagoya, Japan.

Jo Nishino (J)

Division of Bioinformatics, National Cancer Center Research Institute, Tokyo, Japan.

Ryo Emoto (R)

Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan.

Kunihiko Takahashi (K)

Medical and Dental Data Science Center, Tokyo Medical and Dental University, Tokyo, Japan.

Shigeyuki Matsui (S)

Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan. smatsui@med.nagoya-u.ac.jp.
Department of Data Science, The Institute of Statistical Mathematics, Tokyo, Japan. smatsui@med.nagoya-u.ac.jp.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH