BAGSE: a Bayesian hierarchical model approach for gene set enrichment analysis.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
01 03 2020
Historique:
received: 25 01 2019
revised: 14 10 2019
accepted: 06 11 2019
pubmed: 9 11 2019
medline: 17 9 2020
entrez: 9 11 2019
Statut: ppublish

Résumé

Gene set enrichment analysis has been shown to be effective in identifying relevant biological pathways underlying complex diseases. Existing approaches lack the ability to quantify the enrichment levels accurately, hence preventing the enrichment information to be further utilized in both upstream and downstream analyses. A modernized and rigorous approach for gene set enrichment analysis that emphasizes both hypothesis testing and enrichment estimation is much needed. We propose a novel computational method, Bayesian Analysis of Gene Set Enrichment (BAGSE), for gene set enrichment analysis. BAGSE is built on a Bayesian hierarchical model and fully accounts for the uncertainty embedded in the association evidence of individual genes. We adopt an empirical Bayes inference framework to fit the proposed hierarchical model by implementing an efficient EM algorithm. Through simulation studies, we illustrate that BAGSE yields accurate enrichment quantification while achieving similar power as the state-of-the-art methods. Further simulation studies show that BAGSE can effectively utilize the enrichment information to improve the power in gene discovery. Finally, we demonstrate the application of BAGSE in analyzing real data from a differential expression experiment and a transcriptome-wide association study. Our results indicate that the proposed statistical framework is effective in aiding the discovery of potentially causal pathways and gene networks. BAGSE is implemented using the C++ programing language and is freely available from https://github.com/xqwen/bagse/. Simulated and real data used in this paper are also available at the Github repository for reproducibility purposes. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 31702789
pii: 5614816
doi: 10.1093/bioinformatics/btz831
pmc: PMC7523653
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

1689-1695

Subventions

Organisme : NIGMS NIH HHS
ID : R01 GM109215
Pays : United States
Organisme : NHGRI NIH HHS
ID : T32 HG000040
Pays : United States
Organisme : NIAMS NIH HHS
ID : R01 AR042742
Pays : United States

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Références

Nat Genet. 2016 May;48(5):481-7
pubmed: 27019110
Science. 2015 Jun 12;348(6240):1241-4
pubmed: 26068849
Prog Neuropsychopharmacol Biol Psychiatry. 2015 Jun 3;59:31-39
pubmed: 25598502
Science. 2014 Jan 3;343(6166):84-87
pubmed: 24336571
PLoS Genet. 2013;9(10):e1003770
pubmed: 24098138
Nucleic Acids Res. 2009 Jan;37(Database issue):D767-72
pubmed: 18988627
J Psychiatr Res. 2015 Dec;71:120-5
pubmed: 26473696
Biostatistics. 2017 Apr 1;18(2):275-294
pubmed: 27756721
PLoS Genet. 2010 Aug 12;6(8):
pubmed: 20714348
Int J Urol. 2014 Jan;21(1):46-51
pubmed: 23634695
Genome Biol. 2014;15(12):550
pubmed: 25516281
Nat Genet. 2010 Nov;42(11):937-48
pubmed: 20935630
Genome Res. 2016 Dec;26(12):1627-1638
pubmed: 27934696
Nat Genet. 2013 Nov;45(11):1274-1283
pubmed: 24097068
Science. 2020 Sep 11;369(6509):1318-1330
pubmed: 32913098
Nat Genet. 2003 Jul;34(3):267-73
pubmed: 12808457
Nat Genet. 2018 Apr;50(4):538-548
pubmed: 29632383
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
J Infect Dis. 2015 Sep 15;212(6):990-8
pubmed: 25762787
Genomics. 2016 Mar;107(2-3):51-58
pubmed: 26773458
Nature. 2009 Mar 12;458(7235):223-7
pubmed: 19182780
Nat Genet. 2015 Sep;47(9):1091-8
pubmed: 26258848
Cell Syst. 2018 Mar 28;6(3):282-300.e2
pubmed: 29596783

Auteurs

Abhay Hukku (A)

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

Corbin Quick (C)

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

Francesca Luca (F)

Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI 48201, USA.

Roger Pique-Regi (R)

Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI 48201, USA.

Xiaoquan Wen (X)

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Humans Middle Aged Female Male Surveys and Questionnaires
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature

Classifications MeSH