GSEApy: a comprehensive package for performing gene set enrichment analysis in Python.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 01 2023
01 01 2023
Historique:
received:
17
08
2022
revised:
04
11
2022
accepted:
22
11
2022
pubmed:
26
11
2022
medline:
4
1
2023
entrez:
25
11
2022
Statut:
ppublish
Résumé
Gene set enrichment analysis (GSEA) is a commonly used algorithm for characterizing gene expression changes. However, the currently available tools used to perform GSEA have a limited ability to analyze large datasets, which is particularly problematic for the analysis of single-cell data. To overcome this limitation, we developed a GSEA package in Python (GSEApy), which could efficiently analyze large single-cell datasets. We present a package (GSEApy) that performs GSEA in either the command line or Python environment. GSEApy uses a Rust implementation to enable it to calculate the same enrichment statistic as GSEA for a collection of pathways. The Rust implementation of GSEApy is 3-fold faster than the Numpy version of GSEApy (v0.10.8) and uses >4-fold less memory. GSEApy also provides an interface between Python and Enrichr web services, as well as for BioMart. The Enrichr application programming interface enables GSEApy to perform over-representation analysis for an input gene list. Furthermore, GSEApy consists of several tools, each designed to facilitate a particular type of enrichment analysis. The new GSEApy with Rust extension is deposited in PyPI: https://pypi.org/project/gseapy/. The GSEApy source code is freely available at https://github.com/zqfang/GSEApy. Also, the documentation website is available at https://gseapy.rtfd.io/. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 36426870
pii: 6847088
doi: 10.1093/bioinformatics/btac757
pmc: PMC9805564
pii:
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : National Institute of Health
Organisme : National Institute for Drug Addiction
ID : 5U01DA04439902
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
Nat Commun. 2021 Oct 22;12(1):6138
pubmed: 34686668
BMC Bioinformatics. 2013 Apr 15;14:128
pubmed: 23586463
Cell Cycle. 2013 Nov 1;12(21):3390-404
pubmed: 24047698
Genome Biol. 2018 Feb 6;19(1):15
pubmed: 29409532
Cell Stem Cell. 2011 May 6;8(5):511-24
pubmed: 21419747
Curr Protoc. 2021 Mar;1(3):e90
pubmed: 33780170
Genome Res. 2021 Oct;31(10):1753-1766
pubmed: 34035047
Nat Biotechnol. 2018 Jan;36(1):89-94
pubmed: 29227470
Bioinformatics. 2022 Feb 10;:
pubmed: 35143610
Bioinformatics. 2007 Dec 1;23(23):3251-3
pubmed: 17644558
Clin Gastroenterol Hepatol. 2020 May;18(5):1142-1151.e10
pubmed: 31446181
J Clin Invest. 2019 Jul 30;129(10):4492-4505
pubmed: 31361600
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
Nature. 2009 Nov 5;462(7269):108-12
pubmed: 19847166
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7
pubmed: 27141961
Onco Targets Ther. 2019 Jul 24;12:5979-5988
pubmed: 31440059
Nat Rev Genet. 2019 May;20(5):273-282
pubmed: 30617341
Bioinformatics. 2005 Aug 15;21(16):3439-40
pubmed: 16082012