Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples.
Journal
Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660
Informations de publication
Date de publication:
30 Oct 2024
30 Oct 2024
Historique:
received:
25
04
2022
accepted:
28
03
2024
medline:
31
10
2024
pubmed:
31
10
2024
entrez:
31
10
2024
Statut:
epublish
Résumé
A recent study found severely inflated type I error rates for DESeq2 and edgeR, two dominant tools used for differential expression analysis of RNA-seq data. Here, we show that by properly addressing the outliers in the RNA-Seq data using winsorization, the type I error rate of DESeq2 and edgeR can be substantially reduced, and the power is comparable to Wilcoxon rank-sum test for large datasets. Therefore, as an alternative to Wilcoxon rank-sum test, they may still be applied for differential expression analysis of large RNA-Seq datasets.
Identifiants
pubmed: 39478636
doi: 10.1186/s13059-024-03230-w
pii: 10.1186/s13059-024-03230-w
doi:
Types de publication
Journal Article
Comment
Langues
eng
Sous-ensembles de citation
IM
Pagination
282Subventions
Organisme : NHGRI NIH HHS
ID : R21 HG011662
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01GM144351
Pays : United States
Commentaires et corrections
Type : CommentOn
Informations de copyright
© 2024. The Author(s).
Références
Li YM, Ge XZ, Peng F, Li W, Li JJ. Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biol. 2022;23:1–13.
doi: 10.1186/s13059-022-02648-4
Chen J, King E, Deek R, Wei Z, Yu Y, Grill D, Ballman K. An omnibus test for differential distribution analysis of microbiome sequencing data. Bioinformatics. 2018;34:643–51.
doi: 10.1093/bioinformatics/btx650
pubmed: 29040451
Yang L, Zhang X, Chen J. Correspondence-manuscript-sourcecode. Github 2024 https://github.com/chloelulu/winsorization_GB2024 .
Yang L, Zhang X, Chen J. Correspondence on Li Yumei et al.: Exaggerated false positives by popular differential expression methods when analyzing human population sample Zenodo. 2024 https://doi.org/10.5281/zenodo.10871250 .