Response to "Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives" and "Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples".
Journal
Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660
Informations de publication
Date de publication:
30 Oct 2024
30 Oct 2024
Historique:
received:
18
01
2024
accepted:
28
03
2024
medline:
31
10
2024
pubmed:
31
10
2024
entrez:
31
10
2024
Statut:
epublish
Résumé
Two correspondences raised concerns or comments about our analyses regarding exaggerated false positives found by differential expression (DE) methods. Here, we discuss the points they raise and explain why we agree or disagree with these points. We add new analysis to confirm that the Wilcoxon rank-sum test remains the most robust method compared to the other five DE methods (DESeq2, edgeR, limma-voom, dearseq, and NOISeq) in two-condition DE analyses after considering normalization and winsorization, the data preprocessing steps discussed in the two correspondences.
Identifiants
pubmed: 39478544
doi: 10.1186/s13059-024-03232-8
pii: 10.1186/s13059-024-03232-8
doi:
Types de publication
Journal Article
Letter
Langues
eng
Sous-ensembles de citation
IM
Pagination
283Informations de copyright
© 2024. The Author(s).
Références
Li YM, Ge XZ, Peng F, Li W, Li JJ: Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology 2022, 23.
Hejblum BP, Ba K, Thiebaut RT, Agniel D: Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives. 2023.
Yang L, Zhang X, Chen J: Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples. 2023.
Song DY, Wang QY, Yan GA, Liu TY, Sun TY, Li JJ: scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nature Biotechnology 2023.
Gauthier M, Agniel D, Thiebaut R, Hejblum BP: dearseq: a variance component score test for RNA-seq differential analysis that effectively controls the false discovery rate. Nar Genomics and Bioinformatics 2020, 2.
Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25. https://doi.org/10.1186/gb-2010-11-3-r25 .
doi: 10.1186/gb-2010-11-3-r25
pubmed: 20196867
Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics. 2010;11:94.
doi: 10.1186/1471-2105-11-94
pubmed: 20167110
Maza E, Frasse P, Senin P, Bouzayen M, Zouine M. Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments: a matter of relative size of studied transcriptomes. Commun Integr Biol. 2013;6:e25849.
doi: 10.4161/cib.25849
pubmed: 26442135
Li X, Brock GN, Rouchka EC, Cooper NGF, Wu D, O’Toole TE, Gill RS, Eteleeb AM, O’Brien L, Rai SN. A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data. PLoS ONE. 2017;12:e0176185.
doi: 10.1371/journal.pone.0176185
pubmed: 28459823
Love MI, Huber W, Anders S: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 2014, 15.
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40.
doi: 10.1093/bioinformatics/btp616
pubmed: 19910308
Law CW, Chen YS, Shi W, Smyth GK: voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology 2014, 15.
De Schryver M, De Neve J. A tutorial on probabilistic index models: regression models for the effect size P(Y1 < Y2). Psychol Methods. 2019;24:403–18.
doi: 10.1037/met0000194
pubmed: 30265047