Response to "Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives" and "Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples".


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
30 Oct 2024
Historique:
received: 18 01 2024
accepted: 28 03 2024
medline: 31 10 2024
pubmed: 31 10 2024
entrez: 31 10 2024
Statut: epublish

Résumé

Two correspondences raised concerns or comments about our analyses regarding exaggerated false positives found by differential expression (DE) methods. Here, we discuss the points they raise and explain why we agree or disagree with these points. We add new analysis to confirm that the Wilcoxon rank-sum test remains the most robust method compared to the other five DE methods (DESeq2, edgeR, limma-voom, dearseq, and NOISeq) in two-condition DE analyses after considering normalization and winsorization, the data preprocessing steps discussed in the two correspondences.

Identifiants

pubmed: 39478544
doi: 10.1186/s13059-024-03232-8
pii: 10.1186/s13059-024-03232-8
doi:

Types de publication

Journal Article Letter

Langues

eng

Sous-ensembles de citation

IM

Pagination

283

Informations de copyright

© 2024. The Author(s).

Références

Li YM, Ge XZ, Peng F, Li W, Li JJ: Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology 2022, 23.
Hejblum BP, Ba K, Thiebaut RT, Agniel D: Neglecting normalization impact in semi-synthetic RNA-seq data simulation generates artificial false positives. 2023.
Yang L, Zhang X, Chen J: Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples. 2023.
Song DY, Wang QY, Yan GA, Liu TY, Sun TY, Li JJ: scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nature Biotechnology 2023.
Gauthier M, Agniel D, Thiebaut R, Hejblum BP: dearseq: a variance component score test for RNA-seq differential analysis that effectively controls the false discovery rate. Nar Genomics and Bioinformatics 2020, 2.
Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25. https://doi.org/10.1186/gb-2010-11-3-r25 .
doi: 10.1186/gb-2010-11-3-r25 pubmed: 20196867
Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics. 2010;11:94.
doi: 10.1186/1471-2105-11-94 pubmed: 20167110
Maza E, Frasse P, Senin P, Bouzayen M, Zouine M. Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments: a matter of relative size of studied transcriptomes. Commun Integr Biol. 2013;6:e25849.
doi: 10.4161/cib.25849 pubmed: 26442135
Li X, Brock GN, Rouchka EC, Cooper NGF, Wu D, O’Toole TE, Gill RS, Eteleeb AM, O’Brien L, Rai SN. A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data. PLoS ONE. 2017;12:e0176185.
doi: 10.1371/journal.pone.0176185 pubmed: 28459823
Love MI, Huber W, Anders S: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 2014, 15.
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40.
doi: 10.1093/bioinformatics/btp616 pubmed: 19910308
Law CW, Chen YS, Shi W, Smyth GK: voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology 2014, 15.
De Schryver M, De Neve J. A tutorial on probabilistic index models: regression models for the effect size P(Y1 < Y2). Psychol Methods. 2019;24:403–18.
doi: 10.1037/met0000194 pubmed: 30265047

Auteurs

Xinzhou Ge (X)

Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095, USA.
Department of Statistics, Oregon State University, Corvallis, OR, 97331, USA.

Yumei Li (Y)

Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA, 92697, USA.
School of Biology and Basic Medical Sciences, Soochow University, Suzhou, 215123, China.

Wei Li (W)

Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA, 92697, USA. wei.li@uci.edu.

Jingyi Jessica Li (JJ)

Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095, USA. lijy03@g.ucla.edu.
Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA, 90095, USA. lijy03@g.ucla.edu.
Department of Human Genetics, University of California, Los Angeles, CA, 90095, USA. lijy03@g.ucla.edu.
Department of Computational Medicine, University of California, Los Angeles, CA, 90095, USA. lijy03@g.ucla.edu.
Department of Biostatistics, University of California, Los Angeles, CA, 90095, USA. lijy03@g.ucla.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH