Benchmarking integration of single-cell differential expression.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
21 03 2023
Historique:
received: 03 06 2022
accepted: 03 03 2023
entrez: 22 3 2023
pubmed: 23 3 2023
medline: 24 3 2023
Statut: epublish

Résumé

Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.

Identifiants

pubmed: 36944632
doi: 10.1038/s41467-023-37126-3
pii: 10.1038/s41467-023-37126-3
pmc: PMC10030080
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1570

Informations de copyright

© 2023. The Author(s).

Références

Nat Methods. 2022 Jan;19(1):41-50
pubmed: 34949812
Nat Methods. 2019 Jan;16(1):43-49
pubmed: 30573817
Nucleic Acids Res. 2012 May;40(10):4288-97
pubmed: 22287627
Cold Spring Harb Perspect Med. 2015 Apr 01;5(4):
pubmed: 25833940
Cell. 2021 Apr 1;184(7):1895-1913.e19
pubmed: 33657410
Nat Commun. 2020 May 8;11(1):2285
pubmed: 32385277
Nat Methods. 2019 Aug;16(8):715-721
pubmed: 31363220
Bioinformatics. 2015 Nov 15;31(22):3718-20
pubmed: 26209431
Nat Commun. 2020 Feb 7;11(1):774
pubmed: 32034137
Nucleic Acids Res. 2020 Jan 8;48(D1):D845-D855
pubmed: 31680165
Science. 2020 Sep 4;369(6508):1210-1220
pubmed: 32788292
Cell. 2015 May 21;161(5):1187-1201
pubmed: 26000487
Genome Biol. 2017 Sep 12;18(1):174
pubmed: 28899397
Trends Mol Med. 2007 Dec;13(12):527-34
pubmed: 17981505
Nat Biotechnol. 2021 Jul;39(7):877-884
pubmed: 33767393
Genome Biol. 2018 Feb 26;19(1):24
pubmed: 29478411
Nat Rev Immunol. 2011 Oct 10;11(11):762-74
pubmed: 21984070
Biostatistics. 2007 Jan;8(1):118-27
pubmed: 16632515
Nucleic Acids Res. 2021 Jan 8;49(D1):D1138-D1143
pubmed: 33068428
Nat Biotechnol. 2021 Oct;39(10):1202-1215
pubmed: 33941931
Cell Syst. 2016 Oct 26;3(4):346-360.e4
pubmed: 27667365
Nature. 2014 Jul 31;511(7511):543-50
pubmed: 25079552
Genome Biol. 2014;15(12):550
pubmed: 25516281
Methods Mol Biol. 2016;1418:391-416
pubmed: 27008025
Nat Genet. 2004 Jul;36(7):663; author reply 663
pubmed: 15226741
Genome Biol. 2019 Dec 23;20(1):296
pubmed: 31870423
Bioinformatics. 2013 Feb 15;29(4):461-7
pubmed: 23267174
Nucleic Acids Res. 2015 Apr 20;43(7):e47
pubmed: 25605792
Methods Mol Biol. 2016;1418:93-110
pubmed: 27008011
Nat Biotechnol. 2019 Jun;37(6):685-691
pubmed: 31061482
Science. 2018 May 18;360(6390):758-763
pubmed: 29622724
Nat Methods. 2018 Dec;15(12):1053-1058
pubmed: 30504886
Curr Protoc. 2021 Mar;1(3):e90
pubmed: 33780170
Nat Methods. 2018 Apr;15(4):255-261
pubmed: 29481549
Proc Natl Acad Sci U S A. 2019 May 14;116(20):9775-9784
pubmed: 31028141
F1000Res. 2016 Jun 20;5:1438
pubmed: 27508061
Genome Biol. 2020 Jan 16;21(1):12
pubmed: 31948481
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
Nat Biotechnol. 2018 Jun;36(5):421-427
pubmed: 29608177
Genome Biol. 2014 Feb 03;15(2):R29
pubmed: 24485249
Cell Mol Life Sci. 2008 Nov;65(23):3756-88
pubmed: 18726070
Nat Commun. 2018 Jan 18;9(1):284
pubmed: 29348443
Cell. 2018 Feb 22;172(5):1091-1107.e17
pubmed: 29474909
Cell. 2019 Jun 13;177(7):1888-1902.e21
pubmed: 31178118
Nat Med. 2018 Aug;24(8):1277-1289
pubmed: 29988129
Front Oncol. 2019 Sep 25;9:953
pubmed: 31612108
Nature. 2018 Oct;562(7727):367-372
pubmed: 30283141
Nat Commun. 2021 Sep 28;12(1):5692
pubmed: 34584091
Genome Biol. 2015 Dec 10;16:278
pubmed: 26653891
Bioinformatics. 2012 Oct 1;28(19):2534-6
pubmed: 22863766
Cell. 2018 Apr 5;173(2):321-337.e10
pubmed: 29625050
Sci Rep. 2021 Mar 26;11(1):6980
pubmed: 33772054
Mol Syst Biol. 2019 Jun 19;15(6):e8746
pubmed: 31217225

Auteurs

Hai C T Nguyen (HCT)

Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.

Bukyung Baik (B)

Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.

Sora Yoon (S)

Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.
Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA.

Taesung Park (T)

Department of Statistics, Seoul National University, Seoul, 08826, Republic of Korea.
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea.

Dougu Nam (D)

Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea. dougnam@unist.ac.kr.
Department of Mathematical Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea. dougnam@unist.ac.kr.

Articles similaires

Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family
Humans Meta-Analysis as Topic Sample Size Models, Statistical Computer Simulation
Humans Colorectal Neoplasms Biomarkers, Tumor Prognosis Gene Expression Regulation, Neoplastic
Animals Lung India Sheep Transcriptome

Classifications MeSH