clrDV: a differential variability test for RNA-Seq data based on the skew-normal distribution.
Alzheimer’s disease
Compositional data
Differential variability
RNA-Seq data
Skew-normal distribution
Journal
PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425
Informations de publication
Date de publication:
2023
2023
Historique:
received:
24
02
2023
accepted:
27
08
2023
medline:
1
11
2023
pubmed:
4
10
2023
entrez:
4
10
2023
Statut:
epublish
Résumé
Pathological conditions may result in certain genes having expression variance that differs markedly from that of the control. Finding such genes from gene expression data can provide invaluable candidates for therapeutic intervention. Under the dominant paradigm for modeling RNA-Seq gene counts using the negative binomial model, tests of differential variability are challenging to develop, owing to dependence of the variance on the mean. Here, we describe clrDV, a statistical method for detecting genes that show differential variability between two populations. We present the skew-normal distribution for modeling gene-wise null distribution of centered log-ratio transformation of compositional RNA-seq data. Simulation results show that clrDV has false discovery rate and probability of Type II error that are on par with or superior to existing methodologies. In addition, its run time is faster than its closest competitors, and remains relatively constant for increasing sample size per group. Analysis of a large neurodegenerative disease RNA-Seq dataset using clrDV successfully recovers multiple gene candidates that have been reported to be associated with Alzheimer's disease.
Sections du résumé
Background
Pathological conditions may result in certain genes having expression variance that differs markedly from that of the control. Finding such genes from gene expression data can provide invaluable candidates for therapeutic intervention. Under the dominant paradigm for modeling RNA-Seq gene counts using the negative binomial model, tests of differential variability are challenging to develop, owing to dependence of the variance on the mean.
Methods
Here, we describe clrDV, a statistical method for detecting genes that show differential variability between two populations. We present the skew-normal distribution for modeling gene-wise null distribution of centered log-ratio transformation of compositional RNA-seq data.
Results
Simulation results show that clrDV has false discovery rate and probability of Type II error that are on par with or superior to existing methodologies. In addition, its run time is faster than its closest competitors, and remains relatively constant for increasing sample size per group. Analysis of a large neurodegenerative disease RNA-Seq dataset using clrDV successfully recovers multiple gene candidates that have been reported to be associated with Alzheimer's disease.
Identifiants
pubmed: 37790621
doi: 10.7717/peerj.16126
pii: 16126
pmc: PMC10544356
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
e16126Subventions
Organisme : NIA NIH HHS
ID : P50 AG016574
Pays : United States
Organisme : NIA NIH HHS
ID : R01 AG032990
Pays : United States
Organisme : NIA NIH HHS
ID : U01 AG046139
Pays : United States
Organisme : NIA NIH HHS
ID : R01 AG018023
Pays : United States
Organisme : NIA NIH HHS
ID : U01 AG006786
Pays : United States
Organisme : NIA NIH HHS
ID : U01 AG006786
Pays : United States
Organisme : NIA NIH HHS
ID : R01 AG023571
Pays : United States
Organisme : NIA NIH HHS
ID : P01 AG017216
Pays : United States
Organisme : NINDS NIH HHS
ID : R01 NS080820
Pays : United States
Organisme : NINDS NIH HHS
ID : U24 NS072026
Pays : United States
Organisme : NIA NIH HHS
ID : P30 AG019610
Pays : United States
Informations de copyright
©2023 Li and Khang.
Déclaration de conflit d'intérêts
The authors declare there are no competing interests.
Références
BMC Syst Biol. 2010 Nov 12;4:154
pubmed: 21073694
Genome Biol. 2010;11(10):R106
pubmed: 20979621
PeerJ. 2015 Oct 29;3:e1360
pubmed: 26539333
Biostatistics. 2012 Apr;13(2):204-16
pubmed: 22285995
Bioinformatics. 2015 Sep 1;31(17):2778-84
pubmed: 25926345
JSM Alzheimers Dis Relat Dement. 2016;3(1):
pubmed: 27990492
Brief Bioinform. 2018 Sep 28;19(5):776-792
pubmed: 28334202
BMC Bioinformatics. 2018 Jul 18;19(1):274
pubmed: 30021534
Nature. 2006 Jun 22;441(7096):1011-4
pubmed: 16791200
Front Mol Neurosci. 2014 May 21;7:46
pubmed: 24904272
Cells. 2019 Mar 11;8(3):
pubmed: 30862089
J Neuroendocrinol. 2018 Feb;30(2):
pubmed: 28485080
J Biol Chem. 1990 Sep 15;265(26):15977-83
pubmed: 2118534
Microbiome. 2014 May 05;2:15
pubmed: 24910773
Cancer Inform. 2015 Jun 07;14:71-81
pubmed: 26078586
BMC Syst Biol. 2015 Nov 19;9:82
pubmed: 26586157
Nucleic Acids Res. 2017 Jul 27;45(13):e127
pubmed: 28535263
BMC Bioinformatics. 2010 Feb 18;11:94
pubmed: 20167110
NAR Genom Bioinform. 2022 Jan 14;4(1):lqab124
pubmed: 35047816
J Mol Biol. 2017 Aug 4;429(16):2427-2437
pubmed: 28684248
J Thromb Haemost. 2020 Nov;18(11):3029-3042
pubmed: 32790050
Genome Biol. 2014;15(12):550
pubmed: 25516281
J Clin Invest. 2006 Nov;116(11):2855-7
pubmed: 17080189
Matrix Biol. 2018 Oct;71-72:90-99
pubmed: 29217273
BMC Bioinformatics. 2013 Aug 21;14:254
pubmed: 23965047
Front Microbiol. 2017 Nov 15;8:2224
pubmed: 29187837
Immun Ageing. 2012 Sep 17;9(1):20
pubmed: 22985434
PLoS Genet. 2012;8(4):e1002683
pubmed: 22570624
Genome Biol. 2014 Sep 23;15(9):465
pubmed: 25245051
Science. 1994 Sep 2;265(5177):1464-7
pubmed: 8073293
J Alzheimers Dis. 2014;38(3):533-9
pubmed: 24018267
Genome Biol. 2010;11(3):R25
pubmed: 20196867
Nat Rev Genet. 2019 Nov;20(11):631-656
pubmed: 31341269
Front Neurol. 2022 Feb 25;13:830064
pubmed: 35280286
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
J Bioinform Comput Biol. 2012 Apr;10(2):1241013
pubmed: 22809348
Nucleic Acids Res. 2016 Jul 27;44(13):e119
pubmed: 27190235
Mol Syst Biol. 2020 Jun;16(6):e9596
pubmed: 32558274
Front Cell Neurosci. 2018 Oct 29;12:384
pubmed: 30429775
Genome Biol. 2014 Feb 03;15(2):R29
pubmed: 24485249
Ann N Y Acad Sci. 2007 Jan;1096:170-8
pubmed: 17405928
PLoS Genet. 2011 Aug;7(8):e1002207
pubmed: 21852951
Sci Data. 2016 Oct 11;3:160089
pubmed: 27727239
Gigascience. 2019 Sep 1;8(9):
pubmed: 31544212
Physiol Genomics. 2019 May 1;51(5):145-158
pubmed: 30875273
J Clin Invest. 2022 Jan 18;132(2):
pubmed: 34813500
Bioinformatics. 2018 Aug 15;34(16):2870-2878
pubmed: 29608657
Science. 2005 Sep 23;309(5743):2010-3
pubmed: 16179466
Matrix Biol. 2015 Sep;47:44-53
pubmed: 25960419