Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments.

Computer Simulation RNA / genetics RNA-Seq Sequence Analysis, RNA / methods Exome Sequencing

Gene expression Gene filter Multiple testing Next generation sequencing

Journal

BMC bioinformatics

ISSN: 1471-2105

Titre abrégé: BMC Bioinformatics

Pays: England

ID NLM: 100965194

Informations de publication

Date de publication:
24 Sep 2022

Historique:

received: 28 09 2020

accepted: 13 09 2022

entrez: 24 9 2022

pubmed: 25 9 2022

medline: 28 9 2022

Statut: epublish

Résumé

In RNA-sequencing studies a large number of hypothesis tests are performed to compare the differential expression of genes between several conditions. Filtering has been proposed to remove candidate genes with a low expression level which may not be relevant and have little or no chance of showing a difference between conditions. This step may reduce the multiple testing burden and increase power. We show in a simulation study that filtering can lead to some increase in power for RNA-sequencing data, too aggressive filtering, however, can lead to a decline. No uniformly optimal filter in terms of power exists. Depending on the scenario different filters may be optimal. We propose an adaptive filtering strategy which selects one of several filters to maximise the number of rejections. No additional adjustment for multiplicity has to be included, but a rule has to be considered if the number of rejections is too small. For a large range of simulation scenarios, the adaptive filter maximises the power while the simulated False Discovery Rate is bounded by the pre-defined significance level. Using the adaptive filter, it is not necessary to pre-specify a single individual filtering method optimised for a specific scenario.

Sections du résumé

BACKGROUND BACKGROUND

RESULTS RESULTS

We show in a simulation study that filtering can lead to some increase in power for RNA-sequencing data, too aggressive filtering, however, can lead to a decline. No uniformly optimal filter in terms of power exists. Depending on the scenario different filters may be optimal. We propose an adaptive filtering strategy which selects one of several filters to maximise the number of rejections. No additional adjustment for multiplicity has to be included, but a rule has to be considered if the number of rejections is too small.

CONCLUSIONS CONCLUSIONS

For a large range of simulation scenarios, the adaptive filter maximises the power while the simulated False Discovery Rate is bounded by the pre-defined significance level. Using the adaptive filter, it is not necessary to pre-specify a single individual filtering method optimised for a specific scenario.

Identifiants

DOI: 10.1186/s12859-022-04928-z PMID: 36153479 PMC: PMC9509565

pubmed: 36153479

doi: 10.1186/s12859-022-04928-z

pii: 10.1186/s12859-022-04928-z

pmc: PMC9509565

doi:

Substances chimiques

RNA 63231-63-0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

388

Informations de copyright

Références

BMC Bioinformatics. 2010 Sep 07;11:450

pubmed: 20822518

Science. 2008 Aug 15;321(5891):956-60

pubmed: 18599741

Proc Natl Acad Sci U S A. 2010 May 25;107(21):9546-51

pubmed: 20460310

Genome Biol. 2014;15(12):550

pubmed: 25516281

Nucleic Acids Res. 2015 Apr 20;43(7):e47

pubmed: 25605792

Stat Appl Genet Mol Biol. 2015 Nov;14(5):429-42

pubmed: 26461844

Genome Biol. 2019 Jun 4;20(1):118

pubmed: 31164141

Bioinformatics. 2013 Sep 1;29(17):2146-52

pubmed: 23821648

Bioinformatics. 2010 Jan 1;26(1):139-40

pubmed: 19910308

Biom J. 2014 Jul;56(4):614-30

pubmed: 24753160

Bioinformatics. 2010 Apr 15;26(8):1050-6

pubmed: 20189938

Stat Methods Med Res. 2013 Oct;22(5):519-36

pubmed: 22127579

BMC Genomics. 2016 Jan 05;17:28

pubmed: 26732976

PLoS One. 2011 Mar 24;6(3):e17820

pubmed: 21455293

BMC Genomics. 2019 Nov 7;20(1):820

pubmed: 31699041

Bioinformatics. 2015 Jan 15;31(2):233-41

pubmed: 25273110

Stat Med. 2010 Jan 15;29(1):1-13

pubmed: 19844944

Genet Epidemiol. 2002 Jun;23(1):70-86

pubmed: 12112249

PeerJ. 2014 Sep 23;2:e576

pubmed: 25337456

Bioinformatics. 2015 Jul 1;31(13):2131-40

pubmed: 25725090

PLoS One. 2014 Jun 13;9(6):e99625

pubmed: 24926665

BMC Bioinformatics. 2005 May 16;6:120

pubmed: 15904488

Genome Biol. 2010;11(3):R25

pubmed: 20196867

Nat Cell Biol. 2015 Apr;17(4):365-75

pubmed: 25730472

PLoS Biol. 2010 Sep 14;8(9):

pubmed: 20856902

Genome Biol. 2014 Feb 03;15(2):R29

pubmed: 24485249

Nature. 2013 Jul 4;499(7456):43-9

pubmed: 23792563

Bioinformatics. 2016 Mar 15;32(6):850-8

pubmed: 26576654

BMC Bioinformatics. 2013 Mar 09;14:91

pubmed: 23497356

BMC Bioinformatics. 2008 Jul 09;9:303

pubmed: 18613966

Nucleic Acids Res. 2013 Nov;41(21):e198

pubmed: 24049071

Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Sonja Zehetmayer (S)

Martin Posch (M)

Alexandra Graf (A)

Articles similaires

A new estimator of between study variance of standardized mean difference in meta-analysis.

Exploring transcriptomic mechanisms underlying pulmonary adaptation to diverse environments in Indian rams.

An arithmetic operation P system based on symmetric ternary system.

Assessment of first-touch skills in robotic surgical training using hi-Sim and the hinotori surgical robot system among surgeons and novices.

Classifications MeSH