A guide to creating design matrices for gene expression experiments.
Design matrix
contrast matrix
gene expression analysis
model matrix
statistical models
Journal
F1000Research
ISSN: 2046-1402
Titre abrégé: F1000Res
Pays: England
ID NLM: 101594320
Informations de publication
Date de publication:
2020
2020
Historique:
accepted:
26
11
2020
entrez:
19
2
2021
pubmed:
20
2
2021
medline:
18
5
2021
Statut:
epublish
Résumé
Differential expression analysis of genomic data types, such as RNA-sequencing experiments, use linear models to determine the size and direction of the changes in gene expression. For RNA-sequencing, there are several established software packages for this purpose accompanied with analysis pipelines that are well described. However, there are two crucial steps in the analysis process that can be a stumbling block for many -- the set up an appropriate model via design matrices and the set up of comparisons of interest via contrast matrices. These steps are particularly troublesome because an extensive catalogue for design and contrast matrices does not currently exist. One would usually search for example case studies across different platforms and mix and match the advice from those sources to suit the dataset they have at hand. This article guides the reader through the basics of how to set up design and contrast matrices. We take a practical approach by providing code and graphical representation of each case study, starting with simpler examples (e.g. models with a single explanatory variable) and move onto more complex ones (e.g. interaction models, mixed effects models, higher order time series and cyclical models). Although our work has been written specifically with a
Identifiants
pubmed: 33604029
doi: 10.12688/f1000research.27893.1
pmc: PMC7873980
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1444Informations de copyright
Copyright: © 2020 Law CW et al.
Déclaration de conflit d'intérêts
No competing interests were disclosed.
Références
Nucleic Acids Res. 2012 May;40(10):4288-97
pubmed: 22287627
Nucleic Acids Res. 2015 Apr 20;43(7):e47
pubmed: 25605792
F1000Res. 2020 Jun 4;9:512
pubmed: 32704355
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
F1000Res. 2016 Jun 17;5:1408
pubmed: 27441086
Nat Methods. 2015 Feb;12(2):115-21
pubmed: 25633503
Biostatistics. 2004 Jan;5(1):89-111
pubmed: 14744830
Stat Appl Genet Mol Biol. 2004;3:Article3
pubmed: 16646809