Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data.

RNA-Seq batch effect data preprocessing differential expression microarray normalization quality check scRNA-Seq toxicogenomics transcriptomics

Journal

Nanomaterials (Basel, Switzerland)
ISSN: 2079-4991
Titre abrégé: Nanomaterials (Basel)
Pays: Switzerland
ID NLM: 101610216

Informations de publication

Date de publication:
08 May 2020
Historique:
received: 10 03 2020
revised: 29 04 2020
accepted: 04 05 2020
entrez: 14 5 2020
pubmed: 14 5 2020
medline: 14 5 2020
Statut: epublish

Résumé

Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.

Identifiants

pubmed: 32397130
pii: nano10050903
doi: 10.3390/nano10050903
pmc: PMC7279140
pii:
doi:

Types de publication

Journal Article Review

Langues

eng

Subventions

Organisme : Academy of Finland
ID : 322761
Organisme : H2020 NanosolveIT
ID : 814572

Références

Mol Syst Biol. 2018 Apr 16;14(4):e8046
pubmed: 29661792
J Am Med Inform Assoc. 2013 Jan 1;20(1):125-7
pubmed: 23037799
F1000Res. 2015 Oct 21;4:28
pubmed: 26535111
Genome Biol. 2016 Feb 17;17:29
pubmed: 26887813
Nucleic Acids Res. 2015 Sep 3;43(15):e97
pubmed: 25925576
Bioinformatics. 2015 Jan 15;31(2):166-9
pubmed: 25260700
Nat Protoc. 2016 Sep;11(9):1650-67
pubmed: 27560171
Bioinformatics. 2012 Mar 15;28(6):882-3
pubmed: 22257669
Nucleic Acids Res. 2015 Jan;43(Database issue):D921-7
pubmed: 25313160
Bioinformation. 2007 Apr 10;1(10):423-8
pubmed: 17597933
Nucleic Acids Res. 2014 Jan;42(Database issue):D472-7
pubmed: 24243840
Biostatistics. 2007 Jan;8(1):118-27
pubmed: 16632515
Nat Methods. 2019 Jan;16(1):43-49
pubmed: 30573817
Brief Bioinform. 2016 May;17(3):393-407
pubmed: 26342128
Nucleic Acids Res. 2002 Jan 1;30(1):207-10
pubmed: 11752295
BMC Bioinformatics. 2010 Feb 18;11:94
pubmed: 20167110
Bioinformatics. 2011 Nov 15;27(22):3209-10
pubmed: 21976420
BMC Genomics. 2015 Feb 14;16:82
pubmed: 25888492
Nucleic Acids Res. 2012 May;40(10):4288-97
pubmed: 22287627
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5
pubmed: 17130148
Nat Protoc. 2009;4(1):44-57
pubmed: 19131956
Proc Natl Acad Sci U S A. 2010 May 25;107(21):9546-51
pubmed: 20460310
Nat Rev Cancer. 2007 Jan;7(1):54-60
pubmed: 17186018
BMC Bioinformatics. 2019 Feb 15;20(1):79
pubmed: 30767762
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W193-200
pubmed: 17478515
OMICS. 2012 May;16(5):284-7
pubmed: 22455463
Bioinformatics. 2006 Jul 1;22(13):1600-7
pubmed: 16606683
Bioinformatics. 2011 Sep 1;27(17):2325-9
pubmed: 21697122
Genome Biol. 2010;11(3):R34
pubmed: 20236510
Cell. 2018 Aug 23;174(5):1293-1308.e36
pubmed: 29961579
Genome Biol. 2014;15(12):550
pubmed: 25516281
Bioinformatics. 2013 Jan 1;29(1):129-31
pubmed: 23097420
Nucleic Acids Res. 2003 Feb 15;31(4):e15
pubmed: 12582260
PLoS Comput Biol. 2018 Jun 25;14(6):e1006245
pubmed: 29939984
Nat Methods. 2013 Nov;10(11):1093-5
pubmed: 24056876
Nucleic Acids Res. 2015 Apr 20;43(7):e47
pubmed: 25605792
Nat Methods. 2017 Jun;14(6):584-586
pubmed: 28418000
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
Genome Biol. 2010;11(12):220
pubmed: 21176179
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W71-6
pubmed: 23620278
Cell. 2017 Nov 30;171(6):1437-1452.e17
pubmed: 29195078
Nucleic Acids Res. 2002 Feb 15;30(4):e15
pubmed: 11842121
Nucleic Acids Res. 2015 Jan;43(Database issue):D1113-6
pubmed: 25361974
Nat Genet. 2003 Dec;35(4):292-3
pubmed: 14647279
Source Code Biol Med. 2019 Jan 29;14:1
pubmed: 30728855
Toxicol Sci. 2017 May 1;157(1):85-99
pubmed: 28123101
Mol Omics. 2018 Aug 6;14(4):218-236
pubmed: 29917034
Brief Bioinform. 2013 Jul;14(4):469-90
pubmed: 22851511
Nat Rev Genet. 2006 Jan;7(1):55-65
pubmed: 16369572
BMC Bioinformatics. 2008 Mar 05;9:140
pubmed: 18318917
Nat Protoc. 2019 Feb;14(2):482-517
pubmed: 30664679
Genome Biol. 2019 Sep 9;20(1):194
pubmed: 31500660
Nat Rev Genet. 2004 Dec;5(12):936-48
pubmed: 15573125
Nat Biotechnol. 2018 Dec 03;:
pubmed: 30531897
Curr Issues Mol Biol. 2002 Apr;4(2):57-64
pubmed: 11931570
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
Genome Res. 1998 Mar;8(3):186-94
pubmed: 9521922
Nucleic Acids Res. 2019 May 7;47(8):e47
pubmed: 30783653
Bioinformatics. 2009 May 1;25(9):1105-11
pubmed: 19289445
Nucleic Acids Res. 2016 Jan 4;44(D1):D457-62
pubmed: 26476454
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
Genome Biol. 2014 Feb 03;15(2):R29
pubmed: 24485249
BMC Med Genomics. 2012 Jun 08;5:23
pubmed: 22682473
BMC Genomics. 2011 Oct 14;12:507
pubmed: 21999641
Cell Syst. 2019 Apr 24;8(4):329-337.e4
pubmed: 30954475
Nat Methods. 2008 Jul;5(7):621-8
pubmed: 18516045
Genome Biol. 2016 Apr 27;17:75
pubmed: 27122128
PLoS Comput Biol. 2012;8(2):e1002375
pubmed: 22383865
BMC Bioinformatics. 2015 Feb 05;16:37
pubmed: 25652236
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W560-8
pubmed: 22600742
BMC Biol. 2014 May 30;12:42
pubmed: 24885439
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7
pubmed: 27141961
Biostatistics. 2012 Jul;13(3):539-52
pubmed: 22101192
Genome Biol. 2013 Apr 25;14(4):R36
pubmed: 23618408
Nucleic Acids Res. 2008 Jan;36(Database issue):D753-60
pubmed: 18003653
Int Rev Neurobiol. 2004;60:25-58
pubmed: 15474586
Altern Lab Anim. 2015 Nov;43(5):325-32
pubmed: 26551289
Nucleic Acids Res. 2015 Jul 1;43(W1):W117-21
pubmed: 25897133
PLoS One. 2015 Oct 01;10(10):e0139516
pubmed: 26426330
Genome Biol. 2016 Jan 26;17:13
pubmed: 26813401
Nucleic Acids Res. 2018 Jan 4;46(D1):D661-D667
pubmed: 29136241
Nucleic Acids Res. 2015 Dec 2;43(21):e140
pubmed: 26184878
Nat Methods. 2015 Feb;12(2):115-21
pubmed: 25633503
Nat Commun. 2019 Nov 28;10(1):5416
pubmed: 31780648
Stat Med. 2014 May 20;33(11):1946-78
pubmed: 24399688
Int J Mol Sci. 2017 Jul 29;18(8):
pubmed: 28758927
Nat Genet. 2002 Dec;32 Suppl:496-501
pubmed: 12454644
Nat Commun. 2017 Jan 16;8:14049
pubmed: 28091601
Nat Biotechnol. 2014 Sep;32(9):896-902
pubmed: 25150836
Biostatistics. 2016 Jan;17(1):16-28
pubmed: 26286812
Brief Bioinform. 2018 Sep 28;19(5):776-792
pubmed: 28334202
Mol Syst Biol. 2019 Jun 19;15(6):e8746
pubmed: 31217225
Nat Rev Genet. 2010 Oct;11(10):733-9
pubmed: 20838408
Cell. 2015 May 21;161(5):1187-1201
pubmed: 26000487
BMC Bioinformatics. 2009 Apr 19;10:110
pubmed: 19374774
PLoS One. 2017 May 25;12(5):e0178302
pubmed: 28542535
Genome Biol. 2015 Jun 04;16:117
pubmed: 26040460
Nat Methods. 2009 May;6(5):377-82
pubmed: 19349980
FEBS Lett. 2017 Aug;591(15):2213-2225
pubmed: 28524227
Pharmacogenomics. 2006 Oct;7(7):1025-44
pubmed: 17054413
BMC Mol Biol. 2006 Jan 31;7:3
pubmed: 16448564
Gigascience. 2018 Jun 1;7(6):
pubmed: 29846586
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W305-11
pubmed: 19465376

Auteurs

Antonio Federico (A)

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.
BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.

Angela Serra (A)

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.
BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.

My Kieu Ha (MK)

Center for Next Generation Cytometry, Hanyang University, Seoul 04763, Korea.
Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Korea.
Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Korea.

Pekka Kohonen (P)

Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden.
Division of Toxicology, Misvik Biology, 20520 Turku, Finland.

Jang-Sik Choi (JS)

Center for Next Generation Cytometry, Hanyang University, Seoul 04763, Korea.
Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Korea.
Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Korea.

Irene Liampa (I)

School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece.

Penny Nymark (P)

Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden.
Division of Toxicology, Misvik Biology, 20520 Turku, Finland.

Natasha Sanabria (N)

National Institute for Occupational Health, 30333 Johannesburg, South Africa.

Luca Cattelani (L)

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.
BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.

Michele Fratello (M)

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.
BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.

Pia Anneli Sofia Kinaret (PAS)

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.
BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.
Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland.

Karolina Jagiello (K)

QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland.
Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland karolina.jagiello@ug.edu.pl (K.J.).

Tomasz Puzyn (T)

QSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, Poland.
Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland karolina.jagiello@ug.edu.pl (K.J.).

Georgia Melagraki (G)

Nanoinformatics Department, NovaMechanics Ltd., 1065 Nicosia, Cyprus.

Mary Gulumian (M)

National Institute for Occupational Health, 30333 Johannesburg, South Africa.
Haematology and Molecular Medicine Department, School of Pathology, University of the Witwatersrand, 2050 Johannesburg, South Africa.

Antreas Afantitis (A)

Nanoinformatics Department, NovaMechanics Ltd., 1065 Nicosia, Cyprus.

Haralambos Sarimveis (H)

School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece.

Tae-Hyun Yoon (TH)

Center for Next Generation Cytometry, Hanyang University, Seoul 04763, Korea.
Department of Chemistry, College of Natural Sciences, Hanyang University, Seoul 04763, Korea.
Institute of Next Generation Material Design, Hanyang University, Seoul 04763, Korea.

Roland Grafström (R)

Institute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden.
Division of Toxicology, Misvik Biology, 20520 Turku, Finland.

Dario Greco (D)

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.
BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.
Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland.

Classifications MeSH