A differential k-mer analysis pipeline for comparing RNA-Seq transcriptome and meta-transcriptome datasets without a reference.


Journal

Functional & integrative genomics
ISSN: 1438-7948
Titre abrégé: Funct Integr Genomics
Pays: Germany
ID NLM: 100939343

Informations de publication

Date de publication:
Mar 2019
Historique:
received: 02 08 2018
accepted: 09 11 2018
revised: 08 11 2018
pubmed: 30 11 2018
medline: 30 5 2019
entrez: 29 11 2018
Statut: ppublish

Résumé

Next-generation DNA sequencing technologies, such as RNA-Seq, currently dominate genome-wide gene expression studies. A standard approach to analyse this data requires mapping sequence reads to a reference and counting the number of reads which map to each gene. However, for many transcriptome studies, a suitable reference genome is unavailable, especially for meta-transcriptome studies which assay gene expression from mixed populations of organisms. Where a reference is unavailable, it is possible to generate a reference by the de novo assembly of the sequence reads. However, the high cost of generating high-coverage data for de novo assembly hinders this approach and more importantly the accurate assembly of such data is challenging, especially for meta-transcriptome data, and resulting assemblies frequently suffer from collapsed regions or chimeric sequences. As an alternative to the standard reference mapping approach, we have developed a k-mer-based analysis pipeline (DiffKAP) to identify differentially expressed reads between RNA-Seq datasets without the requirement for a reference. We compared the DiffKAP approach with the traditional Tophat/Cuffdiff method using RNA-Seq data from soybean, which has a suitable reference genome. We subsequently examined differential gene expression for a coral meta-transcriptome where no reference is available, and validated the results using qRT-PCR. We conclude that DiffKAP is an accurate method to study differential gene expression in complex meta-transcriptomes without the requirement of a reference genome.

Identifiants

pubmed: 30483906
doi: 10.1007/s10142-018-0647-3
pii: 10.1007/s10142-018-0647-3
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

363-371

Références

Genome Biol. 2002 Jun 18;3(7):RESEARCH0034
pubmed: 12184808
Neurosci Lett. 2003 Mar 13;339(1):62-6
pubmed: 12618301
Nucleic Acids Res. 2003 Oct 1;31(19):5676-84
pubmed: 14500831
Plant Physiol. 1981 Nov;68(5):1144-9
pubmed: 16662065
Methods Mol Biol. 2007;406:89-112
pubmed: 18287689
Genome Res. 2008 May;18(5):821-9
pubmed: 18349386
Nat Methods. 2008 Jul;5(7):621-8
pubmed: 18516045
BMC Genomics. 2008 Oct 31;9:517
pubmed: 18976482
Nat Rev Genet. 2009 Jan;10(1):57-63
pubmed: 19015660
Trends Microbiol. 2009 Dec;17(12):554-62
pubmed: 19822428
Bioinformatics. 2010 Jan 1;26(1):136-8
pubmed: 19855105
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
Nat Rev Genet. 2010 Jan;11(1):31-46
pubmed: 19997069
J Integr Plant Biol. 2010 Jan;52(1):61-76
pubmed: 20074141
Nature. 2010 Jan 14;463(7278):178-83
pubmed: 20075913
Nat Biotechnol. 2010 May;28(5):421-3
pubmed: 20458303
Mar Biotechnol (NY). 2011 Jun;13(3):355-65
pubmed: 20668900
Cell Stress Chaperones. 2011 Jan;16(1):69-80
pubmed: 20821176
Nat Methods. 2010 Nov;7(11):909-12
pubmed: 20935650
Genome Biol. 2010;11(12):220
pubmed: 21176179
Bioinformatics. 2011 Mar 15;27(6):764-70
pubmed: 21217122
PLoS One. 2011 Jan 24;6(1):e16095
pubmed: 21283671
Nat Biotechnol. 2011 May 15;29(7):644-52
pubmed: 21572440
Nat Methods. 2011 Jun;8(6):469-77
pubmed: 21623353
Bioinformatics. 2011 Jul 1;27(13):i94-101
pubmed: 21685107
Bioinformatics. 2011 Sep 1;27(17):2325-9
pubmed: 21697122
Nat Rev Genet. 2011 Sep 07;12(10):671-82
pubmed: 21897427
Bioinformatics. 2012 Apr 15;28(8):1086-92
pubmed: 22368243
Nat Protoc. 2012 Mar 01;7(3):562-78
pubmed: 22383036
PLoS One. 2012;7(4):e36009
pubmed: 22558305
Nucleic Acids Res. 2012 Nov 1;40(20):e155
pubmed: 22821567
Plant Biotechnol J. 2012 Oct;10(8):995-1010
pubmed: 22863334
Nucleic Acids Res. 2012 Nov 1;40(20):10084-97
pubmed: 22965124
Proc Natl Acad Sci U S A. 2013 Jan 22;110(4):1387-92
pubmed: 23297204
PLoS One. 2013;8(3):e59270
pubmed: 23555009
Ecol Evol. 2013 Apr;3(4):822-34
pubmed: 23610627
ISME J. 2015 Mar 17;9(4):844-56
pubmed: 25343511
BMC Genomics. 2014 Dec 02;15:1052
pubmed: 25467196
PLoS One. 2015 Oct 28;10(10):e0139223
pubmed: 26510159
Bioresour Technol. 2017 Nov;244(Pt 2):1281-1293
pubmed: 28625352
Curr Biol. 2018 Aug 20;28(16):2570-2580.e6
pubmed: 30100341

Auteurs

Chon-Kit Kenneth Chan (CK)

School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, WA, 6009, Australia.
Australian Genome Research Facility, Melbourne, VIC, Australia.

Nedeljka Rosic (N)

School of Health and Human Sciences, Southern Cross University, Gold Coast, QLD, 4225, Australia.
Marine Ecology Research Centre, Southern Cross University, Lismore, NSW, 2480, Australia.

Michał T Lorenc (MT)

Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD, 4000, Australia.

Paul Visendi (P)

Natural Resources Institute, University of Greenwich, Chatham Maritime, Kent, ME4 4TB, UK.

Meng Lin (M)

Centre for Integrative Legume Research, School of Agriculture and Food Sciences, The University of Queensland, St. Lucia, QLD, 4067, Australia.

Paulina Kaniewska (P)

Global Change Institute, The University of Queensland, St. Lucia, QLD, 4072, Australia.

Brett J Ferguson (BJ)

Centre for Integrative Legume Research, School of Agriculture and Food Sciences, The University of Queensland, St. Lucia, QLD, 4067, Australia.

Peter M Gresshoff (PM)

Centre for Integrative Legume Research, School of Agriculture and Food Sciences, The University of Queensland, St. Lucia, QLD, 4067, Australia.

Jacqueline Batley (J)

School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, WA, 6009, Australia.

David Edwards (D)

School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, WA, 6009, Australia. Dave.Edwards@uwa.edu.au.

Articles similaires

Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice

Classifications MeSH