A hybrid pipeline for reconstruction and analysis of viral genomes at multi-organ level.

JC polyomavirus efficient pipeline genome analysis mitochondrial DNA multi-organ sequencing parvovirus B19 viral genomes

Journal

GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872

Informations de publication

Date de publication:
01 08 2020
Historique:
received: 17 01 2020
revised: 25 05 2020
accepted: 23 07 2020
entrez: 21 8 2020
pubmed: 21 8 2020
medline: 26 10 2021
Statut: ppublish

Résumé

Advances in sequencing technologies have enabled the characterization of multiple microbial and host genomes, opening new frontiers of knowledge while kindling novel applications and research perspectives. Among these is the investigation of the viral communities residing in the human body and their impact on health and disease. To this end, the study of samples from multiple tissues is critical, yet, the complexity of such analysis calls for a dedicated pipeline. We provide an automatic and efficient pipeline for identification, assembly, and analysis of viral genomes that combines the DNA sequence data from multiple organs. TRACESPipe relies on cooperation among 3 modalities: compression-based prediction, sequence alignment, and de novo assembly. The pipeline is ultra-fast and provides, additionally, secure transmission and storage of sensitive data. TRACESPipe performed outstandingly when tested on synthetic and ex vivo datasets, identifying and reconstructing all the viral genomes, including those with high levels of single-nucleotide polymorphisms. It also detected minimal levels of genomic variation between different organs. TRACESPipe's unique ability to simultaneously process and analyze samples from different sources enables the evaluation of within-host variability. This opens up the possibility to investigate viral tissue tropism, evolution, fitness, and disease associations. Moreover, additional features such as DNA damage estimation and mitochondrial DNA reconstruction and analysis, as well as exogenous-source controls, expand the utility of this pipeline to other fields such as forensics and ancient DNA studies. TRACESPipe is released under GPLv3 and is available for free download at https://github.com/viromelab/tracespipe.

Sections du résumé

BACKGROUND
Advances in sequencing technologies have enabled the characterization of multiple microbial and host genomes, opening new frontiers of knowledge while kindling novel applications and research perspectives. Among these is the investigation of the viral communities residing in the human body and their impact on health and disease. To this end, the study of samples from multiple tissues is critical, yet, the complexity of such analysis calls for a dedicated pipeline. We provide an automatic and efficient pipeline for identification, assembly, and analysis of viral genomes that combines the DNA sequence data from multiple organs. TRACESPipe relies on cooperation among 3 modalities: compression-based prediction, sequence alignment, and de novo assembly. The pipeline is ultra-fast and provides, additionally, secure transmission and storage of sensitive data.
FINDINGS
TRACESPipe performed outstandingly when tested on synthetic and ex vivo datasets, identifying and reconstructing all the viral genomes, including those with high levels of single-nucleotide polymorphisms. It also detected minimal levels of genomic variation between different organs.
CONCLUSIONS
TRACESPipe's unique ability to simultaneously process and analyze samples from different sources enables the evaluation of within-host variability. This opens up the possibility to investigate viral tissue tropism, evolution, fitness, and disease associations. Moreover, additional features such as DNA damage estimation and mitochondrial DNA reconstruction and analysis, as well as exogenous-source controls, expand the utility of this pipeline to other fields such as forensics and ancient DNA studies. TRACESPipe is released under GPLv3 and is available for free download at https://github.com/viromelab/tracespipe.

Identifiants

pubmed: 32815536
pii: 5894824
doi: 10.1093/gigascience/giaa086
pmc: PMC7439602
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press.

Références

Nucleic Acids Res. 2007 Jan;35(Database issue):D5-12
pubmed: 17170002
ISME J. 2017 Jan;11(1):7-14
pubmed: 27420028
Nature. 1981 Apr 9;290(5806):457-65
pubmed: 7219534
Bioinformatics. 2010 Mar 1;26(5):589-95
pubmed: 20080505
Bioinformatics. 2014 Aug 1;30(15):2114-20
pubmed: 24695404
J Gen Virol. 2015 Jun;96(Pt 6):1193-1206
pubmed: 26068186
PeerJ. 2018 Jan 12;6:e4227
pubmed: 29340239
Gigascience. 2020 Aug 1;9(8):
pubmed: 32815536
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W6-9
pubmed: 16845079
Front Microbiol. 2016 Jun 09;7:822
pubmed: 27375564
Nat Methods. 2018 Jul;15(7):475-476
pubmed: 29967506
Curr Protoc Bioinformatics. 2014 Sep 08;47:11.12.1-34
pubmed: 25199790
J Comput Biol. 2000 Feb-Apr;7(1-2):203-14
pubmed: 10890397
Sci Rep. 2016 Mar 30;6:23774
pubmed: 27026381
BMC Genomics. 2016 Mar 01;17:165
pubmed: 26932765
PLoS One. 2013 Nov 21;8(11):e79922
pubmed: 24278218
Sci Rep. 2015 Nov 27;5:17226
pubmed: 26611279
Genes (Basel). 2018 Sep 06;9(9):
pubmed: 30200636
Bioinformatics. 2012 Feb 15;28(4):593-4
pubmed: 22199392
Bioinformatics. 2013 Jul 01;29(13):1682-4
pubmed: 23613487
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
Bioinformatics. 2019 Mar 1;35(5):871-873
pubmed: 30124794
Bioinformatics. 2011 Mar 1;27(5):718-9
pubmed: 21208982
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
Nature. 2016 Aug 25;536(7617):425-30
pubmed: 27533034
Microbiome. 2017 Jul 6;5(1):69
pubmed: 28683828
Investig Genet. 2014 Jul 30;5:9
pubmed: 25101166
Bioinformatics. 2019 Jan 1;35(1):146-148
pubmed: 30020420
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
PLoS Comput Biol. 2018 Jan 26;14(1):e1005944
pubmed: 29373581
Nat Biotechnol. 2011 Jan;29(1):24-6
pubmed: 21221095
Mol Biol Evol. 2020 Feb 1;37(2):442-454
pubmed: 31593241
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Nat Commun. 2018 Aug 10;9(1):3205
pubmed: 30097567
Nat Genet. 1999 Oct;23(2):147
pubmed: 10508508
Genome Biol. 2019 Jul 25;20(1):144
pubmed: 31345254
Methods Mol Biol. 2012;840:197-228
pubmed: 22237537

Auteurs

Diogo Pratas (D)

Department of Virology, University of Helsinki, Haartmaninkatu 3, Helsinki, 00290, Finland.
Department of Electronics, Telecommunications and Informatics, University of Aveiro, Campus Universitario de Santiago, 3810-193 Aveiro, Portugal.
Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitario de Santiago, 3810-193 Aveiro, Portugal.

Mari Toppinen (M)

Department of Virology, University of Helsinki, Haartmaninkatu 3, Helsinki, 00290, Finland.

Lari Pyöriä (L)

Department of Virology, University of Helsinki, Haartmaninkatu 3, Helsinki, 00290, Finland.

Klaus Hedman (K)

Department of Virology, University of Helsinki, Haartmaninkatu 3, Helsinki, 00290, Finland.
HUSLAB, Helsinki University Hospital, Topeliuksenkatu 32, 00290 Helsinki, Finland.

Antti Sajantila (A)

Department of Forensic Medicine, University of Helsinki, Kytösuontie 11, 00300, Helsinki, Finland.
Forensic Medicine Unit, Finnish Institute of Health and Welfare, PO Box 30 FI-00271 Helsinki, Finland.

Maria F Perdomo (MF)

Department of Virology, University of Helsinki, Haartmaninkatu 3, Helsinki, 00290, Finland.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH