WGA-LP: a pipeline for whole genome assembly of contaminated reads.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
12 01 2022
Historique:
received: 27 07 2021
revised: 22 09 2021
accepted: 15 10 2021
pubmed: 21 10 2021
medline: 3 2 2023
entrez: 20 10 2021
Statut: ppublish

Résumé

Whole genome assembly (WGA) of bacterial genomes with short reads is a quite common task as DNA sequencing has become cheaper with the advances of its technology. The process of assembling a genome has no absolute golden standard and it requires to perform a sequence of steps each of which can involve combinations of many different tools. However, the quality of the final assembly is always strongly related to the quality of the input data. With this in mind we built WGA-LP, a package that connects state-of-the-art programs for microbial analysis and novel scripts to check and improve the quality of both samples and resulting assemblies. WGA-LP, with its conservative decontamination approach, has shown to be capable of creating high quality assemblies even in the case of contaminated reads. WGA-LP is available on GitHub (https://github.com/redsnic/WGA-LP) and Docker Hub (https://hub.docker.com/r/redsnic/wgalp). The web app for node visualization is hosted by shinyapps.io (https://redsnic.shinyapps.io/ContigCoverageVisualizer/). Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 34668528
pii: 6404579
doi: 10.1093/bioinformatics/btab719
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

846-848

Subventions

Organisme : Ministero dell'Università e della Ricerca
ID : RBFR107VML

Informations de copyright

© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

N Rossi (N)

Department of Mathematics, Computer Science, and Physics, University of Udine, 33100 Udine, Italy.

A Colautti (A)

Dipartimento di Scienze Agroalimentari, Ambientali e Animali, University of Udine, 33100 Udine, Italy.

L Iacumin (L)

Dipartimento di Scienze Agroalimentari, Ambientali e Animali, University of Udine, 33100 Udine, Italy.

C Piazza (C)

Department of Mathematics, Computer Science, and Physics, University of Udine, 33100 Udine, Italy.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH