PaCBAM: fast and scalable processing of whole exome and targeted sequencing data.


Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
26 Dec 2019
Historique:
received: 27 03 2019
accepted: 11 12 2019
entrez: 28 12 2019
pubmed: 28 12 2019
medline: 21 4 2020
Statut: epublish

Résumé

Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed. PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets. PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.

Sections du résumé

BACKGROUND BACKGROUND
Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed.
RESULTS RESULTS
PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets.
CONCLUSIONS CONCLUSIONS
PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.

Identifiants

pubmed: 31878881
doi: 10.1186/s12864-019-6386-6
pii: 10.1186/s12864-019-6386-6
pmc: PMC6933905
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1018

Références

BMC Med Genomics. 2015 Mar 01;8:9
pubmed: 25889339
Bioinformatics. 2018 Mar 1;34(5):867-868
pubmed: 29096012
Nat Genet. 2011 May;43(5):491-8
pubmed: 21478889
Bioinformatics. 2014 Sep 1;30(17):2503-5
pubmed: 24812344
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Bioinformatics. 2015 Jun 15;31(12):2032-4
pubmed: 25697820

Auteurs

Samuel Valentini (S)

Laboratory of Bioinformatics and Computational Genomics, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy.

Tarcisio Fedrizzi (T)

Laboratory of Computational and Functional Oncology, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy.

Francesca Demichelis (F)

Laboratory of Computational and Functional Oncology, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy.

Alessandro Romanel (A)

Laboratory of Bioinformatics and Computational Genomics, Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy. alessandro.romanel@unitn.it.

Articles similaires

Humans Meals Time Factors Female Adult

Vancomycin-associated DRESS demonstrates delay in AST abnormalities.

Ahmed Hussein, Kateri L Schoettinger, Jourdan Hydol-Smith et al.
1.00
Humans Drug Hypersensitivity Syndrome Vancomycin Female Male

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH