PaCBAM: fast and scalable processing of whole exome and targeted sequencing data.

Software Time Factors Exome Sequencing / methods

Résumé

Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed. PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets. PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.

Sections du résumé

BACKGROUND BACKGROUND

Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed.

RESULTS RESULTS

PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets.

CONCLUSIONS CONCLUSIONS

PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.

Identifiants

DOI: 10.1186/s12864-019-6386-6 PMID: 31878881 PMC: PMC6933905

pubmed: 31878881

doi: 10.1186/s12864-019-6386-6

pii: 10.1186/s12864-019-6386-6

pmc: PMC6933905

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1018

Références

BMC Med Genomics. 2015 Mar 01;8:9

pubmed: 25889339

Bioinformatics. 2018 Mar 1;34(5):867-868

pubmed: 29096012

Nat Genet. 2011 May;43(5):491-8

pubmed: 21478889

Bioinformatics. 2014 Sep 1;30(17):2503-5

pubmed: 24812344

Bioinformatics. 2009 Aug 15;25(16):2078-9

pubmed: 19505943

Bioinformatics. 2010 Mar 15;26(6):841-2

pubmed: 20110278

Bioinformatics. 2015 Jun 15;31(12):2032-4

pubmed: 25697820

PaCBAM: fast and scalable processing of whole exome and targeted sequencing data.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Références

Auteurs

Samuel Valentini (S)

Tarcisio Fedrizzi (T)

Francesca Demichelis (F)

Alessandro Romanel (A)

Articles similaires

Meal Timing and Anthropometric and Metabolic Outcomes: A Systematic Review and Meta-Analysis.

Vancomycin-associated DRESS demonstrates delay in AST abnormalities.

Selecting optimal software code descriptors-The case of Java.

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Classifications MeSH