SRA Down Under: Cache and Analysis Platform for Infectious Disease.

Cromwell MeDiCI QRIScloud Queensland Genomics SRA

Journal

Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582

Informations de publication

Date de publication:
08 Aug 2019
Historique:
entrez: 10 8 2019
pubmed: 10 8 2019
medline: 11 9 2019
Statut: ppublish

Résumé

SRA, NCBI's Sequence Read Archive, is a valuable resource holding a near definitive collection of the world's collective sequenced reads for academic purposes. Increasingly, these reads are being used for both basic research and clinical investigations. When time is a critical factor in analysis, such as during bacterial outbreaks, the geographical separation between Australia and the offshore NCBI SRA servers can result in significant delays that may have adverse clinical outcomes. To address this, Queensland Genomics commissioned a pilot program for the establishment of a local Australian SRA Cache. Utilizing the hosting capabilities of the NeCTAR Research Cloud, QRIScloud's HTC infrastructure and the MeDiCI data fabric as a storage solution, and the software stack of Cromwell for workflow management, PostgreSQL database for sample and job metadata, and a coordinator Python Flask application, a local cache of seventeen bacterial species was established. Furthermore, the workflow capabilities of Cromwell were leveraged to provide analysis solutions for cached sample data, including quality control and taxonomic profiling, and individual and multiple sample analysis. Moving forward to a broader rollout of increased bacterial species, it was found that the initial storage estimation did not keep up with the exponential increase sequencing reads uploaded to NCBI SRA, which while highlighting the increasing availability and importance in modern research, will need to be addressed.

Identifiants

pubmed: 31397305
pii: SHTI190776
doi: 10.3233/SHTI190776
doi:

Types de publication

Journal Article

Langues

eng

Pagination

76-82

Auteurs

Thom Cuddihy (T)

QFAB Bioinformatics - Research Computing Centre, University of Queensland, Australia.

Brian Forde (B)

School of Chemistry and Molecular Biosciences, University of Queensland, Australia.

Nicholas Rhodes (N)

QFAB Bioinformatics - Institute for Molecular Bioscience, University of Queensland, Australia.

David Paterson (D)

UQ Centre for Clinical Research, University of Queensland, Australia.

Dominique Gorse (D)

QFAB Bioinformatics - Institute for Molecular Bioscience, University of Queensland, Australia.

Scott Beatson (S)

School of Chemistry and Molecular Biosciences, University of Queensland, Australia.

Patrick Harris (P)

UQ Centre for Clinical Research, University of Queensland, Australia.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Coal Metagenome Phylogeny Bacteria Genome, Bacterial
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH