HaTSPiL: A modular pipeline for high-throughput sequencing data analysis.


Journal

PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081

Informations de publication

Date de publication:
2019
Historique:
received: 25 03 2019
accepted: 30 08 2019
entrez: 16 10 2019
pubmed: 16 10 2019
medline: 11 3 2020
Statut: epublish

Résumé

Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability. We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need. HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil.

Sections du résumé

BACKGROUND
Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability.
METHODS
We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need.
CONCLUSIONS
HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil.

Identifiants

pubmed: 31613890
doi: 10.1371/journal.pone.0222512
pii: PONE-D-19-08543
pmc: PMC6793853
doi:

Substances chimiques

DNA 9007-49-2

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e0222512

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Bioinformatics. 2012 Jun 1;28(11):1525-6
pubmed: 22500002
Bioinformatics. 2018 Oct 15;34(20):3600
pubmed: 29788404
BMC Res Notes. 2011 Sep 08;4:331
pubmed: 21899774
Nat Biotechnol. 2013 Mar;31(3):213-9
pubmed: 23396013
Nucleic Acids Res. 2001 Jan 1;29(1):308-11
pubmed: 11125122
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
F1000Res. 2016 Jun 29;5:1542
pubmed: 28232861
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W729-32
pubmed: 16845108
Nat Methods. 2011 Dec 28;9(1):7-8
pubmed: 22205509
Bioinformatics. 2015 Jan 1;31(1):10-6
pubmed: 25189778
Genome Biol. 2010;11(8):R86
pubmed: 20738864
Brief Bioinform. 2017 May 1;18(3):530-536
pubmed: 27013646
Bioinformatics. 2018 Jun 1;34(11):1934-1936
pubmed: 29361152
PLoS One. 2016 Oct 5;11(10):e0163962
pubmed: 27706213
Bioinformatics. 2010 Nov 1;26(21):2778-9
pubmed: 20847218
Nucleic Acids Res. 2017 Jan 4;45(D1):D777-D783
pubmed: 27899578
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
Genome Res. 2012 Mar;22(3):568-76
pubmed: 22300766
Bioinformatics. 2012 Jul 15;28(14):1811-7
pubmed: 22581179
Bioinformatics. 2012 Jun 15;28(12):i172-8
pubmed: 22689758
F1000Res. 2016 Nov 22;5:2741
pubmed: 27990269
Nucleic Acids Res. 2018 Jan 4;46(D1):D1062-D1067
pubmed: 29165669

Auteurs

Edoardo Morandi (E)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Matteo Cereda (M)

Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Danny Incarnato (D)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Caterina Parlato (C)

Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Giulia Basile (G)

Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Francesca Anselmi (F)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Andrea Lauria (A)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Lisa Marie Simon (LM)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Isabelle Laurence Polignano (I)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.

Francesca Arruga (F)

Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Silvia Deaglio (S)

Italian Institute for Genomic Medicine (IIGM), Turin, Italy.
Department of Medical Sciences, University of Turin, Turin, Italy.

Elisa Tirtei (E)

Paediatric Onco-Haematology, Stem Cell Transplantation and Cellular Therapy Division, City of Science and Health of Turin, Regina Margherita Children's Hospital, Turin, Italy.

Franca Fagioli (F)

Paediatric Onco-Haematology, Stem Cell Transplantation and Cellular Therapy Division, City of Science and Health of Turin, Regina Margherita Children's Hospital, Turin, Italy.

Salvatore Oliviero (S)

Department of Life Sciences and System Biology, University of Turin, Turin, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH