SLIM: a flexible web application for the reproducible processing of environmental DNA metabarcoding data.
Amplicon sequencing
High-throughput sequencing
Molecular ecology
Pipeline
Reproducibility
eDNA metabarcoding
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
19 Feb 2019
19 Feb 2019
Historique:
received:
20
09
2018
accepted:
30
01
2019
entrez:
21
2
2019
pubmed:
21
2
2019
medline:
19
3
2019
Statut:
epublish
Résumé
High-throughput amplicon sequencing of environmental DNA (eDNA metabarcoding) has become a routine tool for biodiversity survey and ecological studies. By including sample-specific tags in the primers prior PCR amplification, it is possible to multiplex hundreds of samples in a single sequencing run. The analysis of millions of sequences spread into hundreds to thousands of samples prompts for efficient, automated yet flexible analysis pipelines. Various algorithms and software have been developed to perform one or multiple processing steps, such as paired-end reads assembly, chimera filtering, Operational Taxonomic Unit (OTU) clustering and taxonomic assignment. Some of these software are now well established and widely used by scientists as part of their workflow. Wrappers that are capable to process metabarcoding data from raw sequencing data to annotated OTU-to-sample matrix were also developed to facilitate the analysis for non-specialist users. Yet, most of them require basic bioinformatic or command-line knowledge, which can limit the accessibility to such integrative toolkits. Furthermore, for flexibility reasons, these tools have adopted a step-by-step approach, which can prevent an easy automation of the workflow, and hence hamper the analysis reproducibility. We introduce SLIM, an open-source web application that simplifies the creation and execution of metabarcoding data processing pipelines through an intuitive Graphic User Interface (GUI). The GUI interact with well-established software and their associated parameters, so that the processing steps are performed seamlessly from the raw sequencing data to an annotated OTU-to-sample matrix. Thanks to a module-centered organization, SLIM can be used for a wide range of metabarcoding cases, and can also be extended by developers for custom needs or for the integration of new software. The pipeline configuration (i.e. the modules chaining and all their parameters) is stored in a file that can be used for reproducing the same analysis. This web application has been designed to be user-friendly for non-specialists yet flexible with advanced settings and extensibility for advanced users and bioinformaticians. The source code along with full documentation is available on the GitHub repository ( https://github.com/yoann-dufresne/SLIM ) and a demonstration server is accessible through the application website ( https://trtcrd.github.io/SLIM/ ).
Sections du résumé
BACKGROUND
BACKGROUND
High-throughput amplicon sequencing of environmental DNA (eDNA metabarcoding) has become a routine tool for biodiversity survey and ecological studies. By including sample-specific tags in the primers prior PCR amplification, it is possible to multiplex hundreds of samples in a single sequencing run. The analysis of millions of sequences spread into hundreds to thousands of samples prompts for efficient, automated yet flexible analysis pipelines. Various algorithms and software have been developed to perform one or multiple processing steps, such as paired-end reads assembly, chimera filtering, Operational Taxonomic Unit (OTU) clustering and taxonomic assignment. Some of these software are now well established and widely used by scientists as part of their workflow. Wrappers that are capable to process metabarcoding data from raw sequencing data to annotated OTU-to-sample matrix were also developed to facilitate the analysis for non-specialist users. Yet, most of them require basic bioinformatic or command-line knowledge, which can limit the accessibility to such integrative toolkits. Furthermore, for flexibility reasons, these tools have adopted a step-by-step approach, which can prevent an easy automation of the workflow, and hence hamper the analysis reproducibility.
RESULTS
RESULTS
We introduce SLIM, an open-source web application that simplifies the creation and execution of metabarcoding data processing pipelines through an intuitive Graphic User Interface (GUI). The GUI interact with well-established software and their associated parameters, so that the processing steps are performed seamlessly from the raw sequencing data to an annotated OTU-to-sample matrix. Thanks to a module-centered organization, SLIM can be used for a wide range of metabarcoding cases, and can also be extended by developers for custom needs or for the integration of new software. The pipeline configuration (i.e. the modules chaining and all their parameters) is stored in a file that can be used for reproducing the same analysis.
CONCLUSION
CONCLUSIONS
This web application has been designed to be user-friendly for non-specialists yet flexible with advanced settings and extensibility for advanced users and bioinformaticians. The source code along with full documentation is available on the GitHub repository ( https://github.com/yoann-dufresne/SLIM ) and a demonstration server is accessible through the application website ( https://trtcrd.github.io/SLIM/ ).
Identifiants
pubmed: 30782112
doi: 10.1186/s12859-019-2663-2
pii: 10.1186/s12859-019-2663-2
pmc: PMC6381720
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
88Subventions
Organisme : Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung
ID : 31003A \ _159709
Organisme : Swiss Network of International Studies
ID : Monitoring marine biodiversity in the genomic era
Références
PLoS One. 2007 Feb 14;2(2):e197
pubmed: 17299583
Appl Environ Microbiol. 2007 Aug;73(16):5261-7
pubmed: 17586664
Mol Ecol Notes. 2007 May 1;7(3):355-364
pubmed: 18784790
Appl Environ Microbiol. 2009 Dec;75(23):7537-41
pubmed: 19801464
Nat Methods. 2010 May;7(5):335-6
pubmed: 20383131
New Phytol. 2010 Apr;186(2):281-5
pubmed: 20409185
Bioinformatics. 2010 Oct 1;26(19):2460-1
pubmed: 20709691
Bioinformatics. 2011 Aug 15;27(16):2194-200
pubmed: 21700674
Mol Ecol. 2012 Apr;21(8):1931-50
pubmed: 22171763
BMC Bioinformatics. 2012 Feb 14;13:31
pubmed: 22333067
Mol Ecol. 2012 Apr;21(8):2045-50
pubmed: 22486824
Bioinformatics. 2012 Dec 15;28(24):3211-7
pubmed: 23071270
PLoS One. 2012;7(11):e49334
pubmed: 23145153
Nucleic Acids Res. 2013 Jan;41(Database issue):D597-604
pubmed: 23193267
Nucleic Acids Res. 2013 Jan;41(Database issue):D590-6
pubmed: 23193283
New Phytol. 2014 Jun;202(4):1101-4
pubmed: 24571363
Environ Microbiol. 2015 May;17(5):1689-706
pubmed: 25156547
BMC Bioinformatics. 2014;15 Suppl 9:S10
pubmed: 25252785
Nucleic Acids Res. 2015 Mar 11;43(5):2513-24
pubmed: 25690897
Mol Ecol Resour. 2016 Jan;16(1):176-82
pubmed: 25959493
Microbiome. 2015 May 20;3:20
pubmed: 25995836
Mol Ecol. 2016 Feb;25(4):929-42
pubmed: 26479867
PeerJ. 2015 Dec 08;3:e1487
pubmed: 26664811
PeerJ. 2015 Dec 10;3:e1420
pubmed: 26713226
PeerJ. 2016 Feb 25;4:e1692
pubmed: 26966652
Nucleic Acids Res. 2016 Jul 8;44(W1):W3-W10
pubmed: 27137889
Nat Methods. 2016 Jul;13(7):581-3
pubmed: 27214047
Mol Ecol. 2016 Sep;25(17):4392-406
pubmed: 27454455
PeerJ. 2016 Oct 18;4:e2584
pubmed: 27781170
Sci Data. 2017 Mar 14;4:170027
pubmed: 28291235
Mol Ecol Resour. 2017 Nov;17(6):1231-1242
pubmed: 28296259
Nat Commun. 2017 Oct 30;8(1):1188
pubmed: 29084957
Microbiome. 2018 May 17;6(1):90
pubmed: 29773078
Mol Ecol Resour. 2018 Nov;18(6):1381-1391
pubmed: 30014577
Microbiome. 2018 Aug 9;6(1):140
pubmed: 30092815