ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data.
Analysis workflow
Annotation
Metagenome-assembled genomes
Metagenomics
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
22 Jun 2020
22 Jun 2020
Historique:
received:
23
05
2019
accepted:
08
06
2020
entrez:
24
6
2020
pubmed:
24
6
2020
medline:
11
8
2020
Statut:
epublish
Résumé
Metagenomics studies provide valuable insight into the composition and function of microbial populations from diverse environments; however, the data processing pipelines that rely on mapping reads to gene catalogs or genome databases for cultured strains yield results that underrepresent the genes and functional potential of uncultured microbes. Recent improvements in sequence assembly methods have eased the reliance on genome databases, thereby allowing the recovery of genomes from uncultured microbes. However, configuring these tools, linking them with advanced binning and annotation tools, and maintaining provenance of the processing continues to be challenging for researchers. Here we present ATLAS, a software package for customizable data processing from raw sequence reads to functional and taxonomic annotations using state-of-the-art tools to assemble, annotate, quantify, and bin metagenome data. Abundance estimates at genome resolution are provided for each sample in a dataset. ATLAS is written in Python and the workflow implemented in Snakemake; it operates in a Linux environment, and is compatible with Python 3.5+ and Anaconda 3+ versions. The source code for ATLAS is freely available, distributed under a BSD-3 license. ATLAS provides a user-friendly, modular and customizable Snakemake workflow for metagenome data processing; it is easily installable with conda and maintained as open-source on GitHub at https://github.com/metagenome-atlas/atlas.
Sections du résumé
BACKGROUND
BACKGROUND
Metagenomics studies provide valuable insight into the composition and function of microbial populations from diverse environments; however, the data processing pipelines that rely on mapping reads to gene catalogs or genome databases for cultured strains yield results that underrepresent the genes and functional potential of uncultured microbes. Recent improvements in sequence assembly methods have eased the reliance on genome databases, thereby allowing the recovery of genomes from uncultured microbes. However, configuring these tools, linking them with advanced binning and annotation tools, and maintaining provenance of the processing continues to be challenging for researchers.
RESULTS
RESULTS
Here we present ATLAS, a software package for customizable data processing from raw sequence reads to functional and taxonomic annotations using state-of-the-art tools to assemble, annotate, quantify, and bin metagenome data. Abundance estimates at genome resolution are provided for each sample in a dataset. ATLAS is written in Python and the workflow implemented in Snakemake; it operates in a Linux environment, and is compatible with Python 3.5+ and Anaconda 3+ versions. The source code for ATLAS is freely available, distributed under a BSD-3 license.
CONCLUSIONS
CONCLUSIONS
ATLAS provides a user-friendly, modular and customizable Snakemake workflow for metagenome data processing; it is easily installable with conda and maintained as open-source on GitHub at https://github.com/metagenome-atlas/atlas.
Identifiants
pubmed: 32571209
doi: 10.1186/s12859-020-03585-4
pii: 10.1186/s12859-020-03585-4
pmc: PMC7310028
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
257Subventions
Organisme : Pacific Northwest National Laboratory LDRD program
ID : Microbiomes in Transition Initiative
Organisme : European Research Council
ID : ERC-COG-2018
Pays : International
Références
PeerJ. 2019 Jul 26;7:e7359
pubmed: 31388474
Nat Microbiol. 2018 Jul;3(7):836-843
pubmed: 29807988
Nucleic Acids Res. 2017 Jan 4;45(D1):D507-D516
pubmed: 27738135
Methods. 2016 Jun 1;102:3-11
pubmed: 27012178
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
Nat Methods. 2018 Jul;15(7):475-476
pubmed: 29967506
Bioinformatics. 2016 Aug 15;32(16):2520-3
pubmed: 27153620
BMC Bioinformatics. 2010 Mar 08;11:119
pubmed: 20211023
J Formos Med Assoc. 2019 Feb;118(2):545-555
pubmed: 29490879
Nat Biotechnol. 2017 Nov;35(11):1026-1028
pubmed: 29035372
Nat Methods. 2017 Nov;14(11):1063-1071
pubmed: 28967888
Cell Metab. 2018 Dec 4;28(6):907-921.e7
pubmed: 30174308
Bioinformatics. 2015 May 15;31(10):1674-6
pubmed: 25609793
Nat Biotechnol. 2018 Nov;36(10):996-1004
pubmed: 30148503
Nat Rev Microbiol. 2015 Jul;13(7):439-46
pubmed: 26052662
Genome Biol. 2004;5(2):R12
pubmed: 14759262
Bioinformatics. 2016 Feb 15;32(4):605-7
pubmed: 26515820
Cell. 2016 Aug 25;166(5):1103-1116
pubmed: 27565341
ISME J. 2017 Dec;11(12):2864-2868
pubmed: 28742071
Bioinformatics. 2012 Oct 1;28(19):2520-2
pubmed: 22908215
Nat Biotechnol. 2017 Nov;35(11):1069-1076
pubmed: 28967887
Nat Commun. 2018 Feb 28;9(1):870
pubmed: 29491419
Nat Microbiol. 2017 Nov;2(11):1533-1542
pubmed: 28894102
Nat Methods. 2013 Dec;10(12):1196-9
pubmed: 24141494
mSystems. 2016 May 3;1(3):
pubmed: 27822526
Bioinformatics. 2015 Jun 15;31(12):i9-16
pubmed: 26072514
Nat Methods. 2018 Nov;15(11):962-968
pubmed: 30377376
Nature. 2019 Apr;568(7753):499-504
pubmed: 30745586
Genome Res. 2015 Jul;25(7):1043-55
pubmed: 25977477
Genome Biol. 2016 Jun 20;17(1):132
pubmed: 27323842
Genome Biol. 2016 Dec 16;17(1):260
pubmed: 27986083
Nat Commun. 2018 Jun 29;9(1):2542
pubmed: 29959318
PeerJ. 2015 Oct 08;3:e1319
pubmed: 26500826
Mol Biol Evol. 2017 Aug 1;34(8):2115-2122
pubmed: 28460117
Nucleic Acids Res. 2019 Jan 8;47(D1):D309-D314
pubmed: 30418610