Targeted domain assembly for fast functional profiling of metagenomic datasets with S3A.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 07 2020
01 07 2020
Historique:
received:
16
10
2019
revised:
11
04
2020
accepted:
17
04
2020
pubmed:
25
4
2020
medline:
29
12
2020
entrez:
25
4
2020
Statut:
ppublish
Résumé
The understanding of the ever-increasing number of metagenomic sequences accumulating in our databases demands for approaches that rapidly 'explore' the content of multiple and/or large metagenomic datasets with respect to specific domain targets, avoiding full domain annotation and full assembly. S3A is a fast and accurate domain-targeted assembler designed for a rapid functional profiling. It is based on a novel construction and a fast traversal of the Overlap-Layout-Consensus graph, designed to reconstruct coding regions from domain annotated metagenomic sequence reads. S3A relies on high-quality domain annotation to efficiently assemble metagenomic sequences and on the design of a new confidence measure for a fast evaluation of overlapping reads. Its implementation is highly generic and can be applied to any arbitrary type of annotation. On simulated data, S3A achieves a level of accuracy similar to that of classical metagenomics assembly tools while permitting to conduct a faster and sensitive profiling on domains of interest. When studying a few dozens of functional domains-a typical scenario-S3A is up to an order of magnitude faster than general purpose metagenomic assemblers, thus enabling the analysis of a larger number of datasets in the same amount of time. S3A opens new avenues to the fast exploration of the rapidly increasing number of metagenomic datasets displaying an ever-increasing size. S3A is available at http://www.lcqb.upmc.fr/S3A_ASSEMBLER/. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 32330240
pii: 5824791
doi: 10.1093/bioinformatics/btaa272
pmc: PMC7332565
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
3975-3981Informations de copyright
© The Author(s) 2020. Published by Oxford University Press.
Références
Nucleic Acids Res. 2017 Jan 4;45(D1):D566-D573
pubmed: 27789705
Microbiome. 2015 Aug 05;3:32
pubmed: 26246894
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W29-37
pubmed: 21593126
Methods Mol Biol. 2016;1399:207-33
pubmed: 26791506
Genome Res. 2008 May;18(5):821-9
pubmed: 18349386
Nat Rev Microbiol. 2005 Jun;3(6):489-98
pubmed: 15931167
Nucleic Acids Res. 2010 Nov;38(20):e191
pubmed: 20805240
PLoS Comput Biol. 2014 Aug 14;10(8):e1003737
pubmed: 25122209
Genome Biol. 2016 Jan 18;17:9
pubmed: 26781712
mSystems. 2017 Dec 5;2(6):
pubmed: 29238752
Bioinformatics. 2016 Jun 15;32(12):i201-i208
pubmed: 27307618
PLoS One. 2008 Oct 08;3(10):e3373
pubmed: 18841204
Nat Methods. 2017 Nov;14(11):1063-1071
pubmed: 28967888
Nature. 2017 Mar 1;543(7643):51-59
pubmed: 28252066
Microbiome. 2018 Aug 28;6(1):149
pubmed: 30153857
Bioinformatics. 2010 Sep 15;26(18):i420-5
pubmed: 20823302
PLoS Biol. 2007 Mar;5(3):e82
pubmed: 17355177
Algorithms Mol Biol. 2013 Sep 16;8(1):22
pubmed: 24040893
Front Genet. 2015 Dec 17;6:348
pubmed: 26734060
Nucleic Acids Res. 2016 Jan 4;44(D1):D590-4
pubmed: 26656948
Front Microbiol. 2016 Jul 18;7:1040
pubmed: 27486436
Brief Bioinform. 2012 Nov;13(6):696-710
pubmed: 23175748