binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets.

MAGs dimensionality reduction embedding iterative clustering marker gene sets metagenome-assembled genome t-SNE

Journal

Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837

Informations de publication

Date de publication:
19 11 2022
Historique:
received: 09 06 2022
revised: 03 09 2022
accepted: 06 09 2022
pubmed: 15 10 2022
medline: 24 11 2022
entrez: 14 10 2022
Statut: ppublish

Résumé

The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial communities. Here, we present binny, a binning tool that produces high-quality metagenome-assembled genomes (MAG) from both contiguous and highly fragmented genomes. Based on established metrics, binny outperforms or is highly competitive with commonly used and state-of-the-art binning methods and finds unique genomes that could not be detected by other methods. binny uses k-mer-composition and coverage by metagenomic reads for iterative, nonlinear dimension reduction of genomic signatures as well as subsequent automated contig clustering with cluster assessment using lineage-specific marker gene sets. When compared with seven widely used binning algorithms, binny provides substantial amounts of uniquely identified MAGs and almost always recovers the most near-complete ($\gt 95\%$ pure, $\gt 90\%$ complete) and high-quality ($\gt 90\%$ pure, $\gt 70\%$ complete) genomes from simulated datasets from the Critical Assessment of Metagenome Interpretation initiative, as well as substantially more high-quality draft genomes, as defined by the Minimum Information about a Metagenome-Assembled Genome standard, from a real-world benchmark comprised of metagenomes from various environments than any other tested method.

Identifiants

pubmed: 36239393
pii: 6760137
doi: 10.1093/bib/bbac431
pmc: PMC9677464
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : National Research Fund
ID : PRIDE/11823097
Organisme : European Research Council
ID : ERC-CoG 863664
Pays : International

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press.

Références

PeerJ. 2019 Jul 26;7:e7359
pubmed: 31388474
Nat Methods. 2017 Nov;14(11):1063-1071
pubmed: 28967888
Gigascience. 2018 May 1;7(5):
pubmed: 29762668
IEEE Trans Vis Comput Graph. 2014 Dec;20(12):1983-92
pubmed: 26356912
Nat Microbiol. 2021 Jan;6(1):3-6
pubmed: 33349678
Genome Biol. 2021 Jul 14;22(1):209
pubmed: 34261503
Annu Rev Microbiol. 2020 Sep 8;74:117-135
pubmed: 32603623
Nat Commun. 2019 Nov 28;10(1):5416
pubmed: 31780648
Nat Protoc. 2021 Apr;16(4):1785-1801
pubmed: 33649565
Sci Rep. 2016 Apr 12;6:24175
pubmed: 27067514
Nucleic Acids Res. 2021 Jan 8;49(D1):D1020-D1028
pubmed: 33270901
Bioinformatics. 2018 Oct 15;34(20):3600
pubmed: 29788404
Genome Res. 2020 Mar;30(3):315-333
pubmed: 32188701
Nat Commun. 2018 Nov 30;9(1):5114
pubmed: 30504855
Nat Microbiol. 2018 Jul;3(7):836-843
pubmed: 29807988
Nucleic Acids Res. 2020 Jan 8;48(D1):D570-D578
pubmed: 31696235
Cell Host Microbe. 2019 Nov 13;26(5):666-679.e7
pubmed: 31607556
BMC Bioinformatics. 2007 Jun 18;8:209
pubmed: 17577412
Microbiome. 2022 Mar 10;10(1):46
pubmed: 35272700
Bioinformatics. 2016 Feb 15;32(4):605-7
pubmed: 26515820
Gigascience. 2021 Jun 2;10(6):
pubmed: 34076241
Nat Biotechnol. 2021 Apr;39(4):499-509
pubmed: 33169036
Nat Methods. 2014 Nov;11(11):1144-6
pubmed: 25218180
Appl Environ Microbiol. 2021 Feb 26;87(6):
pubmed: 33452027
Nat Commun. 2019 Nov 28;10(1):5415
pubmed: 31780669
Nature. 2013 Jul 25;499(7459):431-7
pubmed: 23851394
Gigascience. 2018 Jun 1;7(6):
pubmed: 29893851
J Microbiol. 2018 Apr;56(4):280-285
pubmed: 29492869
Nat Biotechnol. 2017 Sep 12;35(9):833-844
pubmed: 28898207
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
Genome Res. 2015 Jul;25(7):1043-55
pubmed: 25977477
Nat Methods. 2015 Jan;12(1):59-60
pubmed: 25402007
Nat Microbiol. 2016 Oct 10;2:16180
pubmed: 27723761
Microbiome. 2015 Jan 20;3(1):1
pubmed: 25621171
BMC Bioinformatics. 2020 Jul 28;21(1):334
pubmed: 32723290
Nat Commun. 2020 Oct 19;11(1):5281
pubmed: 33077707
Nature. 2015 Jul 9;523(7559):208-11
pubmed: 26083755
Nat Methods. 2022 Apr;19(4):429-440
pubmed: 35396482
Nat Biotechnol. 2021 May;39(5):555-560
pubmed: 33398153
BMC Bioinformatics. 2010 Mar 08;11:119
pubmed: 20211023
Nature. 2017 Jan 19;541(7637):353-358
pubmed: 28077874
Microbiome. 2021 Jun 12;9(1):136
pubmed: 34118971
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
SIAM J Math Data Sci. 2019;1(2):313-332
pubmed: 33073204
PeerJ. 2015 Oct 08;3:e1319
pubmed: 26500826
BMC Bioinformatics. 2017 May 2;18(1):233
pubmed: 28464793
Bioinformatics. 2014 Jul 15;30(14):2068-9
pubmed: 24642063
Microbiome. 2018 Sep 15;6(1):158
pubmed: 30219103
Nat Biotechnol. 2017 Aug 8;35(8):725-731
pubmed: 28787424
Nat Microbiol. 2018 Jul;3(7):804-813
pubmed: 29891866
Nat Biotechnol. 2021 Jan;39(1):105-114
pubmed: 32690973
Nat Commun. 2022 Apr 28;13(1):2326
pubmed: 35484115

Auteurs

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Populus Soil Microbiology Soil Microbiota Fungi
Coal Metagenome Phylogeny Bacteria Genome, Bacterial
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature

Classifications MeSH