binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets.
MAGs
dimensionality reduction
embedding
iterative clustering
marker gene sets
metagenome-assembled genome
t-SNE
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
19 11 2022
19 11 2022
Historique:
received:
09
06
2022
revised:
03
09
2022
accepted:
06
09
2022
pubmed:
15
10
2022
medline:
24
11
2022
entrez:
14
10
2022
Statut:
ppublish
Résumé
The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial communities. Here, we present binny, a binning tool that produces high-quality metagenome-assembled genomes (MAG) from both contiguous and highly fragmented genomes. Based on established metrics, binny outperforms or is highly competitive with commonly used and state-of-the-art binning methods and finds unique genomes that could not be detected by other methods. binny uses k-mer-composition and coverage by metagenomic reads for iterative, nonlinear dimension reduction of genomic signatures as well as subsequent automated contig clustering with cluster assessment using lineage-specific marker gene sets. When compared with seven widely used binning algorithms, binny provides substantial amounts of uniquely identified MAGs and almost always recovers the most near-complete ($\gt 95\%$ pure, $\gt 90\%$ complete) and high-quality ($\gt 90\%$ pure, $\gt 70\%$ complete) genomes from simulated datasets from the Critical Assessment of Metagenome Interpretation initiative, as well as substantially more high-quality draft genomes, as defined by the Minimum Information about a Metagenome-Assembled Genome standard, from a real-world benchmark comprised of metagenomes from various environments than any other tested method.
Identifiants
pubmed: 36239393
pii: 6760137
doi: 10.1093/bib/bbac431
pmc: PMC9677464
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : National Research Fund
ID : PRIDE/11823097
Organisme : European Research Council
ID : ERC-CoG 863664
Pays : International
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
PeerJ. 2019 Jul 26;7:e7359
pubmed: 31388474
Nat Methods. 2017 Nov;14(11):1063-1071
pubmed: 28967888
Gigascience. 2018 May 1;7(5):
pubmed: 29762668
IEEE Trans Vis Comput Graph. 2014 Dec;20(12):1983-92
pubmed: 26356912
Nat Microbiol. 2021 Jan;6(1):3-6
pubmed: 33349678
Genome Biol. 2021 Jul 14;22(1):209
pubmed: 34261503
Annu Rev Microbiol. 2020 Sep 8;74:117-135
pubmed: 32603623
Nat Commun. 2019 Nov 28;10(1):5416
pubmed: 31780648
Nat Protoc. 2021 Apr;16(4):1785-1801
pubmed: 33649565
Sci Rep. 2016 Apr 12;6:24175
pubmed: 27067514
Nucleic Acids Res. 2021 Jan 8;49(D1):D1020-D1028
pubmed: 33270901
Bioinformatics. 2018 Oct 15;34(20):3600
pubmed: 29788404
Genome Res. 2020 Mar;30(3):315-333
pubmed: 32188701
Nat Commun. 2018 Nov 30;9(1):5114
pubmed: 30504855
Nat Microbiol. 2018 Jul;3(7):836-843
pubmed: 29807988
Nucleic Acids Res. 2020 Jan 8;48(D1):D570-D578
pubmed: 31696235
Cell Host Microbe. 2019 Nov 13;26(5):666-679.e7
pubmed: 31607556
BMC Bioinformatics. 2007 Jun 18;8:209
pubmed: 17577412
Microbiome. 2022 Mar 10;10(1):46
pubmed: 35272700
Bioinformatics. 2016 Feb 15;32(4):605-7
pubmed: 26515820
Gigascience. 2021 Jun 2;10(6):
pubmed: 34076241
Nat Biotechnol. 2021 Apr;39(4):499-509
pubmed: 33169036
Nat Methods. 2014 Nov;11(11):1144-6
pubmed: 25218180
Appl Environ Microbiol. 2021 Feb 26;87(6):
pubmed: 33452027
Nat Commun. 2019 Nov 28;10(1):5415
pubmed: 31780669
Nature. 2013 Jul 25;499(7459):431-7
pubmed: 23851394
Gigascience. 2018 Jun 1;7(6):
pubmed: 29893851
J Microbiol. 2018 Apr;56(4):280-285
pubmed: 29492869
Nat Biotechnol. 2017 Sep 12;35(9):833-844
pubmed: 28898207
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
Genome Res. 2015 Jul;25(7):1043-55
pubmed: 25977477
Nat Methods. 2015 Jan;12(1):59-60
pubmed: 25402007
Nat Microbiol. 2016 Oct 10;2:16180
pubmed: 27723761
Microbiome. 2015 Jan 20;3(1):1
pubmed: 25621171
BMC Bioinformatics. 2020 Jul 28;21(1):334
pubmed: 32723290
Nat Commun. 2020 Oct 19;11(1):5281
pubmed: 33077707
Nature. 2015 Jul 9;523(7559):208-11
pubmed: 26083755
Nat Methods. 2022 Apr;19(4):429-440
pubmed: 35396482
Nat Biotechnol. 2021 May;39(5):555-560
pubmed: 33398153
BMC Bioinformatics. 2010 Mar 08;11:119
pubmed: 20211023
Nature. 2017 Jan 19;541(7637):353-358
pubmed: 28077874
Microbiome. 2021 Jun 12;9(1):136
pubmed: 34118971
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
SIAM J Math Data Sci. 2019;1(2):313-332
pubmed: 33073204
PeerJ. 2015 Oct 08;3:e1319
pubmed: 26500826
BMC Bioinformatics. 2017 May 2;18(1):233
pubmed: 28464793
Bioinformatics. 2014 Jul 15;30(14):2068-9
pubmed: 24642063
Microbiome. 2018 Sep 15;6(1):158
pubmed: 30219103
Nat Biotechnol. 2017 Aug 8;35(8):725-731
pubmed: 28787424
Nat Microbiol. 2018 Jul;3(7):804-813
pubmed: 29891866
Nat Biotechnol. 2021 Jan;39(1):105-114
pubmed: 32690973
Nat Commun. 2022 Apr 28;13(1):2326
pubmed: 35484115