metaFlye: scalable long-read metagenome assembly using repeat graphs.


Journal

Nature methods
ISSN: 1548-7105
Titre abrégé: Nat Methods
Pays: United States
ID NLM: 101215604

Informations de publication

Date de publication:
11 2020
Historique:
received: 16 07 2020
accepted: 07 09 2020
revised: 22 08 2020
pubmed: 7 10 2020
medline: 26 1 2021
entrez: 6 10 2020
Statut: ppublish

Résumé

Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.

Identifiants

pubmed: 33020656
doi: 10.1038/s41592-020-00971-x
pii: 10.1038/s41592-020-00971-x
pmc: PMC10699202
mid: NIHMS1627039
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

1103-1110

Subventions

Organisme : NIGMS NIH HHS
ID : P41 GM103484
Pays : United States
Organisme : NHGRI NIH HHS
ID : R25 HG011022
Pays : United States

Références

Genome Res. 2019 Jun;29(6):961-968
pubmed: 31048319
J Comput Biol. 2018 Jul;25(7):649-663
pubmed: 29461862
Bioinformatics. 2020 Apr 15;36(8):2385-2392
pubmed: 31860070
Genome Res. 2017 May;27(5):737-746
pubmed: 28100585
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
PeerJ. 2016 Oct 18;4:e2584
pubmed: 27781170
BMC Microbiol. 2019 Jun 25;19(1):143
pubmed: 31238873
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Bioinformatics. 2018 Jul 1;34(13):i142-i150
pubmed: 29949969
Bioinformatics. 2012 Dec 1;28(23):3150-2
pubmed: 23060610
Bioinformatics. 2015 Oct 15;31(20):3350-2
pubmed: 26099265
Science. 2019 Feb 15;363(6428):
pubmed: 30765538
Bioinformatics. 2011 Nov 1;27(21):2964-71
pubmed: 21926123
BMC Genomics. 2015 Jul 04;16:496
pubmed: 26141154
Genome Res. 2018 Oct;28(10):1467-1480
pubmed: 30232199
Nat Commun. 2018 Nov 30;9(1):5114
pubmed: 30504855
Nat Methods. 2017 Nov;14(11):1063-1071
pubmed: 28967888
Nat Genet. 2007 Nov;39(11):1361-8
pubmed: 17922013
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5
pubmed: 17130148
Bioinformatics. 2015 May 15;31(10):1674-6
pubmed: 25609793
Nat Biotechnol. 2019 Aug;37(8):953-961
pubmed: 31375809
Gigascience. 2019 May 1;8(5):
pubmed: 31089679
Cell Syst. 2020 Jan 22;10(1):99-108.e5
pubmed: 31864964
Nat Commun. 2019 Jan 11;10(1):159
pubmed: 30635580
Nat Commun. 2016 Jan 27;7:10476
pubmed: 26814032
Genome Biol. 2019 Aug 26;20(1):174
pubmed: 31451112
Nat Prod Rep. 2016 Jan;33(1):73-86
pubmed: 26497201
Stand Genomic Sci. 2017 Jan 19;12:9
pubmed: 28127419
Bioinformatics. 2016 Jul 15;32(14):2103-10
pubmed: 27153593
ACS Chem Biol. 2019 Oct 18;14(10):2115-2126
pubmed: 31508935
Nature. 2020 Sep;585(7823):79-84
pubmed: 32663838
Nat Biotechnol. 2020 Jun;38(6):701-707
pubmed: 32042169
Genome Res. 2019 Aug;29(8):1352-1362
pubmed: 31160374
Nat Commun. 2020 Dec 10;11(1):6327
pubmed: 33303762
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
Nucleic Acids Res. 2013 Jan;41(Database issue):D590-6
pubmed: 23193283
Genome Res. 2008 May;18(5):821-9
pubmed: 18349386
Bioinformatics. 2013 Nov 15;29(22):2826-34
pubmed: 24058058
Nat Chem Biol. 2011 Oct 09;7(11):794-802
pubmed: 21983601
Microbiome. 2019 Aug 27;7(1):119
pubmed: 31455406
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
mBio. 2016 Feb 09;7(1):e01948-15
pubmed: 26861018
Genome Biol. 2019 Aug 2;20(1):153
pubmed: 31375138
Nat Methods. 2020 Feb;17(2):155-158
pubmed: 31819265
Microbiome. 2019 Apr 16;7(1):61
pubmed: 30992083
Genome Res. 2017 Apr;27(4):626-638
pubmed: 28167665
Genome Res. 2015 Jul;25(7):1043-55
pubmed: 25977477
Nat Methods. 2015 Jan;12(1):59-60
pubmed: 25402007
Nat Biotechnol. 2019 Aug;37(8):937-944
pubmed: 31359005
Nat Biotechnol. 2018 Apr;36(4):338-345
pubmed: 29431738
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12
pubmed: 25348405
PLoS One. 2014 Nov 19;9(11):e112963
pubmed: 25409509
Nat Methods. 2016 Dec;13(12):1050-1054
pubmed: 27749838
J Comput Biol. 2013 Oct;20(10):714-37
pubmed: 24093227
BMC Bioinformatics. 2010 Mar 08;11:119
pubmed: 20211023
Nat Methods. 2020 Nov;17(11):1103-1110
pubmed: 33020656
Nature. 2015 Jan 22;517(7535):455-9
pubmed: 25561178
Sci Rep. 2020 Aug 12;10(1):13588
pubmed: 32788623

Auteurs

Mikhail Kolmogorov (M)

Department of Computer Science and Engineering, University of California, San Diego, CA, USA.

Derek M Bickhart (DM)

Cell Wall Biology and Utilization Laboratory, Dairy Forage Research Center, USDA, Madison, WI, USA.

Bahar Behsaz (B)

Graduate Program in Bioinformatics and System Biology, University of California, San Diego, CA, USA.

Alexey Gurevich (A)

Center for Algorithmic Biotechnology, St. Petersburg State University, St. Petersburg, Russia.

Mikhail Rayko (M)

Center for Algorithmic Biotechnology, St. Petersburg State University, St. Petersburg, Russia.

Sung Bong Shin (SB)

USDA-ARS US Meat Animal Research Center, Clay Center, NE, USA.

Kristen Kuhn (K)

USDA-ARS US Meat Animal Research Center, Clay Center, NE, USA.

Jeffrey Yuan (J)

Graduate Program in Bioinformatics and System Biology, University of California, San Diego, CA, USA.

Evgeny Polevikov (E)

Center for Algorithmic Biotechnology, St. Petersburg State University, St. Petersburg, Russia.
Bioinformatics Institute, St. Petersburg, Russia.

Timothy P L Smith (TPL)

USDA-ARS US Meat Animal Research Center, Clay Center, NE, USA.

Pavel A Pevzner (PA)

Department of Computer Science and Engineering, University of California, San Diego, CA, USA. ppevzner@ucsd.edu.
Center for Microbiome Innovation, University of California, San Diego, CA, USA. ppevzner@ucsd.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH