Inclusion of Oxford Nanopore long reads improves all microbial and viral metagenome-assembled genomes from a complex aquifer system.
Journal
Environmental microbiology
ISSN: 1462-2920
Titre abrégé: Environ Microbiol
Pays: England
ID NLM: 100883692
Informations de publication
Date de publication:
09 2020
09 2020
Historique:
received:
19
12
2019
revised:
31
07
2020
accepted:
02
08
2020
pubmed:
8
8
2020
medline:
7
4
2021
entrez:
8
8
2020
Statut:
ppublish
Résumé
Assembling microbial and viral genomes from metagenomes is a powerful and appealing method to understand structure-function relationships in complex environments. To compare the recovery of genomes from microorganisms and their viruses from groundwater, we generated shotgun metagenomes with Illumina sequencing accompanied by long reads derived from the Oxford Nanopore Technologies (ONT) sequencing platform. Assembly and metagenome-assembled genome (MAG) metrics for both microbes and viruses were determined from an Illumina-only assembly, ONT-only assembly, and a hybrid assembly approach. The hybrid approach recovered 2× more mid to high-quality MAGs compared to the Illumina-only approach and 4× more than the ONT-only approach. A similar number of viral genomes were reconstructed using the hybrid and ONT methods, and both recovered nearly fourfold more viral genomes than the Illumina-only approach. While yielding fewer MAGs, the ONT-only approach generated MAGs with a high probability of containing rRNA genes, 3× higher than either of the other methods. Of the shared MAGs recovered from each method, the ONT-only approach generated the longest and least fragmented MAGs, while the hybrid approach yielded the most complete. This work provides quantitative data to inform a cost-benefit analysis of the decision to supplement shotgun metagenomic projects with long reads towards the goal of recovering genomes from environmentally abundant groups.
Identifiants
pubmed: 32761733
doi: 10.1111/1462-2920.15186
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
4000-4013Subventions
Organisme : Deutsche Forschungsgemeinschaft
ID : CRC 1076 AquaDiva
Pays : International
Organisme : Joachim Herz Stiftung
Pays : International
Informations de copyright
© 2020 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.
Références
Al-Shayeb, B., Sachdeva, R., Chen, L.-X., Ward, F., Munk, P., Devoto, A., et al. (2020) Clades of huge phages from across Earth's ecosystems. Nature, 578: 425-431. https://doi.org/10.1038/s41586-020-2007-4
Anantharaman, K., Brown, C.T., Hug, L.A., Sharon, I., Castelle, C.J., Probst, A.J., et al. (2016) Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat Commun 7: 13219.
Bertrand, D., Shaw, J., Kalathiyappan, M., Ng, A.H.Q., Kumar, M.S., Li, C., et al. (2019) Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat Biotechnol 37: 937-944.
Bowers, R.M., Kyrpides, N.C., Stepanauskas, R., Harmon-Smith, M., Doud, D., Reddy, T.B.K., et al. (2017) Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol 35: 725-731.
Breitbart, M., Salamon, P., Andresen, B., Mahaffy, J.M., Segall, A.M., Mead, D., et al. (2002) Genomic analysis of uncultured marine viral communities. Proc Natl Acad Sci U S A 99: 14250-14255.
Brown, C.T., Hug, L.A., Thomas, B.C., Sharon, I., Castelle, C.J., Singh, A., et al. (2015) Unusual biology across a group comprising more than 15% of domain bacteria. Nature 523: 208-211.
Brum, J.R., Ignacio-Espinoza, J.C., Roux, S., Doulcier, G., Acinas, S.G., Alberti, A., et al. (2015) Ocean plankton. Patterns and ecological drivers of ocean viral communities. Science 348: 1261498.
Bushnell, B. (2014) BBTools software package. URL http://sourceforgenet/projects/bbmap
Chaumeil, P.-A., Mussig, A.J., Hugenholtz, P., and Parks, D.H. (2020) GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36: 1925-1927.
Chen, L.-X., Anantharaman, K., Shaiber, A., Eren, A. M., Banfield, J. F. (2020). Accurate and complete genomes from metagenomes. Genome Research 30: 315-333. http://dx.doi.org/10.1101/gr.258640.119.
Cross, K.L., Campbell, J.H., Balachandran, M., Campbell, A.G., Cooper, S.J., Griffen, A., et al. (2019) Targeted isolation and cultivation of uncultivated bacteria by reverse genomics. Nat Biotechnol 37: 1314-1321.
Daims, H., Lebedeva, E.V., Pjevac, P., Han, P., Herbold, C., Albertsen, M., et al. (2015) Complete nitrification by Nitrospira bacteria. Nature 528: 504-509.
Delmont, T.O., Quince, C., Shaiber, A., Esen, Ö.C., Lee, S.T., Rappé, M.S., et al. (2018) Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes. Nat Microbiol 3: 804-813.
Devoto, A.E., Santini, J.M., Olm, M.R., Anantharaman, K., Munk, P., Tung, J., et al. (2019) Megaphages infect Prevotella and variants are widespread in gut microbiomes. Nat Microbiol 4: 693-700.
Dutilh, B.E., Cassman, N., McNair, K., Sanchez, S.E., Silva, G.G.Z., Boling, L., et al. (2014) A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat Commun 5: 4498.
Eddy, S. (2017) HMMER3: a new generation of sequence homology search software.
Frank, J.A., Pan, Y., Tooming-Klunderud, A., Eijsink, V.G.H., McHardy, A.C., Nederbragt, A.J., and Pope, P.B. (2016) Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data. Sci Rep 6: 25373.
Fridman, S., Flores-Uribe, J., Larom, S., Alalouf, O., Liran, O., Yacoby, I., et al. (2017) A myovirus encoding both photosystem I and II proteins enhances cyclic electron flow in infected Prochlorococcus cells. Nat Microbiol 2: 1350-1357.
Fuhrman, J.A. (1999) Marine viruses and their biogeochemical and ecological effects. Nature 399: 541-548.
Geesink, P., Wegner, C.-E., Probst, A.J., Herrmann, M., Dam, H.T., Kaster, A.-K., and Küsel, K. (2020) Genome-inferred spatio-temporal resolution of an uncultivated Roizmanbacterium reveals its ecological preferences in groundwater. Environ Microbiol 22: 726-737.
Goldberg, S.M.D., Johnson, J., Busam, D., Feldblyum, T., Ferriera, S., Friedman, R., et al. (2006) A sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes. Proc Natl Acad Sci U S A 103: 11240-11245.
Graham, E.D., Heidelberg, J.F., and Tully, B.J. (2017) BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation. PeerJ 5: e3035.
Gutleben, J., Chaib De Mares, M., van Elsas, J.D., Smidt, H., Overmann, J., and Sipkema, D. (2018) The multi-omics promise in context: from sequence to microbial isolate. Crit Rev Microbiol 44: 212-229.
Handelsman, J., Tiedje, J., Alvarez-Cohen, L., Ashburner, M., Cann, I.K.O., Delong, E.F., et al. (2007) The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet, Washington, DC: The National Academies Press. https://doi.org/10.17226/11902.
Herrmann, M., Wegner, C.-E., Taubert, M., Geesink, P., Lehmann, K., Yan, L., et al. (2019) Predominance of Cand. Patescibacteria in groundwater is caused by their preferential mobilization from soils and flourishing under oligotrophic conditions. Front Microbiol 10: 1407.
Howard-Varona, C., Hargreaves, K.R., Abedon, S.T., and Sullivan, M.B. (2017) Lysogeny in nature: mechanisms, impact and ecology of temperate phages. ISME J 11: 1511-1520.
Hug, L.A., Baker, B.J., Anantharaman, K., Brown, C.T., Probst, A.J., Castelle, C.J., et al. (2016) A new view of the tree of life. Nat Microbiol 1: 16048.
Hyatt, D., Chen, G.-L., LoCascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.
Imachi, H., Nobu, M.K., Nakahara, N., Morono, Y., Ogawara, M., Takaki, Y., et al. (2020) Isolation of an archaeon at the prokaryote-eukaryote interface. Nature 577: 519-525.
Jain, C., Rodriguez-R, L.M., Phillippy, A.M., Konstantinidis, K.T., and Aluru, S. (2018) High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun 9: 5114.
Kallies, R., Hölzer, M., Brizola Toscan, R., Nunes da Rocha, U., Anders, J., Marz, M., and Chatzinotas, A. (2019). Evaluation of Sequencing Library Preparation Protocols for Viral Metagenomic Analysis from Pristine Aquifer Groundwaters. Viruses 11: 484-502. http://dx.doi.org/10.3390/v11060484.
Kang, D.D., Froula, J., Egan, R., and Wang, Z. (2015) MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3: e1165.
Kang, D., Li, F., Kirton, E.S., Thomas, A., Egan, R.S., An, H., and Wang, Z. (2019) MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ Preprints 7: e7359.
Kassambara, A. (2020) rstatix: Pipe-Friendly Framework for Basic Statistical Tests.
Kohlhepp, B., Lehmann, R., Seeber, P., Küsel, K., Trumbore, S.E., and Totsche, K.U. (2017) Aquifer configuration and geostructural links control the groundwater quality in thin-bedded carbonate-siliciclastic alternations of the Hainich CZE, Central Germany. Hydrol Earth Syst Sci 21: 6091-6116.
Kolmogorov, M., Rayko, M., Yuan, J., Polevikov, E., and Pevzner, P. (2019) metaFlye: scalable long-read metagenome assembly using repeat graphs. bioRxiv: 637637. https://doi.org/10.1101/637637.
Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., and Salzberg, S.L. (2004) Versatile and open software for comparing large genomes. Genome Biol 5: R12.
Küsel, K., Totsche, K.U., Trumbore, S.E., Lehmann, R., Steinhäuser, C., and Herrmann, M. (2016) How deep can surface signals be traced in the critical zone? Merging biodiversity with biogeochemistry research in a central German Muschelkalk landscape. Front Earth Sci China 4: 32.
Lehmann, R., and Totsche, K.U. (2020) Multi-directional flow dynamics shape groundwater quality in sloping bedrock strata. J Hydrol 580: 124291.
Li, H. (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34: 3094-3100.
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., et al. (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078-2079.
Luo, C., Tsementzi, D., Kyrpides, N.C., and Konstantinidis, K.T. (2012) Individual genome assembly from complex community short-read metagenomic datasets. ISME J 6: 898-901.
Matsen, F.A., Kodner, R.B., and Armbrust, E.V. (2010) Pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 11: 538.
Mikheenko, A., Saveliev, V., and Gurevich, A. (2016) MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32: 1088-1090.
Murali, A., Bhargava, A., and Wright, E.S. (2018) IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences. Microbiome 6: 140.
Murat Eren, A., Esen, Ö.C., Quince, C., Vineis, J.H., Morrison, H.G., Sogin, M.L., and Delmont, T.O. (2015) Anvi'o: an advanced analysis and visualization platform for ‘omics data. PeerJ 3: e1319.
Nooij, S., Schmitz, D., Vennema, H., Kroneman, A., and Koopmans, M.P.G. (2018) Overview of virus metagenomic classification methods and their biological applications. Front Microbiol 9: 749.
Nurk, S., Meleshko, D., Korobeynikov, A., and Pevzner, P.A. (2017) metaSPAdes: a new versatile metagenomic assembler. Genome Res 27: 824-834.
Paez-Espino, D., Eloe-Fadrosh, E.A., Pavlopoulos, G.A., Thomas, A.D., Huntemann, M., Mikhailova, N., et al. (2016) Uncovering Earth's virome. Nature 536: 425-430.
Parks, D.H., Chuvochina, M., Waite, D.W., Rinke, C., Skarshewski, A., Chaumeil, P.-A., and Hugenholtz, P. (2018) A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36: 996-1004.
Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P., and Tyson, G.W. (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25: 1043-1055.
Pedron, R., Esposito, A., Bianconi, I., Pasolli, E., Tett, A., Asnicar, F., et al. (2019) Genomic and metagenomic insights into the microbial community of a thermal spring. Microbiome 7: 8.
Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 - approximately maximum-likelihood trees for large alignments. PLoS One 5: e9490.
R Core Team (2014) R: A Language and Environment for Statistical Computing.
Rodriguez-R, L.M., Gunturu, S., Tiedje, J.M., Cole, J.R., and Konstantinidis, K.T. (2018) Nonpareil 3: fast estimation of metagenomic coverage and sequence diversity. mSystems 3. https://doi.org/10.1128/mSystems.00039-18.
Rodriguez-R, L.M., and Konstantinidis, K.T. (2014) Nonpareil: a redundancy-based approach to assess the level of coverage in metagenomic datasets. Bioinformatics 30: 629-635. http://dx.doi.org/10.1093/bioinformatics/btt584.
Roux, S., Adriaenssens, E.M., Dutilh, B.E., Koonin, E.V., Kropinski, A.M., Krupovic, M., et al. (2019) Minimum information about an uncultivated virus genome (MIUViG). Nat Biotechnol 37: 29-37.
Roux, S., Brum, J.R., Dutilh, B.E., Sunagawa, S., Duhaime, M.B., Loy, A., et al. (2016) Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature 537: 689-693.
Roux, S., Enault, F., Hurwitz, B.L., and Sullivan, M.B. (2015) VirSorter: mining viral signal from microbial genomic data. PeerJ 3: e985.
Scholz, M., Lo, C.-C., and Chain, P.S.G. (2014) Improved assemblies using a source-agnostic pipeline for MetaGenomic assembly by merging (MeGAMerge) of contigs. Sci Rep 4: 6480.
Seemann, T. (2015) Barrnap.
Shaiber, A., and Eren, A.M. (2019) Composite Metagenome-Assembled Genomes Reduce the Quality of Public Genome Repositories. mBio 10. http://dx.doi.org/10.1128/mbio.00725-19.
Somerville, V., Lutz, S., Schmid, M., Frei, D., Moser, A., Irmler, S., et al. (2019) Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system. BMC Microbiol 19: 143.
Stewart, R.D., Auffret, M.D., Warr, A., Wiser, A.H., Press, M.O., Langford, K.W., et al. (2018) Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen. Nat Commun 9: 870.
Taubert, M., Stöckel, S., Geesink, P., Girnus, S., Jehmlich, N., von Bergen, M., et al. (2018) Tracking active groundwater microbes with D2O labelling to understand their ecosystem function. Environ Microbiol 20: 369-384.
Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram, R.J., Richardson, P.M., et al. (2004) Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 37-43.
Uritskiy, G.V., DiRuggiero, J., and Taylor, J. (2018) MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6: 158.
Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A., et al. (2004) Environmental genome shotgun sequencing of the Sargasso Sea. Science 304: 66-74.
Warwick-Dugdale, J., Solonenko, N., Moore, K., Chittick, L., Gregory, A.C., Allen, M.J., et al. (2019) Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands. PeerJ 7: e6800.
Watson, M., and Warr, A. (2019) Errors in long-read assemblies can critically affect protein prediction. Nat Biotechnol 37: 124-126.
Wegner, C.-E., Gaspar, M., Geesink, P., Herrmann, M., Marz, M., and Küsel, K. (2018) Biogeochemical Regimes in Shallow Aquifers Reflect the Metabolic Coupling of the Elements Nitrogen, Sulfur, and Carbon. Applied and Environmental Microbiology 85. http://dx.doi.org/10.1128/aem.02346-18.
Wickham, H. (2009) ggplot2: Elegant Graphics for Data Analysis. New York, NY: Springer.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., et al. (2019a) Welcome to the Tidyverse. J Open Source Softw 4: 1686.
Wickham, H., François, R., Henry, L., and Müller, K. (2019b) dplyr: A Grammar of Data Manipulation.
Wickham, H. and Henry, L. (2019) tidyr: Tidy Messy Data.
Wu, Y.-W., Simmons, B.A., and Singer, S.W. (2016) MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32: 605-607.
Yan, L., Herrmann, M., Kampe, B., Lehmann, R., Totsche, K.U., and Küsel, K. (2019) Environmental selection shapes the formation of near-surface groundwater microbiomes. Water Res 170: 115341.