Coassembly and binning of a twenty-year metagenomic time-series from Lake Mendota.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
04 Sep 2024
Historique:
received: 28 12 2023
accepted: 27 08 2024
medline: 5 9 2024
pubmed: 5 9 2024
entrez: 4 9 2024
Statut: epublish

Résumé

The North Temperate Lakes Long-Term Ecological Research (NTL-LTER) program has been extensively used to improve understanding of how aquatic ecosystems respond to environmental stressors, climate fluctuations, and human activities. Here, we report on the metagenomes of samples collected between 2000 and 2019 from Lake Mendota, a freshwater eutrophic lake within the NTL-LTER site. We utilized the distributed metagenome assembler MetaHipMer to coassemble over 10 terabases (Tbp) of data from 471 individual Illumina-sequenced metagenomes. A total of 95,523,664 contigs were assembled and binned to generate 1,894 non-redundant metagenome-assembled genomes (MAGs) with ≥50% completeness and ≤10% contamination. Phylogenomic analysis revealed that the MAGs were nearly exclusively bacterial, dominated by Pseudomonadota (Proteobacteria, N = 623) and Bacteroidota (N = 321). Nine eukaryotic MAGs were identified by eukCC with six assigned to the phylum Chlorophyta. Additionally, 6,350 high-quality viral sequences were identified by geNomad with the majority classified in the phylum Uroviricota. This expansive coassembled metagenomic dataset provides an unprecedented foundation to advance understanding of microbial communities in freshwater ecosystems and explore temporal ecosystem dynamics.

Identifiants

pubmed: 39231974
doi: 10.1038/s41597-024-03826-8
pii: 10.1038/s41597-024-03826-8
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

966

Subventions

Organisme : NSF | BIO | Division of Environmental Biology (DEB)
ID : DEB-2025982
Organisme : NSF | BIO | Division of Environmental Biology (DEB)
ID : DEB-1344254
Organisme : NSF | BIO | Division of Molecular and Cellular Biosciences (MCB)
ID : MCB-0702395
Organisme : United States Department of Agriculture | Agricultural Research Service (USDA Agricultural Research Service)
ID : WIS01516, WIS01789, WIS03004
Organisme : NSF | BIO | Division of Biological Infrastructure (DBI)
ID : DBI-2011002
Organisme : U.S. Department of Energy (DOE)
ID : 89233218CNA000001
Organisme : U.S. Department of Energy (DOE)
ID : DE-AC05-76RL01830
Organisme : U.S. Department of Energy (DOE)
ID : DE-AC05-00OR22725
Organisme : U.S. Department of Energy (DOE)
ID : DE-AC02-05CH11231
Organisme : U.S. Department of Energy (DOE)
ID : 17-SC-20-SC

Informations de copyright

© 2024. The Author(s).

Références

Gries, C., Gahler, M. R., Hanson, P. C., Kratz, T. K. & Stanley, E. H. Information management at the North Temperate Lakes Long-term Ecological Research site — Successful support of research in a large, diverse, and long running project. Ecol. Inform. 36, 201–208 (2016).
doi: 10.1016/j.ecoinf.2016.08.007
Rohwer, R. R., Hale, R. J., Vander Zanden, M. J., Miller, T. R. & McMahon, K. D. Species invasions shift microbial phenology in a two-decade freshwater time series. Proc. Natl. Acad. Sci. USA 120, e2211796120 (2023).
doi: 10.1073/pnas.2211796120 pubmed: 36881623 pmcid: 10089161
DOE Joint Genome Institute. Freshwater microbial communities from Lake Mendota, Crystal Bog Lake, and Trout Bog Lake in Wisconsin, United States - time-series metagenomes. Genbank. https://identifiers.org/ncbi/bioproject:PRJNA1056043 (2023).
DOE Joint Genome Institute. Combined assembly of metagenomes from Lake Mendota. Genbank. https://identifiers.org/ncbi/bioproject:PRJNA1134257 (2024).
Hofmeyr, S. et al. Terabase-scale metagenome coassembly with MetaHipMer. Sci. Rep. 10, 10689 (2020).
doi: 10.1038/s41598-020-67416-5 pubmed: 32612216 pmcid: 7329831
Clum, A. et al. DOE JGI Metagenome Workflow. mSystems 6, e00804–20 (2021).
doi: 10.1128/mSystems.00804-20 pubmed: 34006627 pmcid: 8269246
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
doi: 10.7717/peerj.7359 pubmed: 31388474 pmcid: 6662567
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
doi: 10.1101/gr.186072.114 pubmed: 25977477 pmcid: 4484387
Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).
doi: 10.1093/bioinformatics/btz848 pubmed: 31730192 pmcid: 7703759
Grigoriev, I. V. et al. PhycoCosm, a comparative algal genomics resource. Nucleic Acids Res. 49, D1004–D1011 (2021).
doi: 10.1093/nar/gkaa898 pubmed: 33104790
Camargo, A. P. et al. Identification of mobile genetic elements with geNomad. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01953-y (2023).
doi: 10.1038/s41587-023-01953-y pubmed: 37735266 pmcid: 11324519
Nayfach, S. et al. CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat. Biotechnol. 39, 578–585 (2021).
doi: 10.1038/s41587-020-00774-7 pubmed: 33349699
Chen, I.-M. A. et al. The IMG/M data management and analysis system v.7: content updates and new features. Nucleic Acids Res. 51, D723–D732 (2023).
doi: 10.1093/nar/gkac976 pubmed: 36382399
Mukherjee, S. et al. Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9. Nucleic Acids Res. 51, D957–D963 (2023).
doi: 10.1093/nar/gkac974 pubmed: 36318257
Bushnell, B. BBmap software package http://sourceforge.net/projects/bbmap/ (2015).
Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).
doi: 10.1038/nbt.3893 pubmed: 28787424 pmcid: 6436528
Saary, P., Mitchell, A. L. & Finn, R. D. Estimating the quality of eukaryotic genomes recovered from metagenomic analysis with EukCC. Genome Biol. 21, 244 (2020).
doi: 10.1186/s13059-020-02155-4 pubmed: 32912302 pmcid: 7488429
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
doi: 10.1093/nar/gkab301 pubmed: 33885785 pmcid: 8265157
Linz, A. M. et al. Freshwater carbon and nutrient cycles revealed through reconstructed population genomes. PeerJ 6, e6075 (2018).
doi: 10.7717/peerj.6075 pubmed: 30581671 pmcid: 6292386

Auteurs

Tiffany Oliver (T)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA. toliver4@spelman.edu.
Department of Biology, Spelman College, Atlanta, GA, 30314, USA. toliver4@spelman.edu.

Neha Varghese (N)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Simon Roux (S)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Frederik Schulz (F)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Marcel Huntemann (M)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Alicia Clum (A)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Brian Foster (B)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Bryce Foster (B)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Robert Riley (R)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Kurt LaButti (K)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Robert Egan (R)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Patrick Hajek (P)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Supratim Mukherjee (S)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Galina Ovchinnikova (G)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

T B K Reddy (TBK)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Sara Calhoun (S)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Richard D Hayes (RD)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Robin R Rohwer (RR)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
Department of Integrative Biology, The University of Texas at Austin, Austin, TX, 78712, USA.

Zhichao Zhou (Z)

Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, 53706, USA.

Chris Daum (C)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Alex Copeland (A)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

I-Min A Chen (IA)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Natalia N Ivanova (NN)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Nikos C Kyrpides (NC)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Nigel J Mouncey (NJ)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Tijana Glavina Del Rio (TG)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Igor V Grigoriev (IV)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, 94720, USA.

Steven Hofmeyr (S)

Applied Math and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Leonid Oliker (L)

Applied Math and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Katherine Yelick (K)

Applied Math and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
Electrical Engineering and Computer Sciences Department, University of California Berkeley, Berkeley, CA, 94720, USA.

Karthik Anantharaman (K)

Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, 53706, USA.

Katherine D McMahon (KD)

Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, 53706, USA.
Department of Civil and Environmental Engineering, University of Wisconsin-Madison, Madison, WI, 53706, USA.

Tanja Woyke (T)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
Life and Environmental Sciences, University of California Merced, Merced, CA, 95343, USA.

Emiley A Eloe-Fadrosh (EA)

DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA. eaeloefadrosh@lbl.gov.
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA. eaeloefadrosh@lbl.gov.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Populus Soil Microbiology Soil Microbiota Fungi
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH