Ecosystem-wide metagenomic binning enables prediction of ecological niches from genomes.


Journal

Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179

Informations de publication

Date de publication:
13 03 2020
Historique:
received: 03 07 2019
accepted: 25 02 2020
entrez: 15 3 2020
pubmed: 15 3 2020
medline: 10 2 2021
Statut: epublish

Résumé

The genome encodes the metabolic and functional capabilities of an organism and should be a major determinant of its ecological niche. Yet, it is unknown if the niche can be predicted directly from the genome. Here, we conduct metagenomic binning on 123 water samples spanning major environmental gradients of the Baltic Sea. The resulting 1961 metagenome-assembled genomes represent 352 species-level clusters that correspond to 1/3 of the metagenome sequences of the prokaryotic size-fraction. By using machine-learning, the placement of a genome cluster along various niche gradients (salinity level, depth, size-fraction) could be predicted based solely on its functional genes. The same approach predicted the genomes' placement in a virtual niche-space that captures the highest variation in distribution patterns. The predictions generally outperformed those inferred from phylogenetic information. Our study demonstrates a strong link between genome and ecological niche and provides a conceptual framework for predictive ecology based on genomic data.

Identifiants

pubmed: 32170201
doi: 10.1038/s42003-020-0856-x
pii: 10.1038/s42003-020-0856-x
pmc: PMC7070063
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

119

Références

Hutchinson, G. E. Concluding remarks. Cold Spring Harb. Symposia Quant. Biol. 22, 415–427 (1957).
doi: 10.1101/SQB.1957.022.01.039
Webb, C. O. Exploring the phylogenetic structure of ecological communities: an example for rain forest trees. Am. Nat. 156, 145–155 (2000).
pubmed: 10856198 doi: 10.1086/303378 pmcid: 10856198
Horner-Devine, M. C. & Bohannan, B. J. M. Phylogenetic clustering and overdispersion in bacterial communities. Ecology 87, S100–8 (2006).
pubmed: 16922306 doi: 10.1890/0012-9658(2006)87[100:PCAOIB]2.0.CO;2 pmcid: 16922306
Burns, J. H. & Strauss, S. Y. More closely related species are more ecologically similar in an experimental test. Proc. Natl Acad. Sci. USA 108, 5302–5307 (2011).
pubmed: 21402914 doi: 10.1073/pnas.1013003108 pmcid: 21402914
Andersson, A. F., Riemann, L. & Bertilsson, S. Pyrosequencing reveals contrasting seasonal dynamics of taxa within Baltic Sea bacterioplankton communities. ISME J. 4, 171–181 (2010).
pubmed: 19829318 doi: 10.1038/ismej.2009.108 pmcid: 19829318
Martiny, J. B. H., Jones, S. E., Lennon, J. T. & Martiny, A. C. Microbiomes in light of traits: A phylogenetic perspective. Science 350, aac9323–aac9323 (2015).
pubmed: 26542581 doi: 10.1126/science.aac9323 pmcid: 26542581
Cavender-Bares, J., Kozak, K. H., Fine, P. V. A. & Kembel, S. W. The merging of community ecology and phylogenetic biology. Ecol. Lett. 12, 693–715 (2009).
pubmed: 19473217 doi: 10.1111/j.1461-0248.2009.01314.x pmcid: 19473217
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinforma. 11, 119 (2010).
doi: 10.1186/1471-2105-11-119
Ye, Y. & Doak, T. G. A parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes. PLoS Comput. Biol. 5, e1000465 (2009).
pubmed: 19680427 pmcid: 2714467 doi: 10.1371/journal.pcbi.1000465
Weimann, A. et al. From genomes to phenotypes: traitar, the microbial trait analyzer. mSystems. 1, e00101–16 (2016).
pubmed: 28066816 pmcid: 5192078 doi: 10.1128/mSystems.00101-16
Brbić, M. et al. The landscape of microbial phenotypic traits and associated genes. Nucleic Acids Res. 44, 10074–10090 (2016).
Jensen, D. B. & Ussery, D. W. Bayesian prediction of microbial oxygen requirement. F1000Res. 2, 184 (2013).
pubmed: 26913185 pmcid: 4743139 doi: 10.12688/f1000research.2-184.v1
Jensen, D. B., Vesth, T. C., Hallin, P. F., Pedersen, A. G. & Ussery, D. W. Bayesian prediction of bacterial growth temperature range based on genome sequences. BMC Genomics 13, S3 (2012).
pubmed: 23282160 pmcid: 3521210 doi: 10.1186/1471-2164-13-S7-S3
Lauro, F. M. et al. The genomic basis of trophic strategy in marine bacteria. Proc. Natl Acad. Sci. USA 106, 15527–15533 (2009).
pubmed: 19805210 doi: 10.1073/pnas.0903507106 pmcid: 19805210
Falkowski, P. G., Fenchel, T. & Delong, E. F. The microbial engines that drive Earth’s biogeochemical cycles. Science 320, 1034–1039 (2008).
pubmed: 18497287 doi: 10.1126/science.1153213 pmcid: 18497287
Venter, J. C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004).
pubmed: 15001713 doi: 10.1126/science.1093857 pmcid: 15001713
Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).
pubmed: 25999513 doi: 10.1126/science.1261359 pmcid: 25999513
Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).
pubmed: 28898207 doi: 10.1038/nbt.3935 pmcid: 28898207
Delmont, T. O. et al. Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes. Nat. Microbiol. 3, 804–813 (2018).
pubmed: 29891866 pmcid: 6792437 doi: 10.1038/s41564-018-0176-9
Tully, B. J., Graham, E. D. & Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).
pubmed: 29337314 pmcid: 5769542 doi: 10.1038/sdata.2017.203
Linz, A. M. et al. Freshwater carbon and nutrient cycles revealed through reconstructed population genomes. PeerJ 6, e6075 (2018).
pubmed: 30581671 pmcid: 6292386 doi: 10.7717/peerj.6075
Hugerth, L. W. et al. Metagenome-assembled genomes uncover a global brackish microbiome. Genome Biol. 16, 279 (2015).
pubmed: 26667648 pmcid: 4699468 doi: 10.1186/s13059-015-0834-7
Snoeijs-Leijonmalm, P., Schubert, H. & Radziejewska, T. Biological Oceanography of the Baltic Sea. (Springer Science & Business Media, 2017).
Herlemann, D. P. et al. Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea. ISME J. 5, 1571–1579 (2011).
pubmed: 21472016 pmcid: 3176514 doi: 10.1038/ismej.2011.41
Dupont, C. L. et al. Functional tradeoffs underpin salinity-driven divergence in microbial community composition. PLoS ONE 9, e89549 (2014).
pubmed: 24586863 pmcid: 3937345 doi: 10.1371/journal.pone.0089549
Hu, Y. O. O., Karlson, B., Charvet, S. & Andersson, A. F. Diversity of Pico- to Mesoplankton along the 2000 km Salinity Gradient of the Baltic Sea. Front. Microbiol. 7, 679 (2016).
pubmed: 27242706 pmcid: 4864665
Lindh, M. V. et al. Disentangling seasonal bacterioplankton population dynamics by high-frequency sampling. Environ. Microbiol. 17, 2459–2476 (2015).
pubmed: 25403576 doi: 10.1111/1462-2920.12720 pmcid: 25403576
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
pubmed: 27043002 pmcid: 27043002 doi: 10.1038/nbt.3519
Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods https://doi.org/10.1038/nmeth.3103 (2014).
doi: 10.1038/nmeth.3103 pubmed: 25218180 pmcid: 25218180
Parks, D. H. et al. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes 5. Genome Res. 25, 1043–1055
Varghese, N. J. et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 43, 6761–6771 (2015).
pubmed: 26150420 pmcid: 4538840 doi: 10.1093/nar/gkv657
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
pubmed: 30148503 pmcid: 30148503 doi: 10.1038/nbt.4229
Alneberg, J. et al. BARM and BalticMicrobeDB, a reference metagenome and interface to meta-omic data for the Baltic Sea. Sci. Data 5, 180146 (2018).
pubmed: 30063227 pmcid: 6067050 doi: 10.1038/sdata.2018.146
Newton, R. J., Jones, S. E., Eiler, A., McMahon, K. D. & Bertilsson, S. A guide to the natural history of freshwater lake bacteria. Microbiol. Mol. Biol. Rev. 75, 14–49 (2011).
pubmed: 21372319 pmcid: 3063352 doi: 10.1128/MMBR.00028-10
Giovannoni, S. J., Cameron Thrash, J. & Temperton, B. Implications of streamlining theory for microbial ecology. ISME J. 8, 1553–1565 (2014).
pubmed: 24739623 pmcid: 4817614 doi: 10.1038/ismej.2014.60
Fernández-Gómez, B. et al. Ecology of marine Bacteroidetes: a comparative genomics approach. ISME J. 7, 1026–1037 (2013).
pubmed: 23303374 pmcid: 3635232 doi: 10.1038/ismej.2012.169
DeLong, E. F., Franks, D. G. & Alldredge, A. L. Phylogenetic diversity of aggregate-attached vs. free-living marine bacterial assemblages. Limnol. Oceanogr. 38, 924–934 (1993).
doi: 10.4319/lo.1993.38.5.0924
Huerta-Cepas, J. et al. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 44, D286–93 (2016).
pubmed: 26582926 doi: 10.1093/nar/gkv1248 pmcid: 26582926
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
pubmed: 20808728 pmcid: 2929880 doi: 10.18637/jss.v033.i01
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
doi: 10.1023/A:1010933404324
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning, ser. (Springer, 2001).
Moran, M. A. et al. Deciphering ocean carbon in a changing world. Proc. Natl Acad. Sci. USA 113, 3143–3151 (2016).
pubmed: 26951682 doi: 10.1073/pnas.1514645113 pmcid: 26951682
Chow, C.-E. T., Kim, D. Y., Sachdeva, R., Caron, D. A. & Fuhrman, J. A. Top-down controls on bacterial community structure: microbial network analysis of bacteria, T4-like viruses and protists. ISME J. 8, 816–829 (2014).
pubmed: 24196323 doi: 10.1038/ismej.2013.199 pmcid: 24196323
Huynen, M. A. & Bork, P. Measuring genome evolution. Proc. Natl Acad. Sci. USA 95, 5849–5856 (1998).
pubmed: 9600883 doi: 10.1073/pnas.95.11.5849 pmcid: 9600883
Kembel, S. W. et al. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463–1464 (2010).
pubmed: 20395285 doi: 10.1093/bioinformatics/btq166 pmcid: 20395285
Gilbert, G. S. & Webb, C. O. Phylogenetic signal in plant pathogen-host range. Proc. Natl Acad. Sci. USA 104, 4979–4983 (2007).
pubmed: 17360396 doi: 10.1073/pnas.0607968104 pmcid: 17360396
Goberna, M. & Verdú, M. Predicting microbial traits with phylogenies. ISME J. 10, 959–967 (2016).
pubmed: 26371406 doi: 10.1038/ismej.2015.171 pmcid: 26371406
Martiny, A. C., Treseder, K. & Pusch, G. Phylogenetic conservatism of functional traits in microorganisms. ISME J. 7, 830–838 (2013).
pubmed: 23235290 doi: 10.1038/ismej.2012.160 pmcid: 23235290
Herlemann, D. P. R., Lundin, D., Andersson, A. F., Labrenz, M. & Jürgens, K. Phylogenetic signals of salinity and season in bacterial community composition across the salinity gradient of the Baltic Sea. Front. Microbiol. 7, 1883 (2016).
pubmed: 27933046 pmcid: 5121245 doi: 10.3389/fmicb.2016.01883
Fierer, N., Bradford, M. A. & Jackson, R. B. Toward an ecological classification of soil bacteria. Ecology 88, 1354–1364 (2007).
pubmed: 17601128 doi: 10.1890/05-1839 pmcid: 17601128
Coleman, M. L. & Chisholm, S. W. Ecosystem-specific selection pressures revealed through comparative population genomics. Proc. Natl Acad. Sci. USA 107, 18634–18639 (2010).
pubmed: 20937887 doi: 10.1073/pnas.1009480107 pmcid: 20937887
Denef, V. J. et al. Proteogenomic basis for ecological divergence of closely related bacteria in natural acidophilic microbial communities. Proc. Natl Acad. Sci. USA 107, 2383–2390 (2010).
pubmed: 20133593 doi: 10.1073/pnas.0907041107 pmcid: 20133593
Hunt, D. E. et al. Resource partitioning and sympatric differentiation among closely related bacterioplankton. Science 320, 1081–1085 (2008).
pubmed: 18497299 doi: 10.1126/science.1157890 pmcid: 18497299
Suen, G., Goldman, B. S. & Welch, R. D. Predicting prokaryotic ecological niches using genome sequence analysis. PLoS ONE 2, e743 (2007).
pubmed: 17710143 pmcid: 1937020 doi: 10.1371/journal.pone.0000743
Ochman, H., Lawrence, J. G. & Groisman, E. A. Lateral gene transfer and the nature of bacterial innovation. Nature 405, 299–304 (2000).
doi: 10.1038/35012500
Smillie, C. S. et al. Ecology drives a global network of gene exchange connecting the human microbiome. Nature 480, 241–244 (2011).
pubmed: 22037308 doi: 10.1038/nature10571 pmcid: 22037308
Quince, C. et al. DESMAN: a new tool for de novo extraction of strains from metagenomes. Genome Biol. 18, 181 (2017).
pubmed: 28934976 pmcid: 5607848 doi: 10.1186/s13059-017-1309-9
Scholz, M. et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nat. Methods 13, 435 (2016).
pubmed: 26999001 doi: 10.1038/nmeth.3802 pmcid: 26999001
Elith, J. & Leathwick, J. R. Species distribution models: ecological explanation and prediction across space and time. Annu. Rev. Ecol., Evolution, Syst. 40, 677–697 (2009).
doi: 10.1146/annurev.ecolsys.110308.120159
Larsson, J. et al. Picocyanobacteria containing a novel pigment gene cluster dominate the brackish water Baltic Sea. ISME J. 8, 1892–1903 (2014).
pubmed: 24621524 pmcid: 4139726 doi: 10.1038/ismej.2014.35
Bange, H. W. & Malien, F. Hydrochemistry from time series station Boknis Eck from 1957 to 2014. https://doi.org/10.1594/PANGAEA.855693 (2015).
Bunse, C. et al. High frequency multi-year variability in baltic sea microbial plankton stocks and activities. Front. Microbiol. 9, 3296 (2019).
Boström, K. H., Simu, K., Hagström, Å., Riemann, L. Optimization of DNA extraction for quantitative marine bacterioplankton community analysis. Limnology and Oceanography: Methods 2, 365–373 (2004)
Bunse, C. et al. Spatio-Temporal Interdependence of Bacteria and Phytoplankton during a Baltic Sea Spring Bloom. Frontiers in Microbiology 7 (2016).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).
doi: 10.14806/ej.17.1.200
Xu, H. et al. FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS ONE 7, e52249 (2012).
pubmed: 23284954 pmcid: 3527383 doi: 10.1371/journal.pone.0052249
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
pubmed: 25609793 doi: 10.1093/bioinformatics/btv033 pmcid: 25609793
Jain, C., Rodriguez-R, L. M. & Phillippy, A. M. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. bioRxiv (2017).
Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).
pubmed: 28460117 pmcid: 5850834 doi: 10.1093/molbev/msx148
Team, R. C. & Others. R: A language and environment for statistical computing. (2013).
Paradis, E., Claude, J. & Strimmer, K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).
pubmed: 14734327 doi: 10.1093/bioinformatics/btg412 pmcid: 14734327
Asnicar, F., Weingart, G., Tickle, T. L., Huttenhower, C. & Segata, N. Compact graphical representation of phylogenetic data and metadata with GraPhlAn. PeerJ 3, e1029 (2015).
pubmed: 26157614 pmcid: 4476132 doi: 10.7717/peerj.1029
Breiman, L., Cutler, A., Liaw, A. & Wiener, M. Package randomForest. Software available at: http://stat-www.berkeley.edu/users/breiman/RandomForests (2011).
Ridgeway, G. & Others. gbm: Generalized boosted regression models. R. package version 1, 55 (2006).
Kembel, S. W., Wu, M., Eisen, J. A. & Green, J. L. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput. Biol. 8, e1002743 (2012).
pubmed: 23133348 pmcid: 3486904 doi: 10.1371/journal.pcbi.1002743
Garland, T. & Ives, A. R. Using the past to predict the present: confidence intervals for regression equations in phylogenetic comparative methods. Am. Nat. 155, 346–364 (2000).
Pante, E. & Simon-Bouhet, B. marmap: a package for importing, plotting and analyzing bathymetric and topographic data in R. PLoS ONE 8, e73051 (2013).
pubmed: 24019892 pmcid: 3760912 doi: 10.1371/journal.pone.0073051
Amante, C. & Eakins, B. W. ETOPO1 arc-minute global relief model: procedures, data sources and analysis. (2009).

Auteurs

Johannes Alneberg (J)

Department of Gene Technology, Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Stockholm, Sweden.

Christin Bennke (C)

Leibniz Institute for Baltic Sea Research, Warnemünde, Germany.

Sara Beier (S)

Leibniz Institute for Baltic Sea Research, Warnemünde, Germany.
CNRS, Laboratoire d'Océanographie Microbienne, LOMIC, Sorbonne Université, Banyuls/mer, France.

Carina Bunse (C)

Centre for Ecology and Evolution in Microbial Model Systems, Linnaeus, University, Kalmar, Sweden.
Helmholtz Institute for Functional Marine Biodiversity at the University of Oldenburg (HIFMB), Oldenburg, Germany.
Alfred-Wegener-Institut Helmholtz-Zentrum für Polar- und Meeresforschung, Bremerhaven, Germany.

Christopher Quince (C)

Warwick Medical School, University of Warwick, Coventry, UK.

Karolina Ininbergs (K)

Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden.
Department of Laboratory Medicine, Karolinska Institute, Stockholm, Sweden.

Lasse Riemann (L)

Department of Biology, Marine Biological Section, University of Copenhagen, Helsingør, Denmark.

Martin Ekman (M)

Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden.

Klaus Jürgens (K)

Leibniz Institute for Baltic Sea Research, Warnemünde, Germany.

Matthias Labrenz (M)

Leibniz Institute for Baltic Sea Research, Warnemünde, Germany.

Jarone Pinhassi (J)

Centre for Ecology and Evolution in Microbial Model Systems, Linnaeus, University, Kalmar, Sweden.

Anders F Andersson (AF)

Department of Gene Technology, Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Stockholm, Sweden. anders.andersson@scilifelab.se.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Populus Soil Microbiology Soil Microbiota Fungi
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH