Ecosystem-wide metagenomic binning enables prediction of ecological niches from genomes.
Journal
Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179
Informations de publication
Date de publication:
13 03 2020
13 03 2020
Historique:
received:
03
07
2019
accepted:
25
02
2020
entrez:
15
3
2020
pubmed:
15
3
2020
medline:
10
2
2021
Statut:
epublish
Résumé
The genome encodes the metabolic and functional capabilities of an organism and should be a major determinant of its ecological niche. Yet, it is unknown if the niche can be predicted directly from the genome. Here, we conduct metagenomic binning on 123 water samples spanning major environmental gradients of the Baltic Sea. The resulting 1961 metagenome-assembled genomes represent 352 species-level clusters that correspond to 1/3 of the metagenome sequences of the prokaryotic size-fraction. By using machine-learning, the placement of a genome cluster along various niche gradients (salinity level, depth, size-fraction) could be predicted based solely on its functional genes. The same approach predicted the genomes' placement in a virtual niche-space that captures the highest variation in distribution patterns. The predictions generally outperformed those inferred from phylogenetic information. Our study demonstrates a strong link between genome and ecological niche and provides a conceptual framework for predictive ecology based on genomic data.
Identifiants
pubmed: 32170201
doi: 10.1038/s42003-020-0856-x
pii: 10.1038/s42003-020-0856-x
pmc: PMC7070063
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
119Références
Hutchinson, G. E. Concluding remarks. Cold Spring Harb. Symposia Quant. Biol. 22, 415–427 (1957).
doi: 10.1101/SQB.1957.022.01.039
Webb, C. O. Exploring the phylogenetic structure of ecological communities: an example for rain forest trees. Am. Nat. 156, 145–155 (2000).
pubmed: 10856198
doi: 10.1086/303378
pmcid: 10856198
Horner-Devine, M. C. & Bohannan, B. J. M. Phylogenetic clustering and overdispersion in bacterial communities. Ecology 87, S100–8 (2006).
pubmed: 16922306
doi: 10.1890/0012-9658(2006)87[100:PCAOIB]2.0.CO;2
pmcid: 16922306
Burns, J. H. & Strauss, S. Y. More closely related species are more ecologically similar in an experimental test. Proc. Natl Acad. Sci. USA 108, 5302–5307 (2011).
pubmed: 21402914
doi: 10.1073/pnas.1013003108
pmcid: 21402914
Andersson, A. F., Riemann, L. & Bertilsson, S. Pyrosequencing reveals contrasting seasonal dynamics of taxa within Baltic Sea bacterioplankton communities. ISME J. 4, 171–181 (2010).
pubmed: 19829318
doi: 10.1038/ismej.2009.108
pmcid: 19829318
Martiny, J. B. H., Jones, S. E., Lennon, J. T. & Martiny, A. C. Microbiomes in light of traits: A phylogenetic perspective. Science 350, aac9323–aac9323 (2015).
pubmed: 26542581
doi: 10.1126/science.aac9323
pmcid: 26542581
Cavender-Bares, J., Kozak, K. H., Fine, P. V. A. & Kembel, S. W. The merging of community ecology and phylogenetic biology. Ecol. Lett. 12, 693–715 (2009).
pubmed: 19473217
doi: 10.1111/j.1461-0248.2009.01314.x
pmcid: 19473217
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinforma. 11, 119 (2010).
doi: 10.1186/1471-2105-11-119
Ye, Y. & Doak, T. G. A parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes. PLoS Comput. Biol. 5, e1000465 (2009).
pubmed: 19680427
pmcid: 2714467
doi: 10.1371/journal.pcbi.1000465
Weimann, A. et al. From genomes to phenotypes: traitar, the microbial trait analyzer. mSystems. 1, e00101–16 (2016).
pubmed: 28066816
pmcid: 5192078
doi: 10.1128/mSystems.00101-16
Brbić, M. et al. The landscape of microbial phenotypic traits and associated genes. Nucleic Acids Res. 44, 10074–10090 (2016).
Jensen, D. B. & Ussery, D. W. Bayesian prediction of microbial oxygen requirement. F1000Res. 2, 184 (2013).
pubmed: 26913185
pmcid: 4743139
doi: 10.12688/f1000research.2-184.v1
Jensen, D. B., Vesth, T. C., Hallin, P. F., Pedersen, A. G. & Ussery, D. W. Bayesian prediction of bacterial growth temperature range based on genome sequences. BMC Genomics 13, S3 (2012).
pubmed: 23282160
pmcid: 3521210
doi: 10.1186/1471-2164-13-S7-S3
Lauro, F. M. et al. The genomic basis of trophic strategy in marine bacteria. Proc. Natl Acad. Sci. USA 106, 15527–15533 (2009).
pubmed: 19805210
doi: 10.1073/pnas.0903507106
pmcid: 19805210
Falkowski, P. G., Fenchel, T. & Delong, E. F. The microbial engines that drive Earth’s biogeochemical cycles. Science 320, 1034–1039 (2008).
pubmed: 18497287
doi: 10.1126/science.1153213
pmcid: 18497287
Venter, J. C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004).
pubmed: 15001713
doi: 10.1126/science.1093857
pmcid: 15001713
Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).
pubmed: 25999513
doi: 10.1126/science.1261359
pmcid: 25999513
Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).
pubmed: 28898207
doi: 10.1038/nbt.3935
pmcid: 28898207
Delmont, T. O. et al. Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes. Nat. Microbiol. 3, 804–813 (2018).
pubmed: 29891866
pmcid: 6792437
doi: 10.1038/s41564-018-0176-9
Tully, B. J., Graham, E. D. & Heidelberg, J. F. The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans. Sci. Data 5, 170203 (2018).
pubmed: 29337314
pmcid: 5769542
doi: 10.1038/sdata.2017.203
Linz, A. M. et al. Freshwater carbon and nutrient cycles revealed through reconstructed population genomes. PeerJ 6, e6075 (2018).
pubmed: 30581671
pmcid: 6292386
doi: 10.7717/peerj.6075
Hugerth, L. W. et al. Metagenome-assembled genomes uncover a global brackish microbiome. Genome Biol. 16, 279 (2015).
pubmed: 26667648
pmcid: 4699468
doi: 10.1186/s13059-015-0834-7
Snoeijs-Leijonmalm, P., Schubert, H. & Radziejewska, T. Biological Oceanography of the Baltic Sea. (Springer Science & Business Media, 2017).
Herlemann, D. P. et al. Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea. ISME J. 5, 1571–1579 (2011).
pubmed: 21472016
pmcid: 3176514
doi: 10.1038/ismej.2011.41
Dupont, C. L. et al. Functional tradeoffs underpin salinity-driven divergence in microbial community composition. PLoS ONE 9, e89549 (2014).
pubmed: 24586863
pmcid: 3937345
doi: 10.1371/journal.pone.0089549
Hu, Y. O. O., Karlson, B., Charvet, S. & Andersson, A. F. Diversity of Pico- to Mesoplankton along the 2000 km Salinity Gradient of the Baltic Sea. Front. Microbiol. 7, 679 (2016).
pubmed: 27242706
pmcid: 4864665
Lindh, M. V. et al. Disentangling seasonal bacterioplankton population dynamics by high-frequency sampling. Environ. Microbiol. 17, 2459–2476 (2015).
pubmed: 25403576
doi: 10.1111/1462-2920.12720
pmcid: 25403576
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
pubmed: 27043002
pmcid: 27043002
doi: 10.1038/nbt.3519
Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods https://doi.org/10.1038/nmeth.3103 (2014).
doi: 10.1038/nmeth.3103
pubmed: 25218180
pmcid: 25218180
Parks, D. H. et al. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes 5. Genome Res. 25, 1043–1055
Varghese, N. J. et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 43, 6761–6771 (2015).
pubmed: 26150420
pmcid: 4538840
doi: 10.1093/nar/gkv657
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
pubmed: 30148503
pmcid: 30148503
doi: 10.1038/nbt.4229
Alneberg, J. et al. BARM and BalticMicrobeDB, a reference metagenome and interface to meta-omic data for the Baltic Sea. Sci. Data 5, 180146 (2018).
pubmed: 30063227
pmcid: 6067050
doi: 10.1038/sdata.2018.146
Newton, R. J., Jones, S. E., Eiler, A., McMahon, K. D. & Bertilsson, S. A guide to the natural history of freshwater lake bacteria. Microbiol. Mol. Biol. Rev. 75, 14–49 (2011).
pubmed: 21372319
pmcid: 3063352
doi: 10.1128/MMBR.00028-10
Giovannoni, S. J., Cameron Thrash, J. & Temperton, B. Implications of streamlining theory for microbial ecology. ISME J. 8, 1553–1565 (2014).
pubmed: 24739623
pmcid: 4817614
doi: 10.1038/ismej.2014.60
Fernández-Gómez, B. et al. Ecology of marine Bacteroidetes: a comparative genomics approach. ISME J. 7, 1026–1037 (2013).
pubmed: 23303374
pmcid: 3635232
doi: 10.1038/ismej.2012.169
DeLong, E. F., Franks, D. G. & Alldredge, A. L. Phylogenetic diversity of aggregate-attached vs. free-living marine bacterial assemblages. Limnol. Oceanogr. 38, 924–934 (1993).
doi: 10.4319/lo.1993.38.5.0924
Huerta-Cepas, J. et al. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 44, D286–93 (2016).
pubmed: 26582926
doi: 10.1093/nar/gkv1248
pmcid: 26582926
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
pubmed: 20808728
pmcid: 2929880
doi: 10.18637/jss.v033.i01
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
doi: 10.1023/A:1010933404324
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning, ser. (Springer, 2001).
Moran, M. A. et al. Deciphering ocean carbon in a changing world. Proc. Natl Acad. Sci. USA 113, 3143–3151 (2016).
pubmed: 26951682
doi: 10.1073/pnas.1514645113
pmcid: 26951682
Chow, C.-E. T., Kim, D. Y., Sachdeva, R., Caron, D. A. & Fuhrman, J. A. Top-down controls on bacterial community structure: microbial network analysis of bacteria, T4-like viruses and protists. ISME J. 8, 816–829 (2014).
pubmed: 24196323
doi: 10.1038/ismej.2013.199
pmcid: 24196323
Huynen, M. A. & Bork, P. Measuring genome evolution. Proc. Natl Acad. Sci. USA 95, 5849–5856 (1998).
pubmed: 9600883
doi: 10.1073/pnas.95.11.5849
pmcid: 9600883
Kembel, S. W. et al. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463–1464 (2010).
pubmed: 20395285
doi: 10.1093/bioinformatics/btq166
pmcid: 20395285
Gilbert, G. S. & Webb, C. O. Phylogenetic signal in plant pathogen-host range. Proc. Natl Acad. Sci. USA 104, 4979–4983 (2007).
pubmed: 17360396
doi: 10.1073/pnas.0607968104
pmcid: 17360396
Goberna, M. & Verdú, M. Predicting microbial traits with phylogenies. ISME J. 10, 959–967 (2016).
pubmed: 26371406
doi: 10.1038/ismej.2015.171
pmcid: 26371406
Martiny, A. C., Treseder, K. & Pusch, G. Phylogenetic conservatism of functional traits in microorganisms. ISME J. 7, 830–838 (2013).
pubmed: 23235290
doi: 10.1038/ismej.2012.160
pmcid: 23235290
Herlemann, D. P. R., Lundin, D., Andersson, A. F., Labrenz, M. & Jürgens, K. Phylogenetic signals of salinity and season in bacterial community composition across the salinity gradient of the Baltic Sea. Front. Microbiol. 7, 1883 (2016).
pubmed: 27933046
pmcid: 5121245
doi: 10.3389/fmicb.2016.01883
Fierer, N., Bradford, M. A. & Jackson, R. B. Toward an ecological classification of soil bacteria. Ecology 88, 1354–1364 (2007).
pubmed: 17601128
doi: 10.1890/05-1839
pmcid: 17601128
Coleman, M. L. & Chisholm, S. W. Ecosystem-specific selection pressures revealed through comparative population genomics. Proc. Natl Acad. Sci. USA 107, 18634–18639 (2010).
pubmed: 20937887
doi: 10.1073/pnas.1009480107
pmcid: 20937887
Denef, V. J. et al. Proteogenomic basis for ecological divergence of closely related bacteria in natural acidophilic microbial communities. Proc. Natl Acad. Sci. USA 107, 2383–2390 (2010).
pubmed: 20133593
doi: 10.1073/pnas.0907041107
pmcid: 20133593
Hunt, D. E. et al. Resource partitioning and sympatric differentiation among closely related bacterioplankton. Science 320, 1081–1085 (2008).
pubmed: 18497299
doi: 10.1126/science.1157890
pmcid: 18497299
Suen, G., Goldman, B. S. & Welch, R. D. Predicting prokaryotic ecological niches using genome sequence analysis. PLoS ONE 2, e743 (2007).
pubmed: 17710143
pmcid: 1937020
doi: 10.1371/journal.pone.0000743
Ochman, H., Lawrence, J. G. & Groisman, E. A. Lateral gene transfer and the nature of bacterial innovation. Nature 405, 299–304 (2000).
doi: 10.1038/35012500
Smillie, C. S. et al. Ecology drives a global network of gene exchange connecting the human microbiome. Nature 480, 241–244 (2011).
pubmed: 22037308
doi: 10.1038/nature10571
pmcid: 22037308
Quince, C. et al. DESMAN: a new tool for de novo extraction of strains from metagenomes. Genome Biol. 18, 181 (2017).
pubmed: 28934976
pmcid: 5607848
doi: 10.1186/s13059-017-1309-9
Scholz, M. et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics. Nat. Methods 13, 435 (2016).
pubmed: 26999001
doi: 10.1038/nmeth.3802
pmcid: 26999001
Elith, J. & Leathwick, J. R. Species distribution models: ecological explanation and prediction across space and time. Annu. Rev. Ecol., Evolution, Syst. 40, 677–697 (2009).
doi: 10.1146/annurev.ecolsys.110308.120159
Larsson, J. et al. Picocyanobacteria containing a novel pigment gene cluster dominate the brackish water Baltic Sea. ISME J. 8, 1892–1903 (2014).
pubmed: 24621524
pmcid: 4139726
doi: 10.1038/ismej.2014.35
Bange, H. W. & Malien, F. Hydrochemistry from time series station Boknis Eck from 1957 to 2014. https://doi.org/10.1594/PANGAEA.855693 (2015).
Bunse, C. et al. High frequency multi-year variability in baltic sea microbial plankton stocks and activities. Front. Microbiol. 9, 3296 (2019).
Boström, K. H., Simu, K., Hagström, Å., Riemann, L. Optimization of DNA extraction for quantitative marine bacterioplankton community analysis. Limnology and Oceanography: Methods 2, 365–373 (2004)
Bunse, C. et al. Spatio-Temporal Interdependence of Bacteria and Phytoplankton during a Baltic Sea Spring Bloom. Frontiers in Microbiology 7 (2016).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).
doi: 10.14806/ej.17.1.200
Xu, H. et al. FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS ONE 7, e52249 (2012).
pubmed: 23284954
pmcid: 3527383
doi: 10.1371/journal.pone.0052249
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
pubmed: 25609793
doi: 10.1093/bioinformatics/btv033
pmcid: 25609793
Jain, C., Rodriguez-R, L. M. & Phillippy, A. M. High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries. bioRxiv (2017).
Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).
pubmed: 28460117
pmcid: 5850834
doi: 10.1093/molbev/msx148
Team, R. C. & Others. R: A language and environment for statistical computing. (2013).
Paradis, E., Claude, J. & Strimmer, K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).
pubmed: 14734327
doi: 10.1093/bioinformatics/btg412
pmcid: 14734327
Asnicar, F., Weingart, G., Tickle, T. L., Huttenhower, C. & Segata, N. Compact graphical representation of phylogenetic data and metadata with GraPhlAn. PeerJ 3, e1029 (2015).
pubmed: 26157614
pmcid: 4476132
doi: 10.7717/peerj.1029
Breiman, L., Cutler, A., Liaw, A. & Wiener, M. Package randomForest. Software available at: http://stat-www.berkeley.edu/users/breiman/RandomForests (2011).
Ridgeway, G. & Others. gbm: Generalized boosted regression models. R. package version 1, 55 (2006).
Kembel, S. W., Wu, M., Eisen, J. A. & Green, J. L. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput. Biol. 8, e1002743 (2012).
pubmed: 23133348
pmcid: 3486904
doi: 10.1371/journal.pcbi.1002743
Garland, T. & Ives, A. R. Using the past to predict the present: confidence intervals for regression equations in phylogenetic comparative methods. Am. Nat. 155, 346–364 (2000).
Pante, E. & Simon-Bouhet, B. marmap: a package for importing, plotting and analyzing bathymetric and topographic data in R. PLoS ONE 8, e73051 (2013).
pubmed: 24019892
pmcid: 3760912
doi: 10.1371/journal.pone.0073051
Amante, C. & Eakins, B. W. ETOPO1 arc-minute global relief model: procedures, data sources and analysis. (2009).