Exploring environmental intra-species diversity through non-redundant pangenome assemblies.
accessory genome
bioinformatics
core genome
metagenome-assembled genomes (MAGs)
metagenomics
microbial ecology
Journal
Molecular ecology resources
ISSN: 1755-0998
Titre abrégé: Mol Ecol Resour
Pays: England
ID NLM: 101465604
Informations de publication
Date de publication:
Oct 2023
Oct 2023
Historique:
revised:
24
05
2023
received:
07
12
2022
accepted:
15
06
2023
medline:
5
9
2023
pubmed:
29
6
2023
entrez:
29
6
2023
Statut:
ppublish
Résumé
At the genome level, microorganisms are highly adaptable both in terms of allele and gene composition. Such heritable traits emerge in response to different environmental niches and can have a profound influence on microbial community dynamics. As a consequence, any individual genome or population will contain merely a fraction of the total genetic diversity of any operationally defined "species", whose ecological potential can thus be only fully understood by studying all of their genomes and the genes therein. This concept, known as the pangenome, is valuable for studying microbial ecology and evolution, as it partitions genomes into core (present in all the genomes from a species, and responsible for housekeeping and species-level niche adaptation among others) and accessory regions (present only in some, and responsible for intra-species differentiation). Here we present SuperPang, an algorithm producing pangenome assemblies from a set of input genomes of varying quality, including metagenome-assembled genomes (MAGs). SuperPang runs in linear time and its results are complete, non-redundant, preserve gene ordering and contain both coding and non-coding regions. Our approach provides a modular view of the pangenome, identifying operons and genomic islands, and allowing to track their prevalence in different populations. We illustrate this by analysing intra-species diversity in Polynucleobacter, a bacterial genus ubiquitous in freshwater ecosystems, characterized by their streamlined genomes and their ecological versatility. We show how SuperPang facilitates the simultaneous analysis of allelic and gene content variation under different environmental pressures, allowing us to study the drivers of microbial diversification at unprecedented resolution.
Identifiants
pubmed: 37382302
doi: 10.1111/1755-0998.13826
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1724-1736Subventions
Organisme : H2020 Marie Skłodowska-Curie Actions
ID : 892961
Organisme : Svenska Forskningsrådet Formas
ID : 2019-02336
Organisme : Vetenskapsrådet
ID : 2017-04422
Organisme : Vetenskapsrådet
ID : 2018-05973
Informations de copyright
© 2023 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd.
Références
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410.
Brown, C. T., Moritz, D., O'Brien, M. P., Reidl, F., Reiter, T., & Sullivan, B. D. (2020). Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity. Genome Biology, 21(1), 1-16.
Buchfink, B., Xie, C., & Huson, D. H. (2015). Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12(1), 59-60.
Buck, M., Garcia, S. L., Fernandez, L., Martin, G., Martinez-Rodriguez, G. A., Saarenheimo, J., Zopfi, J., Bertilsson, S., & Peura, S. (2021). Comprehensive dataset of shotgun metagenomes from oxygen stratified freshwater lakes and ponds. Scientific Data, 8(1), 1-10.
Buck, M., Mehrshad, M., & Bertilsson, S. (2022). mOTUpan: A robust Bayesian approach to leverage metagenome-assembled genomes for core-genome estimation. NAR Genomics and Bioinformatics, 4(3), lqac060.
Cohan, F. M. (2001). Bacterial species and speciation. Systematic Biology, 50(4), 513-524.
Coleman, I., & Korem, T. (2021). Embracing metagenomic complexity with a genome-free approach. Msystems, 6(4), e00816-e00821.
Coleman, M. L., & Chisholm, S. W. (2007). Code and context: Prochlorococcus as a model for cross-scale biology. Trends in Microbiology, 15(9), 398-407.
Colquhoun, R. M., Hall, M. B., Lima, L., Roberts, L. W., Malone, K. M., Hunt, M., Letcher, B., Hawkey, J., George, S., Pankhurst, L., & Iqbal, Z. (2021). Pandora: Nucleotide-resolution bacterial pan-genomics with reference graphs. Genome Biology, 22, 1-30.
Copley, S. D. (2020). Evolution of new enzymes by gene duplication and divergence. The FEBS Journal, 287(7), 1262-1283.
Eddy, S. R. (2009). A new generation of homology search tools based on probabilistic inference. Genome Informatics Series, 23, 205-211.
Finn, R. D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J. E., Gavin, O. L., Gunasekaran, P., Ceric, G., Forslund, K., Holm, L., Sonnhammer, E. L., Eddy, S. R., & Bateman, A. (2010). The Pfam protein families database. Nucleic Acids Research, 38(suppl_1), D211-D222.
Fuhrman, J. A., & Campbell, L. (1998). Microbial microdiversity. Nature, 393(6684), 410-411.
Galand, P. E., Pereira, O., Hochart, C., Auguet, J. C., & Debroas, D. (2018). A strong link between marine microbial community composition and function challenges the idea of functional redundancy. The ISME Journal, 12(10), 2470-2478.
García-García, N., Tamames, J., Linz, A. M., Pedrós-Alió, C., & Puente-Sánchez, F. (2019). Microdiversity ensures the maintenance of functional microbial communities under changing environmental conditions. The ISME Journal, 13(12), 2969-2983.
Gautreau, G., Bazin, A., Gachet, M., Planel, R., Burlot, L., Dubois, M., Perrin, A., Médigue, C., Calteau, A., Cruveiller, S., Matias, C., Ambroise, C., Rocha, E. P. C., & Vallenet, D. (2020). PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph. PLoS Computational Biology, 16(3), e1007732.
Gourlé, H., Karlsson-Lindsjö, O., Hayer, J., & Bongcam-Rudloff, E. (2019). Simulating Illumina metagenomic data with InSilicoSeq. Bioinformatics, 35(3), 521-522.
Hahn, M. W., Schmidt, J., Pitt, A., Taipale, S. J., & Lang, E. (2016). Reclassification of four Polynucleobacter necessarius strains as representatives of Polynucleobacter asymbioticus comb. nov., Polynucleobacter duraquae sp. nov., Polynucleobacter yangtzensis sp. nov. and Polynucleobacter sinensis sp. nov., and emended description of Polynucleobacter necessarius. International Journal of Systematic and Evolutionary Microbiology, 66(8), 2883.
Hoetzinger, M., Pitt, A., Huemer, A., & Hahn, M. W. (2021). Continental-scale gene flow prevents allopatric divergence of pelagic freshwater bacteria. Genome Biology and Evolution, 13(3), evab019.
Hoetzinger, M., Schmidt, J., Jezberová, J., Koll, U., & Hahn, M. W. (2017). Microdiversification of a pelagic Polynucleobacter species is mainly driven by acquisition of genomic islands from a partially interspecific gene pool. Applied and Environmental Microbiology, 83(3), e02266-e02216.
Huerta-Cepas, J., Szklarczyk, D., Forslund, K., Cook, H., Heller, D., Walter, M. C., Rattei, T., Mende, D. R., Sunagawa, S., Kuhn, M., Jensen, L. J., von Mering, C., & Bork, P. (2016). eggNOG 4.5: A hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Research, 44(D1), D286-D293.
Hyatt, D., Chen, G. L., LoCascio, P. F., Land, M. L., Larimer, F. W., & Hauser, L. J. (2010a). Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 11(1), 1-11.
Inkpen, S. A., Douglas, G. M., Brunet, T. D. P., Leuschen, K., Doolittle, W. F., & Langille, M. G. (2017). The coupling of taxonomy and function in microbiomes. Biology and Philosophy, 32(6), 1225-1243.
Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T., & Aluru, S. (2018). High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nature Communications, 9(1), 1-8.
Kanehisa, M., & Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28(1), 27-30.
Koeppel, A. F., Wertheim, J. O., Barone, L., Gentile, N., Krizanc, D., & Cohan, F. M. (2013). Speedy speciation in a bacterial microcosm: New species can arise as frequently as adaptations within a species. The ISME Journal, 7(6), 1080-1091.
Larkin, A. A., & Martiny, A. C. (2017). Microdiversity shapes the traits, niche space, and biogeography of microbial taxa. Environmental Microbiology Reports, 9(2), 55-70.
Li, H. (2018). Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 3094-3100.
López-Pérez, M., Gonzaga, A., & Rodriguez-Valera, F. (2013). Genomic diversity of “deep ecotype” Alteromonas macleodii isolates: Evidence for pan-Mediterranean clonal frames. Genome Biology and Evolution, 5(6), 1220-1232.
López-Pérez, M., Martin-Cuadrado, A. B., & Rodriguez-Valera, F. (2014). Homologous recombination is involved in the diversity of replacement flexible genomic islands in aquatic prokaryotes. Frontiers in Genetics, 5, 147.
López-Pérez, M., & Rodriguez-Valera, F. (2016). Pangenome evolution in the marine bacterium Alteromonas. Genome Biology and Evolution, 8(5), 1556-1570.
Louca, S., Polz, M. F., Mazel, F., Albright, M. B., Huber, J. A., O'Connor, M. I., Ackermann, M., Hahn, A. S., Srivastava, D. S., Crowe, S. A., Doebeli, M., & Parfrey, L. W. (2018). Function and functional redundancy in microbial systems. Nature Ecology & Evolution, 2(6), 936-943.
Nurk, S., Meleshko, D., Korobeynikov, A., & Pevzner, P. A. (2017). metaSPAdes: A new versatile metagenomic assembler. Genome Research, 27(5), 824-834.
Olm, M. R., Brown, C. T., Brooks, B., & Banfield, J. F. (2017). dRep: A tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. The ISME Journal, 11(12), 2864-2868.
Page, A. J., Cummins, C. A., Hunt, M., Wong, V. K., Reuter, S., Holden, M. T., Fookes, M., Falush, D., Keane, J. A., & Parkhill, J. (2015). Roary: Rapid large-scale prokaryote pan genome analysis. Bioinformatics, 31(22), 3691-3693.
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P., & Tyson, G. W. (2015). CheckM: Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Research, 25(7), 1043-1055.
Peixoto, T. P. (2014). The graph-tool python library, figshare. doi: 10.6084/m9.figshare.1164194.
Perrin, A., & Rocha, E. P. (2021). PanACoTA: A modular tool for massive microbial comparative genomics. NAR Genomics and Bioinformatics, 3(1), lqaa106.
Puente-Sánchez, F., García-García, N., & Tamames, J. (2020). SQMtools: Automated processing and visual analysis of'omics data with R and anvi'o. BMC Bioinformatics, 21(1), 1-11.
Pushker, R., Mira, A., & Rodríguez-Valera, F. (2004). Comparative genomics of gene-family size in closely related bacteria. Genome Biology, 5(4), 1-15.
Quince, C., Nurk, S., Raguideau, S., James, R., Soyer, O. S., Summers, J. K., Limasset, A., Eren, A. M., Chikhi, R., & Darling, A. E. (2021). STRONG: Metagenomics strain resolution on assembly graphs. Genome Biology, 22(1), 1-34.
Richter, M., & Rosselló-Móra, R. (2009). Shifting the genomic gold standard for the prokaryotic species definition. Proceedings of the National Academy of Sciences, 106(45), 19126-19131.
Rogozin, I. B., Makarova, K. S., Natale, D. A., Spiridonov, A. N., Tatusov, R. L., Wolf, Y. I., Yin, J., & Koonin, E. V. (2002). Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. Nucleic Acids Research, 30(19), 4264-4271.
Salazar, G., Paoli, L., Alberti, A., Huerta-Cepas, J., Ruscheweyh, H. J., Cuenca, M., Field, C. M., Coelho, L. P., Cruaud, C., Engelen, S., Gregory, A. C., Labadie, K., Marec, C., Pelletier, E., Royo-Llonch, M., Roux, S., Sánchez, P., Uehara, H., Zayed, A. A., … Sunagawa, S. (2019). Gene expression changes and community turnover differentially shape the global ocean metatranscriptome. Cell, 179(5), 1068-1083.
Sanford, R. A., Lloyd, K. G., Konstantinidis, K. T., & Löffler, F. E. (2021). Microbial taxonomy run amok. Trends in Microbiology, 29(5), 394-404.
Seemann, T. (2014). Prokka: Rapid prokaryotic genome annotation. Bioinformatics, 30(14), 2068-2069.
Sjöqvist, C., Delgado, L. F., Alneberg, J., & Andersson, A. F. (2021). Ecologically coherent population structure of uncultivated bacterioplankton. The ISME Journal, 15(10), 3034-3049.
Steinegger, M., & Söding, J. (2017). MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, 35(11), 1026-1028.
Sunagawa, S., Coelho, L. P., Chaffron, S., Kultima, J. R., Labadie, K., Salazar, G., Djahanschiri, B., Zeller, G., Mende, D. R., Alberti, A., Cornejo-Castillo, F. M., Costea, P. I., Cruaud, C., d'Ovidio, F., Engelen, S., Ferrera, I., Gasol, J. M., Guidi, L., Hildebrand, F., … Velayoudon, D. (2015). Structure and function of the global ocean microbiome. Science, 348(6237), 1261359.
Tamames, J., & Puente-Sánchez, F. (2019). SqueezeMeta, a highly portable, fully automatic metagenomic analysis pipeline. Frontiers in Microbiology, 9, 3349.
Tettelin, H., Masignani, V., Cieslewicz, M. J., Donati, C., Medini, D., Ward, N. L., Angiuoli, S. V., Crabtree, J., Jones, A. L., Durkin, A. S., Deboy, R. T., Davidsen, T. M., Mora, M., Scarselli, M., Peterson, J. D., Hauser, C. R., Sundaram, J. P., Nelson, W. C., Madupu, R., … Fraser, C. M. (2005). Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”. Proceedings of the National Academy of Sciences, 102(39), 13950-13955.
Tettelin, H., Riley, D., Cattuto, C., & Medini, D. (2008). Comparative genomics: The bacterial pan-genome. Current Opinion in Microbiology, 11(5), 472-477.
Van der Walt, A. J., Van Goethem, M. W., Ramond, J. B., Makhalanyane, T. P., Reva, O., & Cowan, D. A. (2017). Assembling metagenomes, one community at a time. BMC Genomics, 18(1), 1-13.
Wick, R. R., Schultz, M. B., Zobel, J., & Holt, K. E. (2015). Bandage: Interactive visualization of de novo genome assemblies. Bioinformatics, 31(20), 3350-3352.