Prediction of strain level phage-host interactions across the Escherichia genus using only genomic information.
Journal
Nature microbiology
ISSN: 2058-5276
Titre abrégé: Nat Microbiol
Pays: England
ID NLM: 101674869
Informations de publication
Date de publication:
Nov 2024
Nov 2024
Historique:
received:
06
12
2023
accepted:
13
09
2024
medline:
1
11
2024
pubmed:
1
11
2024
entrez:
1
11
2024
Statut:
ppublish
Résumé
Predicting bacteriophage infection of specific bacterial strains promises advancements in phage therapy and microbial ecology. Whether the dynamics of well-established phage-host model systems generalize to the wide diversity of microbes is currently unknown. Here we show that we could accurately predict the outcomes of phage-bacteria interactions at the strain level in natural isolates from the genus Escherichia using only genomic data (area under the receiver operating characteristic curve (AUROC) of 86%). We experimentally established a dataset of interactions between 403 diverse Escherichia strains and 96 phages. Most interactions are explained by adsorption factors as opposed to antiphage systems which play a marginal role. We trained predictive algorithms and pinpoint poorly predicted interactions to direct future research efforts. Finally, we established a pipeline to recommend tailored phage cocktails, demonstrating efficiency on 100 pathogenic E. coli isolates. This work provides quantitative insights into phage-host specificity and supports the use of predictive algorithms in phage therapy.
Identifiants
pubmed: 39482383
doi: 10.1038/s41564-024-01832-5
pii: 10.1038/s41564-024-01832-5
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
2847-2861Subventions
Organisme : Institut National de la Santé et de la Recherche Médicale (National Institute of Health and Medical Research)
ID : R21042KS/RSE22002KSA
Organisme : Institut National de la Santé et de la Recherche Médicale (National Institute of Health and Medical Research)
ID : R21042KS/RSE22002KSA
Organisme : Institut National de la Santé et de la Recherche Médicale (National Institute of Health and Medical Research)
ID : R21042KS/RSE22002KSA
Organisme : Institut National de la Santé et de la Recherche Médicale (National Institute of Health and Medical Research)
ID : R21042KS/RSE22002KSA
Organisme : Institut National de la Santé et de la Recherche Médicale (National Institute of Health and Medical Research)
ID : R21042KS/RSE22002KSA
Organisme : Institut National de la Santé et de la Recherche Médicale (National Institute of Health and Medical Research)
ID : R21042KS/RSE22002KSA
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PECAN 101040529
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PECAN 101040529
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PECAN 101040529
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PECAN 101040529
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PECAN 101040529
Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
ID : PECAN 101040529
Organisme : Agence Nationale de la Recherche (French National Research Agency)
ID : ANR-19-AMRB-0002
Organisme : Agence Nationale de la Recherche (French National Research Agency)
ID : ANR-19-AMRB-0002
Organisme : Agence Nationale de la Recherche (French National Research Agency)
ID : ANR-20-CE92-0048
Organisme : Fondation pour la Recherche Médicale (Foundation for Medical Research in France)
ID : DEQ20161136698
Informations de copyright
© 2024. The Author(s), under exclusive licence to Springer Nature Limited.
Références
Kortright, K. E., Chan, B. K., Koff, J. L. & Turner, P. E. Phage therapy: a renewed approach to combat antibiotic-resistant bacteria. Cell Host Microbe 25, 219–232 (2019).
pubmed: 30763536
doi: 10.1016/j.chom.2019.01.014
Strathdee, S. A., Hatfull, G. F., Mutalik, V. K. & Schooley, R. T. Phage therapy: from biological mechanisms to future directions. Cell 186, 17–31 (2023).
pubmed: 36608652
doi: 10.1016/j.cell.2022.11.017
Lood, C. et al. Digital phagograms: predicting phage infectivity through a multilayer machine learning approach. Curr. Opin. Virol. 52, 174–181 (2022).
pubmed: 34952265
doi: 10.1016/j.coviro.2021.12.004
Nobrega, F. L. et al. Targeting mechanisms of tailed bacteriophages. Nat. Rev. Microbiol. 16, 760–773 (2018).
pubmed: 30104690
doi: 10.1038/s41579-018-0070-8
Georjon, H. & Bernheim, A. The highly diverse antiphage defence systems of bacteria. Nat. Rev. Microbiol. 21, 686–700 (2023).
pubmed: 37460672
doi: 10.1038/s41579-023-00934-x
Maffei, E. et al. Systematic exploration of Escherichia coli phage–host interactions with the BASEL phage collection. PLoS Biol. 19, e3001424 (2021).
pubmed: 34784345
doi: 10.1371/journal.pbio.3001424
Stanley, S. Y. & Maxwell, K. L. Phage-encoded anti-CRISPR defenses. Annu. Rev. Genet. 52, 445–464 (2018).
pubmed: 30208287
doi: 10.1146/annurev-genet-120417-031321
Krüger, D. H. & Bickle, T. A. Bacteriophage survival: multiple mechanisms for avoiding the deoxyribonucleic acid restriction systems of their hosts. Microbiol. Rev. 47, 345–360 (1983).
pubmed: 6314109
doi: 10.1128/mr.47.3.345-360.1983
Bertozzi Silva, J., Storms, Z. & Sauvageau, D. Host receptors for bacteriophage adsorption. FEMS Microbiol. Lett. 363, fnw002 (2016).
pubmed: 26755501
doi: 10.1093/femsle/fnw002
Tesson, F. et al. Systematic and quantitative view of the antiviral arsenal of prokaryotes. Nat. Commun. 13, 2561 (2022).
pubmed: 35538097
doi: 10.1038/s41467-022-30269-9
Piel, D. et al. Phage–host coevolution in natural populations. Nat. Microbiol. 7, 1075–1086 (2022).
pubmed: 35760840
doi: 10.1038/s41564-022-01157-1
Kauffman, K. M. et al. Resolving the structure of phage–bacteria interactions in the context of natural diversity. Nat. Commun. 13, 372 (2022).
pubmed: 35042853
doi: 10.1038/s41467-021-27583-z
Korf, I. H. E. et al. Still something to discover: novel insights into Escherichia coli phage diversity and taxonomy. Viruses 11, 454 (2019).
pubmed: 31109012
doi: 10.3390/v11050454
Walsh, S. K. et al. The host phylogeny determines viral infectivity and replication across Staphylococcus host species. PLoS Pathog. 19, e1011433 (2023).
pubmed: 37289828
doi: 10.1371/journal.ppat.1011433
Cuervo, A. et al. Structures of T7 bacteriophage portal and tail suggest a viral DNA retention and ejection mechanism. Nat. Commun. 10, 3746 (2019).
pubmed: 31431626
pmcid: 6702177
doi: 10.1038/s41467-019-11705-9
Hu, B., Margolin, W., Molineux, I. J. & Liu, J. The bacteriophage t7 virion undergoes extensive structural remodeling during infection. Science 339, 576–579 (2013).
pubmed: 23306440
doi: 10.1126/science.1231887
Suga, A., Kawaguchi, M., Yonesaki, T. & Otsuka, Y. Manipulating interactions between T4 phage long tail fibers and Escherichia coli receptors. Appl. Environ. Microbiol. 87, e0042321 (2021).
pubmed: 33893116
doi: 10.1128/AEM.00423-21
Srikant, S., Guegler, C. K. & Laub, M. T. The evolution of a counter-defense mechanism in a virus constrains its host range. Elife 11, e79549 (2022).
pubmed: 35924892
doi: 10.7554/eLife.79549
Abby, S. S., Néron, B., Ménager, H., Touchon, M. & Rocha, E. P. C. MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems. PLoS ONE 9, e110726 (2014).
pubmed: 25330359
doi: 10.1371/journal.pone.0110726
Néron, B. et al. MacSyFinder v2: improved modelling and search engine to identify molecular systems in genomes. Peer Community J. 3, e28 (2023).
Boeckaerts, D., Stock, M., De Baets, B. & Briers, Y. Identification of phage receptor-binding protein sequences with hidden Markov models and an extreme gradient boosting classifier. Viruses 14, 1329 (2022).
pubmed: 35746800
doi: 10.3390/v14061329
Pan, J. et al. GSPHI: a novel deep learning model for predicting phage–host interactions via multiple biological information. Comput. Struct. Biotechnol. J. 21, 3404–3413 (2023).
pubmed: 37397626
doi: 10.1016/j.csbj.2023.06.014
Wang, Y. et al. An effective model for predicting phage–host interactions via graph embedding representation learning with multi-head attention mechanism. IEEE J. Biomed. Health Inform. 27, 3061–3071 (2023).
pubmed: 37030796
doi: 10.1109/JBHI.2023.3261319
Boeckaerts, D. et al. Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins. Sci. Rep. 11, 1467 (2021).
pubmed: 33446856
doi: 10.1038/s41598-021-81063-4
Bajiya, N., Dhall, A., Aggarwal, S. & Raghava, G. P. S. Advances in the field of phage-based therapy with special emphasis on computational resources. Brief. Bioinform. 24, bbac574 (2023).
pubmed: 36575815
doi: 10.1093/bib/bbac574
Moller, A. G. et al. Genes influencing phage–host range in Staphylococcus aureus on a species-wide scale. mSphere 6, e01263–20 (2021).
pubmed: 33441407
doi: 10.1128/mSphere.01263-20
Beamud, B. et al. Genetic determinants of host tropism in Klebsiella phages. Cell Rep. 42, 112048 (2023).
pubmed: 36753420
doi: 10.1016/j.celrep.2023.112048
Haudiquet, M., Buffet, A., Rendueles, O. & Rocha, E. P. C. Interplay between the cell envelope and mobile genetic elements shapes gene flow in populations of the nosocomial pathogen Klebsiella pneumoniae. PLoS Biol. 19, e3001276 (2021).
pubmed: 34228700
doi: 10.1371/journal.pbio.3001276
Boeckaerts, D. et al. Prediction of Klebsiella phage–host specificity at the strain level. Nat. Commun. 15, 4355 (2024).
pubmed: 38778023
doi: 10.1038/s41467-024-48675-6
Keith, M. et al. Predictive phage therapy for Escherichia coli urinary tract infections: Cocktail selection for therapy based on machine learning models. Proc. Natl Acad. Sci. 121, e2313574121 (2024).
pubmed: 38478693
doi: 10.1073/pnas.2313574121
Antimicrobial Resistance Collaborators. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399, 629–655 (2022).
doi: 10.1016/S0140-6736(21)02724-0
Tenaillon, O., Skurnik, D., Picard, B. & Denamur, E. The population genetics of commensal Escherichia coli. Nat. Rev. Microbiol. 8, 207–217 (2010).
pubmed: 20157339
doi: 10.1038/nrmicro2298
Denamur, E., Clermont, O., Bonacorsi, S. & Gordon, D. The population genetics of pathogenic Escherichia coli. Nat. Rev. Microbiol. 19, 37–54 (2021).
pubmed: 32826992
doi: 10.1038/s41579-020-0416-x
Galardini, M. et al. Phenotype inference in an Escherichia coli strain panel. Elife 6, e31035 (2017).
pubmed: 29280730
doi: 10.7554/eLife.31035
Galardini, M. et al. Major role of iron uptake systems in the intrinsic extra-intestinal virulence of the genus Escherichia revealed by a genome-wide association study. PLoS Genet. 16, e1009065 (2020).
pubmed: 33112851
doi: 10.1371/journal.pgen.1009065
Bolduc, B. et al. vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria. PeerJ 5, e3243 (2017).
pubmed: 28480138
doi: 10.7717/peerj.3243
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).
pubmed: 26553804
doi: 10.1093/nar/gkv1189
Gaborieau, B. et al. Prediction of strain level phage-host interactions across the Escherichia genus using only genomic information. Zenodo https://doi.org/10.5281/zenodo.10202713 (2024).
Dixon, P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 14, 927–930 (2003).
doi: 10.1111/j.1654-1103.2003.tb02228.x
Smug, B. J., Szczepaniak, K., Rocha, E. P. C., Dunin-Horkawicz, S. & Mostowy, R. J. Ongoing shuffling of protein fragments diversifies core viral functions linked to interactions with bacterial hosts. Nat. Commun. 14, 7460 (2023).
Pas, C., Latka, A., Fieseler, L. & Briers, Y. Phage tailspike modularity and horizontal gene transfer reveals specificity towards E. coli O-antigen serogroups. Virol. J. 20, 174 (2023).
pubmed: 37550759
pmcid: 10408124
doi: 10.1186/s12985-023-02138-4
Sørensen, A. N., Woudstra, C., Sørensen, M. C. H. & Brøndsted, L. Subtypes of tail spike proteins predicts the host range of Ackermannviridae phages. Comput. Struct. Biotechnol. J. 19, 4854–4867 (2021).
pubmed: 34527194
pmcid: 8432352
doi: 10.1016/j.csbj.2021.08.030
Gencay, Y. E., Gambino, M., Prüssing, T. F. & Brøndsted, L. The genera of bacteriophages and their receptors are the major determinants of host range. Environ. Microbiol. 21, 2095–2111 (2019).
pubmed: 30888719
doi: 10.1111/1462-2920.14597
Hadfield, J. D. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J. Stat. Softw. 33, 1–22 (2010).
Casjens, S. R. & Hendrix, R. W. Bacteriophage lambda: early pioneer and still relevant. Virology 0, 310–330 (2015).
doi: 10.1016/j.virol.2015.02.010
Guo, X., Yin, Y., Dong, C., Yang, G. & Zhou, G. On the class imbalance problem. In 2008 Fourth International Conference on Natural Computation 192–201 (IEEE, 2008).
Abedon, S. T., Danis-Wlodarczyk, K. M. & Wozniak, D. J. Phage cocktail development for bacteriophage therapy: toward improving spectrum of activity breadth and depth. Pharmaceuticals 14, 1019 (2021).
pubmed: 34681243
pmcid: 8541335
doi: 10.3390/ph14101019
Holtappels, D., Alfenas-Zerbini, P. & Koskella, B. Drivers and consequences of bacteriophage host range. FEMS Microbiol. Rev. 47, fuad038 (2023).
pubmed: 37422441
doi: 10.1093/femsre/fuad038
Rocha, E. P. C. & Bikard, D. Microbial defenses against mobile genetic elements and viruses: who defends whom from what? PLoS Biol. 20, e3001514 (2022).
pubmed: 35025885
doi: 10.1371/journal.pbio.3001514
Rousset, F. et al. Phages and their satellites encode hotspots of antiviral systems. Cell Host Microbe 30, 740–753.e5 (2022).
pubmed: 35316646
doi: 10.1016/j.chom.2022.02.018
Tesson, F. Genome assembly of the Escherichia Picard collection. figshare https://doi.org/10.6084/m9.figshare.25941691.v1 (2024).
La Combe, B. et al. Pneumonia-specific Escherichia coli with distinct phylogenetic and virulence profiles, France, 2012–2014. Emerg. Infect. Dis. 25, 710–718 (2019).
pubmed: 30882313
doi: 10.3201/eid2504.180944
Debarbieux, L. et al. Bacteriophages can treat and prevent Pseudomonas aeruginosa lung infections. J. Infect. Dis. 201, 1096–1104 (2010).
pubmed: 20196657
doi: 10.1086/651135
Maura, D. et al. Intestinal colonization by enteroaggregative Escherichia coli supports long-term bacteriophage replication in mice. Environ. Microbiol. 14, 1844–1854 (2012).
pubmed: 22118225
doi: 10.1111/j.1462-2920.2011.02644.x
Dufour, N., Debarbieux, L., Fromentin, M. & Ricard, J.-D. Treatment of highly virulent extraintestinal pathogenic Escherichia coli pneumonia with bacteriophages. Crit. Care Med. 43, e190–e198 (2015).
pubmed: 25803649
doi: 10.1097/CCM.0000000000000968
Dufour, N. et al. Bacteriophage LM33_P1, a fast-acting weapon against the pandemic ST131-O25b:H4 Escherichia coli clonal complex. J. Antimicrob. Chemother. 71, 3072–3080 (2016).
pubmed: 27387322
doi: 10.1093/jac/dkw253
Galtier, M. et al. Bacteriophages to reduce gut carriage of antibiotic resistant uropathogens with low impact on microbiota composition. Environ. Microbiol. 18, 2237–2245 (2016).
pubmed: 26971586
doi: 10.1111/1462-2920.13284
Galtier, M. et al. Bacteriophages targeting adherent invasive Escherichia coli strains as a promising new treatment for Crohn’s disease. J. Crohns Colitis 11, 840–847 (2017).
pubmed: 28130329
De Sordi, L., Khanna, V. & Debarbieux, L. The gut microbiota facilitates drifts in the genetic diversity and infectivity of bacterial viruses. Cell Host Microbe 22, 801–808.e3 (2017).
pubmed: 29174401
doi: 10.1016/j.chom.2017.10.010
Lourenço, M. et al. The spatial heterogeneity of the gut limits predation and fosters coexistence of bacteria and bacteriophages. Cell Host Microbe 28, 390–401.e5 (2020).
pubmed: 32615090
doi: 10.1016/j.chom.2020.06.002
Chiu, C.-L., Clack, N. & The napari Community napari: a Python multi-dimensional image viewer platform for the research community. Microsc. Microanal. 28, 1576–1577 (2022).
doi: 10.1017/S1431927622006328
Gaborieau, B. et al. Prediction of strain level phage–host interactions across the Escherichia genus using only genomic information. GitHub https://github.com/mdmparis/coli_phage_interactions_2023 (2024).
Lamy-Besnier, Q., Brancotte, B., Ménager, H. & Debarbieux, L. Viral Host Range database, an online tool for recording, analyzing and disseminating virus–host interactions. Bioinformatics 37, 2798–2801 (2021).
pubmed: 33594411
doi: 10.1093/bioinformatics/btab070
Trivedi, U. H. et al. Quality control of next-generation sequencing data without a reference. Front. Genet. 5, 111 (2014).
pubmed: 24834071
doi: 10.3389/fgene.2014.00111
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
pubmed: 22506599
doi: 10.1089/cmb.2012.0021
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
pubmed: 24642063
doi: 10.1093/bioinformatics/btu153
Jolley, K. A., Bray, J. E. & Maiden, M. C. J. Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications. Wellcome Open Res. 3, 124 (2018).
pubmed: 30345391
doi: 10.12688/wellcomeopenres.14826.1
Beghain, J., Bridier-Nahmias, A., Le Nagard, H., Denamur, E. & Clermont, O. ClermonTyping: an easy-to-use and accurate in silico method for Escherichia genus strain phylotyping. Microb. Genom. 4, e000192 (2018).
pubmed: 29916797
Perrin, A. & Rocha, E. P. C. PanACoTA: a modular tool for massive microbial comparative genomics. NAR Genom. Bioinform. 3, lqaa106 (2021).
pubmed: 33575648
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
pubmed: 33885785
doi: 10.1093/nar/gkab301
Moraru, C., Varsani, A. & Kropinski, A. M. VIRIDIC–a novel tool to calculate the intergenomic similarities of prokaryote-infecting viruses. Viruses 12, 1268 (2020).
pubmed: 33172115
pmcid: 7694805
doi: 10.3390/v12111268
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
pubmed: 14597658
pmcid: 403769
doi: 10.1101/gr.1239303
Bessonov, K. et al. ECTyper: in silico Escherichia coli serotype and species prediction from raw and assembled whole-genome sequence data. Microb. Genom. 7, 000728 (2021).
pubmed: 34860150
Rendueles, O., Garcia-Garcerà, M., Néron, B., Touchon, M. & Rocha, E. P. C. Abundance and co-occurrence of extracellular capsules increase environmental breadth: implications for the emergence of pathogens. PLoS Pathog. 13, e1006525 (2017).
pubmed: 28742161
doi: 10.1371/journal.ppat.1006525
Mirdita, M., Steinegger, M. & Söding, J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics 35, 2856–2858 (2019).
pubmed: 30615063
pmcid: 6691333
doi: 10.1093/bioinformatics/bty1057
Lam, M. M. C., Wick, R. R., Judd, L. M., Holt, K. E. & Wyres, K. L. Kaptive 2.0: updated capsule and lipopolysaccharide locus typing for the Klebsiella pneumoniae species complex. Microb. Genom. 8, 000800 (2022).
pubmed: 35311639
Rodríguez-Gironés, M. A. & Santamaría, L. A new algorithm to calculate the nestedness temperature of presence–absence matrices. J. Biogeogr. 33, 924–935 (2006).
doi: 10.1111/j.1365-2699.2006.01444.x
Barber, M. J. Modularity and community detection in bipartite networks. Phys. Rev. E 76, 066102 (2007).
doi: 10.1103/PhysRevE.76.066102
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 3, 861 (2018).
doi: 10.21105/joss.00861
Eddy, S. R. Accelerated Profile HMM Searches. PLoS Comput. Biol. 7, e1002195 (2011).
pubmed: 22039361
doi: 10.1371/journal.pcbi.1002195
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
pubmed: 33125078
doi: 10.1093/nar/gkaa913
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
pubmed: 36927031
doi: 10.1126/science.ade2574
van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01773-0 (2023).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825−2830 (2011).
Stock, M., Pahikkala, T., Airola, A., Waegeman, W. & De Baets, B. Algebraic shortcuts for leave-one-out cross-validation in supervised network inference. Brief. Bioinform. 21, 262–271 (2020).
pubmed: 30329015
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
pubmed: 32015543
pmcid: 7056644
doi: 10.1038/s41592-019-0686-2
Baptiste, G., Hugo, V & Florian, T. Prediction of strain level phage–host interactions across the Escherichia genus using only genomic information code and analysis. Zenodo https://doi.org/10.5281/zenodo.13831957 (2024).
Gilchrist, C. L. M. & Chooi, Y.-H. clinker & clustermap.js: automatic generation of gene cluster comparison figures. Bioinformatics 37, 2473–2475 (2021).
pubmed: 33459763
doi: 10.1093/bioinformatics/btab007