Machine learning-aided design and screening of an emergent protein function in synthetic cells.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
05 Mar 2024
Historique:
received: 27 06 2023
accepted: 16 02 2024
medline: 6 3 2024
pubmed: 6 3 2024
entrez: 5 3 2024
Statut: epublish

Résumé

Recently, utilization of Machine Learning (ML) has led to astonishing progress in computational protein design, bringing into reach the targeted engineering of proteins for industrial and biomedical applications. However, the design of proteins for emergent functions of core relevance to cells, such as the ability to spatiotemporally self-organize and thereby structure the cellular space, is still extremely challenging. While on the generative side conditional generative models and multi-state design are on the rise, for emergent functions there is a lack of tailored screening methods as typically needed in a protein design project, both computational and experimental. Here we describe a proof-of-principle of how such screening, in silico and in vitro, can be achieved for ML-generated variants of a protein that forms intracellular spatiotemporal patterns. For computational screening we use a structure-based divide-and-conquer approach to find the most promising candidates, while for the subsequent in vitro screening we use synthetic cell-mimics as established by Bottom-Up Synthetic Biology. We then show that the best screened candidate can indeed completely substitute the wildtype gene in Escherichia coli. These results raise great hopes for the next level of synthetic biology, where ML-designed synthetic proteins will be used to engineer cellular functions.

Identifiants

pubmed: 38443351
doi: 10.1038/s41467-024-46203-0
pii: 10.1038/s41467-024-46203-0
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

2010

Informations de copyright

© 2024. The Author(s).

Références

Huang, P.-S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).
Ferruz, N. & Höcker, B. Controllable protein design with language models. Nat. Mach. Intell. 4, 521–532 (2022).
Ferruz, N. et al. From sequence to function through structure: Deep learning for protein design. Comput. Struct. Biotechnol. J. 21, 238–250 (2023).
pubmed: 36544476 doi: 10.1016/j.csbj.2022.11.014
Bordin, N. et al. Novel machine learning approaches revolutionize protein knowledge. Trends Biochem. Sci. 48, 345–359 (2023).
pubmed: 36504138 pmcid: 10570143 doi: 10.1016/j.tibs.2022.11.001
Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).
pubmed: 37433327 pmcid: 10468394 doi: 10.1038/s41586-023-06415-8
Wang, J. et al. Scaffolding protein functional sites using deep learning. Science 377, 387–394 (2022).
pubmed: 35862514 pmcid: 9621694 doi: 10.1126/science.abn2100
Lu, H. et al. Machine learning-aided engineering of hydrolases for PET depolymerization. Nature 604, 662–667 (2022).
pubmed: 35478237 doi: 10.1038/s41586-022-04599-z
Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).
pubmed: 36702895 pmcid: 10400306 doi: 10.1038/s41587-022-01618-2
Rudden, L. S. P., Hijazi, M. & Barth, P. Deep learning approaches for conformational flexibility and switching properties in protein design. Front Mol. Biosci. 9, 928534 (2022).
pubmed: 36032687 pmcid: 9399439 doi: 10.3389/fmolb.2022.928534
Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12, 3168 (2021).
pubmed: 34039967 pmcid: 8155034 doi: 10.1038/s41467-021-23303-9
Makrodimitris, S., Van Ham, R. C. H. J. & Reinders, M. J. T. Automatic gene function prediction in the 2020’s. Genes (Basel) 11, 1264 (2020).
pubmed: 33120976 doi: 10.3390/genes11111264
Littmann, M., Heinzinger, M., Dallago, C., Olenyi, T. & Rost, B. Embeddings from deep learning transfer GO annotations beyond homology. Sci. Rep. 11, 1–14 (2021). 1160.
doi: 10.1038/s41598-020-80786-0
Kucera, T., Togninalli, M. & Meng-Papaxanthos, L. Conditional generative modeling for de novo protein design with hierarchical functions. Bioinformatics 38, 3454–3461 (2022).
pubmed: 35639661 pmcid: 9237736 doi: 10.1093/bioinformatics/btac353
Munsamy, G., Lindner, S., Lorenz, P. & Ferruz, N. ZymCTRL: a conditional language model for the controllable generation of artificial enzymes. MLSB (2022)
Krishna, R. et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. bioRxiv https://doi.org/10.1101/2023.10.09.561603 (2023)
Kuhlman, B. & Bradley, P. Advances in protein structure prediction and design. Nat. Rev. Mol. Cell Biol. 20, 681–697 (2019).
pubmed: 31417196 pmcid: 7032036 doi: 10.1038/s41580-019-0163-x
Gane, A. et al. ProtNLM: Model-based Natural Language Protein Annotation. Google PrePrint https://storage.googleapis.com/brain-genomics-public/research/proteins/protnlm/uniprot_2022_04/protnlm_preprint_draft.pdf (2022).
Schwille, P. & Frohn, B. P. Hidden protein functions and what they may teach us Synthesizing from the bottom-up. https://doi.org/10.1016/j.tcb.2021.09.006 (2022)
Kohyama, S., Yoshinaga, N., Yanagisawa, M., Fujiwara, K. & Doi, N. Cell-sized confinement controls generation and stability of a protein wave for spatiotemporal regulation in cells. Elife 8 (2019).
Litschel, T., Ramm, B., Maas, R., Heymann, M. & Schwille, P. Beating vesicles: encapsulated protein oscillations cause dynamic membrane deformations. Angew. Chem. Int Ed. Engl. 57, 16286–16290 (2018).
pubmed: 30270475 pmcid: 6391971 doi: 10.1002/anie.201808750
Loose, M., Fischer-Friedrich, E., Ries, J., Kruse, K. & Schwille, P. Spatial regulators for bacterial cell division self-organize into surface waves in vitro. Science 320, 789–792 (2008).
pubmed: 18467587 doi: 10.1126/science.1154413
Glock, P., Brauns, F., Halatek, J., Frey, E. & Schwille, P. Design of biochemical pattern forming systems from minimal motifs. Elife 8 (2019).
Glock, P. et al. Stationary patterns in a two-protein reaction-diffusion system. ACS Synth. Biol. 8, 148–157 (2019).
pubmed: 30571913 doi: 10.1021/acssynbio.8b00415
Ramm, B., Heermann, T. & Schwille, P. The E. coli MinCDE system in the regulation of protein patterns and gradients. Cell. Mol. Life Sci. 76, 4245–4273 (2019).
pubmed: 31317204 pmcid: 6803595 doi: 10.1007/s00018-019-03218-x
Hawkins-Hooker, A. et al. Generating functional protein variants with variational autoencoders. PLoS Comput. Biol. 17, e1008736 (2021).
pubmed: 33635868 pmcid: 7946179 doi: 10.1371/journal.pcbi.1008736
Lee, K. et al. Cell-free biosynthesis of peptidomimetics. Biotechnol. Bioprocess Eng. 28, 905–921 (2023).
doi: 10.1007/s12257-022-0268-5
Repecka, D. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat. Mach. Intell. 3, 324–333 (2021).
doi: 10.1038/s42256-021-00310-5
Russ, W. P., Lowery, D. M., Mishra, P., Yaffe, M. B. & Ranganathan, R. Natural-like function in artificial WW domains. Nature 437, 579–583 (2005).
pubmed: 16177795 doi: 10.1038/nature03990
Socolich, M. et al. Evolutionary information for specifying a protein fold. Nature 437, 512–518 (2005).
pubmed: 16177782 doi: 10.1038/nature03991
Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv https://doi.org/10.1101/2021.10.04.463034 (2022)
Hebditch, M. & Warwicker, J. Web-based display of protein surface and pH-dependent properties for assessing the developability of biotherapeutics. Sci. Rep. 9, 1969 (2019).
pubmed: 30760735 pmcid: 6374528 doi: 10.1038/s41598-018-36950-8
Szeto, T. H., Rowland, S. L., Habrukowich, C. L. & King, G. F. The MinD membrane targeting sequence is a transplantable lipid-binding helix. J. Biol. Chem. 278, 40050–40056 (2003).
pubmed: 12882967 doi: 10.1074/jbc.M306876200
Shih, Y. L. et al. The N-terminal amphipathic helix of the topological specificity factor MinE is associated with shaping membrane curvature. PLoS ONE 6, e21425 (2011).
pubmed: 21738659 pmcid: 3124506 doi: 10.1371/journal.pone.0021425
Hurley, J. Membrane binding domains. Biochim. Biophys. Acta 1761, 805–811 (2006).
pubmed: 16616874 pmcid: 2049088 doi: 10.1016/j.bbalip.2006.02.020
Hebditch, M., Carballo-Amador, M. A., Charonis, S., Curtis, R. & Warwicker, J. Protein–Sol: a web tool for predicting protein solubility from sequence. Bioinformatics 33, 3098–3100 (2017).
pubmed: 28575391 pmcid: 5870856 doi: 10.1093/bioinformatics/btx345
Silverman, A. D., Karim, A. S. & Jewett, M. C. Cell-free gene expression: an expanded repertoire of applications. Nat. Rev. Genet. 21, 151–170 (2020).
pubmed: 31780816 doi: 10.1038/s41576-019-0186-3
Garenne, D. et al. Cell-free gene expression. Nat. Rev. Methods Prim. 1, 49 (2021).
doi: 10.1038/s43586-021-00046-x
Shimizu, Y. et al. Cell-free translation reconstituted with purified components. Nat. Biotechnol. 19, 751–755 (2001).
pubmed: 11479568 doi: 10.1038/90802
Yoshida, A., Kohyama, S., Fujiwara, K., Nishikawa, S. & Doi, N. Regulation of spatiotemporal patterning in artificial cells by a defined protein expression system. Chem. Sci. 10, 11064–11072 (2019).
pubmed: 32190256 pmcid: 7066863 doi: 10.1039/C9SC02441G
Kohyama, S., Merino-Salomón, A. & Schwille, P. In vitro assembly, positioning and contraction of a division ring in minimal cells. Nat. Commun. 13, 6098 (2022).
pubmed: 36243816 pmcid: 9569390 doi: 10.1038/s41467-022-33679-x
Godino, E., Doerr, A. & Danelon, C. Min waves without MinC can pattern FtsA-anchored FtsZ filaments on model membranes. Commun. Biol. 5, 675 (2022).
pubmed: 35798943 pmcid: 9262947 doi: 10.1038/s42003-022-03640-1
Godino, E. et al. De novo synthesized Min proteins drive oscillatory liposome deformation and regulate FtsA-FtsZ cytoskeletal patterns. Nat. Commun. 10, 4969 (2019).
pubmed: 31672986 pmcid: 6823393 doi: 10.1038/s41467-019-12932-w
Hale, C. A. Dynamic localization cycle of the cell division regulator MinE in Escherichia coli. EMBO J. 20, 1563–1572 (2001).
pubmed: 11285221 pmcid: 145461 doi: 10.1093/emboj/20.7.1563
de Boer, P. A. J., Crossley, R. E. & Rothfield, L. I. A division inhibitor and a topological specificity factor coded for by the minicell locus determine proper placement of the division septum in E. coli. Cell 56, 641–649 (1989).
pubmed: 2645057 doi: 10.1016/0092-8674(89)90586-2
Hu, Z. & Lutkenhaus, J. Topological regulation of cell division in E. coli. spatiotemporal oscillation of MinD requires stimulation of its ATPase by MinE and phospholipid. Mol. Cell 7, 1337–1343 (2001).
pubmed: 11430835 doi: 10.1016/S1097-2765(01)00273-8
Ma, L. Y., King, G. & Rothfield, L. Mapping the MinE site involved in interaction with the MinD division site selection protein of Escherichia coli. J. Bacteriol. 185, 4948–4955 (2003).
pubmed: 12897015 pmcid: 166455 doi: 10.1128/JB.185.16.4948-4955.2003
Lackner, L. L., Raskin, D. M. & De Boer, P. A. J. ATP-dependent interactions between Escherichia coli Min proteins and the phospholipid membrane in vitro. J. Bacteriol. 185, 735–749 (2003).
pubmed: 12533449 pmcid: 142821 doi: 10.1128/JB.185.3.735-749.2003
Hu, Z., Saez, C. & Lutkenhaus, J. Recruitment of MinC, an Inhibitor of Z-Ring Formation, to the Membrane in Escherichia coli: Role of MinD and MinE. J. Bacteriol. 185, 196–203 (2003).
pubmed: 12486056 pmcid: 141945 doi: 10.1128/JB.185.1.196-203.2003
Hu, Z. & Lutkenhaus, J. Topological regulation of cell division in E. coli: spatiotemporal oscillation of mind requires stimulation of its ATPase by MinE and phospholipid. Mol. Cell 7, 1337–1343 (2001).
pubmed: 11430835 doi: 10.1016/S1097-2765(01)00273-8
Park, K. T. et al. The Min oscillator uses MinD-dependent conformational changes in MinE to spatially regulate cytokinesis. Cell 146, 396–407 (2011).
pubmed: 21816275 pmcid: 3155264 doi: 10.1016/j.cell.2011.06.042
Kohyama, S., Fujiwara, K., Yoshinaga, N. & Doi, N. Conformational equilibrium of MinE regulates the allowable concentration ranges of a protein wave for cell division. Nanoscale 12, 11960–11970 (2020).
pubmed: 32458918 doi: 10.1039/D0NR00242A
Park, K. T., Villar, M. T., Artigues, A. & Lutkenhaus, J. MinE conformational dynamics regulate membrane binding, MinD interaction, and Min oscillation. Proc. Natl Acad. Sci. USA 114, 7497–7504 (2017).
pubmed: 28652337 pmcid: 5530704 doi: 10.1073/pnas.1707385114
Linke, H., Höcker, B., Furuta, K., Forde, N. R. & Curmi, P. M. G. Synthetic biology approaches to dissecting linear motor protein function: towards the design and synthesis of artificial autonomous protein walkers. Biophys. Rev. 12, 1041–1054 (2020).
pubmed: 32651904 pmcid: 7429643 doi: 10.1007/s12551-020-00717-1
Halatek, J., Brauns, F. & Frey, E. Self-organization principles of intracellular pattern formation. Philos. Trans. R. Soc. B: Biol. Sci. 373, 20170107 (2018).
doi: 10.1098/rstb.2017.0107
Richoux, F., Servantie, C., Borès, C. & Téletchéa, S. Comparing two deep learning sequence-based models for protein-protein interaction prediction. arXiv https://doi.org/10.48550/arXiv.1901.06268 (2019).
Ramirez‐Arcos, S. et al. Conservation of dynamic localization among MinD and MinE orthologues: oscillation of Neisseria gonorrhoeae proteins in Escherichia coli. Mol. Microbiol 46, 493–504 (2002).
pubmed: 12406224 doi: 10.1046/j.1365-2958.2002.03168.x
Paysan-Lafosse, T. et al. InterPro in 2022. Nucleic Acids Res. 51, D418–D427 (2023).
pubmed: 36350672 doi: 10.1093/nar/gkac993
Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
pubmed: 16731699 doi: 10.1093/bioinformatics/btl158
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7 (2011).
Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
pubmed: 33125078 doi: 10.1093/nar/gkaa913
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019).
Openai, I. G. NIPS 2016 Tutorial: Generative Adversarial Networks. arXiv https://doi.org/10.48550/arXiv.1701.00160 (2016).
Tareen, A. & Kinney, J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2020).
pubmed: 31821414 doi: 10.1093/bioinformatics/btz921
Madeira, F. et al. Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Res. 50, W276–W279 (2022).
pubmed: 35412617 pmcid: 9252731 doi: 10.1093/nar/gkac240
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
pubmed: 22743772 doi: 10.1038/nmeth.2019
Campbell, B. C. et al. mGreenLantern: a bright monomeric fluorescent protein with rapid expression and cell filling properties for neuronal imaging. Proc. Natl Acad. Sci. USA 117, 30710–30721 (2020).
pubmed: 33208539 pmcid: 7720163 doi: 10.1073/pnas.2000942117
Ramm, B., Glock, P. & Schwille, P. In vitro reconstitution of self-organizing protein patterns on supported lipid bilayers. J. Vis. Exp. 2018 (2018).
Kohyama, S., Fujiwara, K., Yoshinaga, N. Self-organization assay for min proteins of Escherichia coli in micro-droplets covered with lipids. Bio Protoc. 10 (2020).

Auteurs

Shunshi Kohyama (S)

Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany.

Béla P Frohn (BP)

Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany.

Leon Babl (L)

Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany.

Petra Schwille (P)

Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany. schwille@biochem.mpg.de.

Classifications MeSH