Inverse design of 3d molecular structures with conditional generative neural networks.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
21 02 2022
Historique:
received: 10 09 2021
accepted: 28 01 2022
entrez: 22 2 2022
pubmed: 23 2 2022
medline: 23 2 2022
Statut: epublish

Résumé

The rational design of molecules with desired properties is a long-standing challenge in chemistry. Generative neural networks have emerged as a powerful approach to sample novel molecules from a learned distribution. Here, we propose a conditional generative neural network for 3d molecular structures with specified chemical and structural properties. This approach is agnostic to chemical bonding and enables targeted sampling of novel molecules from conditional distributions, even in domains where reference calculations are sparse. We demonstrate the utility of our method for inverse design by generating molecules with specified motifs or composition, discovering particularly stable molecules, and jointly targeting multiple electronic properties beyond the training regime.

Identifiants

pubmed: 35190542
doi: 10.1038/s41467-022-28526-y
pii: 10.1038/s41467-022-28526-y
pmc: PMC8861047
doi:

Banques de données

figshare
['10.6084/m9.figshare.978904']

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

973

Informations de copyright

© 2022. The Author(s).

Références

Hajduk, P. J. & Greer, J. A decade of fragment-based drug design: Strategic advances and lessons learned. Nat. Rev. Drug Discov. 6, 211–219 (2007).
pubmed: 17290284 doi: 10.1038/nrd2220
Mandal, S., Moudgil, M. & Mandal, S. K. Rational drug design. Eur. J. Pharmacol 625, 90–100 (2009).
pubmed: 19835861 doi: 10.1016/j.ejphar.2009.06.065
Gantzer, P., Creton, B. & Nieto-Draghi, C. Inverse-QSPR for de novo design: A review. Mol. Inf. 39, 1900087 (2020).
doi: 10.1002/minf.201900087
Freeze, J. G., Kelly, H. R. & Batista, V. S. Search for catalysts by inverse design: Artificial intelligence, mountain climbers, and alchemists. Chem. Rev. 119, 6595–6612 (2019).
pubmed: 31059236 doi: 10.1021/acs.chemrev.8b00759
Kang, K., Meng, Y. S., Breger, J., Grey, C. P. & Ceder, G. Electrodes with high power and high capacity for rechargeable lithium batteries. Science 311, 977–980 (2006).
pubmed: 16484487 doi: 10.1126/science.1122152
Hautier, G. et al. Novel mixed polyanions lithium-ion battery cathode materials predicted by high-throughput ab initio computations. J. Mater. Chem. 21, 17147–17153 (2011).
doi: 10.1039/c1jm12216a
Scharber, M. C. et al. Design rules for donors in bulk-heterojunction solar cells–towards 10% energy-conversion efficiency. Adv. Mater. 18, 789–794 (2006).
doi: 10.1002/adma.200501717
Yu, L., Kokenyesi, R. S., Keszler, D. A. & Zunger, A. Inverse design of high absorption thin-film photovoltaic materials. Adv. Energy Mater. 3, 43–48 (2013).
doi: 10.1002/aenm.201200538
Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).
pubmed: 30046072 doi: 10.1038/s41586-018-0337-2
von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).
doi: 10.1038/s41570-020-0189-9
Schüttet, K. et al. Machine Learning Meets Quantum Physics, volume 968 of Lecture Notes in Physics (Springer International Publishing, 2020).
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
pubmed: 33705118 pmcid: 8391964 doi: 10.1021/acs.chemrev.0c01111
Westermayr, J., Gastegger, M., Schütt, K. T. & Maurer, R. J. Perspective on integrating machine learning into computational chemistry and materials science. J. Chem. Phys. 154, 230903 (2021).
pubmed: 34241249 doi: 10.1063/5.0047760
Ceriotti, M., Clementi, C. & Anatole von Lilienfeld, O. Machine learning meets chemical physics. J. Chem. Phys. 154, 160401 (2021).
pubmed: 33940847 doi: 10.1063/5.0051418
Keith, J. A. et al. Combining machine learning and computational chemistry for predictive insights into chemical systems. Chem. Rev. 121, 9816–9872 (2021).
pubmed: 34232033 pmcid: 8391798 doi: 10.1021/acs.chemrev.1c00107
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
pubmed: 17501293 doi: 10.1103/PhysRevLett.98.146401
Rupp, M., Tkatchenko, A., Müller, K.-R. & Von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108, 058301 (2012).
pubmed: 22400967 doi: 10.1103/PhysRevLett.108.058301
Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8, 13890 (2017a).
pubmed: 28067221 pmcid: 5228054 doi: 10.1038/ncomms13890
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1263–1272 (PMLR, 2017).
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
pubmed: 28507695 pmcid: 5414547 doi: 10.1039/C6SC05720A
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet—A deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
pubmed: 29960322 doi: 10.1063/1.5019779
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machinelearned force fields. Nat. Commun. 9, 3887 (2018).
pubmed: 30250077 pmcid: 6155327 doi: 10.1038/s41467-018-06169-2
Unke, O. T. & Meuwly, M. PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
pubmed: 31042390 doi: 10.1021/acs.jctc.9b00181
Klicpera, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (ICLR) https://openreview.net/forum?id=B1eWbxStPH (2020).
Christensen, A. S., Bratholm, L. A., Faber, F. A. & Anatole von Lilienfeld, O. FCHL revisited: Faster and more accurate quantum machine learning. J. Chem. Phys. 152, 044107 (2020).
pubmed: 32007071 doi: 10.1063/1.5126701
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. arXiv preprint arXiv 2101.03164 (2021).
Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In Proc. 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 9377–9388 (PMLR, 2021).
Zunger, A. Inverse design in search of materials with target functionalities. Nat. Rev. Chem. 2, 1–16 (2018).
doi: 10.1038/s41570-018-0121
Sanchez-Lengeling, B. & Aspuru-Guzik, A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 361, 360–365 (2018).
pubmed: 30049875 doi: 10.1126/science.aat2663
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
doi: 10.1021/ci00057a005
Elton, D. C., Boukouvalas, Z., Fuge, M. D. & Chung, P. W. Deep learning for molecular design–a review of the state of the art. Mol. Syst. Des. Eng 4, 828–849 (2019).
doi: 10.1039/C9ME00039A
Mansimov, E., Mahmood, O., Kang, S. & Cho, K. Molecular geometry prediction using a deep generative graph neural network. Sci. Rep. 9, 1–13 (2019).
doi: 10.1038/s41598-019-56773-5
Simm, G. & Hernandez-Lobato, J. M. A generative model for molecular distance geometry. In Proc. 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8949–8958 (PMLR, 2020).
Gogineni, T. et al. Torsionnet: A reinforcement learning approach to sequential conformer search. Adv. Neur 33, 20142–20153 (2020).
Xu, M., Luo, S., Bengio, Y., Peng, J. & Tang, J. Learning neural generative dynamics for molecular conformation generation. In International Conference on Learning Representations, https://openreview.net/forum?id=pAbm1qfheGk (2021a).
Xu, M. et al. An end-to-end framework for molecular conformation generation via bilevel programming. In Proc. 38
Ganea, O.-E. et al. GeoMol: Torsional geometric generation of molecular 3d conformer ensembles. arXiv preprint arXiv:2106.07802 (2021).
Lemm, D., von Rudorff, G. F. & von Lilienfeld, O. A. Machine learning based energy-free structure predictions of molecules, transition states, and solids. Nat. Commun. 12, 4468 (2021).
pubmed: 34294693 pmcid: 8298673 doi: 10.1038/s41467-021-24525-7
Stieffenhofer, M., Bereau, T. & Wand, M. Adversarial reverse mapping of condensed-phase molecular structures: Chemical transferability. APL Mater 9, 031107 (2021).
doi: 10.1063/5.0039102
Noé, F., Olsson, S., Köhler, J. & Wu, H. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science 365, eaaw1147 (2019).
pubmed: 31488660 doi: 10.1126/science.aaw1147
Köhler, J., Klein, L. & Noe, F. Equivariant flows: Exact likelihood generative learning for symmetric densities. In Proc. 37
Ingraham, J., Riesselman, A., Sander, C. & Marks, D. Learning protein structure with a differentiable simulator. In International Conference on Learning Representations, https://openreview.net/forum?id=Byg3y3C9Km (2018).
Lemke, T. & Peter, C. Encodermap: Dimensionality reduction and generation of molecule conformations. J. Chem. Theory Comput. 15, 1209–1215 (2019).
pubmed: 30632745 doi: 10.1021/acs.jctc.8b00975
AlQuraishi, M. End-to-end differentiable learning of protein structure. Cell Syst 8, 292–301 (2019).
pubmed: 31005579 pmcid: 6513320 doi: 10.1016/j.cels.2019.03.006
Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).
pubmed: 31942072 doi: 10.1038/s41586-019-1923-7
Jumperet, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
doi: 10.1038/s41586-021-03819-2
Gebauer, N. W. A., Gastegger, M. and Schütt, K. T. Generating equilibrium molecules with deep neural networks. NeurIPS Workshop on Machine Learning for Molecules and Materials, arXiv:1810.11347 (2018).
Gebauer, N., Gastegger, M. & Schütt, K. Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules. In Advances in Neural Information Processing Systems 32, pages 7566–7578 (Curran Associates, Inc., 2019).
Hoffmann, M. & Noé, F. Generating valid euclidean distance matrices. arXiv preprint arXiv:1910.03131 (2019).
Nesterov, V., Wieser, M. & Roth, V. 3DMolNet: A generative network for molecular structures. arXiv preprint arXiv:2010.06477 (2020).
Simm, G., Pinsler, R. & Hernandez-Lobato, J. M. Reinforcement learning for molecular design guided by quantum mechanics. In Proc. 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 8959–8969 (PMLR, 2020).
Simm, G. N. C., Pinsler, R. Csányi, G. & Hernández-Lobato, J. M. Symmetry-aware actor-critic for 3d molecular design. In International Conference on Learning Representations, https://openreview.net/forum?id=jEYKjPE1xYN (2021).
Li, Y., Pei, J. & Lai, L. Learning to design drug-like molecules in three-dimensional space using deep generative models. arXiv preprint arXiv:2104.08474 (2021).
Joshi, R. P. et al. 3D-Scaffold: A deep learning framework to generate 3d coordinates of drug-like molecules with desired scaffolds. J. Phys. Chem. B 125, 12166–12176 (2021).
pubmed: 34662142 doi: 10.1021/acs.jpcb.1c06437
Satorras, V. G., Hoogeboom, E., Fuchs, F. B., Posner, I. & Welling, M. E(n) equivariant normalizing flows. arXiv preprint arXiv:2105.09016 (2021).
Meldgaard, S. A. et al. Generating stable molecules using imitation and reinforcement learning. Mach. Learn. Sci. Technol 3, 015008 (2022).
doi: 10.1088/2632-2153/ac3eb4
O’Boyle, N. M. et al. Open Babel: An open chemical toolbox. J. Cheminf. 3, 33 (2011).
doi: 10.1186/1758-2946-3-33
Ramakrishnan, R., Dral, P. O., Rupp, M. & von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 140022 (2014).
pubmed: 25977779 pmcid: 4322582 doi: 10.1038/sdata.2014.22
Reymond, J.-L. The chemical space project. Acc. Chem. Res. 48, 722–730 (2015).
pubmed: 25687211 doi: 10.1021/ar500432k
Ruddigkeit, L., Van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).
pubmed: 23088335 doi: 10.1021/ci300415d
Zubatyuk, R., Smith, J. S., Leszczynski, J. & Isayev, O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 5, eaav6490 (2019).
pubmed: 31448325 pmcid: 6688864 doi: 10.1126/sciadv.aav6490
Glavatskikh, M., Leguy, J., Hunault, G., Cauchy, T. & Da Mota, B. Dataset’s chemical diversity limits the generalizability of machine learning predictions. J. Cheminf. 11, 1–15 (2019).
doi: 10.1186/s13321-019-0391-2
Huang, B. & von Lilienfeld, O. A. Quantum machine learning using atom-in-molecule-based fragments selected on the fly. Nat. Chem. 12, 945–951 (2020).
pubmed: 32929248 doi: 10.1038/s41557-020-0527-z
Gastegger, M., Kauffmann, C., Behler, J. & Marquetand, P. Comparing the accuracy of high-dimensional neural network potentials and the systematic molecular fragmentation method: A benchmark study for all-trans alkanes. J. Chem. Phys. 144, 194110 (2016).
pubmed: 27208939 doi: 10.1063/1.4950815
Gastegger, M. & Behler, J. Machine learning molecular dynamics for the simulation of infrared spectra. Chem. Sci. 8, 6924–6935 (2017).
pubmed: 29147518 pmcid: 5636952 doi: 10.1039/C7SC02267K
Ramachandran, P. & Varoquaux, G. Mayavi: 3D visualization of scientific data. Comput Sci. Eng. 13, 40–51 (2011). ISSN 1521-9615.
Schütt, K. et al. SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. In Advances in Neural Information Processing Systems 30, pages 992–1002 (Curran Associates, Inc., 2017b).
Schütt, K. T. et al. SchNetPack: A deep learning toolbox for atomistic systems. J. Chem. Theory Comput. 15, 448–455 (2019).
pubmed: 30481453 doi: 10.1021/acs.jctc.8b00908
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. International Conference for Learning Representations, arXiv:1412.6980, 2014.
RDKit, online. RDKit: Open-source cheminformatics. http://www.rdkit.org (2021).
Gebauer, N. W. A., Gastegger, M., Hessmann, S. S. P., Müller, K.-R. & Schütt, K. T. atomistic-machine-learning/cG-SchNet: Inverse design of 3d molecular structures with conditional generative neural networks. Zenodo https://doi.org/10.5281/zenodo.5907027 (2022).

Auteurs

Niklas W A Gebauer (NWA)

Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany. n.gebauer@tu-berlin.de.
Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany. n.gebauer@tu-berlin.de.
BASLEARN-TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587, Berlin, Germany. n.gebauer@tu-berlin.de.

Michael Gastegger (M)

Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
BASLEARN-TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587, Berlin, Germany.

Stefaan S P Hessmann (SSP)

Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany.

Klaus-Robert Müller (KR)

Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany.
Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
Max-Planck-Institut für Informatik, 66123, Saarbrücken, Germany.

Kristof T Schütt (KT)

Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany. kristof.schuett@tu-berlin.de.
Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany. kristof.schuett@tu-berlin.de.

Classifications MeSH