Extending machine learning beyond interatomic potentials for predicting molecular properties.


Journal

Nature reviews. Chemistry
ISSN: 2397-3358
Titre abrégé: Nat Rev Chem
Pays: England
ID NLM: 101703631

Informations de publication

Date de publication:
Sep 2022
Historique:
accepted: 15 07 2022
medline: 29 4 2023
pubmed: 29 4 2023
entrez: 28 4 2023
Statut: ppublish

Résumé

Machine learning (ML) is becoming a method of choice for modelling complex chemical processes and materials. ML provides a surrogate model trained on a reference dataset that can be used to establish a relationship between a molecular structure and its chemical properties. This Review highlights developments in the use of ML to evaluate chemical properties such as partial atomic charges, dipole moments, spin and electron densities, and chemical bonding, as well as to obtain a reduced quantum-mechanical description. We overview several modern neural network architectures, their predictive capabilities, generality and transferability, and illustrate their applicability to various chemical properties. We emphasize that learned molecular representations resemble quantum-mechanical analogues, demonstrating the ability of the models to capture the underlying physics. We also discuss how ML models can describe non-local quantum effects. Finally, we conclude by compiling a list of available ML toolboxes, summarizing the unresolved challenges and presenting an outlook for future development. The observed trends demonstrate that this field is evolving towards physics-based models augmented by ML, which is accompanied by the development of new methods and the rapid growth of user-friendly ML frameworks for chemistry.

Identifiants

pubmed: 37117713
doi: 10.1038/s41570-022-00416-3
pii: 10.1038/s41570-022-00416-3
doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

653-672

Informations de copyright

© 2022. Springer Nature Limited.

Références

Purvis, G. D. & Bartlett, R. J. A full coupled-cluster singles and doubles model: the inclusion of disconnected triples. J. Chem. Phys. 76, 1910–1918 (1982).
doi: 10.1063/1.443164
Burke, K. Perspective on density functional theory. J. Chem. Phys. 136, 150901 (2012).
pubmed: 22519306 doi: 10.1063/1.4704546
Mardirossian, N. & Head-Gordon, M. Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals. Mol. Phys. 115, 2315–2372 (2017).
doi: 10.1080/00268976.2017.1333644
Thiel, W. Semiempirical quantum–chemical methods. WIREs Comput. Mol. Sci. 4, 145–157 (2014).
doi: 10.1002/wcms.1161
Ratcliff, L. E. et al. Challenges in large scale quantum mechanical calculations. WIREs Comput. Mol. Sci. 7, e1290 (2017).
doi: 10.1002/wcms.1290
von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).
doi: 10.1038/s41570-020-0189-9
Keith, J. A. et al. Combining machine learning and computational chemistry for predictive insights into chemical systems. Chem. Rev. 121, 9816–9872 (2021).
pubmed: 34232033 pmcid: 8391798 doi: 10.1021/acs.chemrev.1c00107
Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).
pubmed: 30046072 doi: 10.1038/s41586-018-0337-2
Dral, P. O. Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett. 11, 2336–2347 (2020).
pubmed: 32125858 doi: 10.1021/acs.jpclett.9b03664
Musil, F. et al. Physics-inspired structural representations for molecules and materials. Chem. Rev. 121, 9759–9815 (2021).
pubmed: 34310133 doi: 10.1021/acs.chemrev.1c00021
Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).
pubmed: 30976107 pmcid: 6552674 doi: 10.1038/s41573-019-0024-5
David, L., Thakkar, A., Mercado, R. & Engkvist, O. Molecular representations in AI-driven drug discovery: a review and practical guide. J. Cheminform. 12, 56 (2020).
pubmed: 33431035 pmcid: 7495975 doi: 10.1186/s13321-020-00460-5
Popova, M., Isayev, O. & Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).
pubmed: 30050984 pmcid: 6059760 doi: 10.1126/sciadv.aap7885
Pollice, R. et al. Data-driven strategies for accelerated materials design. Acc. Chem. Res. 54, 849–860 (2021).
pubmed: 33528245 pmcid: 7893702 doi: 10.1021/acs.accounts.0c00785
Guo, H., Wang, Q., Stuke, A., Urban, A. & Artrith, N. Accelerated atomistic modeling of solid-state battery materials with machine learning. Front. Energy Res. 9, 265 (2021).
doi: 10.3389/fenrg.2021.695902
Kulichenko, M. et al. The rise of neural networks for materials and chemical dynamics. J. Phys. Chem. Lett. 12, 6227–6243 (2021).
pubmed: 34196559 doi: 10.1021/acs.jpclett.1c01357
Behler, J. Four generations of high-dimensional neural network potentials. Chem. Rev. 121, 10037–10072 (2021).
pubmed: 33779150 doi: 10.1021/acs.chemrev.0c00868
Gokcan, H. & Isayev, O. Learning molecular potentials with neural networks. WIREs Comput. Mol. Sci. 12, e1564.
Dral, P. O. & Barbatti, M. Molecular excited states through a machine learning lens. Nat. Rev. Chem. 5, 388–405 (2021).
doi: 10.1038/s41570-021-00278-1
Westermayr, J. & Marquetand, P. Machine learning for electronically excited states of molecules. Chem. Rev. 121, 9873–9926 (2021).
pubmed: 33211478 doi: 10.1021/acs.chemrev.0c00749
Jorner, K., Tomberg, A., Bauer, C., Sköld, C. & Norrby, P.-O. Organic reactivity from mechanism to machine learning. Nat. Rev. Chem. 5, 240–255 (2021).
doi: 10.1038/s41570-021-00260-x
Gallegos, L. C., Luchini, G., St. John, P. C., Kim, S. & Paton, R. S. Importance of engineered and learned molecular representations in predicting organic reactivity, selectivity, and chemical properties. Acc. Chem. Res. 54, 827–836 (2021).
pubmed: 33534534 doi: 10.1021/acs.accounts.0c00745
Toyao, T. et al. Machine learning for catalysis informatics: recent applications and prospects. ACS Catal. 10, 2260–2297 (2020).
doi: 10.1021/acscatal.9b04186
Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119, 10520–10594 (2019).
pubmed: 31294972 doi: 10.1021/acs.chemrev.8b00728
Granda, J. M., Donina, L., Dragone, V., Long, D.-L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).
pubmed: 30022133 pmcid: 6223543 doi: 10.1038/s41586-018-0307-8
Gao, H. et al. Using machine learning to predict suitable conditions for organic reactions. ACS Cent. Sci. 4, 1465–1476 (2018).
pubmed: 30555898 pmcid: 6276053 doi: 10.1021/acscentsci.8b00357
Bartók, A. P. & Csányi, G. Gaussian approximation potentials: a brief tutorial introduction. Int. J. Quantum Chem. 115, 1051–1057 (2015).
doi: 10.1002/qua.24927
Thompson, A. P., Swiler, L. P., Trott, C. R., Foiles, S. M. & Tucker, G. J. Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 285, 316–330 (2015).
doi: 10.1016/j.jcp.2014.12.018
Novikov, I. S., Gubaev, K., Podryabinkin, E. V. & Shapeev, A. V. The MLIP package: moment tensor potentials with MPI and active learning. Mach. Learn. Sci. Technol. 2, 025002 (2021).
doi: 10.1088/2632-2153/abc9fe
Chmiela, S., Sauceda, H. E., Poltavsky, I., Müller, K.-R. & Tkatchenko, A. sGDML: Constructing accurate and data efficient molecular force fields using machine learning. Comput. Phys. Commun. 240, 38–45 (2019).
doi: 10.1016/j.cpc.2019.02.007
Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9, 3887 (2018).
pubmed: 30250077 pmcid: 6155327 doi: 10.1038/s41467-018-06169-2
Gubaev, K., Podryabinkin, E. V. & Shapeev, A. V. Machine learning of molecular properties: locality and active learning. J. Chem. Phys. 148, 241727 (2018).
pubmed: 29960350 doi: 10.1063/1.5005095
Behler, J. Perspective: machine learning potentials for atomistic simulations. J. Chem. Phys. 145, 170901 (2016).
pubmed: 27825224 doi: 10.1063/1.4966192
Behler, J. & Csányi, G. Machine learning potentials for extended systems: a perspective. Eur. Phys. J. B 94, 142 (2021).
doi: 10.1140/epjb/s10051-021-00156-1
Daw, M. S., Foiles, S. M. & Baskes, M. I. The embedded-atom method: a review of theory and applications. Mater. Sci. Rep. 9, 251–310 (1993).
doi: 10.1016/0920-2307(93)90001-U
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
pubmed: 17501293 doi: 10.1103/PhysRevLett.98.146401
Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).
pubmed: 21341827 doi: 10.1063/1.3553717
Behler, J. Constructing high-dimensional neural network potentials: a tutorial review. Int. J. Quantum Chem. 115, 1032–1050 (2015).
doi: 10.1002/qua.24890
Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat. Commun. 12, 398 (2021).
pubmed: 33452239 pmcid: 7811002 doi: 10.1038/s41467-020-20427-2
Ruddigkeit, L., van Deursen, R., Blum, L. C. & Reymond, J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52, 2864–2875 (2012).
pubmed: 23088335 doi: 10.1021/ci300415d
Devereux, C. et al. Extending the applicability of the ANI deep learning molecular potential to sulfur and halogens. J. Chem. Theory Comput. 16, 4192–4202 (2020).
pubmed: 32543858 doi: 10.1021/acs.jctc.0c00121
Zubatyuk, R., Smith, J. S., Leszczynski, J. & Isayev, O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 5, eaav6490 (2019).
pubmed: 31448325 pmcid: 6688864 doi: 10.1126/sciadv.aav6490
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet — a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
pubmed: 29960322 doi: 10.1063/1.5019779
Schütt, K. T. et al. SchNetPack: a deep learning toolbox for atomistic systems. J. Chem. Theory Comput. 15, 448–455 (2019).
pubmed: 30481453 doi: 10.1021/acs.jctc.8b00908
Unke, O. T. & Meuwly, M. PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
pubmed: 31042390 doi: 10.1021/acs.jctc.9b00181
Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. Preprint at arXiv https://doi.org/10.48550/arXiv.2003.03123 (2020).
doi: 10.48550/arXiv.2003.03123
Gasteiger, J., Giri, S., Margraf, J. T. & Günnemann, S. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. Preprint at arXiv https://doi.org/10.48550/arXiv.2011.14115 (2020).
doi: 10.48550/arXiv.2011.14115
Mueller, T., Hernandez, A. & Wang, C. Machine learning for interatomic potential models. J. Chem. Phys. 152, 050902 (2020).
pubmed: 32035452 doi: 10.1063/1.5126336
Glick, Z. L., Koutsoukas, A., Cheney, D. L. & Sherrill, C. D. Cartesian message passing neural networks for directional properties: fast and transferable atomic multipoles. J. Chem. Phys. 154, 224103 (2021).
pubmed: 34241239 doi: 10.1063/5.0050444
Lubbers, N., Smith, J. S. & Barros, K. Hierarchical modeling of molecular energies using a deep neural network. J. Chem. Phys. 148, 241715 (2018).
pubmed: 29960311 doi: 10.1063/1.5011181
Nebgen, B. et al. Transferable dynamic molecular charge assignment using deep neural networks. J. Chem. Theory Comput. 14, 4687–4698 (2018).
pubmed: 30064217 doi: 10.1021/acs.jctc.8b00524
Sifain, A. E. et al. Discovering a transferable charge assignment model using machine learning. J. Phys. Chem. Lett. 9, 4495–4501 (2018).
pubmed: 30039707 doi: 10.1021/acs.jpclett.8b01939
Magedov, S., Koh, C., Malone, W., Lubbers, N. & Nebgen, B. Bond order predictions using deep neural networks. J. Appl. Phys. 129, 064701 (2021).
doi: 10.1063/5.0016011
Zubatiuk, T. et al. Machine learned Hückel theory: interfacing physics and deep neural networks. J. Chem. Phys. 154, 244108 (2021).
pubmed: 34241371 doi: 10.1063/5.0052857
Caruana, R. Multitask learning. Mach. Learn. 28, 41–75 (1997).
doi: 10.1023/A:1007379606734
Sifain, A. E. et al. Predicting phosphorescence energies and inferring wavefunction localization with machine learning. Chem. Sci. 12, 10207–10217 (2021).
pubmed: 34447529 pmcid: 8336587 doi: 10.1039/D1SC02136B
Tretiak, S. & Mukamel, S. Density matrix analysis and simulation of electronic excitations in conjugated and aggregated molecules. Chem. Rev. 102, 3171–3212 (2002).
pubmed: 12222985 doi: 10.1021/cr0101252
Bader, R. F. W. Atoms in Molecules: a Quantum Theory (Clarendon Press, 1994).
Zubatyuk, R., Smith, J. S., Nebgen, B. T., Tretiak, S. & Isayev, O. Teaching a neural network to attach and detach electrons from molecules. Nat. Commun. 12, 4870 (2021).
pubmed: 34381051 pmcid: 8357920 doi: 10.1038/s41467-021-24904-0
Smith, J. S., Nebgen, B., Lubbers, N., Isayev, O. & Roitberg, A. E. Less is more: sampling chemical space with active learning. J. Chem. Phys. 148, 241733 (2018).
pubmed: 29960353 doi: 10.1063/1.5023802
Miksch, A. M., Morawietz, T., Kästner, J., Urban, A. & Artrith, N. Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations. Mach. Learn. Sci. Technol. 2, 031001 (2021).
doi: 10.1088/2632-2153/abfd96
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1, a data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci. Data 4, 170193 (2017).
pubmed: 29257127 pmcid: 5735918 doi: 10.1038/sdata.2017.193
Smith, J. S. et al. The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data 7, 134 (2020).
pubmed: 32358545 pmcid: 7195467 doi: 10.1038/s41597-020-0473-z
Chambers, J. et al. UniChem: a unified chemical structure cross-referencing and identifier tracking system. J. Cheminform. 5, 3 (2013).
pubmed: 23317286 pmcid: 3616875 doi: 10.1186/1758-2946-5-3
Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).
pubmed: 29629118 doi: 10.1039/C7SC02664A
Ramakrishnan, R., Dral, P. O., Rupp, M. & von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1, 140022 (2014).
pubmed: 25977779 pmcid: 4322582 doi: 10.1038/sdata.2014.22
Nakata, M. & Shimazaki, T. PubChemQC project: a large-scale first-principles electronic structure database for data-driven chemistry. J. Chem. Inf. Model. 57, 1300–1308 (2017).
pubmed: 28481528 doi: 10.1021/acs.jcim.7b00083
Curtarolo, S. et al. AFLOW: an automatic framework for high-throughput materials discovery. Comput. Mater. Sci. 58, 218–226 (2012).
doi: 10.1016/j.commatsci.2012.02.005
Pinheiro, G. A. et al. Machine learning prediction of nine molecular properties based on the SMILES representation of the QM9. J. Phys. Chem. A 124, 9854–9866 (2020).
pubmed: 33174750 doi: 10.1021/acs.jpca.0c05969
Wießner, M. et al. Complete determination of molecular orbitals by measurement of phase symmetry and electron density. Nat. Commun. 5, 4156 (2014).
pubmed: 24910256 doi: 10.1038/ncomms5156
Gao, W. et al. Real-space charge-density imaging with sub-ångström resolution by four-dimensional electron microscopy. Nature 575, 480–484 (2019).
pubmed: 31610544 doi: 10.1038/s41586-019-1649-6
Hirshfeld, F. L. Bonded-atom fragments for describing molecular charge densities. Theor. Chim. Acta 44, 129–138 (1977).
doi: 10.1007/BF00549096
Marenich, A. V., Jerome, S. V., Cramer, C. J. & Truhlar, D. G. Charge Model 5: an extension of Hirshfeld population analysis for the accurate description of molecular interactions in gaseous and condensed phases. J. Chem. Theory Comput. 8, 527–541 (2012).
pubmed: 26596602 doi: 10.1021/ct200866d
Singh, U. C. & Kollman, P. A. An approach to computing electrostatic charges for molecules. J. Comput. Chem. 5, 129–145 (1984).
doi: 10.1002/jcc.540050204
Glendening, E. D., Landis, C. R. & Weinhold, F. Natural bond orbital methods. WIREs Comput. Mol. Sci. 2, 1–42 (2012).
doi: 10.1002/wcms.51
Pérez de la Luz, A., Aguilar-Pineda, J. A., Méndez-Bermúdez, J. G. & Alejandre, J. Force field parametrization from the hirshfeld molecular electronic density. J. Chem. Theory Comput. 14, 5949–5958 (2018).
pubmed: 30278120 doi: 10.1021/acs.jctc.8b00554
Honda, S., Yamasaki, K., Sawada, Y. & Morii, H. 10 residue folded peptide designed by segment statistics. Structure 12, 1507–1518 (2004).
pubmed: 15296744 doi: 10.1016/j.str.2004.05.022
Neidigh, J. W., Fesinmeyer, R. M. & Andersen, N. H. Designing a 20-residue protein. Nat. Struct. Mol. Biol. 9, 425–430 (2002).
doi: 10.1038/nsb798
Ševčík, J. et al. Structure of glucoamylase from Saccharomycopsis fibuligera at 1.7 Å resolution. Acta Cryst. D. 54, 854–866 (1998).
doi: 10.1107/S0907444998002005
Bleiziffer, P., Schaller, K. & Riniker, S. Machine learning of partial charges derived from high-quality quantum-mechanical calculations. J. Chem. Inf. Model. 58, 579–590 (2018).
pubmed: 29461814 doi: 10.1021/acs.jcim.7b00663
Wang, X. & Gao, J. Atomic partial charge predictions for furanoses by random forest regression with atom type symmetry function. RSC Adv. 10, 666–673 (2020).
pubmed: 35494472 pmcid: 9048215 doi: 10.1039/C9RA09337K
Kato, K. et al. High-precision atomic charge prediction for protein systems using fragment molecular orbital calculation and machine learning. J. Chem. Inf. Model. 60, 3361–3368 (2020).
pubmed: 32496771 doi: 10.1021/acs.jcim.0c00273
Wang, J. et al. Fast and accurate prediction of partial charges using atom-path-descriptor-based machine learning. Bioinformatics 36, 4721–4728 (2020).
pubmed: 32525553 doi: 10.1093/bioinformatics/btaa566
Martin, R. & Heider, D. ContraDRG: automatic partial charge prediction by machine learning. Front. Genet. 10, 990 (2019).
pubmed: 31737032 pmcid: 6831742 doi: 10.3389/fgene.2019.00990
Cioslowski, J. & Surján, P. R. An observable-based interpretation of electronic wavefunctions: application to “hypervalent” molecules. J. Mol. Struc. THEOCHEM 255, 9–33 (1992).
doi: 10.1016/0166-1280(92)85003-4
Francl, M. M., Carey, C., Chirlian, L. E. & Gange, D. M. Charges fit to electrostatic potentials. II. Can atomic charges be unambiguously fit to electrostatic potentials? J. Comput. Chem. 17, 367–383 (1996).
doi: 10.1002/(SICI)1096-987X(199602)17:3<367::AID-JCC11>3.0.CO;2-H
Veit, M., Wilkins, D. M., Yang, Y., DiStasio, R. A. & Ceriotti, M. Predicting molecular dipole moments by combining atomic partial charges and atomic dipoles. J. Chem. Phys. 153, 024113 (2020).
pubmed: 32668949 doi: 10.1063/5.0009106
Yao, K., Herr, J. E., Toth, D. W., Mckintyre, R. & Parkhill, J. The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics. Chem. Sci. 9, 2261–2269 (2018).
pubmed: 29719699 pmcid: 5897848 doi: 10.1039/C7SC04934J
Loeffler, J. R. et al. Conformational shifts of stacked heteroaromatics: vacuum vs. water studied by machine learning. Front. Chem. https://doi.org/10.3389/fchem.2021.641610 (2021).
doi: 10.3389/fchem.2021.641610 pubmed: 34778215 pmcid: 8589469
Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017).
pubmed: 28507695 pmcid: 5414547 doi: 10.1039/C6SC05720A
McGaughey, G. B., Gagné, M. & Rappé, A. K. π-Stacking interactions: alive and well in proteins. J. Biol. Chem. 273, 15458–15463 (1998).
pubmed: 9624131 doi: 10.1074/jbc.273.25.15458
Metcalf, D. P. et al. Approaches for machine learning intermolecular interaction energies and application to energy components from symmetry adapted perturbation theory. J. Chem. Phys. 152, 074103 (2020).
pubmed: 32087645 doi: 10.1063/1.5142636
Szalewicz, K. Symmetry-adapted perturbation theory of intermolecular forces. WIREs Comput. Mol. Sci. 2, 254–272 (2012).
doi: 10.1002/wcms.86
Glick, Z. L. et al. AP-Net: an atomic-pairwise neural network for smooth and transferable interaction potentials. J. Chem. Phys. 153, 044112 (2020).
pubmed: 32752707 doi: 10.1063/5.0011521
Geerlings, P., De Proft, F. & Langenaeker, W. Conceptual density functional theory. Chem. Rev. 103, 1793–1874 (2003).
pubmed: 12744694 doi: 10.1021/cr990029p
Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. General-purpose machine learning potentials capturing nonlocal charge transfer. Acc. Chem. Res. 54, 808–817 (2021).
pubmed: 33513012 doi: 10.1021/acs.accounts.0c00689
Grisafi, A. et al. Transferable machine-learning model of the electron density. ACS Cent. Sci. 5, 57–64 (2019).
pubmed: 30693325 doi: 10.1021/acscentsci.8b00551
Glielmo, A., Sollich, P. & De Vita, A. Accurate interatomic force fields via machine learning with covariant kernels. Phys. Rev. B 95, 214302 (2017).
doi: 10.1103/PhysRevB.95.214302
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
pubmed: 20481899 doi: 10.1103/PhysRevLett.104.136403
Nguyen, T. T. et al. Comparison of permutationally invariant polynomials, neural networks, and Gaussian approximation potentials in representing water interactions through many-body expansions. J. Chem. Phys. 148, 241725 (2018).
pubmed: 29960316 doi: 10.1063/1.5024577
Fabrizio, A., Grisafi, A., Meyer, B., Ceriotti, M. & Corminboeuf, C. Electron density learning of non-covalent systems. Chem. Sci. 10, 9424–9432 (2019).
pubmed: 32055318 pmcid: 6991182 doi: 10.1039/C9SC02696G
Deringer, V. L. et al. Gaussian process regression for materials and molecules. Chem. Rev. 121, 10073–10141 (2021).
pubmed: 34398616 pmcid: 8391963 doi: 10.1021/acs.chemrev.1c00022
Cuevas-Zuviría, B. & Pacios, L. F. Analytical model of electron density and its machine learning inference. J. Chem. Inf. Model. 60, 3831–3842 (2020).
pubmed: 32786704 doi: 10.1021/acs.jcim.0c00197
Cuevas-Zuviría, B. & Pacios, F. Machine learning of analytical electron density in large molecules through message-passing. J. Chem. Inf. Model. 61, 2658–2666.
Lewis, A. M., Grisafi, A., Ceriotti, M. & Rossi, M. Learning electron densities in the condensed phase. J. Chem. Theory Comput. 17, 7203–7214 (2021).
pubmed: 34669406 pmcid: 8582255 doi: 10.1021/acs.jctc.1c00576
Zou, S.-J. et al. Recent advances in organic light-emitting diodes: toward smart lighting and displays. Mater. Chem. Front. 4, 788–820 (2020).
doi: 10.1039/C9QM00716D
Nayak, P. K., Mahesh, S., Snaith, H. J. & Cahen, D. Photovoltaic solar cell technologies: analysing the state of the art. Nat. Rev. Mater. 4, 269–285 (2019).
doi: 10.1038/s41578-019-0097-0
Hirohata, A. et al. Review on spintronics: principles and device applications. J. Magn. Magn. Mater. 509, 166711 (2020).
doi: 10.1016/j.jmmm.2020.166711
Tretiak, S., Chernyak, V. & Mukamel, S. Localized electronic excitations in phenylacetylene dendrimers. J. Phys. Chem. B 102, 3310–3315 (1998).
doi: 10.1021/jp980745f
Zhao, L., Pan, S., Holzmann, N., Schwerdtfeger, P. & Frenking, G. Chemical bonding and bonding models of main-group compounds. Chem. Rev. 119, 8781–8845 (2019).
pubmed: 31251603 doi: 10.1021/acs.chemrev.8b00722
Mayer, I. Bond order and valence indices: a personal account. J. Comput. Chem. 28, 204–221 (2007).
pubmed: 17066501 doi: 10.1002/jcc.20494
Wiberg, K. B. Application of the Pople–Santry–Segal CNDO method to the cyclopropylcarbinyl and cyclobutyl cation and to bicyclobutane. Tetrahedron 24, 1083–1096 (1968).
doi: 10.1016/0040-4020(68)88057-3
Alonso, M. & Herradón, B. Neural networks as a tool to classify compounds according to aromaticity criteria. Chem. Eur. J. 13, 3913–3923 (2007).
pubmed: 17323387 doi: 10.1002/chem.200601101
Alonso, M., Miranda, C., Martín, N. & Herradón, B. Chemical applications of neural networks: aromaticity of pyrimidine derivatives. Phys. Chem. Chem. Phys. 13, 20564–20574 (2011).
pubmed: 21879068 doi: 10.1039/c1cp22001b
Ferreira, A. R. Chemical bonding in metallic glasses from machine learning and crystal orbital hamilton population. Phys. Rev. Mater. 4, 113603 (2020).
doi: 10.1103/PhysRevMaterials.4.113603
Matlock, M. K., Dang, N. L. & Swamidass, S. J. Learning a local-variable model of aromatic and conjugated systems. ACS Cent. Sci. 4, 52–62 (2018).
pubmed: 29392176 pmcid: 5785769 doi: 10.1021/acscentsci.7b00405
Li, H., Collins, C., Tanha, M., Gordon, G. J. & Yaron, D. J. A density functional tight binding layer for deep learning of chemical Hamiltonians. J. Chem. Theory Comput. 14, 5764–5776 (2018).
Schütt, K. T., Gastegger, M., Tkatchenko, A., Müller, K.-R. & Maurer, R. J. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat. Commun. 10, 5024 (2019).
pubmed: 31729373 pmcid: 6858523 doi: 10.1038/s41467-019-12875-2
Wang, Z. et al. Machine learning method for tight-binding Hamiltonian parameterization from ab-initio band structure. npj Comput. Mater. 7, 11 (2021).
doi: 10.1038/s41524-020-00490-5
Hoffmann, R. An extended Hückel theory. I. Hydrocarbons. J. Chem. Phys. 39, 1397–1412 (1963).
doi: 10.1063/1.1734456
Grabill, L. P. & Berger, R. F. Calibrating the extended Hückel method to quantitatively screen the electronic properties of materials. Sci. Rep. 8, 10530 (2018).
pubmed: 30002480 pmcid: 6043563 doi: 10.1038/s41598-018-28864-2
Zhou, G., Lubbers, N., Barros, K., Tretiak, S. & Nebgen, B. Deep learning of dynamically responsive chemical Hamiltonians with semiempirical quantum mechanics. Proc. Natl Acad. Sci. USA 119, e2120333119 (2022).
pubmed: 35776544 pmcid: 9271210 doi: 10.1073/pnas.2120333119
Stewart, J. J. P. Optimization of parameters for semiempirical methods I. Method. J. Comput. Chem. 10, 209–220 (1989).
doi: 10.1002/jcc.540100208
Elstner, M. et al. Self-consistent-charge density-functional tight-binding method for simulations of complex materials properties. Phys. Rev. B 58, 7260–7268 (1998).
doi: 10.1103/PhysRevB.58.7260
Gaus, M., Cui, Q. & Elstner, M. Density functional tight binding: application to organic and biological molecules. WIREs Comput. Mol. Sci. 4, 49–61 (2014).
doi: 10.1002/wcms.1156
Panosetti, C., Engelmann, A., Nemec, L., Reuter, K. & Margraf, J. T. Learning to use the force: fitting repulsive potentials in density-functional tight-binding with Gaussian process regression. J. Chem. Theory Comput. 16, 2181–2191 (2020).
pubmed: 32155065 doi: 10.1021/acs.jctc.9b00975
Kranz, J. J., Kubillus, M., Ramakrishnan, R., von Lilienfeld, O. A. & Elstner, M. Generalized density-functional tight-binding repulsive potentials from unsupervised machine learning. J. Chem. Theory Comput. 14, 2341–2352 (2018).
pubmed: 29579387 doi: 10.1021/acs.jctc.7b00933
Hastie, T., Tibshirani, R. & Friedman, J. Elements Of Statistical Learning: Data Mining, Inference, And Prediction 2nd edn (Springer, 2009).
Snyder, J. C., Rupp, M., Hansen, K., Müller, K.-R. & Burke, K. Finding density functionals with machine learning. Phys. Rev. Lett. 108, 253002 (2012).
pubmed: 23004593 doi: 10.1103/PhysRevLett.108.253002
Li, L. et al. Understanding machine-learned density functionals. Int. J. Quantum Chem. 116, 819–833 (2016).
doi: 10.1002/qua.25040
Brockherde, F. et al. Bypassing the Kohn–Sham equations with machine learning. Nat. Commun. 8, 872 (2017).
pubmed: 29021555 pmcid: 5636838 doi: 10.1038/s41467-017-00839-3
Hollingsworth, J., Baker, T. E. & Burke, K. Can exact conditions improve machine-learned density functionals? J. Chem. Phys. 148, 241743 (2018).
pubmed: 29960336 doi: 10.1063/1.5025668
Ramakrishnan, R., Dral, P. O., Rupp, M. & von Lilienfeld, O. A. Big data meets quantum chemistry approximations: the Δ-machine learning approach. J. Chem. Theory Comput. 11, 2087–2096 (2015).
pubmed: 26574412 doi: 10.1021/acs.jctc.5b00099
McGibbon, R. T. et al. Improving the accuracy of Møller–Plesset perturbation theory with neural networks. J. Chem. Phys. 147, 161725 (2017).
pubmed: 29096510 doi: 10.1063/1.4986081
Wilkins, D. M. et al. Accurate molecular polarizabilities with coupled cluster theory and machine learning. Proc. Natl Acad. Sci. 116, 3401–3406 (2019).
pubmed: 30733292 pmcid: 6397574 doi: 10.1073/pnas.1816132116
Kulik, H. et al. Roadmap on machine learning in electronic structure. Electron. Struct. https://doi.org/10.1088/2516-1075/ac572f (2022).
doi: 10.1088/2516-1075/ac572f
Gastegger, M., McSloy, A., Luya, M., Schütt, K. T. & Maurer, R. J. A deep neural network for molecular wave functions in quasi-atomic minimal basis representation. J. Chem. Phys. 153, 044123 (2020).
pubmed: 32752663 doi: 10.1063/5.0012911
Zubatiuk, T. & Isayev, O. Development of multimodal machine learning potentials: toward a physics-aware artificial intelligence. Acc. Chem. Res. 54, 1575–1585 (2021).
pubmed: 33715355 doi: 10.1021/acs.accounts.0c00868
Chandrasekaran, A. et al. Solving the electronic structure problem with machine learning. npj Comput. Mater. 5, 22 (2019).
doi: 10.1038/s41524-019-0162-7
Smith, J. S. et al. Automated discovery of a robust interatomic potential for aluminum. Nat. Commun. 12, 1257 (2021).
pubmed: 33623036 pmcid: 7902823 doi: 10.1038/s41467-021-21376-0
Jia, W. et al. Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning. Preprint at arXiv https://doi.org/10.48550/arXiv.2005.00223 (2020).
Jung, J. et al. New parallel computing algorithm of molecular dynamics for extremely huge scale biological systems. J. Comput. Chem. 42, 231–241 (2021).
pubmed: 33200457 doi: 10.1002/jcc.26450
Jinnouchi, R., Miwa, K., Karsai, F., Kresse, G. & Asahi, R. On-the-fly active learning of interatomic potentials for large-scale atomistic simulations. J. Phys. Chem. Lett. 11, 6946–6955 (2020).
pubmed: 32787192 doi: 10.1021/acs.jpclett.0c01061
Zhang, L., Lin, D.-Y., Wang, H., Car, R. & E, W. Active learning of uniformly accurate interatomic potentials for materials simulation. Phys. Rev. Mater. 3, 023804 (2019).
doi: 10.1103/PhysRevMaterials.3.023804
Noé, F., Olsson, S., Köhler, J. & Wu, H. Boltzmann generators: sampling equilibrium states of many-body systems with deep learning. Science 365, eaaw1147 (2019).
pubmed: 31488660 doi: 10.1126/science.aaw1147
Ribeiro, J. M. L., Bravo, P., Wang, Y. & Tiwary, P. Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). J. Chem. Phys. 149, 072301 (2018).
pubmed: 30134694 doi: 10.1063/1.5025487
Wang, Y., Ribeiro, J. M. L. & Tiwary, P. Past–future information bottleneck for sampling molecular reaction coordinate simultaneously with thermodynamics and kinetics. Nat. Commun. 10, 3573 (2019).
pubmed: 31395868 pmcid: 6687748 doi: 10.1038/s41467-019-11405-4
Gebauer, N. W. A., Gastegger, M. & Schütt, K. T. Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules. Preprint at arXiv https://doi.org/10.48550/arXiv.1906.00957 (2020).
doi: 10.48550/arXiv.1906.00957
Kuenneth, C. et al. Polymer informatics with multi-task learning. Patterns 2, 100238 (2021).
pubmed: 33982028 pmcid: 8085610 doi: 10.1016/j.patter.2021.100238
Krämer, M. et al. Charge and exciton transfer simulations using machine-learned hamiltonians. J. Chem. Theory Comput. 16, 4061–4070 (2020).
pubmed: 32491856 doi: 10.1021/acs.jctc.0c00246
Jeong, W. et al. Automation of active space selection for multireference methods via machine learning on chemical bond dissociation. J. Chem. Theory Comput. 16, 2389–2399 (2020).
pubmed: 32119542 doi: 10.1021/acs.jctc.9b01297
Qiao, Z., Welborn, M., Anandkumar, A., Manby, F. R. & Miller, T. F. OrbNet: deep learning for quantum chemistry using symmetry-adapted atomic-orbital features. J. Chem. Phys. 153, 124111 (2020).
pubmed: 33003742 doi: 10.1063/5.0021955
Artrith, N. & Urban, A. An implementation of artificial neural-network potentials for atomistic materials simulations: performance for TiO
doi: 10.1016/j.commatsci.2015.11.047
Dral, P. O. et al. MLatom 2: an integrative platform for atomistic machine learning. Top. Curr. Chem. 379, 27 (2021).
doi: 10.1007/s41061-021-00339-5
Khorshidi, A. & Peterson, A. A. Amp: a modular approach to machine learning in atomistic simulations. Computer Phys. Commun. 207, 310–324 (2016).
doi: 10.1016/j.cpc.2016.05.010
Kolb, B., Lentz, L. C. & Kolpak, A. M. Discovering charge density functionals and structure-property relationships with PROPhet: a general framework for coupling machine learning and first-principles methods. Sci. Rep. 7, 1192 (2017).
pubmed: 28446748 pmcid: 5430634 doi: 10.1038/s41598-017-01251-z
Wang, H., Zhang, L., Han, J. & E, W. DeePMD-kit: a deep learning package for many-body potential energy representation and molecular dynamics. Computer Phys. Commun. 228, 178–184 (2018).
doi: 10.1016/j.cpc.2018.03.016
Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J. S. & Roitberg, A. E. TorchANI: a free and open source pytorch-based deep learning implementation of the ANI neural network potentials. J. Chem. Inf. Model. 60, 3408–3415 (2020).
pubmed: 32568524 doi: 10.1021/acs.jcim.0c00451
Himanen, L. et al. DScribe: library of descriptors for machine learning in materials science. Comput. Phys. Commun. 247, 106949 (2020).
doi: 10.1016/j.cpc.2019.106949
Haghighatlari, M. et al. ChemML: a machine learning and informatics program package for the analysis, mining, and modeling of chemical and materials data. WIREs Comput. Mol. Sci. 10, e1458 (2020).
doi: 10.1002/wcms.1458
Lee, K., Yoo, D., Jeong, W. & Han, S. SIMPLE-NN: an efficient package for training and executing neural-network interatomic potentials. Comput. Phys. Commun. 242, 95–103 (2019).
doi: 10.1016/j.cpc.2019.04.014
Shao, Y., Hellström, M., Mitev, P. D., Knijff, L. & Zhang, C. PiNN: a Python library for building atomic neural networks of molecules and materials. J. Chem. Inf. Model. 60, 1184–1193 (2020).
pubmed: 31935100 doi: 10.1021/acs.jcim.9b00994
Velde, Gte et al. Chemistry with ADF. J. Comput. Chem. 22, 931–967 (2001).
doi: 10.1002/jcc.1056
Larsen, A. H. et al. The atomic simulation environment — a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
doi: 10.1088/1361-648X/aa680e
Chen, M. S., Morawietz, T., Mori, H., Markland, T. E. & Artrith, N. AENET–LAMMPS and AENET–TINKER: interfaces for accurate and efficient molecular dynamics simulations with machine learning potentials. J. Chem. Phys. 155, 074801 (2021).
pubmed: 34418919 doi: 10.1063/5.0063880
Neese, F. Software update: the ORCA program system — version 5.0. WIREs Comput. Mol. Sci. https://doi.org/10.1002/wcms.1606 (2022).
doi: 10.1002/wcms.1606
Cova, T. F. G. G. & Pais, A. A. C. C. Deep learning for deep chemistry: optimizing the prediction of chemical patterns. Front. Chem. 7, 809 (2019).
pubmed: 32039134 pmcid: 6988795 doi: 10.3389/fchem.2019.00809
Bzdok, D., Krzywinski, M. & Altman, N. Machine learning: supervised methods. Nat. Methods 15, 5–6 (2018).
pubmed: 30100821 pmcid: 6082635 doi: 10.1038/nmeth.4551
Shaidu, Y. et al. A systematic approach to generating accurate neural network potentials: the case of carbon. npj Comput. Mater. 7, 52 (2021).
doi: 10.1038/s41524-021-00508-6
Botu, V., Batra, R., Chapman, J. & Ramprasad, R. Machine learning force fields: construction, validation, and outlook. J. Phys. Chem. C 121, 511–522 (2017).
doi: 10.1021/acs.jpcc.6b10908
Senftle, T. P. et al. The ReaxFF reactive force-field: development, applications and future directions. npj Comput. Mater. 2, 15011 (2016).
doi: 10.1038/npjcompumats.2015.11
Leach, A. R. Molecular Modelling: Principles and Applications 2nd edn, Ch. 7 (Pearson, 2001)
Unke, O. T. et al. Machine learning force fields. Chem. Rev. 121, 10142–10186 (2021).
pubmed: 33705118 pmcid: 8391964 doi: 10.1021/acs.chemrev.0c01111
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
pubmed: 26017442 doi: 10.1038/nature14539
Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989).
doi: 10.1016/0893-6080(89)90020-8
Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 4, 251–257 (1991).
doi: 10.1016/0893-6080(91)90009-T
Behler, J. First principles neural network potentials for reactive simulations of large molecular and condensed systems. Angew. Chem. Int. Edn 56, 12828–12840 (2017).
doi: 10.1002/anie.201703114
Benoit, M. et al. Measuring transferability issues in machine-learning force fields: the example of gold–iron interactions with linearized potentials. Mach. Learn. Sci. Technol. 2, 025003 (2021).
doi: 10.1088/2632-2153/abc9fd
Anderson, B., Hy, T.-S. & Kondor, R. Cormorant: Covariant Molecular Neural Networks. Preprint at Arxiv https://arxiv.org/abs/1906.04015 (2019).
Jackson, R., Zhang, W. & Pearson, J. TSNet: predicting transition state structures with tensor field networks and transfer learning. Chem. Sci. 12, 10022–10040 (2021).
pubmed: 34377396 pmcid: 8317659 doi: 10.1039/D1SC01206A
Kocer, E., Mason, J. K. & Erturk, H. A novel approach to describe chemical environments in high-dimensional neural network potentials. J. Chem. Phys. 150, 154102 (2019).
pubmed: 31005106 doi: 10.1063/1.5086167

Auteurs

Nikita Fedik (N)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA.
Department of Chemistry and Biochemistry, Utah State University, Logan, UT, USA.

Roman Zubatyuk (R)

Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA, USA.

Maksim Kulichenko (M)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
Department of Chemistry and Biochemistry, Utah State University, Logan, UT, USA.

Nicholas Lubbers (N)

Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, USA.

Justin S Smith (JS)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
NVIDIA, Santa Clara, CA, USA.

Benjamin Nebgen (B)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.

Richard Messerly (R)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.

Ying Wai Li (YW)

Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, USA.

Alexander I Boldyrev (AI)

Department of Chemistry and Biochemistry, Utah State University, Logan, UT, USA.

Kipton Barros (K)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA.

Olexandr Isayev (O)

Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA, USA.

Sergei Tretiak (S)

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA. serg@lanl.gov.
Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA. serg@lanl.gov.
Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, NM, USA. serg@lanl.gov.

Classifications MeSH