MLatom 2: An Integrative Platform for Atomistic Machine Learning.
Gaussian process regression
Kernel ridge regression
Machine learning
Neural networks
Quantum chemistry
Journal
Topics in current chemistry (Cham)
ISSN: 2364-8961
Titre abrégé: Top Curr Chem (Cham)
Pays: Switzerland
ID NLM: 101691301
Informations de publication
Date de publication:
08 Jun 2021
08 Jun 2021
Historique:
received:
22
02
2021
accepted:
07
05
2021
entrez:
8
6
2021
pubmed:
9
6
2021
medline:
6
8
2021
Statut:
epublish
Résumé
Atomistic machine learning (AML) simulations are used in chemistry at an ever-increasing pace. A large number of AML models has been developed, but their implementations are scattered among different packages, each with its own conventions for input and output. Thus, here we give an overview of our MLatom 2 software package, which provides an integrative platform for a wide variety of AML simulations by implementing from scratch and interfacing existing software for a range of state-of-the-art models. These include kernel method-based model types such as KREG (native implementation), sGDML, and GAP-SOAP as well as neural-network-based model types such as ANI, DeepPot-SE, and PhysNet. The theoretical foundations behind these methods are overviewed too. The modular structure of MLatom allows for easy extension to more AML model types. MLatom 2 also has many other capabilities useful for AML simulations, such as the support of custom descriptors, farthest-point and structure-based sampling, hyperparameter optimization, model evaluation, and automatic learning curve generation. It can also be used for such multi-step tasks as Δ-learning, self-correction approaches, and absorption spectrum simulation within the machine-learning nuclear-ensemble approach. Several of these MLatom 2 capabilities are showcased in application examples.
Identifiants
pubmed: 34101036
doi: 10.1007/s41061-021-00339-5
pii: 10.1007/s41061-021-00339-5
pmc: PMC8187220
doi:
Substances chimiques
Hydrocarbons, Cyclic
0
Types de publication
Journal Article
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
27Subventions
Organisme : National Natural Science Foundation of China
ID : 22003051
Organisme : H2020 European Research Council
ID : 832237
Références
Dral PO (2020) Quantum chemistry in the age of machine learning. J Phys Chem Lett 11(6):2336–2347. https://doi.org/10.1021/acs.jpclett.9b03664
doi: 10.1021/acs.jpclett.9b03664
pubmed: 32125858
Dral PO (2020) Quantum chemistry assisted by machine learning. In: Ruud K, Brändas EJ (eds) Advances in quantum chemistry. Chemical physics and quantum chemistry, vol 81. Elsevier, Amdsterdam, pp 291–324. https://doi.org/10.1016/bs.aiq.2020.05.002
Butler KT, Davies DW, Cartwright H, Isayev O, Walsh A (2018) Machine learning for molecular and materials science. Nature 559(7715):547–555. https://doi.org/10.1038/s41586-018-0337-2
doi: 10.1038/s41586-018-0337-2
pubmed: 30046072
von Lilienfeld OA, Müller K-R, Tkatchenko A (2020) Exploring chemical compound space with quantum-based machine learning. Nat Rev Chem 4(7):347–358. https://doi.org/10.1038/s41570-020-0189-9
doi: 10.1038/s41570-020-0189-9
Manzhos S, Carrington T Jr (2020) Neural network potential energy surfaces for small molecules and reactions. Chem Rev. https://doi.org/10.1021/acs.chemrev.0c00665
doi: 10.1021/acs.chemrev.0c00665
pubmed: 33021368
Mueller T, Hernandez A, Wang C (2020) Machine learning for interatomic potential models. J Chem Phys 152(5):050902. https://doi.org/10.1063/1.5126336
doi: 10.1063/1.5126336
pubmed: 32035452
Bartók AP, Csányi G (2015) Gaussian approximation potentials: a brief tutorial introduction. Int J Quantum Chem 115(16):1051–1057. https://doi.org/10.1002/qua.24927
doi: 10.1002/qua.24927
Behler J (2016) Perspective: machine learning potentials for atomistic simulations. J Chem Phys 145(17):170901. https://doi.org/10.1063/1.4966192
doi: 10.1063/1.4966192
pubmed: 27825224
Dral PO, Xue B-X, Ge F, Hou Y-F, Pinheiro Jr M (2013–2021) MLatom: A Package for Atomistic Simulations with Machine Learning. Xiamen University, Xiamen, China, http://MLatom.com Accessed 23 Feb 2021
Dral PO (2019) MLatom: a program package for quantum chemical research assisted by machine learning. J Comput Chem 40(26):2339–2347. https://doi.org/10.1002/jcc.26004
doi: 10.1002/jcc.26004
pubmed: 31219626
Ramakrishnan R, Dral PO, Rupp M, von Lilienfeld OA (2015) Big data meets quantum chemistry approximations: the Δ-machine learning approach. J Chem Theory Comput 11(5):2087–2096. https://doi.org/10.1021/acs.jctc.5b00099
doi: 10.1021/acs.jctc.5b00099
pubmed: 26574412
Dral PO, Owens A, Yurchenko SN, Thiel W (2017) Structure-based sampling and self-correcting machine learning for accurate calculations of potential energy surfaces and vibrational levels. J Chem Phys 146(24):244108. https://doi.org/10.1063/1.4989536
doi: 10.1063/1.4989536
pubmed: 28668062
Xue B-X, Barbatti M, Dral PO (2020) Machine learning for absorption cross sections. J Phys Chem A 124(35):7199–7210. https://doi.org/10.1021/acs.jpca.0c05310
doi: 10.1021/acs.jpca.0c05310
pubmed: 32786977
pmcid: 7511037
Rupp M, Tkatchenko A, Müller K-R, von Lilienfeld OA (2012) Fast and accurate modeling of molecular atomization energies with machine learning. Phys Rev Lett 108(5):058301. https://doi.org/10.1103/Physrevlett.108.058301
doi: 10.1103/Physrevlett.108.058301
pubmed: 22400967
Hansen K, Montavon G, Biegler F, Fazli S, Rupp M, Scheffler M, von Lilienfeld OA, Tkatchenko A, Müller K-R (2013) Assessment and validation of machine learning methods for predicting molecular atomization energies. J Chem Theory Comput 9(8):3404–3419. https://doi.org/10.1021/ct400195d
doi: 10.1021/ct400195d
pubmed: 26584096
Dral PO, von Lilienfeld OA, Thiel W (2015) Machine learning of parameters for accurate semiempirical quantum chemical calculations. J Chem Theory Comput 11(5):2120–2125. https://doi.org/10.1021/acs.jctc.5b00141
doi: 10.1021/acs.jctc.5b00141
pubmed: 26146493
pmcid: 4479612
Dral PO, Barbatti M, Thiel W (2018) Nonadiabatic excited-state dynamics with machine learning. J Phys Chem Lett 9:5660–5663. https://doi.org/10.1021/acs.jpclett.8b02469
doi: 10.1021/acs.jpclett.8b02469
pubmed: 30200766
pmcid: 6174422
Dral PO, Owens A, Dral A, Csányi G (2020) Hierarchical machine learning of potential energy surfaces. J Chem Phys 152(20):204110. https://doi.org/10.1063/5.0006498
doi: 10.1063/5.0006498
pubmed: 32486656
Chmiela S, Sauceda HE, Müller K-R, Tkatchenko A (2018) Towards exact molecular dynamics simulations with machine-learned force fields. Nat Commun 9(1):3887. https://doi.org/10.1038/s41467-018-06169-2
doi: 10.1038/s41467-018-06169-2
pubmed: 30250077
pmcid: 6155327
Koner D, Meuwly M (2020) Permutationally invariant, reproducing kernel-based potential energy surfaces for polyatomic molecules: from formaldehyde to acetone. J Chem Theory Comput 16(9):5474–5484. https://doi.org/10.1021/acs.jctc.0c00535
doi: 10.1021/acs.jctc.0c00535
pubmed: 32787180
Smith JS, Isayev O, Roitberg AE (2017) ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem Sci 8(4):3192–3203. https://doi.org/10.1039/c6sc05720a
doi: 10.1039/c6sc05720a
pubmed: 28507695
pmcid: 5414547
Unke OT, Meuwly M (2019) PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J Chem Theory Comput 15(6):3678–3693. https://doi.org/10.1021/acs.jctc.9b00181
doi: 10.1021/acs.jctc.9b00181
pubmed: 31042390
Gv R (1995) Python tutorial, Technical Report CS-R9526. Centrum voor Wiskunde en Informatica (CWI), Amsterdam
Rossum GV, Drake FL (2009) Python 3 Reference Manual. CreateSpace, 100 Enterprise Way, Suite A200, Scotts Valley, CA
Chmiela S, Sauceda HE, Poltavsky I, Müller K-R, Tkatchenko A (2019) sGDML: constructing accurate and data efficient molecular force fields using machine learning. Comput Phys Commun 240:38–45. https://doi.org/10.1016/j.cpc.2019.02.007
doi: 10.1016/j.cpc.2019.02.007
Bartók AP, Payne MC, Kondor R, Csányi G (2010) Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys Rev Lett 104(13):136403. https://doi.org/10.1103/Physrevlett.104.136403
doi: 10.1103/Physrevlett.104.136403
pubmed: 20481899
Bartók AP, Kondor R, Csányi G (2013) On representing chemical environments. Phys Rev B 87(18):187115. https://doi.org/10.1103/physrevb.87.184115
doi: 10.1103/physrevb.87.184115
Gao X, Ramezanghorbani F, Isayev O, Smith JS, Roitberg AE (2020) TorchANI: a free and open source PyTorch-based deep learning implementation of the ANI neural network potentials. J Chem Inf Model 60(7):3408–3415. https://doi.org/10.1021/acs.jcim.0c00451
doi: 10.1021/acs.jcim.0c00451
pubmed: 32568524
Wang H, Zhang L, Han J, Weinan E (2018) DeePMD-kit: a deep learning package for many-body potential energy representation and molecular dynamics. Comput Phys Commun 228:178–184. https://doi.org/10.1016/j.cpc.2018.03.016
doi: 10.1016/j.cpc.2018.03.016
Zhang L, Han J, Wang H, Car R, Weinan E (2018) Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys Rev Lett 120(14):143001. https://doi.org/10.1103/PhysRevLett.120.143001
doi: 10.1103/PhysRevLett.120.143001
pubmed: 29694129
Zhang LF, Han JQ, Wang H, Saidi WA, Car R (2018) End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. Adv Neural Inf Process Syst 31:4436–4446
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24. Curran Associates, Red Hook, NY
Bergstra J, Yamins D, Cox DD Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In: Proceedings of the 30th International Conference on International Conference on Machine Learning, Atlanta, GA, 2013. ICML'13. JMLR.org, pp I–115–I–123. https://doi.org/10.5555/3042817.3042832
Rezac J (2016) Cuby: an integrative framework for computational chemistry. J Comput Chem 37(13):1230–1237. https://doi.org/10.1002/jcc.24312
doi: 10.1002/jcc.24312
pubmed: 26841135
Himanen L, Jäger MOJ, Morooka EV, Federici Canova F, Ranawat YS, Gao DZ, Rinke P, Foster AS (2020) DScribe: library of descriptors for machine learning in materials science. Comput Phys Commun 247:106949. https://doi.org/10.1016/j.cpc.2019.106949
doi: 10.1016/j.cpc.2019.106949
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York
doi: 10.1007/978-0-387-84858-7
Christensen AS, von Lilienfeld OA (2020) On the role of gradients for machine learning of molecular energies and forces. Mach Learn Sci Technol 1(4):045018. https://doi.org/10.1088/2632-2153/abba6f
doi: 10.1088/2632-2153/abba6f
Behler J (2011) Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. Phys Chem Chem Phys 13(40):17930–17955. https://doi.org/10.1039/C1cp21668f
doi: 10.1039/C1cp21668f
pubmed: 21915403
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. MIT Press, Boston
Cortes C, Jackel LD, Solla SA, Vapnik V, Denker JS (1994) Learning curves: asymptotic values and rate of convergence. Advances in neural information processing systems. Morgan Kaufmann, San Mateo, CA, pp 327–334
Crespo-Otero R, Barbatti M (2012) Spectrum simulation and decomposition with nuclear ensemble: formal derivation and application to benzene, furan and 2-phenylfuran. Theor Chem Acc 131(6):1237. https://doi.org/10.1007/s00214-012-1237-4
doi: 10.1007/s00214-012-1237-4
Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Petersson GA, Nakatsuji H, Li X, Caricato M, Marenich AV, Bloino J, Janesko BG, Gomperts R, Mennucci B, Hratchian HP, Ortiz JV, Izmaylov AF, Sonnenberg JL, Williams, Ding F, Lipparini F, Egidi F, Goings J, Peng B, Petrone A, Henderson T, Ranasinghe D, Zakrzewski VG, Gao J, Rega N, Zheng G, Liang W, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Throssell K, Montgomery Jr. JA, Peralta JE, Ogliaro F, Bearpark MJ, Heyd JJ, Brothers EN, Kudin KN, Staroverov VN, Keith TA, Kobayashi R, Normand J, Raghavachari K, Rendell AP, Burant JC, Iyengar SS, Tomasi J, Cossi M, Millam JM, Klene M, Adamo C, Cammi R, Ochterski JW, Martin RL, Morokuma K, Farkas O, Foresman JB, Fox DJ (2016) Gaussian 16 Rev. C.01. Wallingford, CT
Barbatti M, Granucci G, Ruckenbauer M, Plasser F, Crespo-Otero R, Pittner J, Persico M, Lischka H (2013) NEWTON-X: a package for Newtonian dynamics close to the crossing seam. http://www.newtonx.org . Accessed 23 Feb 2021
Barbatti M, Ruckenbauer M, Plasser F, Pittner J, Granucci G, Persico M, Lischka H (2014) Newton-X: a surface-hopping program for nonadiabatic molecular dynamics. WIREs Comp Mol Sci 4(1):26–33. https://doi.org/10.1002/wcms.1158
doi: 10.1002/wcms.1158
Schinke R (1995) Photodissociation dynamics: spectroscopy and fragmentation of small polyatomic molecules. Cambridge University Press, Cambridge
Weisstein EW (2020) “Least Squares Fitting." From MathWorld—A Wolfram Web Resource. https://mathworld.wolfram.com/LeastSquaresFitting.html . Accessed 25 Dec 2020
Schmitz G, Klinting EL, Christiansen O (2020) A Gaussian process regression adaptive density guided approach for potential energy surface construction. J Chem Phys 153(6):064105. https://doi.org/10.1063/5.0015344
doi: 10.1063/5.0015344
Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Müller K-R (2017) Machine learning of accurate energy-conserving molecular force fields. Sci Adv 3(5):e1603015. https://doi.org/10.1126/sciadv.1603015
doi: 10.1126/sciadv.1603015
pubmed: 28508076
pmcid: 5419702
Denzel A, Kästner J (2018) Gaussian process regression for geometry optimization. J Chem Phys 148(9):094114. https://doi.org/10.1063/1.5017103
doi: 10.1063/1.5017103
Fdez Galván I, Raggi G, Lindh R (2021) Restricted-variance constrained, reaction path, and transition state molecular optimizations using gradient-enhanced kriging. J Chem Theory Comput 17(1):571–582. https://doi.org/10.1021/acs.jctc.0c01163
doi: 10.1021/acs.jctc.0c01163
pubmed: 33382621
Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J, Du Croz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D (1999) LAPACK users’ guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia, PA
doi: 10.1137/1.9780898719604
Hu D, Xie Y, Li X, Li L, Lan Z (2018) Inclusion of machine learning kernel ridge regression potential energy surfaces in on-the-fly nonadiabatic molecular dynamics simulation. J Phys Chem Lett 9:2725–2732. https://doi.org/10.1021/acs.jpclett.8b00684
doi: 10.1021/acs.jpclett.8b00684
pubmed: 29732893
Krämer M, Dohmen PM, Xie W, Holub D, Christensen AS, Elstner M (2020) Charge and exciton transfer simulations using machine-learned hamiltonians. J Chem Theory Comput 16(7):4061–4070. https://doi.org/10.1021/acs.jctc.0c00246
doi: 10.1021/acs.jctc.0c00246
pubmed: 32491856
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X TensorFlow: A system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems, Savannah, GA, USA, 2016. USENIX Association. https://doi.org/10.5555/3026877.3026899
Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, Del Rio JF, Wiebe M, Peterson P, Gerard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE (2020) Array programming with NumPy. Nature 585(7825):357–362. https://doi.org/10.1038/s41586-020-2649-2
doi: 10.1038/s41586-020-2649-2
pubmed: 32939066
pmcid: 7759461
Szlachta WJ, Bartók AP, Csányi G (2014) Accuracy and transferability of Gaussian approximation potential models for tungsten. Phys Rev B 90(10):104108. https://doi.org/10.1103/PhysRevB.90.104108
doi: 10.1103/PhysRevB.90.104108
Taylor CD (2009) Connections between the energy functional and interaction potentials for materials simulations. Phys Rev B 80(2):024104. https://doi.org/10.1103/PhysRevB.80.024104
doi: 10.1103/PhysRevB.80.024104
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32. Curran Associates, Red Hook, NY, pp 8026–8037
Behler J, Parrinello M (2007) Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys Rev Lett 98(14):146401. https://doi.org/10.1103/Physrevlett.98.146401
doi: 10.1103/Physrevlett.98.146401
pubmed: 17501293
Schaub TA, Brülls SM, Dral PO, Hampel F, Maid H, Kivala M (2017) Organic electron acceptors comprising a dicyanomethylene-bridged acridophosphine scaffold: the impact of the heteroatom. Chem Eur J 23(29):6988–6992. https://doi.org/10.1002/chem.201701412
doi: 10.1002/chem.201701412
pubmed: 28370820
Chai J-D, Head-Gordon M (2008) Long-range corrected hybrid density functionals with damped atom-atom dispersion corrections. Phys Chem Chem Phys 10(44):6615–6620. https://doi.org/10.1039/b810189b
doi: 10.1039/b810189b
pubmed: 18989472
Weigend F, Ahlrichs R (2005) Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design and assessment of accuracy. Phys Chem Chem Phys 7(18):3297–3305. https://doi.org/10.1039/B508541a
doi: 10.1039/B508541a
pubmed: 16240044
Schäfer A, Huber C, Ahlrichs R (1994) Fully optimized contracted Gaussian-basis sets of triple zeta valence quality for atoms Li to Kr. J Chem Phys 100(8):5829–5835. https://doi.org/10.1063/1.467146
doi: 10.1063/1.467146
Schäfer A, Horn H, Ahlrichs R (1992) Fully optimized contracted Gaussian-basis sets for atoms Li to Kr. J Chem Phys 97(4):2571–2577
doi: 10.1063/1.463096
Bai S, Mansour R, Stojanovic L, Toldo JM, Barbatti M (2020) On the origin of the shift between vertical excitation and band maximum in molecular photoabsorption. J Mol Model 26(5):107. https://doi.org/10.1007/s00894-020-04355-y
doi: 10.1007/s00894-020-04355-y
pubmed: 32318882
pmcid: 7174274