Hybrid localized graph kernel for machine learning energy-related properties of molecules and solids.

QM7 and BA10 datasets energy-related properties graph kernel machine learning regression

Journal

Journal of computational chemistry
ISSN: 1096-987X
Titre abrégé: J Comput Chem
Pays: United States
ID NLM: 9878362

Informations de publication

Date de publication:
30 Jul 2021
Historique:
revised: 07 04 2021
received: 15 01 2021
accepted: 21 04 2021
pubmed: 20 5 2021
medline: 20 5 2021
entrez: 19 5 2021
Statut: ppublish

Résumé

Nowadays, the coupling of electronic structure and machine learning techniques serves as a powerful tool to predict chemical and physical properties of a broad range of systems. With the aim of improving the accuracy of predictions, a large number of representations for molecules and solids for machine learning applications has been developed. In this work we propose a novel descriptor based on the notion of molecular graph. While graphs are largely employed in classification problems in cheminformatics or bioinformatics, they are not often used in regression problem, especially of energy-related properties. Our method is based on a local decomposition of atomic environments and on the hybridization of two kernel functions: a graph kernel contribution that describes the chemical pattern and a Coulomb label contribution that encodes finer details of the local geometry. The accuracy of this new kernel method in energy predictions of molecular and condensed phase systems is demonstrated by considering the popular QM7 and BA10 datasets. These examples show that the hybrid localized graph kernel outperforms traditional approaches such as, for example, the smooth overlap of atomic positions and the Coulomb matrices.

Identifiants

pubmed: 34009668
doi: 10.1002/jcc.26550
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1390-1401

Subventions

Organisme : Agentúra na Podporu Výskumu a Vývoja
Organisme : FEDER-FSE
Organisme : Slovak Research and Development Agency
ID : VEGA-1/0777/19
Organisme : Slovak Research and Development Agency
ID : APVV-20-0127
Organisme : European Union

Informations de copyright

© 2021 Wiley Periodicals LLC.

Références

S. Haykin, Neural Networks and Learning Machines, 3rd ed., Pearson Education Inc, Upper Saddle River, New Jersey 2009.
J. Behler, J. Chem. Phys. 2016, 145, 170901.
F. Noé, A. Tkatchenko, K.-R. Müller, C. Clementi, Annu. Rev. Phys. Chem. 2020, 71, 361.
P. O. Dral, J. Phys. Chem. Lett. 2020, 11, 2336.
S. McArdle, S. Endo, A. Aspuru-Guzik, S. C. Benjamin, X. Yuan, Rev. Mod. Phys. 2020, 92, 015003.
C. M. Handley, P. L. A. Popelier, J. Phys. Chem. A 2010, 114, 3371.
J. Behler, Phys. Chem. Chem. Phys. 2011, 13, 17930.
T. Bučko, M. Gešvandtnerová, D. Rocca, J. Chem. Theory Comput. 2020, 16, 6049.
B. Casier, S. Carniato, T. Miteva, N. Capron, N. Sisourat, J. Chem. Phys. 2020, 152, 234103.
J. Behler, M. Parrinello, Phys. Rev. Lett. 2007, 98, 146401.
F. Häse, I. Galván, A. Aspuru-Guzik, R. Lindh, M. Vacher, Chem. Sci. 2019, 10, 2298.
P. Gkeka, G. Stoltz, A. B. Farimani, Z. Belkacemi, M. Ceriotti, J. D. Chodera, A. R. Dinner, A. L. Ferguson, J.-B. Maillet, H. Minoux, C. Peter, F. Pietrucci, A. Silveira, A. Tkatchenko, Z. Trstanova, R. Wiewiora, T. Lelièvre, J. Chem. Theory Comput. 2020, 16, 4757.
G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, New J. Phys. 2013, 15, 095003.
M. Welborn, L. Cheng, T. F. Miller, J. Chem. Theory Comput. 2018, 14, 4772.
W. Pronobis, A. Tkatchenko, K.-R. Müller, J. Chem. Theory Comput. 2018, 14, 2991.
A. Grisafi, D. M. Wilkins, G. Csányi, M. Ceriotti, Phys. Rev. Lett. 2018, 120, 036002.
O. T. Unke, M. Meuwly, J. Chem. Theory Comput. 2019, 15, 3678.
F. C. Bononi, Z. Chen, D. Rocca, O. Andreussi, T. Hullar, C. Anastasio, D. Donadio, J. Phys. Chem. A 2020, 124, 9288.
M. R. Carbone, M. Topsakal, D. Lu, S. Yoo, Phys. Rev. Lett. 2020, 124, 156401.
S. S. Dong, M. Govoni, G. Galli, arXiv, 2020.
L. Ruddigkeit, R. van Deursen, L. C. Blum, J.-L. Reymond, J. Chem. Inf. Model. 2012, 52, 2864.
F. A. Faber, A. Lindmaa, O. A. von Lilienfeld, R. Armiento, Phys. Rev. Lett. 2016, 117, 135502.
O. A. von Lilienfeld, Angew. Chem., Int. Ed. 2018, 57, 4164.
A. Tkatchenko, Nat. Commun. 2020, 11, 4125.
R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld, J. Chem. Theory Comput. 2015, 11, 2087.
B. Chehaibou, M. Badawi, T. Bučko, T. Bazhirov, D. Rocca, J. Chem. Theory Comput. 2019, 15, 6333.
J. Behler, J. Chem. Phys. 2011, 134, 074106.
M. Rupp, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, Phys. Rev. Lett. 2012, 108, 058301.
G. Montavon, K. Hansen, S. Fazli, M. Rupp, F. Biegler, A. Ziehe, A. Tkatchenko, O. von Lilienfeld, K. Müller, in Advances in Neural Information Processing Systems (Eds: F. Pereira, C. Burges, L. Bottou, K. Weinberger), 25, Curran Associates, Inc, 2012, p. 440.
F. Faber, A. Lindmaa, O. A. von Lilienfeld, R. Armiento, Int. J. Quantum Chem. 2015, 115, 1094.
A. P. Bartók, R. Kondor, G. Csányi, Phys. Rev. B 2013, 87, 184115.
S. De, A. P. Bartók, G. Csányi, M. Ceriotti, Phys. Chem. Chem. Phys. 2016, 18, 13754.
H. Huo, M. Rupp, arXiv, 2017.
P. Mahé, N. Ueda, T. Akutsu, J.-L. Perret, J.-P. Vert, Extensions of Marginalized Graph Kernels. In Proc. of the Twenty-First Int. Conf. on Machine Learning. New York, NY, USA, 2004, p 70.
P. Mahé, N. Ueda, T. Akutsu, J.-L. Perret, J.-P. Vert, J. Chem. Inf. Model. 2005, 45, 939.
B. Gaüzère, L. Brun, D. Villemin, Pattern Recognit. Lett. 2012, 33, 2038.
A. Lavecchia, Drug Discovery Today 2015, 20, 318.
R. Sharan, T. Ideker, Nat. Biotechnol. 2006, 24, 427.
A. Smalter, J. Huan, G. Lushington, J. Bioinform. Comput. Biol. 2009, 07, 473.
E. N. Muratov, J. Bajorath, R. P. Sheridan, I. V. Tetko, D. Filimonov, V. Poroikov, T. I. Oprea, I. I. Baskin, A. Varnek, A. Roitberg, O. Isayev, S. Curtalolo, D. Fourches, Y. Cohen, A. Aspuru-Guzik, D. A. Winkler, D. Agrafiotis, A. Cherkasov, A. Tropsha, Chem. Soc. Rev. 2020, 49, 3525.
G. S. Na, H. Chang, H. W. Kim, Phys. Chem. Chem. Phys. 2020, 22, 18526.
T. Gärtner, P. Flach, S. Wrobel, Learning Theory and Kernel Machines. (Eds: B. Schölkopf, M.K. Warmuth) Springer, Berlin, Heidelberg 2003, p. 129.
G. Ferré, T. Haut, K. Barros, J. Chem. Phys. 2017, 146, 114107.
Y.-H. Tang, W. A. de Jong, J. Chem. Phys. 2019, 150, 044107.
J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, G. E. Dahl, Neural Message Passing for Quantum Chemistry. In Proc. of the 34th Int. Conf. on Machine Learning. International Convention Centre, Sydney, Australia, 2017, pp. 1263-1272.
Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, V. Pande, Chem. Sci. 2018, 9, 513.
T. Xie, J. C. Grossman, Phys. Rev. Lett. 2018, 120, 145301.
A. E. Hoerl, R. W. Kennard, Technometrics 1970, 12, 55.
M. Gönen, E. Alpaydin, J. Mach. Learn. Res. 2011, 12, 2211.
X. Wu, W. Tang, X. Wu, Information Engineering and Applications, Springer, London 2012, p. 127.
M. J. Willatt, F. Musil, M. Ceriotti, Phys. Chem. Chem. Phys. 2018, 20, 29661.
L. Ralaivola, S. J. Swamidass, H. Saigo, P. Baldi, Neural Network 2005, 18, 1093.
N. M. Kriege, F. D. Johansson, C. Morris, Appl. Network Sci. 2020, 5, 6.
G. Nikolentzos, G. Siglidis, M. Vazirgiannis arXiv, 2019.
B. Cordero, V. Gómez, A. E. Platero-Prats, M. Revés, J. Echeverría, E. Cremades, F. Barragán, S. Alvarez, Dalton Trans. 2008, 21, 2832.
K. M. Borgwardt, H. P. Kriegel, Shortest-path Kernels on Graphs. In Fifth IEEE International Conference on Data Mining (ICDM'05), 2005, p. 8.
R. W. Floyd, Commun. ACM 1962, 5, 345.
M. Rupp, E. Proschak, G. Schneider, J. Chem. Inf. Model. 2007, 47, 2280.
G. Nikolentzos, P. Meladianos, F. Rousseau, Y. Stavrakas, M. Vazirgiannis, Shortest-path Graph Kernels for Document Similarity. In Proc of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, p. 1890.
M. Rupp, Int. J. Quantum Chem. 2015, 115, 1058.
L. Himanen, M. O. J. Jäger, E. V. Morooka, F. Federici Canova, Y. S. Ranawat, D. Z. Gao, P. Rinke, A. S. Foster, Comput. Phys. Commun. 2020, 247, 106949.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, É. Duchesnay, J. Mach. Learn. Res. 2011, 12, 2825.
A. K. Rappe, C. J. Casewit, K. S. Colwell, W. A. Goddard, W. M. Skiff, J. Am. Chem. Soc. 1992, 114, 10024.
L. C. Blum, J.-L. Reymond, J. Am. Chem. Soc. 2009, 131, 8732.
P. Hohenberg, W. Kohn, Phys. Rev. 1964, 136, B864.
J. P. Perdew, K. Burke, M. Ernzerhof, Phys. Rev. Lett. 1996, 77, 3865.
C. Nyshadham, M. Rupp, B. Bekker, A. V. Shapeev, T. Mueller, C. W. Rosenbrock, G. Csányi, D. W. Wingate, G. L. W. Hart, npj Comput. Mater. 2019, 5, 51.
G. L. Hart, L. J. Nelson, R. W. Forcade, Comput. Mater. Sci. 2012, 59, 101.
G. Kresse, J. Hafner, Phys. Rev. B: Condens. Matter Mater. Phys. 1993, 47, 558.
G. Kresse, J. Furthmüller, Phys. Rev. B: Condens. Matter Mater. Phys. 1996, 54, 11169.
K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler, O. A. von Lilienfeld, A. Tkatchenko, K.-R. Müller, J. Chem. Theory Comput. 2013, 9, 3404.
K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O. A. von Lilienfeld, K.-R. Müller, A. Tkatchenko, J. Phys. Chem. Lett. 2015, 6, 2326.

Auteurs

Bastien Casier (B)

Université de Lorraine and CNRS, LPCT, UMR 7019, F-54000 Nancy, France.

Mauricio Chagas da Silva (M)

Université de Lorraine and CNRS, LPCT, UMR 7019, F-54000 Nancy, France.

Michael Badawi (M)

Université de Lorraine and CNRS, LPCT, UMR 7019, F-54000 Nancy, France.

Tomáš Bučko (T)

Department of Physical and Theoretical Chemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Bratislava, Slovakia.
Institute of Inorganic Chemistry, Slovak Academy of Sciences, Bratislava, Slovakia.

Sébastien Lebègue (S)

Université de Lorraine and CNRS, LPCT, UMR 7019, F-54000 Nancy, France.

Dario Rocca (D)

Université de Lorraine and CNRS, LPCT, UMR 7019, F-54000 Nancy, France.

Classifications MeSH