Predicting potentially hazardous chemical reactions using an explainable neural network.
Journal
Chemical science
ISSN: 2041-6520
Titre abrégé: Chem Sci
Pays: England
ID NLM: 101545951
Informations de publication
Date de publication:
25 Aug 2021
25 Aug 2021
Historique:
received:
22
02
2021
accepted:
12
07
2021
entrez:
15
9
2021
pubmed:
16
9
2021
medline:
16
9
2021
Statut:
epublish
Résumé
Predicting potentially dangerous chemical reactions is a critical task for laboratory safety. However, a traditional experimental investigation of reaction conditions for possible hazardous or explosive byproducts entails substantial time and cost, for which machine learning prediction could accelerate the process and help detailed experimental investigations. Several machine learning models have been developed which allow the prediction of major chemical reaction products with reasonable accuracy. However, these methods may not present sufficiently high accuracy for the prediction of hazardous products which particularly requires a low false negative result for laboratory safety in order not to miss any dangerous reactions. In this work, we propose an explainable artificial intelligence model that can predict the formation of hazardous reaction products in a binary classification fashion. The reactant molecules are transformed into substructure-encoded fingerprints and then fed into a convolutional neural network to make the binary decision of the chemical reaction. The proposed model shows a false negative rate of 0.09, which can be compared with 0.47-0.66 using the existing main product prediction models. To provide explanations for what substructures of the given reactant molecules are important to make a decision for target hazardous product formation, we apply an input attribution method, layer-wise relevance propagation, which computes the contributions of individual inputs per input data. The computed attributions indeed match some of the existing chemical intuitions and mechanisms, and also offer a way to analyze possible data-imbalance issues of the current predictions based on relatively small positive datasets. We expect that the proposed hazardous product prediction model will be complementary to existing main product prediction models and experimental investigations.
Identifiants
pubmed: 34522300
doi: 10.1039/d1sc01049b
pii: d1sc01049b
pmc: PMC8386654
doi:
Types de publication
Journal Article
Langues
eng
Pagination
11028-11037Informations de copyright
This journal is © The Royal Society of Chemistry.
Déclaration de conflit d'intérêts
There are no conflicts to declare.
Références
Chem Sci. 2018 Nov 26;10(2):370-377
pubmed: 30746086
Chem Sci. 2018 Jun 22;9(28):6091-6098
pubmed: 30090297
Phys Chem Chem Phys. 2013 Mar 21;15(11):3683-701
pubmed: 23389653
Org Lett. 2005 Aug 4;7(16):3541-4
pubmed: 16048337
PLoS One. 2015 Jul 10;10(7):e0130140
pubmed: 26161953
J Chem Theory Comput. 2015 Sep 8;11(9):4248-59
pubmed: 26575920
J Phys Chem A. 2006 Jul 20;110(28):8933-41
pubmed: 16836457
ACS Cent Sci. 2017 May 24;3(5):434-443
pubmed: 28573205
Chem Sci. 2017 Dec 12;9(4):825-835
pubmed: 29675146
J Chem Inf Model. 2011 Mar 28;51(3):739-53
pubmed: 21384929
ACS Cent Sci. 2019 Sep 25;5(9):1572-1583
pubmed: 31572784
J Chem Inf Model. 2010 May 24;50(5):742-54
pubmed: 20426451
J Chem Theory Comput. 2011 Aug 9;7(8):2335-45
pubmed: 26606607
J Phys Chem A. 2019 Jun 6;123(22):4796-4805
pubmed: 31074624
Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30071-30078
pubmed: 32873639