Machine Learning C-N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction.
Journal
ACS omega
ISSN: 2470-1343
Titre abrégé: ACS Omega
Pays: United States
ID NLM: 101691658
Informations de publication
Date de publication:
24 Jan 2023
24 Jan 2023
Historique:
received:
13
09
2022
accepted:
25
10
2022
entrez:
30
1
2023
pubmed:
31
1
2023
medline:
31
1
2023
Statut:
epublish
Résumé
Pd-catalyzed C-N couplings are commonplace in academia and industry. Despite their significance, finding suitable reaction conditions leading to a high yield, for instance, remains a challenging and time-consuming task which usually requires screening over many sets of conditions. To help select promising reaction conditions in the vast space of reagent combinations, machine learning is an emerging technique with a lot of promise. In this work, we assess whether the reaction yield of C-N couplings can be predicted from databases of chemical reactions. We test the generalizability of models both on challenging data splits and on a dedicated experimental test set. We find that, provided the chemical space represented by the training set is not left, the models perform well. However, the applicability domain is quickly left even for simple reactions of the same type, as, for instance, present in our plate test set. The results show that yield prediction for new reactions is possible from the algorithmic side but in practice is hindered by the available data. Most importantly, more data that cover the diversity in reagents are needed for a general-purpose prediction of reaction yields. Our findings also expose a challenge to this field in that it appears to be extremely deceiving to judge models based on literature data with test sets which are split off the same literature data, even when challenging splits are considered.
Identifiants
pubmed: 36713686
doi: 10.1021/acsomega.2c05546
pmc: PMC9878668
doi:
Types de publication
Journal Article
Langues
eng
Pagination
3017-3025Informations de copyright
© 2023 The Authors. Published by American Chemical Society.
Déclaration de conflit d'intérêts
The authors declare no competing financial interest.
Références
Chem Soc Rev. 2020 Sep 1;49(17):6154-6168
pubmed: 32672294
Chem Rev. 2016 Oct 12;116(19):12564-12649
pubmed: 27689804
Chemistry. 2017 May 2;23(25):5966-5971
pubmed: 28134452
J Am Chem Soc. 2018 Apr 18;140(15):5004-5008
pubmed: 29584953
ACS Cent Sci. 2018 Sep 26;4(9):1134-1145
pubmed: 30276246
J Chem Inf Model. 2019 Jun 24;59(6):2545-2559
pubmed: 31194543
ACS Cent Sci. 2018 Nov 28;4(11):1465-1476
pubmed: 30555898
ACS Cent Sci. 2019 Sep 25;5(9):1572-1583
pubmed: 31572784
J Comput Chem. 2022 Feb 5;43(4):289-302
pubmed: 34862652
Science. 2019 Jan 18;363(6424):
pubmed: 30655414
J Am Chem Soc. 2021 Nov 17;143(45):18820-18826
pubmed: 34727496
Nature. 2018 Jul;559(7715):547-555
pubmed: 30046072
J Med Chem. 2020 Aug 27;63(16):8749-8760
pubmed: 31408336
Angew Chem Int Ed Engl. 2022 Jul 18;61(29):e202204647
pubmed: 35512117
ACS Cent Sci. 2021 Oct 27;7(10):1622-1637
pubmed: 34729406
J Comput Chem. 2017 Jun 15;38(16):1291-1307
pubmed: 28272810
Nature. 2021 Feb;590(7844):89-96
pubmed: 33536653
Chem Sci. 2018 Jun 22;9(28):6091-6098
pubmed: 30090297
Mol Inform. 2022 Aug;41(8):e2100294
pubmed: 35122702
Nature. 2018 Mar 28;555(7698):604-610
pubmed: 29595767
Digit Discov. 2022 Jan 21;1(2):91-97
pubmed: 35515081
Chem Sci. 2020 Oct 20;11(48):13085-13093
pubmed: 34476050
Sci Rep. 2017 Jun 15;7(1):3582
pubmed: 28620199
Science. 2018 Apr 13;360(6385):186-190
pubmed: 29449509
J Am Chem Soc. 2022 Mar 23;144(11):4819-4827
pubmed: 35258973