Enabling late-stage drug diversification by high-throughput experimentation with geometric deep learning.
Journal
Nature chemistry
ISSN: 1755-4349
Titre abrégé: Nat Chem
Pays: England
ID NLM: 101499734
Informations de publication
Date de publication:
23 Nov 2023
23 Nov 2023
Historique:
received:
21
10
2022
accepted:
03
10
2023
medline:
24
11
2023
pubmed:
24
11
2023
entrez:
23
11
2023
Statut:
aheadofprint
Résumé
Late-stage functionalization is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, a late-stage functionalization platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in late-stage functionalization, the computational model predicted reaction yields for diverse reaction conditions with a mean absolute error margin of 4-5%, while the reactivity of novel reactions with known and unknown substrates was classified with a balanced accuracy of 92% and 67%, respectively. The regioselectivity of the major products was accurately captured with a classifier F-score of 67%. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and electronic information on model performance was quantified, and a comprehensive simple user-friendly reaction format was introduced that proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation for late-stage functionalization.
Identifiants
pubmed: 37996732
doi: 10.1038/s41557-023-01360-5
pii: 10.1038/s41557-023-01360-5
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation)
ID : 205321_182176
Organisme : Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation)
ID : 205321_182176
Informations de copyright
© 2023. The Author(s).
Références
Jana, R., Begam, H. M. & Dinda, E. The emergence of the C–H functionalization strategy in medicinal chemistry and drug discovery. Chem. Commun. 57, 10842–10866 (2021).
doi: 10.1039/D1CC04083A
Werner, M. et al. Seamless integration of dose-response screening and flow chemistry: efficient generation of structure–activity relationship data of β-secretase (BACE1) inhibitors. Angew. Chem. Int. Ed. 53, 1704–1708 (2014).
doi: 10.1002/anie.201309301
Parry, D. M. Closing the loop: developing an integrated design, make, and test platform for discovery. ACS Med. Chem. Lett. 10, 848–856 (2019).
pubmed: 31223437
pmcid: 6580368
doi: 10.1021/acsmedchemlett.9b00095
Sutherland, J. D. et al. An automated synthesis–purification–sample-management platform for the accelerated generation of pharmaceutical candidates. J. Lab. Autom. 19, 176–182 (2014).
pubmed: 24352687
doi: 10.1177/2211068213516325
Schneider, P. et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discov. 19, 353–364 (2020).
pubmed: 31801986
doi: 10.1038/s41573-019-0050-3
Wencel-Delord, J. & Glorius, F. C–H bond activation enables the rapid construction and late-stage diversification of functional molecules. Nat. Chem. 5, 369–375 (2013).
pubmed: 23609086
doi: 10.1038/nchem.1607
Nippa, D. F. et al. Late-stage functionalization and its impact on modern drug discovery: medicinal chemistry and chemical biology highlights. Chimia 76, 258–258 (2022).
doi: 10.2533/chimia.2022.258
Hartwig, J. F. Borylation and silylation of C–H bonds: a platform for diverse C–H bond functionalizations. Acc. Chem. Res. 45, 864–873 (2012).
pubmed: 22075137
doi: 10.1021/ar200206a
Wang, M. & Shi, Z. Methodologies and strategies for selective borylation of C–Het and C–C bonds. Chem. Rev. 120, 7348–7398 (2020).
pubmed: 32597639
doi: 10.1021/acs.chemrev.9b00384
Lasso, J. D., Castillo-Pazos, D. J. & Li, C.-J. Green chemistry meets medicinal chemistry: a perspective on modern metal-free late-stage functionalization reactions. Chem. Soc. Rev. 50, 10955–10982 (2021).
pubmed: 34382989
doi: 10.1039/D1CS00380A
Cernak, T., Dykstra, K. D., Tyagarajan, S., Vachal, P. & Krska, S. W. The medicinal chemist’s toolbox for late stage functionalization of drug-like molecules. Chem. Soc. Rev. 45, 546–576 (2016).
pubmed: 26507237
doi: 10.1039/C5CS00628G
Guillemard, L., Kaplaneris, N., Ackermann, L. & Johansson, M. J. Late-stage C–H functionalization offers new opportunities in drug discovery. Nat. Rev. Chem. 5, 522–545 (2021).
pubmed: 37117588
doi: 10.1038/s41570-021-00300-6
Stepan, A. F. et al. Late-stage microsomal oxidation reduces drug–drug interaction and identifies phosphodiesterase 2A inhibitor PF-06815189. ACS Med. Chem. Lett. 9, 68–72 (2018).
pubmed: 29456790
pmcid: 5807869
doi: 10.1021/acsmedchemlett.7b00343
Halperin, S. D., Fan, H., Chang, S., Martin, R. E. & Britton, R. A convenient photocatalytic fluorination of unactivated C–H bonds. Angew. Chem. Int. Ed. 126, 4778–4781 (2014).
doi: 10.1002/ange.201400420
Friis, S. D., Johansson, M. J. & Ackermann, L. Cobalt-catalysed C–H methylation for late-stage drug diversification. Nat. Chem. 12, 511–519 (2020).
pubmed: 32472105
doi: 10.1038/s41557-020-0475-7
Dreher, S. D., Dormer, P. G., Sandrock, D. L. & Molander, G. A. Efficient cross-coupling of secondary alkyltrifluoroborates with aryl chlorides reaction discovery using parallel microscale experimentation. J. Am. Chem. Soc. 130, 9257–9259 (2008).
pubmed: 18582050
pmcid: 2593853
doi: 10.1021/ja8031423
Bellomo, A. et al. Rapid catalyst identification for the synthesis of the pyrimidinone core of HIV integrase inhibitors. Angew. Chem. Int. Ed. 124, 7018–7021 (2012).
doi: 10.1002/ange.201201720
Buitrago Santanilla, A. et al. Nanomole-scale high-throughput chemistry for the synthesis of complex molecules. Science 347, 49–53 (2015).
pubmed: 25554781
doi: 10.1126/science.1259203
Shevlin, M. Practical high-throughput experimentation for chemists. ACS Med. Chem. Lett. 8, 601–607 (2017).
pubmed: 28626518
pmcid: 5467193
doi: 10.1021/acsmedchemlett.7b00165
Krska, S. W., DiRocco, D. A., Dreher, S. D. & Shevlin, M. The evolution of chemical high-throughput experimentation to address challenging problems in pharmaceutical synthesis. Acc. Chem. Res. 50, 2976–2985 (2017).
pubmed: 29172435
doi: 10.1021/acs.accounts.7b00428
Mennen, S. M. et al. The evolution of high-throughput experimentation in pharmaceutical development and perspectives on the future. Org. Process. Res. Dev. 23, 1213–1242 (2019).
doi: 10.1021/acs.oprd.9b00140
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
pubmed: 26978244
pmcid: 4792175
doi: 10.1038/sdata.2016.18
Coley, C. W., Green, W. H. & Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51, 1281–1289 (2018).
pubmed: 29715002
doi: 10.1021/acs.accounts.8b00087
Raccuglia, P. et al. Machine-learning-assisted materials discovery using failed experiments. Nature 533, 73–76 (2016).
pubmed: 27147027
doi: 10.1038/nature17439
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning, 1263–1272 (PMLR, 2017).
Unke, O. T. & Meuwly, M. PhysNet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15, 3678–3693 (2019).
pubmed: 31042390
doi: 10.1021/acs.jctc.9b00181
Isert, C., Kromann, J. C., Stiefl, N., Schneider, G. & Lewis, R. A. Machine learning for fast, quantum mechanics-based approximation of drug lipophilicity. ACS Omega 8, 2046–2056 (2023).
pubmed: 36687099
pmcid: 9850743
doi: 10.1021/acsomega.2c05607
Isert, C., Atz, K. & Schneider, G. Structure-based drug design with geometric deep learning. Curr. Opin. Struct. Biol. 79, 102548 (2023).
pubmed: 36842415
doi: 10.1016/j.sbi.2023.102548
Segler, M. H., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).
pubmed: 29595767
doi: 10.1038/nature25978
Shen, Y. et al. Automation and computer-assisted planning for chemical synthesis. Nat. Rev. Methods Prim. 1, 23 (2021).
doi: 10.1038/s43586-021-00022-5
Atz, K., Grisoni, F. & Schneider, G. Geometric deep learning on molecular representations. Nat. Mach. Intell. 3, 1023–1032 (2021).
doi: 10.1038/s42256-021-00418-8
Somnath, V. R., Bunne, C., Coley, C., Krause, A. & Barzilay, R. Learning graph models for retrosynthesis prediction. In Advances in Neural Information Processing Systems (NeurIPS), 34, 9405–9415, https://proceedings.neurips.cc/paper/2021/hash/4e2a6330465c8ffcaa696a5a16639176-Abstract.html (2021).
Guan, Y. et al. Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors. Chem. Sci. 12, 2198–2208 (2021).
doi: 10.1039/D0SC04823B
Jin, W., Coley, C., Barzilay, R. & Jaakkola, T. Predicting organic reaction outcomes with Weisfeiler-Lehman network. In Advances in Neural Information Processing Systems (NeurIPS), 30, https://papers.nips.cc/paper_files/paper/2017/hash/ced556cd9f9c0c8315cfbe0744a3baf0-Abstract.html (2017).
Schwaller, P. et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 5, 1572–1583 (2019).
pubmed: 31572784
pmcid: 6764164
doi: 10.1021/acscentsci.9b00576
Thakkar, A., Chadimová, V., Bjerrum, E. J., Engkvist, O. & Reymond, J.-L. Retrosynthetic accessibility score (RAscore) – rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem. Sci. 12, 3339–3349 (2021).
pubmed: 34164104
pmcid: 8179384
doi: 10.1039/D0SC05401A
Heinen, S., von Rudorff, G. F. & von Lilienfeld, O. A. Toward the design of chemical reactions: machine learning barriers of competing mechanisms in reactant space. J. Chem. Phys. 155, 064105 (2021).
pubmed: 34391351
doi: 10.1063/5.0059742
Bragato, M., von Rudorff, G. F. & von Lilienfeld, O. A. Data enhanced Hammett-equation: reaction barriers in chemical space. Chem. Sci. 11, 11859–11868 (2020).
pubmed: 34094415
pmcid: 8163012
doi: 10.1039/D0SC04235H
von Rudorff, G. F., Heinen, S. N., Bragato, M. & von Lilienfeld, O. A. Thousands of reactants and transition states for competing E2 and S
doi: 10.1088/2632-2153/aba822
Stuyver, T. & Coley, C. W. Quantum chemistry-augmented neural networks for reactivity prediction: performance, generalizability, and explainability. J. Chem. Phys. 156, 084104 (2022).
pubmed: 35232175
doi: 10.1063/5.0079574
Qiu, J. et al. Selective functionalization of hindered meta-C–H bond of o-alkylaryl ketones promoted by automation and deep learning. Chem 8, 3275–3287 (2022).
King-Smith, E. et al. Predictive Minisci and P450 late stage functionalization with transfer learning. Preprint at ChemRxiv https://doi.org/10.26434/chemrxiv-2022-7ddw5 (2022).
Hoque, A. & Sunoj, R. B. Deep learning for enantioselectivity predictions in catalytic asymmetric β-C–H bond activation reactions. Digit. Discov. 1, 926–940 (2022).
doi: 10.1039/D2DD00084A
Boni, Y. T., Cammarota, R. C., Liao, K., Sigman, M. S. & Davies, H. M. Leveraging regio- and stereoselective C(sp
pubmed: 35977100
doi: 10.1021/jacs.2c04383
Xu, L.-C. et al. Enantioselectivity prediction of pallada-electrocatalysed C–H activation using transition state knowledge in machine learning. Nat. Synth. 2, 321–330 (2023).
Meuwly, M. Machine learning for chemical reactions. Chem. Rev. 121, 10218–10239 (2021).
pubmed: 34097378
doi: 10.1021/acs.chemrev.1c00033
Caldeweyher, E. et al. Hybrid Machine Learning Approach to Predict the Site Selectivity of Iridium-Catalyzed Arene Borylation. J. Am. Chem. Soc. 145, 17367–17376 (2023).
pubmed: 37523755
doi: 10.1021/jacs.3c04986
Kutchukian, P. S. et al. Chemistry informer libraries: a chemoinformatics enabled approach to evaluate and advance synthetic methods. Chem. Sci. 7, 2604–2613 (2016).
pubmed: 28660032
pmcid: 5477042
doi: 10.1039/C5SC04751J
Baek, J., Kang, M. & Hwang, S. J. Accurate learning of graph representations with graph multiset pooling. In International Conference on Learning Representations (ICLR) (2021).
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
pubmed: 20426451
doi: 10.1021/ci100050t
Wiest, O. et al. On the use of real-world datasets for reaction yield prediction. Chem. Sci. 14, 4997–5005 (2023).
Yin, Q., Klare, H. F. & Oestreich, M. Catalytic Friedel-Crafts C–H borylation of electron-rich arenes: dramatic rate acceleration by added alkenes. Angew. Chem. Int. Ed. 56, 3712–3717 (2017).
doi: 10.1002/anie.201611536
Lv, J. et al. Metal-free directed sp
pubmed: 31723273
doi: 10.1038/s41586-019-1640-2
Feng, Y. et al. Total synthesis of verruculogen and fumitremorgin A enabled by ligand-controlled C–H borylation. J. Am. Chem. Soc. 137, 10160–10163 (2015).
pubmed: 26256033
pmcid: 4777340
doi: 10.1021/jacs.5b07154
Bisht, R., Hoque, M. E. & Chattopadhyay, B. Amide effects in C–H activation: noncovalent interactions with L-shaped ligand for meta borylation of aromatic amides. Angew. Chem. Int. Ed. 57, 15762–15766 (2018).
doi: 10.1002/anie.201809929
Kearnes, S. M. et al. The open reaction database. J. Am. Chem. Soc. 143, 18820–18826 (2021).
pubmed: 34727496
doi: 10.1021/jacs.1c09820
Tomczak, J. et al. UDM (unified data model) for chemical reactions – past, present and future. Pure Appl. Chem. https://doi.org/10.1515/pac-2021-3013 (2022).
Hartwig, J. F. Regioselectivity of the borylation of alkanes and arenes. Chem. Soc. Rev. 40, 1992–2002 (2011).
pubmed: 21336364
doi: 10.1039/c0cs00156b
Wright, J. S., Scott, P. J. & Steel, P. G. Iridium-catalysed C–H borylation of heteroarenes: balancing steric and electronic regiocontrol. Angew. Chem. Int. Ed. 60, 2796–2821 (2021).
doi: 10.1002/anie.202001520
Meyers, J., Carter, M., Mok, N. Y. & Brown, N. On the origins of three-dimensionality in drug-like molecules. Future Med. Chem. 8, 1753–1767 (2016).
pubmed: 27572621
doi: 10.4155/fmc-2016-0095
Dreher, S. D. & Krska, S. W. Chemistry informer libraries: conception, early experience, and role in the future of cheminformatics. Acc. Chem. Res. 54, 1586–1596 (2021).
pubmed: 33723992
doi: 10.1021/acs.accounts.0c00760
Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. In Proceedings of the 38th International Conference on Machine Learning (ICML) 9323–9332 (2021).
Fey, M. & Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. In International Conference on Learning Representations (ICLR) (2019).
Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NeurIPS), 32, 8026–8037, https://papers.nips.cc/paper_files/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html (2019).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at arXiv https://doi.org/10.48550/arXiv.1412.6980 (2014).
Atz, K., Isert, C., Böcker, M. N., Jiménez-Luna, J. & Schneider, G. Δ-Quantum machine-learning for medicinal chemistry. Phys. Chem. Chem. Phys. 24, 10775–10783 (2022).
pubmed: 35470831
pmcid: 9093086
doi: 10.1039/D2CP00834C
Isert, C., Atz, K., Jiménez-Luna, J. & Schneider, G. QMugs, quantum mechanical properties of drug-like molecules. Sci. Data 9, 273 (2022).
pubmed: 35672335
pmcid: 9174255
doi: 10.1038/s41597-022-01390-7
Neeser, R., Isert, C., Stuyver, T., Schneider, G. & Coley, C. QMugs 1.1: quantum mechanical properties of organic compounds commonly encountered in reactivity datasets. SSRN http://doi.org/10.2139/ssrn.4363768 (2023).
Chai, J.-D. & Head-Gordon, M. Long-range corrected hybrid density functionals with damped atom–atom dispersion corrections. Phys. Chem. Chem. Phys. 10, 6615–6620 (2008).
pubmed: 18989472
doi: 10.1039/b810189b
Weigend, F. & Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design and assessment of accuracy. Phys. Chem. Chem. Phys. 7, 3297–3305 (2005).
pubmed: 16240044
doi: 10.1039/b508541a
Mulliken, R. S. Electronic population analysis on LCAO–MO molecular wave functions. I. J. Chem. Phys. 23, 1833–1840 (1955).
doi: 10.1063/1.1740588
Landrum, G. RDKit: Open-Source Cheminformatics Software, accessed September 2020; http://www.rdkit.org
Rappé, A. K., Casewit, C. J., Colwell, K., Goddard III, W. A. & Skiff, W. M. UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. J. Am. Chem. Soc. 114, 10024–10035 (1992).
doi: 10.1021/ja00051a040