Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity.


Journal

Communications chemistry
ISSN: 2399-3669
Titre abrégé: Commun Chem
Pays: England
ID NLM: 101725670

Informations de publication

Date de publication:
16 Nov 2023
Historique:
received: 16 06 2023
accepted: 06 11 2023
medline: 17 11 2023
pubmed: 17 11 2023
entrez: 17 11 2023
Statut: epublish

Résumé

The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular features, and it can express structural diversity within a compound library in order to explore a broader chemical space and generate novel compound structures for drug candidates. In this study, we developed a deep-learning method, called NP-VAE (Natural Product-oriented Variational Autoencoder), based on variational autoencoder for managing hard-to-analyze datasets from DrugBank and large molecular structures such as natural compounds with chirality, an essential factor in the 3D complexity of compounds. NP-VAE was successful in constructing the chemical latent space from large-sized compounds that were unable to be handled in existing methods, achieving higher reconstruction accuracy, and demonstrating stable performance as a generative model across various indices. Furthermore, by exploring the acquired latent space, we succeeded in comprehensively analyzing a compound library containing natural compounds and generating novel compound structures with optimized functions.

Identifiants

pubmed: 37973971
doi: 10.1038/s42004-023-01054-6
pii: 10.1038/s42004-023-01054-6
pmc: PMC10654724
doi:

Types de publication

Journal Article

Langues

eng

Pagination

249

Subventions

Organisme : Ministry of Education, Culture, Sports, Science and Technology (MEXT)
ID : 22H04901
Organisme : Ministry of Education, Culture, Sports, Science and Technology (MEXT)
ID : 17H06410
Organisme : Ministry of Education, Culture, Sports, Science and Technology (MEXT)
ID : 23H04885
Organisme : Ministry of Education, Culture, Sports, Science and Technology (MEXT)
ID : 23H04880
Organisme : Ministry of Education, Culture, Sports, Science and Technology (MEXT)
ID : 23H04881
Organisme : Ministry of Education, Culture, Sports, Science and Technology (MEXT)
ID : 23H04887

Informations de copyright

© 2023. The Author(s).

Références

Chem Sci. 2019 Nov 18;11(2):577-586
pubmed: 32190274
J Chem Inf Model. 2019 Mar 25;59(3):1096-1108
pubmed: 30887799
Med Res Rev. 1996 Jan;16(1):3-50
pubmed: 8788213
J Med Chem. 2006 Oct 19;49(21):6177-96
pubmed: 17034125
Nat Commun. 2022 Jun 7;13(1):3293
pubmed: 35672310
Mar Drugs. 2008 May 07;6(2):73-102
pubmed: 18728761
J Chem Inf Model. 2020 Dec 28;60(12):5918-5922
pubmed: 33118816
Nat Chem. 2016 Jun;8(6):531-41
pubmed: 27219696
Angew Chem Int Ed Engl. 2021 Aug 23;60(35):19477-19482
pubmed: 34165856
Nucleic Acids Res. 2017 Jan 4;45(D1):D945-D954
pubmed: 27899562
Nat Commun. 2023 Jan 7;14(1):114
pubmed: 36611029
J Chem Inf Model. 2008 Jan;48(1):68-74
pubmed: 18034468
Nat Chem. 2012 Jan 24;4(2):90-8
pubmed: 22270643
Cancer Discov. 2014 Sep;4(9):1046-61
pubmed: 24893891
J Nat Prod. 2020 Mar 27;83(3):770-803
pubmed: 32162523
Sci Adv. 2021 Jun 11;7(24):
pubmed: 34117066
Neural Comput. 1997 Nov 15;9(8):1735-80
pubmed: 9377276
Molecules. 2023 Jul 26;28(15):
pubmed: 37570623
J Chem Inf Model. 2020 Dec 28;60(12):5658-5666
pubmed: 32986426
Nat Prod Rep. 2016 May 4;33(5):648-54
pubmed: 26883503
J Chem Inf Model. 2005 Jan-Feb;45(1):177-82
pubmed: 15667143
J Chem Inf Model. 2010 May 24;50(5):742-54
pubmed: 20426451
J Cheminform. 2009 Jun 10;1(1):8
pubmed: 20298526
Naunyn Schmiedebergs Arch Pharmacol. 2014 Jun;387(6):505-21
pubmed: 24643470
ACS Cent Sci. 2018 Jan 24;4(1):120-131
pubmed: 29392184
Structure. 2013 Feb 5;21(2):209-19
pubmed: 23273428
J Chem Phys. 2019 Jun 21;150(23):234111
pubmed: 31228909
Front Pharmacol. 2020 Dec 18;11:565644
pubmed: 33390943
ACS Cent Sci. 2018 Feb 28;4(2):268-276
pubmed: 29532027
Nucleic Acids Res. 2018 Jan 4;46(D1):D1074-D1082
pubmed: 29126136

Auteurs

Toshiki Ochiai (T)

Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan.

Tensei Inukai (T)

Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan.

Manato Akiyama (M)

Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan.

Kairi Furui (K)

Department of Computer Science, School of Computing, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8501, Japan.

Masahito Ohue (M)

Department of Computer Science, School of Computing, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8501, Japan.

Nobuaki Matsumori (N)

Department of Chemistry, Graduate School of Science, Kyushu University, Fukuoka, Fukuoka, 819-0395, Japan.

Shinsuke Inuki (S)

Division of Medicinal Frontier Sciences, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Kyoto, 606-8501, Japan.

Motonari Uesugi (M)

Institute for Chemical Research and WPI-iCeMS, Kyoto University, Uji, Kyoto, 611-0011, Japan.

Toshiaki Sunazuka (T)

Omura Satoshi Memorial Institute and Graduate School of Infection Control Sciences, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan.

Kazuya Kikuchi (K)

Department of Applied Chemistry, Graduate School of Engineering, Osaka University, Suita, Osaka, 565-0871, Japan.
Immunology Frontier Research Centre, Osaka University, Suita, Osaka, 565-0871, Japan.

Hideaki Kakeya (H)

Division of Medicinal Frontier Sciences, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Kyoto, 606-8501, Japan.

Yasubumi Sakakibara (Y)

Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan. yasu@bio.keio.ac.jp.
Department of Data Science, Kitasato University School of Frontier Engineering, Sagamihara, Kanagawa, 252-0373, Japan. yasu@bio.keio.ac.jp.

Classifications MeSH