Enhancing molecular design efficiency: Uniting language models and generative networks with genetic algorithms.

generative adversarial network genetic algorithm masked language model molecule design

Journal

Patterns (New York, N.Y.)
ISSN: 2666-3899
Titre abrégé: Patterns (N Y)
Pays: United States
ID NLM: 101767765

Informations de publication

Date de publication:
12 Apr 2024
Historique:
received: 26 09 2023
revised: 14 11 2023
accepted: 08 02 2024
medline: 22 4 2024
pubmed: 22 4 2024
entrez: 22 4 2024
Statut: epublish

Résumé

This study examines the effectiveness of generative models in drug discovery, material science, and polymer science, aiming to overcome constraints associated with traditional inverse design methods relying on heuristic rules. Generative models generate synthetic data resembling real data, enabling deep learning model training without extensive labeled datasets. They prove valuable in creating virtual libraries of molecules for material science and facilitating drug discovery by generating molecules with specific properties. While generative adversarial networks (GANs) are explored for these purposes, mode collapse restricts their efficacy, limiting novel structure variability. To address this, we introduce a masked language model (LM) inspired by natural language processing. Although LMs alone can have inherent limitations, we propose a hybrid architecture combining LMs and GANs to efficiently generate new molecules, demonstrating superior performance over standalone masked LMs, particularly for smaller population sizes. This hybrid LM-GAN architecture enhances efficiency in optimizing properties and generating novel samples.

Identifiants

pubmed: 38645768
doi: 10.1016/j.patter.2024.100947
pii: S2666-3899(24)00046-1
pmc: PMC11026973
doi:

Types de publication

Journal Article

Langues

eng

Pagination

100947

Informations de copyright

© 2024 Oak Ridge National Laboratory.

Déclaration de conflit d'intérêts

The authors declare no competing interests.

Auteurs

Debsindhu Bhowmik (D)

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.

Pei Zhang (P)

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.

Zachary Fox (Z)

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.

Stephan Irle (S)

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.

John Gounley (J)

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.

Classifications MeSH