An Inverse QSAR Method Based on Linear Regression and Integer Programming.

QSAR/QSPR chemoinformatics integer programming linear regression machine learning materials informatics molecular design

Journal

Frontiers in bioscience (Landmark edition)
ISSN: 2768-6698
Titre abrégé: Front Biosci (Landmark Ed)
Pays: Singapore
ID NLM: 101612996

Informations de publication

Date de publication:
10 06 2022
Historique:
received: 16 02 2022
revised: 28 03 2022
accepted: 07 04 2022
entrez: 24 6 2022
pubmed: 25 6 2022
medline: 28 6 2022
Statut: ppublish

Résumé

Drug design is one of the important applications of biological science. Extensive studies have been done on computer-aided drug design based on inverse quantitative structure activity relationship (inverse QSAR), which is to infer chemical compounds from given chemical activities and constraints. However, exact or optimal solutions are not guaranteed in most of the existing methods. Recently a novel framework based on artificial neural networks (ANNs) and mixed integer linear programming (MILP) has been proposed for designing chemical structures. This framework consists of two phases: an ANN is used to construct a prediction function, and then an MILP formulated on the trained ANN and a graph search algorithm are used to infer desired chemical structures. In this paper, we use linear regression instead of ANNs to construct a prediction function. For this, we derive a novel MILP formulation that simulates the computation process of a prediction function by linear regression. For the first phase, we performed computational experiments using 18 chemical properties, and the proposed method achieved good prediction accuracy for a relatively large number of properties, in comparison with ANNs in our previous work. For the second phase, we performed computational experiments on five chemical properties, and the method could infer chemical structures with around up to 50 non-hydrogen atoms. Combination of linear regression and integer programming is a potentially useful approach to computational molecular design.

Sections du résumé

BACKGROUND
Drug design is one of the important applications of biological science. Extensive studies have been done on computer-aided drug design based on inverse quantitative structure activity relationship (inverse QSAR), which is to infer chemical compounds from given chemical activities and constraints. However, exact or optimal solutions are not guaranteed in most of the existing methods.
METHOD
Recently a novel framework based on artificial neural networks (ANNs) and mixed integer linear programming (MILP) has been proposed for designing chemical structures. This framework consists of two phases: an ANN is used to construct a prediction function, and then an MILP formulated on the trained ANN and a graph search algorithm are used to infer desired chemical structures. In this paper, we use linear regression instead of ANNs to construct a prediction function. For this, we derive a novel MILP formulation that simulates the computation process of a prediction function by linear regression.
RESULTS
For the first phase, we performed computational experiments using 18 chemical properties, and the proposed method achieved good prediction accuracy for a relatively large number of properties, in comparison with ANNs in our previous work. For the second phase, we performed computational experiments on five chemical properties, and the method could infer chemical structures with around up to 50 non-hydrogen atoms.
CONCLUSIONS
Combination of linear regression and integer programming is a potentially useful approach to computational molecular design.

Identifiants

pubmed: 35748264
pii: S2768-6701(22)00551-2
doi: 10.31083/j.fbl2706188
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

188

Informations de copyright

© 2022 The Author(s). Published by IMR Press.

Déclaration de conflit d'intérêts

The authors declare no conflict of interest. TA is serving as the guest editor of this journal. We declare that TA had no involvement in the peer review of this article and has no access to information regarding its peer review. Full responsibility for the editorial process for this article was delegated to AK and GP.

Auteurs

Jianshen Zhu (J)

Department of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, Japan.

Naveed Ahmed Azam (NA)

Department of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, Japan.

Kazuya Haraguchi (K)

Department of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, Japan.

Liang Zhao (L)

Graduate School of Advanced Integrated Studies in Human Survavibility (Shishu-Kan), Kyoto University, 606-8306 Kyoto, Japan.

Hiroshi Nagamochi (H)

Department of Applied Mathematics and Physics, Kyoto University, 606-8501 Kyoto, Japan.

Tatsuya Akutsu (T)

Bioinformatics Center, Institute for Chemical Research, Kyoto University, 611-0011 Uji, Japan.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Alzheimer Disease Humans Regression Analysis Quantitative Structure-Activity Relationship Drug Design
Humans Algorithms Software Artificial Intelligence Computer Simulation

Classifications MeSH