Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity.

QSAR autoencoder cheminformatics deep learning dimensionality reduction grid search hyperparameter optimisation locally linear embedding mutagenicity principal component analysis

Journal

Toxics
ISSN: 2305-6304
Titre abrégé: Toxics
Pays: Switzerland
ID NLM: 101639637

Informations de publication

Date de publication:
30 Jun 2023
Historique:
received: 31 05 2023
revised: 28 06 2023
accepted: 28 06 2023
medline: 28 7 2023
pubmed: 28 7 2023
entrez: 28 7 2023
Statut: epublish

Résumé

Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover's theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space.

Identifiants

pubmed: 37505541
pii: toxics11070572
doi: 10.3390/toxics11070572
pmc: PMC10384850
pii:
doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/T008709/1
Pays : United Kingdom

Références

Nucleic Acids Res. 2016 Jan 4;44(D1):D1202-13
pubmed: 26400175
J Chem Inf Model. 2015 Mar 23;55(3):510-28
pubmed: 25647539
Science. 2000 Dec 22;290(5500):2319-23
pubmed: 11125149
Arch Toxicol. 2022 May;96(5):1279-1295
pubmed: 35267067
Methods. 2015 Jan;71:58-63
pubmed: 25132639
Mol Divers. 2021 Aug;25(3):1283-1299
pubmed: 34146224
Science. 2000 Dec 22;290(5500):2323-6
pubmed: 11125150
J Comput Aided Mol Des. 2004 Jul-Sep;18(7-9):475-82
pubmed: 15729847
Arch Toxicol. 2019 Dec;93(12):3643-3667
pubmed: 31781791
Methods Mol Biol. 2013;930:499-526
pubmed: 23086855
Molecules. 2019 Apr 30;24(9):
pubmed: 31052325
Environ Sci Pollut Res Int. 2021 Sep;28(34):47641-47650
pubmed: 33895950
Mutagenesis. 2019 Mar 6;34(1):3-16
pubmed: 30357358
Molecules. 2012 Apr 25;17(5):4791-810
pubmed: 22534664

Auteurs

Alexander D Kalian (AD)

Department of Nutritional Sciences, King's College London, Franklin-Wilkins Building, 150 Stamford St., London SE1 9NH, UK.

Emilio Benfenati (E)

Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milano, Italy.

Olivia J Osborne (OJ)

Food Standards Agency, 70 Petty France, London SW1H 9EX, UK.

David Gott (D)

Food Standards Agency, 70 Petty France, London SW1H 9EX, UK.

Claire Potter (C)

Food Standards Agency, 70 Petty France, London SW1H 9EX, UK.

Jean-Lou C M Dorne (JCM)

European Food Safety Authority (EFSA), Via Carlo Magno 1A, 43126 Parma, Italy.

Miao Guo (M)

Department of Engineering, King's College London, Strand Campus, Strand, London WC2R 2LS, UK.

Christer Hogstrand (C)

Department of Analytical, Environmental and Forensic Sciences, King's College London, Franklin-Wilkins Building, 150 Stamford St., London SE1 9NH, UK.

Classifications MeSH