Positive Predictive Value Surfaces as a Complementary Tool to Assess the Performance of Virtual Screening Methods.


Journal

Mini reviews in medicinal chemistry
ISSN: 1875-5607
Titre abrégé: Mini Rev Med Chem
Pays: Netherlands
ID NLM: 101094212

Informations de publication

Date de publication:
2020
Historique:
received: 11 09 2019
revised: 28 10 2019
accepted: 29 10 2019
pubmed: 20 2 2020
medline: 13 7 2021
entrez: 20 2 2020
Statut: ppublish

Résumé

Since their introduction in the virtual screening field, Receiver Operating Characteristic (ROC) curve-derived metrics have been widely used for benchmarking of computational methods and algorithms intended for virtual screening applications. Whereas in classification problems, the ratio between sensitivity and specificity for a given score value is very informative, a practical concern in virtual screening campaigns is to predict the actual probability that a predicted hit will prove truly active when submitted to experimental testing (in other words, the Positive Predictive Value - PPV). Estimation of such probability is however, obstructed due to its dependency on the yield of actives of the screened library, which cannot be known a priori. To explore the use of PPV surfaces derived from simulated ranking experiments (retrospective virtual screening) as a complementary tool to ROC curves, for both benchmarking and optimization of score cutoff values. The utility of the proposed approach is assessed in retrospective virtual screening experiments with four datasets used to infer QSAR classifiers: inhibitors of Trypanosoma cruzi trypanothione synthetase; inhibitors of Trypanosoma brucei N-myristoyltransferase; inhibitors of GABA transaminase and anticonvulsant activity in the 6 Hz seizure model. Besides illustrating the utility of PPV surfaces to compare the performance of machine learning models for virtual screening applications and to select an adequate score threshold, our results also suggest that ensemble learning provides models with better predictivity and more robust behavior. PPV surfaces are valuable tools to assess virtual screening tools and choose score thresholds to be applied in prospective in silico screens. Ensemble learning approaches seem to consistently lead to improved predictivity and robustness.

Sections du résumé

BACKGROUND BACKGROUND
Since their introduction in the virtual screening field, Receiver Operating Characteristic (ROC) curve-derived metrics have been widely used for benchmarking of computational methods and algorithms intended for virtual screening applications. Whereas in classification problems, the ratio between sensitivity and specificity for a given score value is very informative, a practical concern in virtual screening campaigns is to predict the actual probability that a predicted hit will prove truly active when submitted to experimental testing (in other words, the Positive Predictive Value - PPV). Estimation of such probability is however, obstructed due to its dependency on the yield of actives of the screened library, which cannot be known a priori.
OBJECTIVE OBJECTIVE
To explore the use of PPV surfaces derived from simulated ranking experiments (retrospective virtual screening) as a complementary tool to ROC curves, for both benchmarking and optimization of score cutoff values.
METHODS METHODS
The utility of the proposed approach is assessed in retrospective virtual screening experiments with four datasets used to infer QSAR classifiers: inhibitors of Trypanosoma cruzi trypanothione synthetase; inhibitors of Trypanosoma brucei N-myristoyltransferase; inhibitors of GABA transaminase and anticonvulsant activity in the 6 Hz seizure model.
RESULTS RESULTS
Besides illustrating the utility of PPV surfaces to compare the performance of machine learning models for virtual screening applications and to select an adequate score threshold, our results also suggest that ensemble learning provides models with better predictivity and more robust behavior.
CONCLUSION CONCLUSIONS
PPV surfaces are valuable tools to assess virtual screening tools and choose score thresholds to be applied in prospective in silico screens. Ensemble learning approaches seem to consistently lead to improved predictivity and robustness.

Identifiants

pubmed: 32072906
pii: MRMC-EPUB-105545
doi: 10.2174/1871525718666200219130229
doi:

Substances chimiques

Anticonvulsants 0
Protozoan Proteins 0
4-Aminobutyrate Transaminase EC 2.6.1.19

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

1447-1460

Informations de copyright

Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.net.

Auteurs

Juan F Morales (JF)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Sara Chuguransky (S)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Lucas N Alberca (LN)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Juan I Alice (JI)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Sofía Goicoechea (S)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

María E Ruiz (ME)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Carolina L Bellera (CL)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Alan Talevi (A)

Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, University of La Plata (UNLP) - 47 & 115, La Plata (1900), Buenos Aires, Argentina.

Articles similaires

Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice
Animals Tail Swine Behavior, Animal Animal Husbandry

Classifications MeSH