Positive Predictive Value Surfaces as a Complementary Tool to Assess the Performance of Virtual Screening Methods.
Benchmarking
enrichment
ensemble learning
positive predictive value
retrospective screen
virtual screening
Journal
Mini reviews in medicinal chemistry
ISSN: 1875-5607
Titre abrégé: Mini Rev Med Chem
Pays: Netherlands
ID NLM: 101094212
Informations de publication
Date de publication:
2020
2020
Historique:
received:
11
09
2019
revised:
28
10
2019
accepted:
29
10
2019
pubmed:
20
2
2020
medline:
13
7
2021
entrez:
20
2
2020
Statut:
ppublish
Résumé
Since their introduction in the virtual screening field, Receiver Operating Characteristic (ROC) curve-derived metrics have been widely used for benchmarking of computational methods and algorithms intended for virtual screening applications. Whereas in classification problems, the ratio between sensitivity and specificity for a given score value is very informative, a practical concern in virtual screening campaigns is to predict the actual probability that a predicted hit will prove truly active when submitted to experimental testing (in other words, the Positive Predictive Value - PPV). Estimation of such probability is however, obstructed due to its dependency on the yield of actives of the screened library, which cannot be known a priori. To explore the use of PPV surfaces derived from simulated ranking experiments (retrospective virtual screening) as a complementary tool to ROC curves, for both benchmarking and optimization of score cutoff values. The utility of the proposed approach is assessed in retrospective virtual screening experiments with four datasets used to infer QSAR classifiers: inhibitors of Trypanosoma cruzi trypanothione synthetase; inhibitors of Trypanosoma brucei N-myristoyltransferase; inhibitors of GABA transaminase and anticonvulsant activity in the 6 Hz seizure model. Besides illustrating the utility of PPV surfaces to compare the performance of machine learning models for virtual screening applications and to select an adequate score threshold, our results also suggest that ensemble learning provides models with better predictivity and more robust behavior. PPV surfaces are valuable tools to assess virtual screening tools and choose score thresholds to be applied in prospective in silico screens. Ensemble learning approaches seem to consistently lead to improved predictivity and robustness.
Sections du résumé
BACKGROUND
BACKGROUND
Since their introduction in the virtual screening field, Receiver Operating Characteristic (ROC) curve-derived metrics have been widely used for benchmarking of computational methods and algorithms intended for virtual screening applications. Whereas in classification problems, the ratio between sensitivity and specificity for a given score value is very informative, a practical concern in virtual screening campaigns is to predict the actual probability that a predicted hit will prove truly active when submitted to experimental testing (in other words, the Positive Predictive Value - PPV). Estimation of such probability is however, obstructed due to its dependency on the yield of actives of the screened library, which cannot be known a priori.
OBJECTIVE
OBJECTIVE
To explore the use of PPV surfaces derived from simulated ranking experiments (retrospective virtual screening) as a complementary tool to ROC curves, for both benchmarking and optimization of score cutoff values.
METHODS
METHODS
The utility of the proposed approach is assessed in retrospective virtual screening experiments with four datasets used to infer QSAR classifiers: inhibitors of Trypanosoma cruzi trypanothione synthetase; inhibitors of Trypanosoma brucei N-myristoyltransferase; inhibitors of GABA transaminase and anticonvulsant activity in the 6 Hz seizure model.
RESULTS
RESULTS
Besides illustrating the utility of PPV surfaces to compare the performance of machine learning models for virtual screening applications and to select an adequate score threshold, our results also suggest that ensemble learning provides models with better predictivity and more robust behavior.
CONCLUSION
CONCLUSIONS
PPV surfaces are valuable tools to assess virtual screening tools and choose score thresholds to be applied in prospective in silico screens. Ensemble learning approaches seem to consistently lead to improved predictivity and robustness.
Identifiants
pubmed: 32072906
pii: MRMC-EPUB-105545
doi: 10.2174/1871525718666200219130229
doi:
Substances chimiques
Anticonvulsants
0
Protozoan Proteins
0
4-Aminobutyrate Transaminase
EC 2.6.1.19
Types de publication
Journal Article
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
1447-1460Informations de copyright
Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.net.