Assessing the calibration in toxicological in vitro models with conformal prediction.
Applicability domain
Calibration plots
Conformal prediction
Data drifts
Tox21 datasets
Toxicity prediction
Journal
Journal of cheminformatics
ISSN: 1758-2946
Titre abrégé: J Cheminform
Pays: England
ID NLM: 101516718
Informations de publication
Date de publication:
29 Apr 2021
29 Apr 2021
Historique:
received:
07
02
2021
accepted:
10
04
2021
entrez:
30
4
2021
pubmed:
1
5
2021
medline:
1
5
2021
Statut:
epublish
Résumé
Machine learning methods are widely used in drug discovery and toxicity prediction. While showing overall good performance in cross-validation studies, their predictive power (often) drops in cases where the query samples have drifted from the training data's descriptor space. Thus, the assumption for applying machine learning algorithms, that training and test data stem from the same distribution, might not always be fulfilled. In this work, conformal prediction is used to assess the calibration of the models. Deviations from the expected error may indicate that training and test data originate from different distributions. Exemplified on the Tox21 datasets, composed of chronologically released Tox21Train, Tox21Test and Tox21Score subsets, we observed that while internally valid models could be trained using cross-validation on Tox21Train, predictions on the external Tox21Score data resulted in higher error rates than expected. To improve the prediction on the external sets, a strategy exchanging the calibration set with more recent data, such as Tox21Test, has successfully been introduced. We conclude that conformal prediction can be used to diagnose data drifts and other issues related to model calibration. The proposed improvement strategy-exchanging the calibration data only-is convenient as it does not require retraining of the underlying model.
Identifiants
pubmed: 33926567
doi: 10.1186/s13321-021-00511-5
pii: 10.1186/s13321-021-00511-5
pmc: PMC8082859
doi:
Types de publication
Journal Article
Langues
eng
Pagination
35Subventions
Organisme : Bundesministerium für Bildung und Forschung
ID : 031A262C
Organisme : Alzheimer's Research UK
ID : 560832
Organisme : Svenska Forskningsrådet Formas
ID : 2018-00924
Organisme : Vetenskapsrådet
ID : 2020-03731
Organisme : Vetenskapsrådet
ID : 2020-01865
Organisme : Stiftelsen för Strategisk Forskning
ID : grant BD150008
Références
Chem Res Toxicol. 2016 Aug 15;29(8):1225-51
pubmed: 27367298
Food Chem Toxicol. 2018 Feb;112:526-534
pubmed: 28412406
Mol Inform. 2016 May;35(5):160-80
pubmed: 27492083
J Chem Inf Model. 2014 Oct 27;54(10):2647-53
pubmed: 25230336
J Cheminform. 2020 Jun 5;12(1):41
pubmed: 33431016
J Chem Inf Model. 2014 Jun 23;54(6):1596-603
pubmed: 24797111
J Chem Inf Comput Sci. 2003 May-Jun;43(3):707-20
pubmed: 12767129
J Chem Inf Model. 2012 Oct 22;52(10):2570-8
pubmed: 23030316
SAR QSAR Environ Res. 2016 Nov;27(11):893-909
pubmed: 27827546
J Cheminform. 2020 Apr 14;12(1):24
pubmed: 33431007
J Chem Inf Model. 2017 Jul 24;57(7):1591-1598
pubmed: 28628322
Sci Rep. 2014 Jul 11;4:5664
pubmed: 25012808
J Chem Inf Model. 2014 Nov 24;54(11):3211-7
pubmed: 25318024
J Pharm Sci. 2021 Jan;110(1):42-49
pubmed: 33075380
Int J Mol Sci. 2014 Nov 14;15(11):21136-54
pubmed: 25405742
SAR QSAR Environ Res. 2017 Dec;28(12):1011-1023
pubmed: 29135323
Chem Res Toxicol. 2020 Jan 21;33(1):20-37
pubmed: 31625725
PLoS One. 2019 Mar 14;14(3):e0213848
pubmed: 30870500
Chem Res Toxicol. 2021 Feb 15;34(2):189-216
pubmed: 33140634
J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):427-36
pubmed: 15032522
AAPS J. 2019 Jul 11;21(5):89
pubmed: 31297703
J Chem Inf Model. 2009 Jul;49(7):1762-76
pubmed: 19530661
Drug Discov Today. 2013 Aug;18(15-16):716-23
pubmed: 23732176
J Chem Inf Model. 2015 Jun 22;55(6):1098-107
pubmed: 25998559
J Comput Aided Mol Des. 2003 Feb-Apr;17(2-4):241-53
pubmed: 13677490
Front Pharmacol. 2018 Oct 11;9:1147
pubmed: 30364191
J Chem Inf Model. 2019 Mar 25;59(3):945-946
pubmed: 30905159
Front Chem. 2018 Feb 20;6:30
pubmed: 29515993
J Cheminform. 2015 May 30;7:23
pubmed: 26136848
J Cheminform. 2018 Oct 11;10(1):49
pubmed: 30306349