Dermatologist versus artificial intelligence confidence in dermoscopy diagnosis: Complementary information that may affect decision-making.


Journal

Experimental dermatology
ISSN: 1600-0625
Titre abrégé: Exp Dermatol
Pays: Denmark
ID NLM: 9301549

Informations de publication

Date de publication:
10 2023
Historique:
revised: 04 07 2023
received: 21 10 2022
accepted: 13 07 2023
medline: 12 10 2023
pubmed: 3 8 2023
entrez: 3 8 2023
Statut: ppublish

Résumé

In dermatology, deep learning may be applied for skin lesion classification. However, for a given input image, a neural network only outputs a label, obtained using the class probabilities, which do not model uncertainty. Our group developed a novel method to quantify uncertainty in stochastic neural networks. In this study, we aimed to train such network for skin lesion classification and evaluate its diagnostic performance and uncertainty, and compare the results to the assessments by a group of dermatologists. By passing duplicates of an image through such a stochastic neural network, we obtained distributions per class, rather than a single probability value. We interpreted the overlap between these distributions as the output uncertainty, where a high overlap indicated a high uncertainty, and vice versa. We had 29 dermatologists diagnose a series of skin lesions and rate their confidence. We compared these results to those of the network. The network achieved a sensitivity and specificity of 50% and 88%, comparable to the average dermatologist (respectively 68% and 73%). Higher confidence/less uncertainty was associated with better diagnostic performance both in the neural network and in dermatologists. We found no correlation between the uncertainty of the neural network and the confidence of dermatologists (R = -0.06, p = 0.77). Dermatologists should not blindly trust the output of a neural network, especially when its uncertainty is high. The addition of an uncertainty score may stimulate the human-computer interaction.

Identifiants

pubmed: 37534916
doi: 10.1111/exd.14892
doi:

Types de publication

Comparative Study Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1744-1751

Informations de copyright

© 2023 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

Références

European Cancer Information System. Skin melanoma burden in EU-27. 2021 1-2.
Forsea A-M. Melanoma epidemiology and early detection in Europe: diversity and disparities. Dermatol Pract Concept. 2020;10(3):e2020033. doi:10.5826/DPC.1003A33
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015:1-9. doi:10.1109/CVPR.2015.7298594
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016:770-778. doi:10.1109/CVPR.2016.90
Tan M, Le QV. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine learning; 2019:6105-6114.
Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2
Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166
Haenssle HA, Fink C, Toberer F, et al. Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions. Ann Oncol. 2020;31(1):137-143. doi:10.1016/j.annonc.2019.10.013
Marchetti MA, Codella NCF, Dusza SW, et al. Results of the 2016 international skin imaging collaboration international symposium on biomedical imaging challenge: comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J Am Acad Dermatol. 2018;78(2):270-277. doi:10.1016/j.jaad.2017.08.016
Gal Y, Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. International Conference on Machine Learning; 2016:1050-1059.
Tschandl P, Rinner C, Apalla Z, et al. Human-computer collaboration for skin cancer recognition. Nat Med. 2020;26(8):1229-1234. doi:10.1038/s41591-020-0942-0
Van Molle P, Verbelen T, Vankeirsbilck B, et al. Leveraging the Bhattacharyya coefficient for uncertainty quantification in deep neural networks. Neural Comput Appl. 2021;33(16):10259-10275. doi:10.1007/s00521-021-05789-y
Tschandl P, Rosendahl C, Kittler H. Data descriptor: the HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data. 2018;5(1):1-9. doi:10.1038/sdata.2018.161
Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211-252. doi:10.1007/s11263-015-0816-y
Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C. A survey on deep transfer learning. Lect Notes Comput Sci. 2018;11141:270-279. doi:10.1007/978-3-030-01424-7_27
Agarap AF. Deep learning using rectified linear units (ReLU). arXiv. 2018:1803.08375.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929-1958.
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res. 2010;9:249-256.
Kingma DP, Ba JL. Adam: a method for stochastic optimization. arXiv. 2015:1412.6980.
Van Molle P, De Strooper M, Verbelen T, Vankeirsbilck B, Simoens P, Dhoedt B. Visualizing convolutional neural networks to improve decision support for skin lesion classification. Lect Notes Comput Sci. 2018;11038:115-123. doi:10.1007/978-3-030-02628-8_13
Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv. 2013:1312.6034.
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016:2921-2929. doi:10.1109/CVPR.2016.319
Jahanifar M, Zamani Tajeddin N, Mohammadzadeh Asl B, Gooya A. Supervised saliency map driven segmentation of lesions in dermoscopic images. IEEE J Biomed Health Inform. 2019;23(2):509-518. doi:10.1109/JBHI.2018.2839647
Jia X, Shen L. Skin lesion classification using class activation map. arXiv. 2017:1703.01053.
Kim B, Wattenberg M, Gilmer J, et al. Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). 35th Int Conf Mach Learn ICML. 2018;6:4186-4195. doi:10.48550/arxiv.1711.11279
Lucieri A, Bajwa MN, Alexander Braun S, Malik MI, Dengel A, Ahmed S. On interpretability of deep learning based skin lesion classifiers using concept activation vectors. Proc Int Jt Conf Neural Networks. 2020:1-10. doi:10.1109/IJCNN48605.2020.9206946
Hurwitz RM, Buckel LJ. Signature nevi: individuals with multiple melanocytic nevi commonly have similar clinical and histologic patterns. Dermatol Pract Concept. 2001;1(1):4. doi:10.5826/dpc.0101a04
Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X

Auteurs

Pieter Van Molle (P)

IDLab, Department of Information Technology, Ghent University-IMEC, Ghent, Belgium.

Sofie Mylle (S)

Department of Dermatology, Ghent University Hospital, Ghent, Belgium.
Cancer Research Institute Ghent (CRIG), Ghent, Belgium.

Tim Verbelen (T)

IDLab, Department of Information Technology, Ghent University-IMEC, Ghent, Belgium.

Cedric De Boom (C)

IDLab, Department of Information Technology, Ghent University-IMEC, Ghent, Belgium.

Bert Vankeirsbilck (B)

IDLab, Department of Information Technology, Ghent University-IMEC, Ghent, Belgium.

Evelien Verhaeghe (E)

Department of Dermatology, Ghent University Hospital, Ghent, Belgium.

Bart Dhoedt (B)

IDLab, Department of Information Technology, Ghent University-IMEC, Ghent, Belgium.

Lieve Brochez (L)

Department of Dermatology, Ghent University Hospital, Ghent, Belgium.
Cancer Research Institute Ghent (CRIG), Ghent, Belgium.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH