Dermatologist versus artificial intelligence confidence in dermoscopy diagnosis: Complementary information that may affect decision-making.

Humans Artificial Intelligence Dermatologists Dermoscopy / methods Melanoma / diagnostic imaging Skin Neoplasms / diagnostic imaging Skin Diseases / diagnostic imaging

computer vision deep learning neural networks skin lesion classification uncertainty

Journal

Experimental dermatology

ISSN: 1600-0625

Titre abrégé: Exp Dermatol

Pays: Denmark

ID NLM: 9301549

Informations de publication

Date de publication:
10 2023

Historique:

revised: 04 07 2023

received: 21 10 2022

accepted: 13 07 2023

medline: 12 10 2023

pubmed: 3 8 2023

entrez: 3 8 2023

Statut: ppublish

Résumé

In dermatology, deep learning may be applied for skin lesion classification. However, for a given input image, a neural network only outputs a label, obtained using the class probabilities, which do not model uncertainty. Our group developed a novel method to quantify uncertainty in stochastic neural networks. In this study, we aimed to train such network for skin lesion classification and evaluate its diagnostic performance and uncertainty, and compare the results to the assessments by a group of dermatologists. By passing duplicates of an image through such a stochastic neural network, we obtained distributions per class, rather than a single probability value. We interpreted the overlap between these distributions as the output uncertainty, where a high overlap indicated a high uncertainty, and vice versa. We had 29 dermatologists diagnose a series of skin lesions and rate their confidence. We compared these results to those of the network. The network achieved a sensitivity and specificity of 50% and 88%, comparable to the average dermatologist (respectively 68% and 73%). Higher confidence/less uncertainty was associated with better diagnostic performance both in the neural network and in dermatologists. We found no correlation between the uncertainty of the neural network and the confidence of dermatologists (R = -0.06, p = 0.77). Dermatologists should not blindly trust the output of a neural network, especially when its uncertainty is high. The addition of an uncertainty score may stimulate the human-computer interaction.

Identifiants

DOI: 10.1111/exd.14892 PMID: 37534916

pubmed: 37534916

doi: 10.1111/exd.14892

doi:

Types de publication

Comparative Study Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

1744-1751

Informations de copyright

Références

European Cancer Information System. Skin melanoma burden in EU-27. 2021 1-2.

Forsea A-M. Melanoma epidemiology and early detection in Europe: diversity and disparities. Dermatol Pract Concept. 2020;10(3):e2020033. doi:10.5826/DPC.1003A33

Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition; 2015:1-9. doi:10.1109/CVPR.2015.7298594

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016:770-778. doi:10.1109/CVPR.2016.90

Tan M, Le QV. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine learning; 2019:6105-6114.

Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2

Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056

Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166

Haenssle HA, Fink C, Toberer F, et al. Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions. Ann Oncol. 2020;31(1):137-143. doi:10.1016/j.annonc.2019.10.013

Marchetti MA, Codella NCF, Dusza SW, et al. Results of the 2016 international skin imaging collaboration international symposium on biomedical imaging challenge: comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J Am Acad Dermatol. 2018;78(2):270-277. doi:10.1016/j.jaad.2017.08.016

Gal Y, Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. International Conference on Machine Learning; 2016:1050-1059.

Tschandl P, Rinner C, Apalla Z, et al. Human-computer collaboration for skin cancer recognition. Nat Med. 2020;26(8):1229-1234. doi:10.1038/s41591-020-0942-0

Van Molle P, Verbelen T, Vankeirsbilck B, et al. Leveraging the Bhattacharyya coefficient for uncertainty quantification in deep neural networks. Neural Comput Appl. 2021;33(16):10259-10275. doi:10.1007/s00521-021-05789-y

Tschandl P, Rosendahl C, Kittler H. Data descriptor: the HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data. 2018;5(1):1-9. doi:10.1038/sdata.2018.161

Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211-252. doi:10.1007/s11263-015-0816-y

Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C. A survey on deep transfer learning. Lect Notes Comput Sci. 2018;11141:270-279. doi:10.1007/978-3-030-01424-7_27

Agarap AF. Deep learning using rectified linear units (ReLU). arXiv. 2018:1803.08375.

Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929-1958.

Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res. 2010;9:249-256.

Kingma DP, Ba JL. Adam: a method for stochastic optimization. arXiv. 2015:1412.6980.

Van Molle P, De Strooper M, Verbelen T, Vankeirsbilck B, Simoens P, Dhoedt B. Visualizing convolutional neural networks to improve decision support for skin lesion classification. Lect Notes Comput Sci. 2018;11038:115-123. doi:10.1007/978-3-030-02628-8_13

Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv. 2013:1312.6034.

Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016:2921-2929. doi:10.1109/CVPR.2016.319

Jahanifar M, Zamani Tajeddin N, Mohammadzadeh Asl B, Gooya A. Supervised saliency map driven segmentation of lesions in dermoscopic images. IEEE J Biomed Health Inform. 2019;23(2):509-518. doi:10.1109/JBHI.2018.2839647

Jia X, Shen L. Skin lesion classification using class activation map. arXiv. 2017:1703.01053.

Kim B, Wattenberg M, Gilmer J, et al. Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). 35th Int Conf Mach Learn ICML. 2018;6:4186-4195. doi:10.48550/arxiv.1711.11279

Lucieri A, Bajwa MN, Alexander Braun S, Malik MI, Dengel A, Ahmed S. On interpretability of deep learning based skin lesion classifiers using concept activation vectors. Proc Int Jt Conf Neural Networks. 2020:1-10. doi:10.1109/IJCNN48605.2020.9206946

Hurwitz RM, Buckel LJ. Signature nevi: individuals with multiple melanocytic nevi commonly have similar clinical and histologic patterns. Dermatol Pract Concept. 2001;1(1):4. doi:10.5826/dpc.0101a04

Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X

Dermatologist versus artificial intelligence confidence in dermoscopy diagnosis: Complementary information that may affect decision-making.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Pieter Van Molle (P)

Sofie Mylle (S)

Tim Verbelen (T)

Cedric De Boom (C)

Bert Vankeirsbilck (B)

Evelien Verhaeghe (E)

Bart Dhoedt (B)

Lieve Brochez (L)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH