Diagnostic performance of augmented intelligence with 2D and 3D total body photography and convolutional neural networks in a high-risk population for melanoma under real-world conditions: A new era of skin cancer screening?
Artificial intelligence
Convolutional neural network
Deep learning
Melanoma
Photography augmented intelligence
Pigmented naevus
Skin neoplasm
Three-dimensional (3D)
Total body photography
Two-dimensional (2D)
Journal
European journal of cancer (Oxford, England : 1990)
ISSN: 1879-0852
Titre abrégé: Eur J Cancer
Pays: England
ID NLM: 9005373
Informations de publication
Date de publication:
09 2023
09 2023
Historique:
received:
20
02
2023
revised:
13
06
2023
accepted:
17
06
2023
medline:
8
8
2023
pubmed:
16
7
2023
entrez:
15
7
2023
Statut:
ppublish
Résumé
Convolutional neural networks (CNNs) have outperformed dermatologists in classifying pigmented skin lesions under artificial conditions. We investigated, for the first time, the performance of three-dimensional (3D) and two-dimensional (2D) CNNs and dermatologists in the early detection of melanoma in a real-world setting. In this prospective study, 1690 melanocytic lesions in 143 patients with high-risk criteria for melanoma were evaluated by dermatologists, 2D-FotoFinder-ATBM and 3D-Vectra WB360 total body photography (TBP). Excision was based on the dermatologists' dichotomous decision, an elevated CNN risk score (study-specific malignancy cut-off: FotoFinder >0.5, Vectra >5.0) and/or the second dermatologist's assessment with CNN support. The diagnostic accuracy of the 2D and 3D CNN classification was compared with that of the dermatologists and the augmented intelligence based on histopathology and dermatologists' assessment. Secondary end-points included reproducibility of risk scores and naevus counts per patient by medical staff (gold standard) compared to automated 3D and 2D TBP CNN counts. The sensitivity, specificity, and receiver operating characteristics area under the curve (ROC-AUC) for risk-score-assessments compared to histopathology of 3D-CNN with 95% confidence intervals (CI) were 90.0%, 64.6% and 0.92 (CI 0.85-1.00), respectively. While dermatologists and augmented intelligence achieved the same sensitivity (90%) and comparable classification ROC-AUC (0.91 [CI 0.80-1.00], 0.88 [CI 0.77-1.00]) with 3D-CNN, their specificity was superior (92.3% and 86.2%, respectively). The 2D-CNN (sensitivity: 70%, specificity: 40%, ROC-AUC: 0.68 [CI 0.46-0.90]) was outperformed by 3D CNN and dermatologists. The 3D-CNN showed a higher correlation coefficient for repeated measurements of 246 lesions (R = 0.89) than the 2D-CNN (R = 0.79). The mean naevus count per patient varied significantly (gold standard: 210 lesions; 3D-CNN: 469; 2D-CNN: 1324; p < 0.0001). Our study emphasises the importance of validating the classification of CNNs in real life. The novel 3D-CNN device outperformed the 2D-CNN and achieved comparable sensitivity with dermatologists. The low specificity of CNNs and the lack of automated counting of TBP nevi currently limit the use of augmented intelligence in clinical practice.
Sections du résumé
BACKGROUND
Convolutional neural networks (CNNs) have outperformed dermatologists in classifying pigmented skin lesions under artificial conditions. We investigated, for the first time, the performance of three-dimensional (3D) and two-dimensional (2D) CNNs and dermatologists in the early detection of melanoma in a real-world setting.
METHODS
In this prospective study, 1690 melanocytic lesions in 143 patients with high-risk criteria for melanoma were evaluated by dermatologists, 2D-FotoFinder-ATBM and 3D-Vectra WB360 total body photography (TBP). Excision was based on the dermatologists' dichotomous decision, an elevated CNN risk score (study-specific malignancy cut-off: FotoFinder >0.5, Vectra >5.0) and/or the second dermatologist's assessment with CNN support. The diagnostic accuracy of the 2D and 3D CNN classification was compared with that of the dermatologists and the augmented intelligence based on histopathology and dermatologists' assessment. Secondary end-points included reproducibility of risk scores and naevus counts per patient by medical staff (gold standard) compared to automated 3D and 2D TBP CNN counts.
RESULTS
The sensitivity, specificity, and receiver operating characteristics area under the curve (ROC-AUC) for risk-score-assessments compared to histopathology of 3D-CNN with 95% confidence intervals (CI) were 90.0%, 64.6% and 0.92 (CI 0.85-1.00), respectively. While dermatologists and augmented intelligence achieved the same sensitivity (90%) and comparable classification ROC-AUC (0.91 [CI 0.80-1.00], 0.88 [CI 0.77-1.00]) with 3D-CNN, their specificity was superior (92.3% and 86.2%, respectively). The 2D-CNN (sensitivity: 70%, specificity: 40%, ROC-AUC: 0.68 [CI 0.46-0.90]) was outperformed by 3D CNN and dermatologists. The 3D-CNN showed a higher correlation coefficient for repeated measurements of 246 lesions (R = 0.89) than the 2D-CNN (R = 0.79). The mean naevus count per patient varied significantly (gold standard: 210 lesions; 3D-CNN: 469; 2D-CNN: 1324; p < 0.0001).
CONCLUSIONS
Our study emphasises the importance of validating the classification of CNNs in real life. The novel 3D-CNN device outperformed the 2D-CNN and achieved comparable sensitivity with dermatologists. The low specificity of CNNs and the lack of automated counting of TBP nevi currently limit the use of augmented intelligence in clinical practice.
Identifiants
pubmed: 37453242
pii: S0959-8049(23)00306-4
doi: 10.1016/j.ejca.2023.112954
pii:
doi:
Banques de données
ClinicalTrials.gov
['NCT04605822']
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
112954Informations de copyright
Copyright © 2023 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of Competing Interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: SEC has no conflict of interest. PC has no conflict of interest. LK has received speaking fees for a presentation sponsored by Boehringer Ingelheim. SH has no conflict of interest. MK has received speaking fees from Almirall and Sanofi outside of the current work. J-TM has served as advisor and/or received speaking fees and/or participated in clinical trials sponsored by AbbVie, Almirall, Amgen, BMS, Celgene, Eli Lilly, LEO Pharma, Janssen-Cilag, MSD, Novartis, Pfizer, Pierre Fabre, Roche, Sanofi, UCB. JSB has no conflict of interest. CFD has no conflict of interest. AG has no conflict of interest. CJ has no conflict of interest. LMS has no conflict of interest. JKP has no conflict of interest. ML has received research funding unrelated to the manuscript from Roche, Novartis, Molecular Partners, and Oncobit. AAN declares being a consultant and advisor and/or receiving speaking fees and/or grants and/or served as an investigator in clinical trials for AbbVie, Almirall, Amgen, Biomed, BMS, Boehringer Ingelheim, Celgene, Eli Lilly, Galderma, GSK, LEO Pharma, Janssen-Cilag, MSD, Novartis, Pfizer, Pierre Fabre Pharma, Regeneron, Sandoz, Sanofi, and UCB. LVM has served as advisor and/or received speaking fees and/or participated in clinical trials sponsored by Almirall, Amgen, Eli Lilly, MSD, Novartis, Pierre Fabre, Roche, and Sanofi outside of the current work.