Human-computer collaboration for skin cancer recognition.
Journal
Nature medicine
ISSN: 1546-170X
Titre abrégé: Nat Med
Pays: United States
ID NLM: 9502015
Informations de publication
Date de publication:
08 2020
08 2020
Historique:
received:
26
09
2019
accepted:
15
05
2020
pubmed:
24
6
2020
medline:
29
10
2020
entrez:
24
6
2020
Statut:
ppublish
Résumé
The rapid increase in telemedicine coupled with recent advances in diagnostic artificial intelligence (AI) create the imperative to consider the opportunities and risks of inserting AI-based support into new paradigms of care. Here we build on recent achievements in the accuracy of image-based AI for skin cancer diagnosis to address the effects of varied representations of AI-based support across different levels of clinical expertise and multiple clinical workflows. We find that good quality AI-based support of clinical decision-making improves diagnostic accuracy over that of either AI or physicians alone, and that the least experienced clinicians gain the most from AI-based support. We further find that AI-based multiclass probabilities outperformed content-based image retrieval (CBIR) representations of AI in the mobile technology environment, and AI-based support had utility in simulations of second opinions and of telemedicine triage. In addition to demonstrating the potential benefits associated with good quality AI in the hands of non-expert clinicians, we find that faulty AI can mislead the entire spectrum of clinicians, including experts. Lastly, we show that insights derived from AI class-activation maps can inform improvements in human diagnosis. Together, our approach and findings offer a framework for future studies across the spectrum of image-based diagnostics to improve human-computer collaboration in clinical practice.
Identifiants
pubmed: 32572267
doi: 10.1038/s41591-020-0942-0
pii: 10.1038/s41591-020-0942-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1229-1234Références
Webster, P. Virtual health care in the era of COVID-19. Lancet 395, 1180–1181 (2020).
doi: 10.1016/S0140-6736(20)30818-7
He, J. et al. The practical implementation of artificial intelligence technologies in medicine. Nat. Med. 25, 30–36 (2019).
doi: 10.1038/s41591-018-0307-0
McKinney, S. M. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020).
doi: 10.1038/s41586-019-1799-6
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
doi: 10.1001/jama.2016.17216
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
doi: 10.1038/nature21056
Haenssle, H. A. et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 29, 1836–1842 (2018).
doi: 10.1093/annonc/mdy166
Han, S. S. et al. Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm. J. Invest. Dermatol. 138, 1529–1538 (2018).
doi: 10.1016/j.jid.2018.01.028
Marchetti, M. A. et al. Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J. Am. Acad. Dermatol. 78, 270–277 (2018).
doi: 10.1016/j.jaad.2017.08.016
Tschandl, P. et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 20, 938–947 (2019).
doi: 10.1016/S1470-2045(19)30333-X
Garg, A. X. et al. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes: a systematic review. JAMA 293, 1223–1238 (2005).
doi: 10.1001/jama.293.10.1223
Codella, N. C. F. et al. Collaborative human–AI (CHAI): evidence-based interpretable melanoma classification in dermoscopic images. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications (eds., Kenji Suzuki, Mauricio Reyes, Tanveer Syeda-Mahmood, ETH Zurich, Ben Glocker, Roland Wiest, Yaniv Gur, Hayit Greenspan, Anant Madabhushi) 97–105 (Springer International Publishing, 2018).
Bien, N. et al. Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet. PLoS Med. 15, e1002699 (2018).
doi: 10.1371/journal.pmed.1002699
Mobiny, A., Singh, A. & Van Nguyen, H. Risk-aware machine learning classifier for skin lesion diagnosis. J. Clin. Med. 8, 1241 (2019).
Han, S. S. et al. Augment intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J. Invest. Dermatol. https://doi.org/10.1016/j.jid.2020.01.019 (2020).
Hekler, A. et al. Superior skin cancer classification by the combination of human and artificial intelligence. Eur. J. Cancer 120, 114–121 (2019).
doi: 10.1016/j.ejca.2019.07.019
Lakhani, P. & Sundaram, B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 284, 574–582 (2017).
doi: 10.1148/radiol.2017162326
Tschandl, P., Rosendahl, C. & Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 (2018).
doi: 10.1038/sdata.2018.161
Codella, N. et al. Skin lesion analysis toward melanoma detection 2018: a challenge hosted by the International Skin Imaging Collaboration (ISIC). Preprint at https://arxiv.org/abs/1902.03368 (2019).
Sadeghi, M., Chilana, P. K. & Atkins, M. S. How users perceive content-based image retrieval for identifying skin images. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications (eds., Kenji Suzuki, Mauricio Reyes, Tanveer Syeda-Mahmood, ETH Zurich, Ben Glocker, Roland Wiest, Yaniv Gur, Hayit Greenspan, Anant Madabhushi) 141–148 (Springer International Publishing, 2018).
Tschandl, P., Argenziano, G., Razmara, M. & Yap, J. Diagnostic accuracy of content-based dermatoscopic image retrieval with deep classification features. Br. J. Dermatol. 181, 155–165 (2019).
Cai, C. J. et al. Human-centered tools for coping with imperfect algorithms during medical decision-making. In Proc. 2019 CHI Conference on Human Factors in Computing Systems 1–14 (Association for Computing Machinery, 2019).
Wang, M. & Deng, W. Deep visual domain adaptation: a survey. Neurocomputing 312, 135–153 (2018).
doi: 10.1016/j.neucom.2018.05.083
Finlayson, S.G. et al. Adversarial attacks on medical machine learning. Science 363, 1287–1289 (2019).
doi: 10.1126/science.aaw4399
Navarrete-Dechent, C. et al. Automated dermatological diagnosis: hype or reality? J. Invest. Dermatol. 138, 2277–2279 (2018).
doi: 10.1016/j.jid.2018.04.040
Winkler, J. K. et al. Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA Dermatol. 155, 1135–1141 (2019).
doi: 10.1001/jamadermatol.2019.1735
Cai, C. J., Winter, S., Steiner, D., Wilcox, L. & Terry, M. ‘Hello AI’: uncovering the onboarding needs of medical practitioners for human–AI collaborative decision-making. In Proc. ACM on Human–Computer Interaction (Association for Computing Machinery, 2019).
Janda, M. et al. Accuracy of mobile digital teledermoscopy for skin self-examinations in adults at high risk of skin cancer: an open-label, randomised controlled trial. Lancet Digit. Health 2, e129–e137 (2020).
doi: 10.1016/S2589-7500(20)30001-7
Gessert, N., Nielsen, M., Shaikh, M., Werner, R. & Schlaefer, A. Skin lesion classification using ensembles of multi-resolution EfficientNets with meta data. MethodsX 7, 100864 (2020).
doi: 10.1016/j.mex.2020.100864
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128, 336–359 (2020).
Li, X., Wu, J., Chen, E. Z. & Jiang, H. From deep learning towards finding skin lesion biomarkers. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2019, 2797–2800 (2019).
pubmed: 31946474
Bissoto, A., Fornaciali, M., Valle, E. & Avila, S. (De)constructing bias on skin lesion datasets. Preprint at https://arxiv.org/abs/1904.08818 (2019).
Lapuschkin, S. et al. Unmasking Clever Hans predictors and assessing what machines really learn. Nat. Commun. 10, 1096 (2019).
doi: 10.1038/s41467-019-08987-4
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. & Müller, K.-R. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning (Springer Nature, 2019).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (eds. Wallach, H. et al.) 8026–8037 (Curran Associates, 2019).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
doi: 10.1007/s11263-015-0816-y
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proc. 3rd International Conference for Learning Representations (eds., Bengio, Y., LeCun, Y.) (2015).
Barata, C., Celebi, M. E. & Marques, J. S. Improving dermoscopy image classification using color constancy. IEEE J. Biomed. Health Inform. 19, 1146–1152 (2015).
pubmed: 25073179
Rinner, C., Kittler, H., Rosendahl, C. & Tschandl, P. Analysis of collective human intelligence for diagnosis of pigmented skin lesions harnessed by gamification via a web-based training platform: simulation reader study. J. Med. Internet Res. 22, e15597 (2020).
doi: 10.2196/15597
Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Statist. 6, 65–70 (1979).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).