Vocal cord leukoplakia classification using deep learning models in white light and narrow band imaging endoscopy images.

Humans Vocal Cords / diagnostic imaging Narrow Band Imaging / methods Deep Learning Endoscopy Laryngeal Neoplasms / pathology Endoscopy, Gastrointestinal Leukoplakia / diagnostic imaging Hyperplasia / pathology

NBI images classification deep learning vocal cord leukoplakia white light images

Journal

Head & neck

ISSN: 1097-0347

Titre abrégé: Head Neck

Pays: United States

ID NLM: 8902541

Informations de publication

Date de publication:
12 2023

Historique:

revised: 15 09 2023

received: 05 05 2023

accepted: 29 09 2023

medline: 13 11 2023

pubmed: 14 10 2023

entrez: 14 10 2023

Statut: ppublish

Résumé

Accurate vocal cord leukoplakia classification is critical for the individualized treatment and early detection of laryngeal cancer. Numerous deep learning techniques have been proposed, but it is unclear how to select one to apply in the laryngeal tasks. This article introduces and reliably evaluates existing deep learning models for vocal cord leukoplakia classification. We created white light and narrow band imaging (NBI) image datasets of vocal cord leukoplakia which were classified into six classes: normal tissues (NT), inflammatory keratosis (IK), mild dysplasia (MiD), moderate dysplasia (MoD), severe dysplasia (SD), and squamous cell carcinoma (SCC). Vocal cord leukoplakia classification was performed using six classical deep learning models, AlexNet, VGG, Google Inception, ResNet, DenseNet, and Vision Transformer. GoogLeNet (i.e., Google Inception V1), DenseNet-121, and ResNet-152 perform excellent classification. The highest overall accuracy of white light image classification is 0.9583, while the highest overall accuracy of NBI image classification is 0.9478. These three neural networks all provide very high sensitivity, specificity, and precision values. GoogLeNet, ResNet, and DenseNet can provide accurate pathological classification of vocal cord leukoplakia. It facilitates early diagnosis, providing judgment on conservative treatment or surgical treatment of different degrees, and reducing the burden on endoscopists.

Sections du résumé

BACKGROUND

METHODS

We created white light and narrow band imaging (NBI) image datasets of vocal cord leukoplakia which were classified into six classes: normal tissues (NT), inflammatory keratosis (IK), mild dysplasia (MiD), moderate dysplasia (MoD), severe dysplasia (SD), and squamous cell carcinoma (SCC). Vocal cord leukoplakia classification was performed using six classical deep learning models, AlexNet, VGG, Google Inception, ResNet, DenseNet, and Vision Transformer.

RESULTS

GoogLeNet (i.e., Google Inception V1), DenseNet-121, and ResNet-152 perform excellent classification. The highest overall accuracy of white light image classification is 0.9583, while the highest overall accuracy of NBI image classification is 0.9478. These three neural networks all provide very high sensitivity, specificity, and precision values.

CONCLUSION

GoogLeNet, ResNet, and DenseNet can provide accurate pathological classification of vocal cord leukoplakia. It facilitates early diagnosis, providing judgment on conservative treatment or surgical treatment of different degrees, and reducing the burden on endoscopists.

Identifiants

DOI: 10.1002/hed.27543 PMID: 37837264

pubmed: 37837264

doi: 10.1002/hed.27543

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

3129-3145

Informations de copyright

Références

Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211-252. doi:10.1007/s11263-015-0816-y

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014 International Conference on Learning Representations (ICLR). International Conference on Learning Representations, ICLR; 2014.

Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2012.

Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2015:1-9. doi:10.1109/CVPR.2015.7298594

Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning. PMLR; 2015:448-456.

Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2016:2818-2826. doi:10.1109/CVPR.2016.308

Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, inception-ResNet and the impact of residual connections on learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, in AAAI'17. San Francisco, California, USA. AAAI Press; 2017:4278-4284.

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2016:770-778. doi:10.1109/CVPR.2016.90

Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2017:2261-2269. doi:10.1109/CVPR.2017.243

Hu J, Shen L, Sun G. Squeeze-and-excitation networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE Computer Society; 2018:7132-7141. doi:10.1109/CVPR.2018.00745

Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2017.

Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale. International Conference on Learning Representations. International Conference on Learning Representations, ICLR; 2022.

Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Technical Report; 2009.

Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A. Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Neural Information Processing Systems; 2011.

Lin T-Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, eds. Computer Vision-ECCV 2014. Lecture Notes in Computer Science. Springer; 2014:740-755. doi:10.1007/978-3-319-10602-1_48

Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. The pascal visual object classes (VOC) challenge. Int J Comput Vis. 2010;88(2):303-338. doi:10.1007/s11263-009-0275-4

Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A. The pascal visual object classes challenge: a retrospective. Int J Comput Vis. 2015;111(1):98-136. doi:10.1007/s11263-014-0733-5

Yao P, Usman M, Chen YH, et al. Applications of artificial intelligence to office laryngoscopy: a scoping review. Laryngoscope. 2022;132(10):1993-2016. doi:10.1002/lary.29886

Moccia S, Vanone GO, Momi ED, et al. Learning-based classification of informative laryngoscopic frames. Comput Methods Programs Biomed. 2018;158:21-30. doi:10.1016/j.cmpb.2018.01.030

Galdran A, Costa P, Campilho A. Real-time informative laryngoscopic frame classification with pre-trained convolutional neural networks. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE Computer Society; 2019:87-90. doi:10.1109/ISBI.2019.8759511

Xiong H, Lin P, Yu JG, et al. Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images. EBioMedicine. 2019;48:92-99. doi:10.1016/j.ebiom.2019.08.075

Luan B, Sun Y, Tong C, Liu Y, Liu H. R-FCN based laryngeal lesion detection. 2019 12th International Symposium on Computational Intelligence and Design (ISCID). Institute of Electrical and Electronics Engineers Inc.; 2019:128-131. doi:10.1109/ISCID.2019.10112

Ji B, Ren J, Zheng X, et al. A multi-scale recurrent fully convolution neural network for laryngeal leukoplakia segmentation. Biomed Signal Process Control. 2020;59:101913. doi:10.1016/j.bspc.2020.101913

Hamad A, Haney M, Lever TE, Bunyak F. Automated segmentation of the vocal folds in laryngeal endoscopy videos using deep convolutional regression networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2019;140-148. doi:10.1109/CVPRW.2019.00023

Cho WK, Lee YJ, Joo HA, et al. Diagnostic accuracies of laryngeal diseases using a convolutional neural network-based image classification system. Laryngoscope. 2021;131(11):2558-2566. doi:10.1002/lary.29595

Azam MA, Sampieri C, Ioppi A, et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: toward real-time laryngeal cancer detection. Laryngoscope. 2022;132(9):1798-1806. doi:10.1002/lary.29960

Zhao Q, He Y, Wu Y, et al. Vocal cord lesions classification based on deep convolutional neural network and transfer learning. Med Phys. 2022;49(1):432-442. doi:10.1002/mp.15371

Vocal cord leukoplakia classification using deep learning models in white light and narrow band imaging endoscopy images.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Zhenzhen You (Z)

Botao Han (B)

Zhenghao Shi (Z)

Minghua Zhao (M)

Shuangli Du (S)

Jing Yan (J)

Haiqin Liu (H)

Xinhong Hei (X)

Xiaoyong Ren (X)

Yan Yan (Y)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH