Vocal cord leukoplakia classification using deep learning models in white light and narrow band imaging endoscopy images.


Journal

Head & neck
ISSN: 1097-0347
Titre abrégé: Head Neck
Pays: United States
ID NLM: 8902541

Informations de publication

Date de publication:
12 2023
Historique:
revised: 15 09 2023
received: 05 05 2023
accepted: 29 09 2023
medline: 13 11 2023
pubmed: 14 10 2023
entrez: 14 10 2023
Statut: ppublish

Résumé

Accurate vocal cord leukoplakia classification is critical for the individualized treatment and early detection of laryngeal cancer. Numerous deep learning techniques have been proposed, but it is unclear how to select one to apply in the laryngeal tasks. This article introduces and reliably evaluates existing deep learning models for vocal cord leukoplakia classification. We created white light and narrow band imaging (NBI) image datasets of vocal cord leukoplakia which were classified into six classes: normal tissues (NT), inflammatory keratosis (IK), mild dysplasia (MiD), moderate dysplasia (MoD), severe dysplasia (SD), and squamous cell carcinoma (SCC). Vocal cord leukoplakia classification was performed using six classical deep learning models, AlexNet, VGG, Google Inception, ResNet, DenseNet, and Vision Transformer. GoogLeNet (i.e., Google Inception V1), DenseNet-121, and ResNet-152 perform excellent classification. The highest overall accuracy of white light image classification is 0.9583, while the highest overall accuracy of NBI image classification is 0.9478. These three neural networks all provide very high sensitivity, specificity, and precision values. GoogLeNet, ResNet, and DenseNet can provide accurate pathological classification of vocal cord leukoplakia. It facilitates early diagnosis, providing judgment on conservative treatment or surgical treatment of different degrees, and reducing the burden on endoscopists.

Sections du résumé

BACKGROUND
Accurate vocal cord leukoplakia classification is critical for the individualized treatment and early detection of laryngeal cancer. Numerous deep learning techniques have been proposed, but it is unclear how to select one to apply in the laryngeal tasks. This article introduces and reliably evaluates existing deep learning models for vocal cord leukoplakia classification.
METHODS
We created white light and narrow band imaging (NBI) image datasets of vocal cord leukoplakia which were classified into six classes: normal tissues (NT), inflammatory keratosis (IK), mild dysplasia (MiD), moderate dysplasia (MoD), severe dysplasia (SD), and squamous cell carcinoma (SCC). Vocal cord leukoplakia classification was performed using six classical deep learning models, AlexNet, VGG, Google Inception, ResNet, DenseNet, and Vision Transformer.
RESULTS
GoogLeNet (i.e., Google Inception V1), DenseNet-121, and ResNet-152 perform excellent classification. The highest overall accuracy of white light image classification is 0.9583, while the highest overall accuracy of NBI image classification is 0.9478. These three neural networks all provide very high sensitivity, specificity, and precision values.
CONCLUSION
GoogLeNet, ResNet, and DenseNet can provide accurate pathological classification of vocal cord leukoplakia. It facilitates early diagnosis, providing judgment on conservative treatment or surgical treatment of different degrees, and reducing the burden on endoscopists.

Identifiants

pubmed: 37837264
doi: 10.1002/hed.27543
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

3129-3145

Informations de copyright

© 2023 Wiley Periodicals LLC.

Références

Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211-252. doi:10.1007/s11263-015-0816-y
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014 International Conference on Learning Representations (ICLR). International Conference on Learning Representations, ICLR; 2014.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2012.
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2015:1-9. doi:10.1109/CVPR.2015.7298594
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning. PMLR; 2015:448-456.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2016:2818-2826. doi:10.1109/CVPR.2016.308
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, inception-ResNet and the impact of residual connections on learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, in AAAI'17. San Francisco, California, USA. AAAI Press; 2017:4278-4284.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2016:770-778. doi:10.1109/CVPR.2016.90
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society; 2017:2261-2269. doi:10.1109/CVPR.2017.243
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE Computer Society; 2018:7132-7141. doi:10.1109/CVPR.2018.00745
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2017.
Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale. International Conference on Learning Representations. International Conference on Learning Representations, ICLR; 2022.
Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Technical Report; 2009.
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A. Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Neural Information Processing Systems; 2011.
Lin T-Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, eds. Computer Vision-ECCV 2014. Lecture Notes in Computer Science. Springer; 2014:740-755. doi:10.1007/978-3-319-10602-1_48
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. The pascal visual object classes (VOC) challenge. Int J Comput Vis. 2010;88(2):303-338. doi:10.1007/s11263-009-0275-4
Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A. The pascal visual object classes challenge: a retrospective. Int J Comput Vis. 2015;111(1):98-136. doi:10.1007/s11263-014-0733-5
Yao P, Usman M, Chen YH, et al. Applications of artificial intelligence to office laryngoscopy: a scoping review. Laryngoscope. 2022;132(10):1993-2016. doi:10.1002/lary.29886
Moccia S, Vanone GO, Momi ED, et al. Learning-based classification of informative laryngoscopic frames. Comput Methods Programs Biomed. 2018;158:21-30. doi:10.1016/j.cmpb.2018.01.030
Galdran A, Costa P, Campilho A. Real-time informative laryngoscopic frame classification with pre-trained convolutional neural networks. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE Computer Society; 2019:87-90. doi:10.1109/ISBI.2019.8759511
Xiong H, Lin P, Yu JG, et al. Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images. EBioMedicine. 2019;48:92-99. doi:10.1016/j.ebiom.2019.08.075
Luan B, Sun Y, Tong C, Liu Y, Liu H. R-FCN based laryngeal lesion detection. 2019 12th International Symposium on Computational Intelligence and Design (ISCID). Institute of Electrical and Electronics Engineers Inc.; 2019:128-131. doi:10.1109/ISCID.2019.10112
Ji B, Ren J, Zheng X, et al. A multi-scale recurrent fully convolution neural network for laryngeal leukoplakia segmentation. Biomed Signal Process Control. 2020;59:101913. doi:10.1016/j.bspc.2020.101913
Hamad A, Haney M, Lever TE, Bunyak F. Automated segmentation of the vocal folds in laryngeal endoscopy videos using deep convolutional regression networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2019;140-148. doi:10.1109/CVPRW.2019.00023
Cho WK, Lee YJ, Joo HA, et al. Diagnostic accuracies of laryngeal diseases using a convolutional neural network-based image classification system. Laryngoscope. 2021;131(11):2558-2566. doi:10.1002/lary.29595
Azam MA, Sampieri C, Ioppi A, et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: toward real-time laryngeal cancer detection. Laryngoscope. 2022;132(9):1798-1806. doi:10.1002/lary.29960
Zhao Q, He Y, Wu Y, et al. Vocal cord lesions classification based on deep convolutional neural network and transfer learning. Med Phys. 2022;49(1):432-442. doi:10.1002/mp.15371

Auteurs

Zhenzhen You (Z)

Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.

Botao Han (B)

Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.

Zhenghao Shi (Z)

Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.

Minghua Zhao (M)

Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.

Shuangli Du (S)

Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.

Jing Yan (J)

Department of Otorhinolaryngology, Second Affiliated Hospital of Medical College, Xi'an Jiaotong University, Xi'an, China.

Haiqin Liu (H)

Department of Otorhinolaryngology, Second Affiliated Hospital of Medical College, Xi'an Jiaotong University, Xi'an, China.

Xinhong Hei (X)

Shaanxi Key Laboratory for Network Computing and Security Technology, School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China.

Xiaoyong Ren (X)

Department of Otorhinolaryngology, Second Affiliated Hospital of Medical College, Xi'an Jiaotong University, Xi'an, China.

Yan Yan (Y)

Department of Otorhinolaryngology, Second Affiliated Hospital of Medical College, Xi'an Jiaotong University, Xi'an, China.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH