GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks.

Glottis Vocal Cords Neural Networks, Computer Endoscopy

Laryngeal endoscopy biomedical imaging deep learning deep neural networks glottis midline

Journal

IEEE journal of translational engineering in health and medicine

ISSN: 2168-2372

Titre abrégé: IEEE J Transl Eng Health Med

Pays: United States

ID NLM: 101623153

Informations de publication

Date de publication:
2023

Historique:

received: 01 08 2022

revised: 27 11 2022

accepted: 04 01 2023

entrez: 23 2 2023

pubmed: 24 2 2023

medline: 3 3 2023

Statut: epublish

Résumé

High-speed videoendoscopy is a major tool for quantitative laryngology. Glottis segmentation and glottal midline detection are crucial for computing vocal fold-specific, quantitative parameters. However, fully automated solutions show limited clinical applicability. Especially unbiased glottal midline detection remains a challenging problem. We developed a multitask deep neural network for glottis segmentation and glottal midline detection. We used techniques from pose estimation to estimate the anterior and posterior points in endoscopy images. Neural networks were set up in TensorFlow/Keras and trained and evaluated with the BAGLS dataset. We found that a dual decoder deep neural network termed GlottisNetV2 outperforms the previously proposed GlottisNet in terms of MAPE on the test dataset (1.85% to 6.3%) while converging faster. Using various hyperparameter tunings, we allow fast and directed training. Using temporal variant data on an additional data set designed for this task, we can improve the median prediction accuracy from 2.1% to 1.76% when using 12 consecutive frames and additional temporal filtering. We found that temporal glottal midline detection using a dual decoder architecture together with keypoint estimation allows accurate midline prediction. We show that our proposed architecture allows stable and reliable glottal midline predictions ready for clinical use and analysis of symmetry measures.

Identifiants

DOI: 10.1109/JTEHM.2023.3237859 PMID: 36816097 PMC: PMC9933989

pubmed: 36816097

doi: 10.1109/JTEHM.2023.3237859

pmc: PMC9933989

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

137-144

Références

Sci Data. 2020 Jun 19;7(1):186

pubmed: 32561845

J Speech Lang Hear Res. 2021 Jun 4;64(6):1889-1903

pubmed: 34000199

Laryngoscope. 2012 Jul;122(7):1582-8

pubmed: 22544473

Comput Biol Med. 2022 Feb;141:105089

pubmed: 34920160

J Speech Lang Hear Res. 2011 Feb;54(1):47-54

pubmed: 20699347

Nat Neurosci. 2018 Sep;21(9):1281-1289

pubmed: 30127430

PLoS One. 2020 Feb 10;15(2):e0227791

pubmed: 32040514

Nat Methods. 2022 Apr;19(4):486-495

pubmed: 35379947

Laryngoscope. 2012 Jan;122(1):58-65

pubmed: 21898450

Nat Med. 2019 Jan;25(1):24-29

pubmed: 30617335

Laryngoscope. 2008 Apr;118(4):753-8

pubmed: 18216742

Folia Phoniatr Logop. 2008;60(1):33-44

pubmed: 18057909

NPJ Digit Med. 2021 Jan 8;4(1):5

pubmed: 33420381

J Acoust Soc Am. 2013 Feb;133(2):EL82-7

pubmed: 23363198

J Speech Lang Hear Res. 2014 Apr 1;57(2):S674-86

pubmed: 24686982

Sci Rep. 2020 Nov 26;10(1):20723

pubmed: 33244031

Sci Rep. 2021 Jul 2;11(1):13760

pubmed: 34215788

Biomed Res Int. 2016;2016:4575437

pubmed: 27990428

J Voice. 2011 Sep;25(5):576-90

pubmed: 20728308

Med Image Anal. 2007 Aug;11(4):400-13

pubmed: 17544839

Laryngoscope. 2013 Jul;123(7):1686-93

pubmed: 23649746

GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Références

Auteurs

Elina Kruse (E)

Michael Dollinger (M)

Anne Schutzenberger (A)

Andreas M Kist (AM)

Articles similaires

Unsupervised learning for real-time and continuous gait phase detection.

Detection, classification, and characterization of proximal humerus fractures on plain radiographs.

Editorial: Artificial Intelligence (AI), Digital Image Analysis, and the Future of Cancer Diagnosis and Prognosis.

Deep learning-based automatic image classification of oral cancer cells acquiring chemoresistance in vitro.

Classifications MeSH