ESFPNet: Efficient Stage-Wise Feature Pyramid on Mix Transformer for Deep Learning-Based Cancer Analysis in Endoscopic Video.

autofluorescence bronchoscopy colonoscopy colorectal cancer deep learning efficient stage-wise feature pyramid endoscopic video analysis lesion analysis lung cancer mix transformer semantic image segmentation

Journal

Journal of imaging

ISSN: 2313-433X

Titre abrégé: J Imaging

Pays: Switzerland

ID NLM: 101698819

Informations de publication

Date de publication:
07 Aug 2024

Historique:

received: 20 06 2024

revised: 19 07 2024

accepted: 01 08 2024

medline: 28 8 2024

pubmed: 28 8 2024

entrez: 28 8 2024

Statut: epublish

Résumé

For patients at risk of developing either lung cancer or colorectal cancer, the identification of suspect lesions in endoscopic video is an important procedure. The physician performs an endoscopic exam by navigating an endoscope through the organ of interest, be it the lungs or intestinal tract, and performs a visual inspection of the endoscopic video stream to identify lesions. Unfortunately, this entails a tedious, error-prone search over a lengthy video sequence. We propose a deep learning architecture that enables the real-time detection and segmentation of lesion regions from endoscopic video, with our experiments focused on autofluorescence bronchoscopy (AFB) for the lungs and colonoscopy for the intestinal tract. Our architecture, dubbed ESFPNet, draws on a pretrained Mix Transformer (MiT) encoder and a decoder structure that incorporates a new Efficient Stage-Wise Feature Pyramid (ESFP) to promote accurate lesion segmentation. In comparison to existing deep learning models, the ESFPNet model gave superior lesion segmentation performance for an AFB dataset. It also produced superior segmentation results for three widely used public colonoscopy databases and nearly the best results for two other public colonoscopy databases. In addition, the lightweight ESFPNet architecture requires fewer model parameters and less computation than other competing models, enabling the real-time analysis of input video frames. Overall, these studies point to the combined superior analysis performance and architectural efficiency of the ESFPNet for endoscopic video analysis. Lastly, additional experiments with the public colonoscopy databases demonstrate the learning ability and generalizability of ESFPNet, implying that the model could be effective for region segmentation in other domains.

Identifiants

DOI: 10.3390/jimaging10080191 PMID: 39194980

pubmed: 39194980

pii: jimaging10080191

doi: 10.3390/jimaging10080191

pii:

doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : National Institutes of Health - National Cancer Institute

ID : R01-CA151433

ESFPNet: Efficient Stage-Wise Feature Pyramid on Mix Transformer for Deep Learning-Based Cancer Analysis in Endoscopic Video.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Subventions

Auteurs

Qi Chang (Q)

Danish Ahmad (D)

Jennifer Toth (J)

Rebecca Bascom (R)

William E Higgins (WE)

Classifications MeSH