Surgical Gesture Recognition in Laparoscopic Tasks Based on the Transformer Network and Self-Supervised Learning.

CNN artificial intelligence laparoscopic surgery machine learning self-supervision surgical action recognition transformer

Journal

Bioengineering (Basel, Switzerland)
ISSN: 2306-5354
Titre abrégé: Bioengineering (Basel)
Pays: Switzerland
ID NLM: 101676056

Informations de publication

Date de publication:
29 Nov 2022
Historique:
received: 09 10 2022
revised: 07 11 2022
accepted: 25 11 2022
entrez: 23 12 2022
pubmed: 24 12 2022
medline: 24 12 2022
Statut: epublish

Résumé

In this study, we propose a deep learning framework and a self-supervision scheme for video-based surgical gesture recognition. The proposed framework is modular. First, a 3D convolutional network extracts feature vectors from video clips for encoding spatial and short-term temporal features. Second, the feature vectors are fed into a transformer network for capturing long-term temporal dependencies. Two main models are proposed, based on the backbone framework: C3DTrans (supervised) and SSC3DTrans (self-supervised). The dataset consisted of 80 videos from two basic laparoscopic tasks: peg transfer (PT) and knot tying (KT). To examine the potential of self-supervision, the models were trained on 60% and 100% of the annotated dataset. In addition, the best-performing model was evaluated on the JIGSAWS robotic surgery dataset. The best model (C3DTrans) achieves an accuracy of 88.0%, a 95.2% clip level, and 97.5% and 97.9% (gesture level), for PT and KT, respectively. The SSC3DTrans performed similar to C3DTrans when training on 60% of the annotated dataset (about 84% and 93% clip-level accuracies for PT and KT, respectively). The performance of C3DTrans on JIGSAWS was close to 76% accuracy, which was similar to or higher than prior techniques based on a single video stream, no additional video training, and online processing.

Identifiants

pubmed: 36550943
pii: bioengineering9120737
doi: 10.3390/bioengineering9120737
pmc: PMC9774918
pii:
doi:

Types de publication

Journal Article

Langues

eng

Références

Int J Comput Assist Radiol Surg. 2019 Nov;14(11):2005-2020
pubmed: 31037493
Surgery. 2021 May;169(5):1253-1256
pubmed: 33272610
IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4037-4058
pubmed: 32386141
IEEE Trans Biomed Eng. 2021 Jun;68(6):2021-2035
pubmed: 33497324
JSLS. 2020 Oct-Dec;24(4):
pubmed: 33144823
IEEE Trans Biomed Eng. 2017 Sep;64(9):2025-2041
pubmed: 28060703
Med Image Comput Comput Assist Interv. 2013;16(Pt 3):339-46
pubmed: 24505779
Ann Surg. 2021 Apr 1;273(4):684-693
pubmed: 33201088
Med Image Anal. 2021 Dec;74:102224
pubmed: 34543914
Int J Surg. 2021 Nov;95:106151
pubmed: 34695601
JAMA Netw Open. 2020 Mar 2;3(3):e201664
pubmed: 32227178

Auteurs

Athanasios Gazis (A)

Laboratory of Medical Physics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece.

Pantelis Karaiskos (P)

Laboratory of Medical Physics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece.

Constantinos Loukas (C)

Laboratory of Medical Physics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece.

Classifications MeSH