Deep learning prediction of error and skill in robotic prostatectomy suturing.

Deep learning Errors Robotic Technical skill

Journal

Surgical endoscopy

ISSN: 1432-2218

Titre abrégé: Surg Endosc

Pays: Germany

ID NLM: 8806653

Informations de publication

Date de publication:
21 Oct 2024

Historique:

received: 27 06 2024

accepted: 02 10 2024

medline: 22 10 2024

pubmed: 22 10 2024

entrez: 21 10 2024

Statut: aheadofprint

Résumé

Manual objective assessment of skill and errors in minimally invasive surgery have been validated with correlation to surgical expertise and patient outcomes. However, assessment and error annotation can be subjective and are time-consuming processes, often precluding their use. Recent years have seen the development of artificial intelligence models to work towards automating the process to allow reduction of errors and truly objective assessment. This study aimed to validate surgical skill rating and error annotations in suturing gestures to inform the development and evaluation of AI models. SAR-RARP50 open data set was blindly, independently annotated at the gesture level in Robotic-Assisted Radical Prostatectomy (RARP) suturing. Manual objective assessment tools and error annotation methodology, Objective Clinical Human Reliability Analysis (OCHRA), were used as ground truth to train and test vision-based deep learning methods to estimate skill and errors. Analysis included descriptive statistics plus tool validity and reliability. Fifty-four RARP videos (266 min) were analysed. Strong/excellent inter-rater reliability (range r = 0.70-0.89, p < 0.001) and very strong correlation (r = 0.92, p < 0.001) between objective assessment tools was demonstrated. Skill estimation of OSATS and M-GEARS had a Spearman's Correlation Coefficient 0.37 and 0.36, respectively, with normalised mean absolute error representing a prediction error of 17.92% (inverted "accuracy" 82.08%) and 20.6% (inverted "accuracy" 79.4%) respectively. The best performing models in error prediction achieved mean absolute precision of 37.14%, area under the curve 65.10% and Macro-F1 58.97%. This is the first study to employ detailed error detection methodology and deep learning models within real robotic surgical video. This benchmark evaluation of AI models sets a foundation and promising approach for future advancements in automated technical skill assessment.

Sections du résumé

BACKGROUND BACKGROUND

METHODS METHODS

SAR-RARP50 open data set was blindly, independently annotated at the gesture level in Robotic-Assisted Radical Prostatectomy (RARP) suturing. Manual objective assessment tools and error annotation methodology, Objective Clinical Human Reliability Analysis (OCHRA), were used as ground truth to train and test vision-based deep learning methods to estimate skill and errors. Analysis included descriptive statistics plus tool validity and reliability.

RESULTS RESULTS

Fifty-four RARP videos (266 min) were analysed. Strong/excellent inter-rater reliability (range r = 0.70-0.89, p < 0.001) and very strong correlation (r = 0.92, p < 0.001) between objective assessment tools was demonstrated. Skill estimation of OSATS and M-GEARS had a Spearman's Correlation Coefficient 0.37 and 0.36, respectively, with normalised mean absolute error representing a prediction error of 17.92% (inverted "accuracy" 82.08%) and 20.6% (inverted "accuracy" 79.4%) respectively. The best performing models in error prediction achieved mean absolute precision of 37.14%, area under the curve 65.10% and Macro-F1 58.97%.

CONCLUSIONS CONCLUSIONS

This is the first study to employ detailed error detection methodology and deep learning models within real robotic surgical video. This benchmark evaluation of AI models sets a foundation and promising approach for future advancements in automated technical skill assessment.

Identifiants

DOI: 10.1007/s00464-024-11341-5 PMID: 39433583

pubmed: 39433583

doi: 10.1007/s00464-024-11341-5

pii: 10.1007/s00464-024-11341-5

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Informations de copyright

Références

Cancer Research UK (2015) Cancer Research UK. Prostate cancer statistics. https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/prostate-cancer . Accessed 12 Jan 2024

Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A et al (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71(3):209–249

doi: 10.3322/caac.21660 pubmed: 33538338

Cao L, Yang Z, Qi L, Chen M (2019) Robot-assisted and laparoscopic vs open radical prostatectomy in clinically localized prostate cancer: perioperative, functional, and oncological outcomes: a systematic review and meta-analysis. Medicine 98(22):e15770

doi: 10.1097/MD.0000000000015770 pubmed: 31145297 pmcid: 6709105

Du Y, Long Q, Guan B, Mu L, Tian J, Jiang Y et al (2018) Robot-assisted radical prostatectomy is more beneficial for prostate cancer patients: a system review and meta-analysis. Med Sci Monit 14(24):272–287

doi: 10.12659/MSM.907092

Labban M, Dasgupta P, Song C, Becker R, Li Y, Kreaden US et al (2022) Cost-effectiveness of robotic-assisted radical prostatectomy for localized prostate cancer in the UK. JAMA Netw Open 5(4):e225740

doi: 10.1001/jamanetworkopen.2022.5740 pubmed: 35377424 pmcid: 8980901

Kutana S, Bitner DP, Addison P, Chung PJ, Talamini MA, Filicori F (2021) Objective assessment of robotic surgical skills: review of literature and future directions. Surg Endosc. https://doi.org/10.1007/s00464-022-09134-9

doi: 10.1007/s00464-022-09134-9

Mazzone E, Puliatti S, Amato M, Bunting B, Rocco B, Montorsi F et al (2021) A systematic review and meta-analysis on the impact of proficiency-based progression simulation training on performance outcomes. Ann Surg 274(2):281–289

doi: 10.1097/SLA.0000000000004650 pubmed: 33630473

Birkmeyer JD, Finks JF, O’Reilly A, Oerline M, Carlin AM, Nunn AR et al (2013) Surgical skill and complication rates after bariatric surgery. N Engl J Med 369(15):1434–1442

doi: 10.1056/NEJMsa1300625 pubmed: 24106936

Curtis NJ, Foster JD, Miskovic D, Brown CSB, Hewett PJ, Abbott S et al (2020) Association of surgical skill assessment with clinical outcomes in cancer surgery. JAMA Surg 155(7):590–598

doi: 10.1001/jamasurg.2020.1004 pubmed: 32374371

Kobayashi E, Nakatani E, Tanaka T, Yosuke K, Kanao H, Shiki Y et al (2022) Surgical skill and oncological outcome of laparoscopic radical hysterectomy: JGOG1081s-A1, an ancillary analysis of the Japanese Gynecologic Oncology Group Study JGOG1081. Gynecol Oncol 165(2):293–301

doi: 10.1016/j.ygyno.2022.02.005 pubmed: 35221133

Boal MWE, Anastasiou D, Tesfai F, Ghamrawi W, Mazomenos E, Curtis N et al (2023) Evaluation of objective tools and artificial intelligence in robotic surgery technical skills assessment: a systematic review. Br J Surg. https://doi.org/10.1093/bjs/znad331/7407357

doi: 10.1093/bjs/znad331/7407357 pmcid: 10771126

van Amsterdam B, Funke I, Edwards E, Speidel S, Collins J, Sridhar A et al (2022) Gesture recognition in robotic surgery with multimodal attention and with the centre for tactile internet with human-in-the-loop. IEEE Trans Med Imaging. https://www.ucl.ac.uk/interventional-surgical -

Hung AJ, Chen J, Gill IS (2018) Automated performance metrics and machine learning algorithms to measure surgeon performance and anticipate clinical outcomes in robotic surgery. JAMA Surg 153:770–771

doi: 10.1001/jamasurg.2018.1512 pubmed: 29926095 pmcid: 9084629

Hung AJ, Ma R, Cen S, Nguyen JH, Lei X, Wagner C (2021) Surgeon automated performance metrics as predictors of early urinary continence recovery after robotic radical prostatectomy—a prospective bi-institutional study. Eur Urol Open Sci 1(27):65–72

doi: 10.1016/j.euros.2021.03.005

Ghodoussipour S, Reddy SS, Ma R, Huang D, Nguyen J, Hung AJ (2021) An objective assessment of performance during robotic partial nephrectomy: validation and correlation of automated performance metrics with intraoperative outcomes. J Urol 205(5):1294–1302

doi: 10.1097/JU.0000000000001557 pubmed: 33356480

Zhang J, Nie Y, Lyu Y, Yang X, Chang J, Zhang JJ (2021) SD-Net: joint surgical gesture recognition and skill assessment. Int J Comput Assist Radiol Surg 16(10):1675–1682

doi: 10.1007/s11548-021-02495-x pubmed: 34655392 pmcid: 8580939

Ma R, Ramaswamy A, Xu J, Trinh L, Kiyasseh D, Chu TN et al (2022) Surgical gestures as a method to quantify surgical performance and predict patient outcomes. NPJ Digit Med 5(1):187

doi: 10.1038/s41746-022-00738-y pubmed: 36550203 pmcid: 9780308

Hutchinson K, Li Z, Cantrell LA, Schenkman NS, Alemzadeh H (2022) Analysis of executional and procedural errors in dry-lab robotic surgery experiments. Int J Med Robot Comput Assist Surg. https://doi.org/10.1002/rcs.2375

doi: 10.1002/rcs.2375

Psychogyios D, Colleoni E, Van Amsterdam B, Li CY, Huang SY, Li Y et al (2023) SAR-RARP50: segmentation of surgical instrumentation and action recognition on robot-assisted radical prostatectomy challenge. http://arxiv.org/abs/2401.00496

Gao Y, Swaroop Vedula S, Reiley CE, Ahmidi N, Varadarajan B, Lin HC et al (2014) JHU-ISI gesture and skill assessment working set (JIGSAWS): a surgical activity dataset for human motion modeling. In: MICCAI workshop: M2cai, vol 3, p 3

Guni A, Raison N, Challacombe B, Khan S, Dasgupta P, Ahmed K (2018) Development of a technical checklist for the assessment of suturing in robotic surgery. Surg Endosc 32:4402–4407. https://doi.org/10.1007/s00464-018-6407-6

doi: 10.1007/s00464-018-6407-6 pubmed: 30194643

Tang B, Cuschieri A (2020) Objective assessment of surgical operative performance by observational clinical human reliability analysis (OCHRA): a systematic review. Surg Endosc 34:1492–1508

doi: 10.1007/s00464-019-07365-x pubmed: 31953728 pmcid: 7093355

Gorard J, Boal M, Swamynathan V, Ghamrawi W, Francis N (2023) The application of objective clinical human reliability analysis (OCHRA) in the assessment of basic robotic surgical skills. Surg Endosc 38(1):116–128

doi: 10.1007/s00464-023-10510-2 pubmed: 37932602 pmcid: 10776495

Foster JD, Miskovic D, Allison AS, Conti JA, Ockrim J, Cooper EJ et al (2016) Application of objective clinical human reliability analysis (OCHRA) in assessment of technical performance in laparoscopic rectal cancer surgery. Tech Coloproctol 20(6):361–367

doi: 10.1007/s10151-016-1444-4 pubmed: 27154295

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. In: Conference paper ICLR. http://arxiv.org/abs/2010.11929

Oquab M, Darcet T, Moutakanni T, Vo H, Szafraniec M, Khalidov V et al (2023) DINOv2: learning robust visual features without supervision. Trans Mach Learn Res http://arxiv.org/abs/2304.07193

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE Computer Society. pp 770–778

Czempiel T, Paschali M, Keicher M, Simson W, Feussner H, Kim ST et al (2020) TeCNO: surgical phase recognition with multi-stage temporal convolutional networks- lecture notes in computer science, vol 12263. http://arxiv.org/abs/2003.10751

Farha YA, Gall J (2019) MS-TCN: multi-stage temporal convolutional network for action segmentation. In: IEEE/CVF conference on computer vision and pattern recognition. http://arxiv.org/abs/1903.01945

Li S, Farha YA, Liu Y, Cheng MM, Gall J (2023) MS-TCN++: multi-stage temporal convolutional network for action segmentation. IEEE Trans Pattern Anal Mach Intell 45(6):6647–6658

doi: 10.1109/TPAMI.2020.3021756 pubmed: 32886607

Yi F, Wen H, Jiang T (2021) ASFormer: transformer for action segmentation. http://arxiv.org/abs/2110.08568 . Accessed 16 Oct 2021

Neumuth D, Loebe F, Herre H, Neumuth T (2011) Modeling surgical processes: a four-level translational approach. Artif Intell Med 51(3):147–161

doi: 10.1016/j.artmed.2010.12.003 pubmed: 21227665

Ding X, Xu X, Li X (2023) SEDSkill: surgical events driven method for skill assessment from thoracoscopic surgical videos. In: International conference on medical image computing and computer-assisted intervention, pp 35–45. Springer, Cham

Wagner M, Müller-Stich BP, Kisilenko A, Tran D, Heger P, Mündermann L et al (2023) Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark. Med Image Anal 86:102770

doi: 10.1016/j.media.2023.102770 pubmed: 36889206

Liu D, Li Q, Jiang T, Wang Y, Miao R, Shan F et al (2021) Towards unified surgical skill assessment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

Haque TF, Hui A, You J, Ma R, Nguyen JH, Lei X et al (2022) An assessment tool to provide targeted feedback to robotic surgical trainees: development and validation of the end-to-end assessment of suturing expertise (EASE). Urol Pract 9(6):532–539

doi: 10.1097/UPJ.0000000000000344 pubmed: 36844996 pmcid: 9948038

Francis NK, Curtis NJ, Conti JA, Foster JD, Bonjer HJ, Hanna GB et al (2018) EAES classification of intraoperative adverse events in laparoscopic surgery. Surg Endosc 32(9):3822–3829

doi: 10.1007/s00464-018-6108-1 pubmed: 29435754

Curtis NJ, Conti JA, Dalton R, Rockall TA, Allison AS, Ockrim JB et al (2019) 2D versus 3D laparoscopic total mesorectal excision: a developmental multicentre randomised controlled trial. Surg Endosc 33:3370–3383. https://doi.org/10.1007/s00464-018-06630-9

doi: 10.1007/s00464-018-06630-9 pubmed: 30656453 pmcid: 6722156

Curtis NJ, Dennison G, Brown CSB, Hewett PJ, Hanna GB, Stevenson ARL et al (2021) Clinical evaluation of intraoperative near misses in laparoscopic rectal cancer surgery. Ann Surg 273(4):778–784

doi: 10.1097/SLA.0000000000003452 pubmed: 31274657

Wang Z, Fey AM (2018) Deep learning with convolutional neural network for objective skill evaluation in robot-assisted surgery. Int J Comput Assist Radiol Surg 13(12):1959–1970

doi: 10.1007/s11548-018-1860-1 pubmed: 30255463

Benmansour M, Malti A, Jannin P (2023) Deep neural network architecture for automated soft surgical skills evaluation using objective structured assessment of technical skills criteria. Int J Comput Assist Radiol Surg 18:929–937

doi: 10.1007/s11548-022-02827-5 pubmed: 36694051

Yasar MS, Alemzadeh H (2020) Real-time context-aware detection of unsafe events in robot-assisted surgery. In: 2020 50th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE. pp 385–397

Vaidya A, Aydin A, Ridgley J, Raison N, Dasgupta P, Ahmed K (2020) Current status of technical skills assessment tools in surgery: a systematic review. J Surg Res 246:342–378

doi: 10.1016/j.jss.2019.09.006 pubmed: 31690531

Vanstrum EB, Ma R, Maya-Silva J, Sanford D, Nguyen JH, Lei X et al (2021) Development and validation of an objective scoring tool to evaluate surgical dissection: dissection assessment for robotic technique (DART). Urol Pract 8(5):596–604

doi: 10.1097/UPJ.0000000000000246 pubmed: 37131998 pmcid: 10150863

Intuitive Surgical (2024) Da Vinci 5. https://www.intuitive.com/en-us/products-and-services/da-vinci/5

Deep learning prediction of error and skill in robotic prostatectomy suturing.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Références

Auteurs

N Sirajudeen (N)

M Boal (M)

D Anastasiou (D)

J Xu (J)

D Stoyanov (D)

J Kelly (J)

J W Collins (JW)

A Sridhar (A)

E Mazomenos (E)

N K Francis (NK)

Classifications MeSH