SAGES consensus recommendations on an annotation framework for surgical video.
Annotation
Artificial intelligence
Computer vision
Consensus
Minimally invasive surgery
Surgical video
Journal
Surgical endoscopy
ISSN: 1432-2218
Titre abrégé: Surg Endosc
Pays: Germany
ID NLM: 8806653
Informations de publication
Date de publication:
09 2021
09 2021
Historique:
received:
25
04
2021
accepted:
26
05
2021
pubmed:
8
7
2021
medline:
25
2
2023
entrez:
7
7
2021
Statut:
ppublish
Résumé
The growing interest in analysis of surgical video through machine learning has led to increased research efforts; however, common methods of annotating video data are lacking. There is a need to establish recommendations on the annotation of surgical video data to enable assessment of algorithms and multi-institutional collaboration. Four working groups were formed from a pool of participants that included clinicians, engineers, and data scientists. The working groups were focused on four themes: (1) temporal models, (2) actions and tasks, (3) tissue characteristics and general anatomy, and (4) software and data structure. A modified Delphi process was utilized to create a consensus survey based on suggested recommendations from each of the working groups. After three Delphi rounds, consensus was reached on recommendations for annotation within each of these domains. A hierarchy for annotation of temporal events in surgery was established. While additional work remains to achieve accepted standards for video annotation in surgery, the consensus recommendations on a general framework for annotation presented here lay the foundation for standardization. This type of framework is critical to enabling diverse datasets, performance benchmarks, and collaboration.
Sections du résumé
BACKGROUND
The growing interest in analysis of surgical video through machine learning has led to increased research efforts; however, common methods of annotating video data are lacking. There is a need to establish recommendations on the annotation of surgical video data to enable assessment of algorithms and multi-institutional collaboration.
METHODS
Four working groups were formed from a pool of participants that included clinicians, engineers, and data scientists. The working groups were focused on four themes: (1) temporal models, (2) actions and tasks, (3) tissue characteristics and general anatomy, and (4) software and data structure. A modified Delphi process was utilized to create a consensus survey based on suggested recommendations from each of the working groups.
RESULTS
After three Delphi rounds, consensus was reached on recommendations for annotation within each of these domains. A hierarchy for annotation of temporal events in surgery was established.
CONCLUSIONS
While additional work remains to achieve accepted standards for video annotation in surgery, the consensus recommendations on a general framework for annotation presented here lay the foundation for standardization. This type of framework is critical to enabling diverse datasets, performance benchmarks, and collaboration.
Identifiants
pubmed: 34231065
doi: 10.1007/s00464-021-08578-9
pii: 10.1007/s00464-021-08578-9
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
4918-4929Informations de copyright
© 2021. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Références
McKinley SK, Hashimoto DA, Mansur A, Cassidy D, Petrusa E, Mullen JT, Phitayakorn R, Gee DW (2019) Feasibility and perceived usefulness of using head-mounted cameras for resident video portfolios. J Surg Res 239:233–241
doi: 10.1016/j.jss.2019.01.041
Greenberg CC, Byrnes ME, Engler TA, Quamme SP, Thumma JR, Dimick JB (2021) Association of a Statewide Surgical Coaching Program with Clinical Outcomes and Surgeon Perceptions. Ann Surg. https://doi.org/10.1097/SLA.0000000000004800
doi: 10.1097/SLA.0000000000004800
pubmed: 33856381
Manabe T, Takasaki M, Ide T, Kitahara K, Sato S, Yunotani S, Hirohashi Y, Iyama A, Taniguchi M, Ogata T, Shimizu S, Noshiro H (2020) Regional education on endoscopic surgery using a teleconference system with high-quality video via the internet: Saga surgical videoconferences. BMC Med Educ 20:329
doi: 10.1186/s12909-020-02215-0
Hashimoto DA, Rosman G, Rus D, Meireles OR (2018) Artificial intelligence in surgery: promises and perils. Ann Surg 268:70–76
doi: 10.1097/SLA.0000000000002693
Gibaud B, Forestier G, Feldmann C, Ferrigno G, Gonçalves P, Haidegger T, Julliard C, Katić D, Kenngott H, Maier-Hein L, März K, de Momi E, Nagy DÁ, Nakawala H, Neumann J, Neumuth T, Rojas Balderrama J, Speidel S, Wagner M, Jannin P (2018) Toward a standard ontology of surgical process models. Int J Comput Assist Radiol Surg 13:1397–1408
doi: 10.1007/s11548-018-1824-5
Garrow CR, Kowalewski K-F, Li L, Wagner M, Schmidt MW, Engelhardt S, Hashimoto DA, Kenngott HG, Bodenstedt S, Speidel S, Müller-Stich BP, Nickel F (2020) Machine learning for surgical phase recognition: a systematic review. Ann Surg. https://doi.org/10.1097/SLA.0000000000004425
doi: 10.1097/SLA.0000000000004425
Ward TM, Fer DM, Ban Y, Rosman G, Meireles OR, Hashimoto DA (2021) Challenges in surgical video annotation. Comput Assist Surg 26(1):58–68
doi: 10.1080/24699322.2021.1937320
Deng J, Dong W, Socher R, Li L, Kai Li, Li Fei-Fei (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp 248–255
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. arXiv [cs.CL]
Gokaslan A, Cohen V (2019) Openwebtext corpus. http://Skylion007.github.io/OpenWebTextCorpus
Zhu Y, Kiros R, Zemel R, Salakhutdinov R, Urtasun R, Torralba A, Fidler S (2015) Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE international conference on computer vision. pp 19–27
Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the KITTI dataset. Int J Rob Res 32:1231–1237
doi: 10.1177/0278364913491297
Varadarajan B, Reiley C, Lin H, Khudanpur S, Hager G (2009) Data-derived models for segmentation with application to surgical assessment and training. Med Image Comput Comput Assist Interv 12:426–434
pubmed: 20426016
Katić D, Wekerle A-L, Gärtner F, Kenngott H, Müller-Stich BP, Dillmann R, Speidel S (2014) Knowledge-driven formalization of laparoscopic surgeries for rule-based intraoperative context-aware assistance. Information processing in computer-assisted interventions. Springer, New York, pp 158–167
Ahmadi S-A, Sielhorst T, Stauder R, Horn M, Feussner H, Navab N (2006) Recovery of surgical workflow without explicit models. Med Image Comput Comput Assist Interv 9:420–428
pubmed: 17354918
Anteby R, Horesh N, Soffer S, Zager Y, Barash Y, Amiel I, Rosin D, Gutman M, Klang E (2021) Deep learning visual analysis in laparoscopic surgery: a systematic review and diagnostic test accuracy meta-analysis. Surg Endosc. https://doi.org/10.1007/s00464-020-08168-1
doi: 10.1007/s00464-020-08168-1
pubmed: 33398560
Bhattacharyya SB (2015) Introduction to SNOMED CT. Springer
van Amsterdam B, Clarkson M, Stoyanov D (2021) Gesture recognition in robotic surgery: a review. IEEE Trans Biomed Eng. https://doi.org/10.1109/TBME.2021.3054828
doi: 10.1109/TBME.2021.3054828
pubmed: 33497324
Reiley CE, Hager GD (2009) Task versus subtask surgical skill evaluation of robotic minimally invasive surgery. Med Image Comput Comput Assist Interv 12:435–442
pubmed: 20426017
Sculley D, Holt G, Golovin D, Davydov E, Phillips T, Ebner D, Chaudhary V, Young M, Crespo J-F, Dennison D (2015) Hidden technical debt in machine learning systems. Adv Neural Inf Process Syst 28:2503–2511
Cockburn A (2001) Writing effective use cases. Pearson Education India
Surgical AI and Innovation Laboratory. SAIIL_public. https://github.com/SAIIL/SAIIL_public/
Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2017) EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36:86–97
doi: 10.1109/TMI.2016.2593957
Hashimoto DA, Rosman G, Witkowski ER, Stafford C, Navarette-Welton AJ, Rattner DW, Lillemoe KD, Rus DL, Meireles OR (2019) Computer vision analysis of intraoperative video: automated recognition of operative steps in laparoscopic sleeve gastrectomy. Ann Surg 270:414–421
doi: 10.1097/SLA.0000000000003460
Ward TM, Hashimoto DA, Ban Y, Rattner DW, Inoue H, Lillemoe KD, Rus DL, Rosman G, Meireles OR (2020) Automated operative phase identification in peroral endoscopic myotomy. Surg Endosc. https://doi.org/10.1007/s00464-020-07833-9
doi: 10.1007/s00464-020-07833-9
pubmed: 32989530