Beyond rankings: Learning (more) from algorithm validation.

Humans Algorithms Image Processing, Computer-Assisted / methods Laparoscopy

Artificial intelligence Biomedical image analysis challenges Deep learning Endoscopic vision Generalized linear mixed models Grand challenges Image characteristics driven algorithm development Instrument segmentation Minimally invasive surgery Surgical data science

Journal

Medical image analysis

ISSN: 1361-8423

Titre abrégé: Med Image Anal

Pays: Netherlands

ID NLM: 9713490

Informations de publication

Date de publication:
05 2023

Historique:

received: 17 06 2021

revised: 24 05 2022

accepted: 08 02 2023

medline: 21 4 2023

pubmed: 26 3 2023

entrez: 25 3 2023

Statut: ppublish

Résumé

Challenges have become the state-of-the-art approach to benchmark image analysis algorithms in a comparative manner. While the validation on identical data sets was a great step forward, results analysis is often restricted to pure ranking tables, leaving relevant questions unanswered. Specifically, little effort has been put into the systematic investigation on what characterizes images in which state-of-the-art algorithms fail. To address this gap in the literature, we (1) present a statistical framework for learning from challenges and (2) instantiate it for the specific task of instrument instance segmentation in laparoscopic videos. Our framework relies on the semantic meta data annotation of images, which serves as foundation for a General Linear Mixed Models (GLMM) analysis. Based on 51,542 meta data annotations performed on 2,728 images, we applied our approach to the results of the Robust Medical Instrument Segmentation Challenge (ROBUST-MIS) challenge 2019 and revealed underexposure, motion and occlusion of instruments as well as the presence of smoke or other objects in the background as major sources of algorithm failure. Our subsequent method development, tailored to the specific remaining issues, yielded a deep learning model with state-of-the-art overall performance and specific strengths in the processing of images in which previous methods tended to fail. Due to the objectivity and generic applicability of our approach, it could become a valuable tool for validation in the field of medical image analysis and beyond.

Identifiants

DOI: 10.1016/j.media.2023.102765 PMID: 36965252

pubmed: 36965252

pii: S1361-8415(23)00026-9

doi: 10.1016/j.media.2023.102765

pii:

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

102765

Informations de copyright

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Part of this work was funded by the Helmholtz Imaging Platform (HIP), a platform of the Helmholtz Incubator on Information and Data Science and by the Surgical Oncology Program of the National Center for Tumor Diseases (NCT) Heidelberg.

Beyond rankings: Learning (more) from algorithm validation.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Tobias Roß (T)

Pierangela Bruno (P)

Annika Reinke (A)

Manuel Wiesenfarth (M)

Lisa Koeppel (L)

Peter M Full (PM)

Bünyamin Pekdemir (B)

Patrick Godau (P)

Darya Trofimova (D)

Fabian Isensee (F)

Tim J Adler (TJ)

Thuy N Tran (TN)

Sara Moccia (S)

Francesco Calimeri (F)

Beat P Müller-Stich (BP)

Annette Kopp-Schneider (A)

Lena Maier-Hein (L)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH