Comparison of Artificial Intelligence to Resident Performance on Upper-Extremity Orthopaedic In-Training Examination Questions.

Artificial intelligence ChatGPT Orthopaedic In-Training Examination Resident education Upper extremity

Journal

Journal of hand surgery global online
ISSN: 2589-5141
Titre abrégé: J Hand Surg Glob Online
Pays: United States
ID NLM: 101759126

Informations de publication

Date de publication:
Mar 2024
Historique:
received: 10 07 2023
accepted: 28 10 2023
medline: 21 6 2024
pubmed: 21 6 2024
entrez: 21 6 2024
Statut: epublish

Résumé

Currently, there is a paucity of prior investigations and studies examining applications for artificial intelligence (AI) in upper-extremity (UE) surgical education. The purpose of this investigation was to assess the performance of a novel AI tool (ChatGPT) on UE questions on the Orthopaedic In-Training Examination (OITE). We aimed to compare the performance of ChatGPT to the examination performance of hand surgery residents. We selected questions from the 2020-2022 OITEs that focused on both the hand and UE as well as the shoulder and elbow content domains. These questions were divided into two categories: those with text-only prompts (text-only questions) and those that included supplementary images or videos (media questions). Two authors (B.K.F. and G.S.M.) converted the accompanying media into text-based descriptions. Included questions were inputted into ChatGPT (version 3.5) to generate responses. Each OITE question was entered into ChatGPT three times: (1) open-ended response, which requested a free-text response; (2) multiple-choice responses without asking for justification; and (3) multiple-choice response with justification. We referred to the OITE scoring guide for each year in order to compare the percentage of correct AI responses to correct resident responses. A total of 102 UE OITE questions were included; 59 were text-only questions, and 43 were media-based. ChatGPT correctly answered 46 (45%) of 102 questions using the Multiple Choice No Justification prompt requirement (42% for text-based and 44% for media questions). Compared to ChatGPT, postgraduate year 1 orthopaedic residents achieved an average score of 51% correct. Postgraduate year 5 residents answered 76% of the same questions correctly. ChatGPT answered fewer UE OITE questions correctly compared to hand surgery residents of all training levels. Further development of novel AI tools may be necessary if this technology is going to have a role in UE education.

Identifiants

pubmed: 38903829
doi: 10.1016/j.jhsg.2023.10.013
pii: S2589-5141(23)00188-3
pmc: PMC11185884
doi:

Types de publication

Journal Article

Langues

eng

Pagination

164-168

Informations de copyright

© 2023 The Authors.

Auteurs

Yagiz Ozdag (Y)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Daniel S Hayes (DS)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Gabriel S Makar (GS)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Shahid Manzar (S)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Brian K Foster (BK)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Mason J Shultz (MJ)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Joel C Klena (JC)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Louis C Grandizio (LC)

Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.

Classifications MeSH