Comparison of Artificial Intelligence to Resident Performance on Upper-Extremity Orthopaedic In-Training Examination Questions.
Artificial intelligence
ChatGPT
Orthopaedic In-Training Examination
Resident education
Upper extremity
Journal
Journal of hand surgery global online
ISSN: 2589-5141
Titre abrégé: J Hand Surg Glob Online
Pays: United States
ID NLM: 101759126
Informations de publication
Date de publication:
Mar 2024
Mar 2024
Historique:
received:
10
07
2023
accepted:
28
10
2023
medline:
21
6
2024
pubmed:
21
6
2024
entrez:
21
6
2024
Statut:
epublish
Résumé
Currently, there is a paucity of prior investigations and studies examining applications for artificial intelligence (AI) in upper-extremity (UE) surgical education. The purpose of this investigation was to assess the performance of a novel AI tool (ChatGPT) on UE questions on the Orthopaedic In-Training Examination (OITE). We aimed to compare the performance of ChatGPT to the examination performance of hand surgery residents. We selected questions from the 2020-2022 OITEs that focused on both the hand and UE as well as the shoulder and elbow content domains. These questions were divided into two categories: those with text-only prompts (text-only questions) and those that included supplementary images or videos (media questions). Two authors (B.K.F. and G.S.M.) converted the accompanying media into text-based descriptions. Included questions were inputted into ChatGPT (version 3.5) to generate responses. Each OITE question was entered into ChatGPT three times: (1) open-ended response, which requested a free-text response; (2) multiple-choice responses without asking for justification; and (3) multiple-choice response with justification. We referred to the OITE scoring guide for each year in order to compare the percentage of correct AI responses to correct resident responses. A total of 102 UE OITE questions were included; 59 were text-only questions, and 43 were media-based. ChatGPT correctly answered 46 (45%) of 102 questions using the Multiple Choice No Justification prompt requirement (42% for text-based and 44% for media questions). Compared to ChatGPT, postgraduate year 1 orthopaedic residents achieved an average score of 51% correct. Postgraduate year 5 residents answered 76% of the same questions correctly. ChatGPT answered fewer UE OITE questions correctly compared to hand surgery residents of all training levels. Further development of novel AI tools may be necessary if this technology is going to have a role in UE education.
Identifiants
pubmed: 38903829
doi: 10.1016/j.jhsg.2023.10.013
pii: S2589-5141(23)00188-3
pmc: PMC11185884
doi:
Types de publication
Journal Article
Langues
eng
Pagination
164-168Informations de copyright
© 2023 The Authors.