Comparison of Artificial Intelligence to Resident Performance on Upper-Extremity Orthopaedic In-Training Examination Questions.

Artificial intelligence ChatGPT Orthopaedic In-Training Examination Resident education Upper extremity

Journal

Journal of hand surgery global online

ISSN: 2589-5141

Titre abrégé: J Hand Surg Glob Online

Pays: United States

ID NLM: 101759126

Informations de publication

Date de publication:
Mar 2024

Historique:

received: 10 07 2023

accepted: 28 10 2023

medline: 21 6 2024

pubmed: 21 6 2024

entrez: 21 6 2024

Statut: epublish

Résumé

Currently, there is a paucity of prior investigations and studies examining applications for artificial intelligence (AI) in upper-extremity (UE) surgical education. The purpose of this investigation was to assess the performance of a novel AI tool (ChatGPT) on UE questions on the Orthopaedic In-Training Examination (OITE). We aimed to compare the performance of ChatGPT to the examination performance of hand surgery residents. We selected questions from the 2020-2022 OITEs that focused on both the hand and UE as well as the shoulder and elbow content domains. These questions were divided into two categories: those with text-only prompts (text-only questions) and those that included supplementary images or videos (media questions). Two authors (B.K.F. and G.S.M.) converted the accompanying media into text-based descriptions. Included questions were inputted into ChatGPT (version 3.5) to generate responses. Each OITE question was entered into ChatGPT three times: (1) open-ended response, which requested a free-text response; (2) multiple-choice responses without asking for justification; and (3) multiple-choice response with justification. We referred to the OITE scoring guide for each year in order to compare the percentage of correct AI responses to correct resident responses. A total of 102 UE OITE questions were included; 59 were text-only questions, and 43 were media-based. ChatGPT correctly answered 46 (45%) of 102 questions using the Multiple Choice No Justification prompt requirement (42% for text-based and 44% for media questions). Compared to ChatGPT, postgraduate year 1 orthopaedic residents achieved an average score of 51% correct. Postgraduate year 5 residents answered 76% of the same questions correctly. ChatGPT answered fewer UE OITE questions correctly compared to hand surgery residents of all training levels. Further development of novel AI tools may be necessary if this technology is going to have a role in UE education.

Identifiants

DOI: 10.1016/j.jhsg.2023.10.013 PMID: 38903829 PMC: PMC11185884

pubmed: 38903829

doi: 10.1016/j.jhsg.2023.10.013

pii: S2589-5141(23)00188-3

pmc: PMC11185884

doi:

Types de publication

Journal Article

Langues

eng

Pagination

164-168

Comparison of Artificial Intelligence to Resident Performance on Upper-Extremity Orthopaedic In-Training Examination Questions.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Auteurs

Yagiz Ozdag (Y)

Daniel S Hayes (DS)

Gabriel S Makar (GS)

Shahid Manzar (S)

Brian K Foster (BK)

Mason J Shultz (MJ)

Joel C Klena (JC)

Louis C Grandizio (LC)

Classifications MeSH