Applying GPT-4 to the Plastic Surgery Inservice Training Examination.
AI
Artificial intelligence
ChatGPT
Resident education
Journal
Journal of plastic, reconstructive & aesthetic surgery : JPRAS
ISSN: 1878-0539
Titre abrégé: J Plast Reconstr Aesthet Surg
Pays: Netherlands
ID NLM: 101264239
Informations de publication
Date de publication:
Dec 2023
Dec 2023
Historique:
received:
23
05
2023
revised:
20
08
2023
accepted:
08
09
2023
medline:
5
12
2023
pubmed:
10
10
2023
entrez:
9
10
2023
Statut:
ppublish
Résumé
The recent introduction of Generative Pre-trained Transformer (GPT)-4 has demonstrated the potential to be a superior version of ChatGPT-3.5. According to many, GPT-4 is seen as a more reliable and creative version of GPT-3.5. In conjugation with our prior manuscript, we wanted to determine if GPT-4 could be exploited as an instrument for plastic surgery graduate medical education by evaluating its performance on the Plastic Surgery Inservice Training Examination (PSITE). Sample assessment questions from the 2022 PSITE were obtained from the American Council of Academic Plastic Surgeons website and manually inputted into GPT-4. Responses by GPT-4 were qualified using the properties of natural coherence. Incorrect answers were stratified into the consequent categories: informational, logical, or explicit fallacy. From a total of 242 questions, GPT-4 provided correct answers for 187, resulting in a 77.3% accuracy rate. Logical reasoning was utilized in 95.0% of questions, internal information in 98.3%, and external information in 97.5%. Upon separating the questions based on incorrect and correct responses, a statistically significant difference was identified in GPT-4's application of logical reasoning. GPT-4 has shown to be more accurate and reliable for plastic surgery resident education when compared to GPT-3.5. Users should look to utilize the tool to enhance their educational curriculum. Those who adopt the use of such models may be better equipped to deliver high-quality care to their patients.
Sections du résumé
BACKGROUND
BACKGROUND
The recent introduction of Generative Pre-trained Transformer (GPT)-4 has demonstrated the potential to be a superior version of ChatGPT-3.5. According to many, GPT-4 is seen as a more reliable and creative version of GPT-3.5.
OBJECTIVE
OBJECTIVE
In conjugation with our prior manuscript, we wanted to determine if GPT-4 could be exploited as an instrument for plastic surgery graduate medical education by evaluating its performance on the Plastic Surgery Inservice Training Examination (PSITE).
METHODS
METHODS
Sample assessment questions from the 2022 PSITE were obtained from the American Council of Academic Plastic Surgeons website and manually inputted into GPT-4. Responses by GPT-4 were qualified using the properties of natural coherence. Incorrect answers were stratified into the consequent categories: informational, logical, or explicit fallacy.
RESULTS
RESULTS
From a total of 242 questions, GPT-4 provided correct answers for 187, resulting in a 77.3% accuracy rate. Logical reasoning was utilized in 95.0% of questions, internal information in 98.3%, and external information in 97.5%. Upon separating the questions based on incorrect and correct responses, a statistically significant difference was identified in GPT-4's application of logical reasoning.
CONCLUSION
CONCLUSIONS
GPT-4 has shown to be more accurate and reliable for plastic surgery resident education when compared to GPT-3.5. Users should look to utilize the tool to enhance their educational curriculum. Those who adopt the use of such models may be better equipped to deliver high-quality care to their patients.
Identifiants
pubmed: 37812847
pii: S1748-6815(23)00521-1
doi: 10.1016/j.bjps.2023.09.027
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
78-82Informations de copyright
Copyright © 2023 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of Competing Interest No potential conflicts of interest in regard to publication of this article, authorship, or research were declared by the authors.