Assessing the Performance of ChatGPT in Medical Biochemistry Using Clinical Case Vignettes: Observational Study.
ChatGPT
artificial intelligence
biochemistry
case scenario
case study
chatbot
computer generated
medical Biochemistry
medical education
medical exam
medical examination
Journal
JMIR medical education
ISSN: 2369-3762
Titre abrégé: JMIR Med Educ
Pays: Canada
ID NLM: 101684518
Informations de publication
Date de publication:
07 Nov 2023
07 Nov 2023
Historique:
received:
11
03
2023
accepted:
21
09
2023
revised:
29
05
2023
medline:
7
11
2023
pubmed:
7
11
2023
entrez:
7
11
2023
Statut:
epublish
Résumé
ChatGPT has gained global attention recently owing to its high performance in generating a wide range of information and retrieving any kind of data instantaneously. ChatGPT has also been tested for the United States Medical Licensing Examination (USMLE) and has successfully cleared it. Thus, its usability in medical education is now one of the key discussions worldwide. The objective of this study is to evaluate the performance of ChatGPT in medical biochemistry using clinical case vignettes. The performance of ChatGPT was evaluated in medical biochemistry using 10 clinical case vignettes. Clinical case vignettes were randomly selected and inputted in ChatGPT along with the response options. We tested the responses for each clinical case twice. The answers generated by ChatGPT were saved and checked using our reference material. ChatGPT generated correct answers for 4 questions on the first attempt. For the other cases, there were differences in responses generated by ChatGPT in the first and second attempts. In the second attempt, ChatGPT provided correct answers for 6 questions and incorrect answers for 4 questions out of the 10 cases that were used. But, to our surprise, for case 3, different answers were obtained with multiple attempts. We believe this to have happened owing to the complexity of the case, which involved addressing various critical medical aspects related to amino acid metabolism in a balanced approach. According to the findings of our study, ChatGPT may not be considered an accurate information provider for application in medical education to improve learning and assessment. However, our study was limited by a small sample size (10 clinical case vignettes) and the use of the publicly available version of ChatGPT (version 3.5). Although artificial intelligence (AI) has the capability to transform medical education, we emphasize the validation of such data produced by such AI systems for correctness and dependability before it could be implemented in practice.
Sections du résumé
BACKGROUND
BACKGROUND
ChatGPT has gained global attention recently owing to its high performance in generating a wide range of information and retrieving any kind of data instantaneously. ChatGPT has also been tested for the United States Medical Licensing Examination (USMLE) and has successfully cleared it. Thus, its usability in medical education is now one of the key discussions worldwide.
OBJECTIVE
OBJECTIVE
The objective of this study is to evaluate the performance of ChatGPT in medical biochemistry using clinical case vignettes.
METHODS
METHODS
The performance of ChatGPT was evaluated in medical biochemistry using 10 clinical case vignettes. Clinical case vignettes were randomly selected and inputted in ChatGPT along with the response options. We tested the responses for each clinical case twice. The answers generated by ChatGPT were saved and checked using our reference material.
RESULTS
RESULTS
ChatGPT generated correct answers for 4 questions on the first attempt. For the other cases, there were differences in responses generated by ChatGPT in the first and second attempts. In the second attempt, ChatGPT provided correct answers for 6 questions and incorrect answers for 4 questions out of the 10 cases that were used. But, to our surprise, for case 3, different answers were obtained with multiple attempts. We believe this to have happened owing to the complexity of the case, which involved addressing various critical medical aspects related to amino acid metabolism in a balanced approach.
CONCLUSIONS
CONCLUSIONS
According to the findings of our study, ChatGPT may not be considered an accurate information provider for application in medical education to improve learning and assessment. However, our study was limited by a small sample size (10 clinical case vignettes) and the use of the publicly available version of ChatGPT (version 3.5). Although artificial intelligence (AI) has the capability to transform medical education, we emphasize the validation of such data produced by such AI systems for correctness and dependability before it could be implemented in practice.
Identifiants
pubmed: 37934568
pii: v9i1e47191
doi: 10.2196/47191
pmc: PMC10664016
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e47191Informations de copyright
©Krishna Mohan Surapaneni. Originally published in JMIR Medical Education (https://mededu.jmir.org), 07.11.2023.
Références
Radiology. 2023 Apr;307(2):e230171
pubmed: 36728749
Healthcare (Basel). 2023 Mar 19;11(6):
pubmed: 36981544
Anat Sci Educ. 2023 Mar 14;:
pubmed: 36916887
Br Dent J. 2023 Jan;234(2):72
pubmed: 36707552
J Med Syst. 2023 Mar 04;47(1):33
pubmed: 36869927
Med Teach. 2023 Oct 20;:1-8
pubmed: 37862566
Cureus. 2023 Mar 12;15(3):e36034
pubmed: 37056538
JMIR Med Educ. 2023 Mar 8;9:e46876
pubmed: 36867743
J Educ Eval Health Prof. 2023;20:1
pubmed: 36627845
JMIR Med Educ. 2023 Feb 8;9:e45312
pubmed: 36753318
Pak J Med Sci. 2023 Mar-Apr;39(2):605-607
pubmed: 36950398
PLOS Digit Health. 2023 Feb 9;2(2):e0000198
pubmed: 36812645