Assessing the Performance of ChatGPT in Medical Biochemistry Using Clinical Case Vignettes: Observational Study.

ChatGPT artificial intelligence biochemistry case scenario case study chatbot computer generated medical Biochemistry medical education medical exam medical examination

Journal

JMIR medical education
ISSN: 2369-3762
Titre abrégé: JMIR Med Educ
Pays: Canada
ID NLM: 101684518

Informations de publication

Date de publication:
07 Nov 2023
Historique:
received: 11 03 2023
accepted: 21 09 2023
revised: 29 05 2023
medline: 7 11 2023
pubmed: 7 11 2023
entrez: 7 11 2023
Statut: epublish

Résumé

ChatGPT has gained global attention recently owing to its high performance in generating a wide range of information and retrieving any kind of data instantaneously. ChatGPT has also been tested for the United States Medical Licensing Examination (USMLE) and has successfully cleared it. Thus, its usability in medical education is now one of the key discussions worldwide. The objective of this study is to evaluate the performance of ChatGPT in medical biochemistry using clinical case vignettes. The performance of ChatGPT was evaluated in medical biochemistry using 10 clinical case vignettes. Clinical case vignettes were randomly selected and inputted in ChatGPT along with the response options. We tested the responses for each clinical case twice. The answers generated by ChatGPT were saved and checked using our reference material. ChatGPT generated correct answers for 4 questions on the first attempt. For the other cases, there were differences in responses generated by ChatGPT in the first and second attempts. In the second attempt, ChatGPT provided correct answers for 6 questions and incorrect answers for 4 questions out of the 10 cases that were used. But, to our surprise, for case 3, different answers were obtained with multiple attempts. We believe this to have happened owing to the complexity of the case, which involved addressing various critical medical aspects related to amino acid metabolism in a balanced approach. According to the findings of our study, ChatGPT may not be considered an accurate information provider for application in medical education to improve learning and assessment. However, our study was limited by a small sample size (10 clinical case vignettes) and the use of the publicly available version of ChatGPT (version 3.5). Although artificial intelligence (AI) has the capability to transform medical education, we emphasize the validation of such data produced by such AI systems for correctness and dependability before it could be implemented in practice.

Sections du résumé

BACKGROUND BACKGROUND
ChatGPT has gained global attention recently owing to its high performance in generating a wide range of information and retrieving any kind of data instantaneously. ChatGPT has also been tested for the United States Medical Licensing Examination (USMLE) and has successfully cleared it. Thus, its usability in medical education is now one of the key discussions worldwide.
OBJECTIVE OBJECTIVE
The objective of this study is to evaluate the performance of ChatGPT in medical biochemistry using clinical case vignettes.
METHODS METHODS
The performance of ChatGPT was evaluated in medical biochemistry using 10 clinical case vignettes. Clinical case vignettes were randomly selected and inputted in ChatGPT along with the response options. We tested the responses for each clinical case twice. The answers generated by ChatGPT were saved and checked using our reference material.
RESULTS RESULTS
ChatGPT generated correct answers for 4 questions on the first attempt. For the other cases, there were differences in responses generated by ChatGPT in the first and second attempts. In the second attempt, ChatGPT provided correct answers for 6 questions and incorrect answers for 4 questions out of the 10 cases that were used. But, to our surprise, for case 3, different answers were obtained with multiple attempts. We believe this to have happened owing to the complexity of the case, which involved addressing various critical medical aspects related to amino acid metabolism in a balanced approach.
CONCLUSIONS CONCLUSIONS
According to the findings of our study, ChatGPT may not be considered an accurate information provider for application in medical education to improve learning and assessment. However, our study was limited by a small sample size (10 clinical case vignettes) and the use of the publicly available version of ChatGPT (version 3.5). Although artificial intelligence (AI) has the capability to transform medical education, we emphasize the validation of such data produced by such AI systems for correctness and dependability before it could be implemented in practice.

Identifiants

pubmed: 37934568
pii: v9i1e47191
doi: 10.2196/47191
pmc: PMC10664016
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e47191

Informations de copyright

©Krishna Mohan Surapaneni. Originally published in JMIR Medical Education (https://mededu.jmir.org), 07.11.2023.

Références

Radiology. 2023 Apr;307(2):e230171
pubmed: 36728749
Healthcare (Basel). 2023 Mar 19;11(6):
pubmed: 36981544
Anat Sci Educ. 2023 Mar 14;:
pubmed: 36916887
Br Dent J. 2023 Jan;234(2):72
pubmed: 36707552
J Med Syst. 2023 Mar 04;47(1):33
pubmed: 36869927
Med Teach. 2023 Oct 20;:1-8
pubmed: 37862566
Cureus. 2023 Mar 12;15(3):e36034
pubmed: 37056538
JMIR Med Educ. 2023 Mar 8;9:e46876
pubmed: 36867743
J Educ Eval Health Prof. 2023;20:1
pubmed: 36627845
JMIR Med Educ. 2023 Feb 8;9:e45312
pubmed: 36753318
Pak J Med Sci. 2023 Mar-Apr;39(2):605-607
pubmed: 36950398
PLOS Digit Health. 2023 Feb 9;2(2):e0000198
pubmed: 36812645

Auteurs

Krishna Mohan Surapaneni (KM)

Panimalar Medical College Hospital & Research Institute, Chennai, India.

Classifications MeSH