Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study.

AI Chat Generative Pre-trained Transformer ChatGPT GPT-4 Generative Pre-trained Transformer 4 Japanese Medical Licensing Examination artificial intelligence clinical support learning model medical education medical licensing

Journal

JMIR medical education
ISSN: 2369-3762
Titre abrégé: JMIR Med Educ
Pays: Canada
ID NLM: 101684518

Informations de publication

Date de publication:
29 Jun 2023
Historique:
received: 07 04 2023
accepted: 14 06 2023
revised: 11 05 2023
medline: 29 6 2023
pubmed: 29 6 2023
entrez: 29 6 2023
Statut: epublish

Résumé

The competence of ChatGPT (Chat Generative Pre-Trained Transformer) in non-English languages is not well studied. This study compared the performances of GPT-3.5 (Generative Pre-trained Transformer) and GPT-4 on the Japanese Medical Licensing Examination (JMLE) to evaluate the reliability of these models for clinical reasoning and medical knowledge in non-English languages. This study used the default mode of ChatGPT, which is based on GPT-3.5; the GPT-4 model of ChatGPT Plus; and the 117th JMLE in 2023. A total of 254 questions were included in the final analysis, which were categorized into 3 types, namely general, clinical, and clinical sentence questions. The results indicated that GPT-4 outperformed GPT-3.5 in terms of accuracy, particularly for general, clinical, and clinical sentence questions. GPT-4 also performed better on difficult questions and specific disease questions. Furthermore, GPT-4 achieved the passing criteria for the JMLE, indicating its reliability for clinical reasoning and medical knowledge in non-English languages. GPT-4 could become a valuable tool for medical education and clinical support in non-English-speaking regions, such as Japan.

Sections du résumé

BACKGROUND BACKGROUND
The competence of ChatGPT (Chat Generative Pre-Trained Transformer) in non-English languages is not well studied.
OBJECTIVE OBJECTIVE
This study compared the performances of GPT-3.5 (Generative Pre-trained Transformer) and GPT-4 on the Japanese Medical Licensing Examination (JMLE) to evaluate the reliability of these models for clinical reasoning and medical knowledge in non-English languages.
METHODS METHODS
This study used the default mode of ChatGPT, which is based on GPT-3.5; the GPT-4 model of ChatGPT Plus; and the 117th JMLE in 2023. A total of 254 questions were included in the final analysis, which were categorized into 3 types, namely general, clinical, and clinical sentence questions.
RESULTS RESULTS
The results indicated that GPT-4 outperformed GPT-3.5 in terms of accuracy, particularly for general, clinical, and clinical sentence questions. GPT-4 also performed better on difficult questions and specific disease questions. Furthermore, GPT-4 achieved the passing criteria for the JMLE, indicating its reliability for clinical reasoning and medical knowledge in non-English languages.
CONCLUSIONS CONCLUSIONS
GPT-4 could become a valuable tool for medical education and clinical support in non-English-speaking regions, such as Japan.

Identifiants

pubmed: 37384388
pii: v9i1e48002
doi: 10.2196/48002
pmc: PMC10365615
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e48002

Informations de copyright

©Soshi Takagi, Takashi Watari, Ayano Erabi, Kota Sakaguchi. Originally published in JMIR Medical Education (https://mededu.jmir.org), 29.06.2023.

Références

N Engl J Med. 2023 Mar 30;388(13):1233-1239
pubmed: 36988602
Healthcare (Basel). 2023 Mar 19;11(6):
pubmed: 36981544
Int J Environ Res Public Health. 2023 Feb 15;20(4):
pubmed: 36834073
Commun Med (Lond). 2022 Jun 3;2:63
pubmed: 35668847
Ann Biomed Eng. 2023 May;51(5):868-869
pubmed: 36920578
J Educ Eval Health Prof. 2023;20:1
pubmed: 36627845
JMIR Med Educ. 2023 Feb 8;9:e45312
pubmed: 36753318
PLOS Digit Health. 2023 Feb 9;2(2):e0000198
pubmed: 36812645
Nature. 2023 Mar;615(7951):216
pubmed: 36882613
Epilepsia. 2023 May;64(5):1195-1199
pubmed: 36869421
JMIR Med Educ. 2023 Apr 21;9:e46599
pubmed: 37083633

Auteurs

Soshi Takagi (S)

Faculty of Medicine, Shimane University, Izumo, Japan.

Takashi Watari (T)

Faculty of Medicine, Shimane University, Izumo, Japan.
General Medicine Center, Shimane University Hospital, Izumo, Japan.
Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States.
Medicine Service, VA Ann Arbor Healthcare System, Ann Arbor, MI, United States.

Ayano Erabi (A)

Faculty of Medicine, Shimane University, Izumo, Japan.

Kota Sakaguchi (K)

General Medicine Center, Shimane University Hospital, Izumo, Japan.

Classifications MeSH