Performance of ChatGPT and Bard on the official part 1 FRCOphth practice questions.

Medical Education

Journal

The British journal of ophthalmology
ISSN: 1468-2079
Titre abrégé: Br J Ophthalmol
Pays: England
ID NLM: 0421041

Informations de publication

Date de publication:
06 Nov 2023
Historique:
received: 22 06 2023
accepted: 08 10 2023
medline: 7 11 2023
pubmed: 7 11 2023
entrez: 6 11 2023
Statut: aheadofprint

Résumé

Chat Generative Pre-trained Transformer (ChatGPT), a large language model by OpenAI, and Bard, Google's artificial intelligence (AI) chatbot, have been evaluated in various contexts. This study aims to assess these models' proficiency in the part 1 Fellowship of the Royal College of Ophthalmologists (FRCOphth) Multiple Choice Question (MCQ) examination, highlighting their potential in medical education. Both models were tested on a sample question bank for the part 1 FRCOphth MCQ exam. Their performances were compared with historical human performance on the exam, focusing on the ability to comprehend, retain and apply information related to ophthalmology. We also tested it on the book 'MCQs for FRCOpth part 1', and assessed its performance across subjects. ChatGPT demonstrated a strong performance, surpassing historical human pass marks and examination performance, while Bard underperformed. The comparison indicates the potential of certain AI models to match, and even exceed, human standards in such tasks. The results demonstrate the potential of AI models, such as ChatGPT, in processing and applying medical knowledge at a postgraduate level. However, performance varied among different models, highlighting the importance of appropriate AI selection. The study underlines the potential for AI applications in medical education and the necessity for further investigation into their strengths and limitations.

Sections du résumé

BACKGROUND BACKGROUND
Chat Generative Pre-trained Transformer (ChatGPT), a large language model by OpenAI, and Bard, Google's artificial intelligence (AI) chatbot, have been evaluated in various contexts. This study aims to assess these models' proficiency in the part 1 Fellowship of the Royal College of Ophthalmologists (FRCOphth) Multiple Choice Question (MCQ) examination, highlighting their potential in medical education.
METHODS METHODS
Both models were tested on a sample question bank for the part 1 FRCOphth MCQ exam. Their performances were compared with historical human performance on the exam, focusing on the ability to comprehend, retain and apply information related to ophthalmology. We also tested it on the book 'MCQs for FRCOpth part 1', and assessed its performance across subjects.
RESULTS RESULTS
ChatGPT demonstrated a strong performance, surpassing historical human pass marks and examination performance, while Bard underperformed. The comparison indicates the potential of certain AI models to match, and even exceed, human standards in such tasks.
CONCLUSION CONCLUSIONS
The results demonstrate the potential of AI models, such as ChatGPT, in processing and applying medical knowledge at a postgraduate level. However, performance varied among different models, highlighting the importance of appropriate AI selection. The study underlines the potential for AI applications in medical education and the necessity for further investigation into their strengths and limitations.

Identifiants

pubmed: 37932006
pii: bjo-2023-324091
doi: 10.1136/bjo-2023-324091
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© Author(s) (or their employer(s)) 2023. No commercial re-use. See rights and permissions. Published by BMJ.

Déclaration de conflit d'intérêts

Competing interests: None declared.

Auteurs

Thomas Fowler (T)

Department of Medicine, Barking Havering and Redbridge University Hospitals NHS Trust, London, UK thomas.fowler6@nhs.net.

Simon Pullen (S)

Department of Anaesthetics, Princess Alexandra Hospital, Harlow, UK.

Liam Birkett (L)

Emergency Medicine, Royal Free Hospital, London, UK.

Classifications MeSH