Artificial intelligence model GPT4 narrowly fails simulated radiological protection exam.

Artificial GPT4 Intelligence

Journal

Journal of radiological protection : official journal of the Society for Radiological Protection

ISSN: 1361-6498

Titre abrégé: J Radiol Prot

Pays: England

ID NLM: 8809257

Informations de publication

Date de publication:
17 Jan 2024

Historique:

medline: 17 1 2024

pubmed: 17 1 2024

entrez: 17 1 2024

Statut: aheadofprint

Résumé

This study assesses the efficacy of Generative Pre-Trained Transformers (GPT) published by OpenAI in the specialized domains of radiological protection and health physics. Utilizing a set of 1064 surrogate questions designed to mimic a health physics certification exam, we evaluated the models' ability to accurately respond to questions across five knowledge domains. Our results indicated that neither model met the 67% passing threshold, with GPT-3.5 achieving a 45.3% weighted average and GPT-4 attaining 61.7%. Despite GPT-4's significant parameter increase and multimodal capabilities, it demonstrated superior performance in all categories yet still fell short of a passing score. The study's methodology involved a simple, standardized prompting strategy without employing prompt engineering or in-context learning, which are known to potentially enhance performance. The analysis revealed that GPT-3.5 formatted answers more correctly, despite GPT-4's higher overall accuracy. The findings suggest that while GPT-3.5 and GPT-4 show promise in handling domain-specific content, their application in the field of radiological protection should be approached with caution, emphasizing the need for human oversight and verification.&#xD.

Identifiants

DOI: 10.1088/1361-6498/ad1fdf PMID: 38232401

pubmed: 38232401

doi: 10.1088/1361-6498/ad1fdf

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Informations de copyright

Creative Commons Attribution license.

Artificial intelligence model GPT4 narrowly fails simulated radiological protection exam.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Auteurs

G Roemer (G)

A Li (A)

U Mahmood (U)

L Dauer (L)

M Bellamy (M)

Classifications MeSH