ChatGPT: Evaluating answers on contrast media related questions and finetuning by providing the model with the ESUR guideline on contrast agents.

Artificial Intelligence, Contrast Media, Dicision Making, Guidelines

Journal

Current problems in diagnostic radiology
ISSN: 1535-6302
Titre abrégé: Curr Probl Diagn Radiol
Pays: United States
ID NLM: 7607123

Informations de publication

Date de publication:
21 Apr 2024
Historique:
received: 23 12 2023
revised: 10 03 2024
accepted: 18 04 2024
medline: 27 4 2024
pubmed: 27 4 2024
entrez: 26 4 2024
Statut: aheadofprint

Résumé

This study aimed to assess the feasibility of GPT-4 for answering questions related to contrast media with and without the context of the European Society of Urogenital Radiology (ESUR) guideline on contrast agents. The overarching goal was to determine whether contextual enrichment by providing guideline information improves answers of GPT-4 for clinical decision-making in radiology. A set of 64 questions, based on the ESUR guideline on contrast agents mirroring pertinent sections, was developed and posed to GPT-4 both directly and after providing the guideline using a plugin. Responses were graded by experienced radiologists for quality of information and accuracy in pinpointing information from the guideline as well as by radiology residents for utility, using Likert-scales. GPT-4's performance improved significantly with the guideline. Without the guideline, average quality rating was 3.98, which increased to 4.33 with the guideline (p = 0036). In terms of accuracy, 82.3% of answers matched the information from the guideline. Utility scores also reflected a significant improvement with the guideline, with average scores of 4.1 (without) and 4.4 (with) (p = 0.008) with a Fleiss´ Kappa of 0.44. GPT-4, when contextually enriched with a guideline, demonstrates enhanced capability in providing guideline-backed recommendations. This approach holds promise for real-time clinical decision-support, making guidelines more actionable. However, further refinements are necessary to maximize the potential of large language models (LLMs). Inherent limitations need to be addressed.

Identifiants

pubmed: 38670921
pii: S0363-0188(24)00075-6
doi: 10.1067/j.cpradiol.2024.04.005
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.

Auteurs

Michael Scheschenja (M)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany. Electronic address: Michael.Scheschenja@med.uni-marburg.de.

Moritz B Bastian (MB)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany.

Joel Wessendorf (J)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany.

Andreas D Owczarek (AD)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany.

Alexander M König (AM)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany.

Simon Viniol (S)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany.

Andreas H Mahnken (AH)

Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, Marburg, DE 35043, Germany.

Classifications MeSH