ChatGPT and retinal disease: a cross-sectional study on AI comprehension of clinical guidelines.


Journal

Canadian journal of ophthalmology. Journal canadien d'ophtalmologie
ISSN: 1715-3360
Titre abrégé: Can J Ophthalmol
Pays: England
ID NLM: 0045312

Informations de publication

Date de publication:
31 Jul 2024
Historique:
received: 07 12 2023
revised: 11 02 2024
accepted: 03 06 2024
medline: 4 8 2024
pubmed: 4 8 2024
entrez: 3 8 2024
Statut: aheadofprint

Résumé

To evaluate the performance of an artificial intelligence (AI) large language model, ChatGPT (version 4.0), for common retinal diseases, in accordance with the American Academy of Ophthalmology (AAO) Preferred Practice Pattern (PPP) guidelines. A cross-sectional survey study design was employed to compare the responses made by ChatGPT to established clinical guidelines. Responses by the AI were reviewed by a panel of three vitreoretinal specialists for evaluation. To investigate ChatGPT's comprehension of clinical guidelines, we designed 130 questions covering a broad spectrum of topics within 12 AAO PPP domains of retinal disease These questions were crafted to encompass diagnostic criteria, treatment guidelines, and management strategies, including both medical and surgical aspects of retinal care. A panel of 3 retinal specialists independently evaluated responses on a Likert scale from 1 to 5 based on their relevance, accuracy, and adherence to AAO PPP guidelines. Response readability was evaluated using Flesch Readability Ease and Flesch-Kincaid grade level scores. ChatGPT achieved an overall average score of 4.9/5.0, suggesting high alignment with the AAO PPP guidelines. Scores varied across domains, with the lowest in the surgical management of disease. The responses had a low reading ease score and required a college-to-graduate level of comprehension. Identified errors were related to diagnostic criteria, treatment options, and methodological procedures. ChatGPT 4.0 demonstrated significant potential in generating guideline-concordant responses, particularly for common medical retinal diseases. However, its performance slightly decreased in surgical retina, highlighting the ongoing need for clinician input, further model refinement, and improved comprehensibility.

Identifiants

pubmed: 39097289
pii: S0008-4182(24)00175-3
doi: 10.1016/j.jcjo.2024.06.001
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.

Auteurs

Michael Balas (M)

Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.

Efrem D Mandelcorn (ED)

Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada; University Health Network, University of Toronto, Toronto, Ontario, Canada; Kensington Eye Institute, Toronto, Ontario, Canada.

Peng Yan (P)

Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada; University Health Network, University of Toronto, Toronto, Ontario, Canada; Kensington Eye Institute, Toronto, Ontario, Canada.

Edsel B Ing (EB)

Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada; Department of Ophthalmology and Visual Sciences, University of Alberta, Edmonton, Alberta, Canada.

Sean A Crawford (SA)

Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; University Health Network, University of Toronto, Toronto, Ontario, Canada; Division of Vascular Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada.

Parnian Arjmand (P)

Peter Munk Cardiac Centre, Toronto General Hospital, University Health Network, Toronto, Ontario, Canada; Mississauga Retina Institute, Mississauga, Ontario, Canada. Electronic address: parnian.arjmand@medportal.ca.

Classifications MeSH