Digesting Digital Health: A Study of Appropriateness and Readability of ChatGPT-Generated Gastroenterological Information.


Journal

Clinical and translational gastroenterology
ISSN: 2155-384X
Titre abrégé: Clin Transl Gastroenterol
Pays: United States
ID NLM: 101532142

Informations de publication

Date de publication:
30 Aug 2024
Historique:
received: 18 07 2024
accepted: 15 08 2024
medline: 31 8 2024
pubmed: 31 8 2024
entrez: 30 8 2024
Statut: aheadofprint

Résumé

The advent of artificial intelligence-powered large language models capable of generating interactive responses to intricate queries marks a groundbreaking development in how patients access medical information. Our aim was to evaluate the appropriateness and readability of gastroenterological information generated by ChatGPT. We analyzed responses generated by ChatGPT to 16 dialogue-based queries assessing symptoms and treatments for gastrointestinal conditions and 13 definition-based queries on prevalent topics in gastroenterology. Three board-certified gastroenterologists evaluated output appropriateness with a 5-point Likert-scale proxy measurement of currency, relevance, accuracy, comprehensiveness, clarity, and urgency/next steps. Outputs with a score of 4 or 5 in all 6 categories were designated as "appropriate." Output readability was assessed with Flesch Reading Ease score, Flesch-Kinkaid Reading Level, and Simple Measure of Gobbledygook scores. ChatGTP responses to 44% of the 16 dialogue-based and 69% of the 13 definition-based questions were deemed appropriate, and the proportion of appropriate responses within the 2 groups of questions was not significantly different (P = .17). Notably, none of ChatGTP's responses to questions related to gastrointestinal emergencies were designated appropriate. The mean readability scores showed that outputs were written at a college-level reading proficiency. ChatGPT can produce generally fitting responses to gastroenterological medical queries, but responses were constrained in appropriateness and readability, which limits the current utility of this large language model. Substantial development is essential before these models can be unequivocally endorsed as reliable sources of medical information.

Sections du résumé

BACKGROUND AND AIMS OBJECTIVE
The advent of artificial intelligence-powered large language models capable of generating interactive responses to intricate queries marks a groundbreaking development in how patients access medical information. Our aim was to evaluate the appropriateness and readability of gastroenterological information generated by ChatGPT.
METHODS METHODS
We analyzed responses generated by ChatGPT to 16 dialogue-based queries assessing symptoms and treatments for gastrointestinal conditions and 13 definition-based queries on prevalent topics in gastroenterology. Three board-certified gastroenterologists evaluated output appropriateness with a 5-point Likert-scale proxy measurement of currency, relevance, accuracy, comprehensiveness, clarity, and urgency/next steps. Outputs with a score of 4 or 5 in all 6 categories were designated as "appropriate." Output readability was assessed with Flesch Reading Ease score, Flesch-Kinkaid Reading Level, and Simple Measure of Gobbledygook scores.
RESULTS RESULTS
ChatGTP responses to 44% of the 16 dialogue-based and 69% of the 13 definition-based questions were deemed appropriate, and the proportion of appropriate responses within the 2 groups of questions was not significantly different (P = .17). Notably, none of ChatGTP's responses to questions related to gastrointestinal emergencies were designated appropriate. The mean readability scores showed that outputs were written at a college-level reading proficiency.
CONCLUSION CONCLUSIONS
ChatGPT can produce generally fitting responses to gastroenterological medical queries, but responses were constrained in appropriateness and readability, which limits the current utility of this large language model. Substantial development is essential before these models can be unequivocally endorsed as reliable sources of medical information.

Identifiants

pubmed: 39212302
doi: 10.14309/ctg.0000000000000765
pii: 01720094-990000000-00306
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

Copyright © 2024 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of The American College of Gastroenterology.

Auteurs

Avi Toiv (A)

Department of Internal Medicine, Henry Ford Hospital, Detroit, MI, USA.

Zachary Saleh (Z)

Department of Gastroenterology, Henry Ford Hospital, Detroit, MI, USA.

Angela Ishak (A)

Department of Internal Medicine, Henry Ford Hospital, Detroit, MI, USA.

Eva Alsheik (E)

Department of Gastroenterology, Henry Ford Hospital, Detroit, MI, USA.

Deepak Venkat (D)

Department of Gastroenterology, Henry Ford Hospital, Detroit, MI, USA.

Neilanjan Nandi (N)

Department of Gastroenterology, University of Pennsylvania, Philadelphia, PA, USA.

Tobias E Zuchelli (TE)

Department of Gastroenterology, Henry Ford Hospital, Detroit, MI, USA.

Classifications MeSH