Artificial intelligence compared with human-derived patient educational materials on cirrhosis.
Journal
Hepatology communications
ISSN: 2471-254X
Titre abrégé: Hepatol Commun
Pays: United States
ID NLM: 101695860
Informations de publication
Date de publication:
01 Mar 2024
01 Mar 2024
Historique:
received:
14
07
2023
accepted:
11
12
2023
medline:
15
2
2024
pubmed:
15
2
2024
entrez:
15
2
2024
Statut:
epublish
Résumé
The study compared the readability, grade level, understandability, actionability, and accuracy of standard patient educational material against artificial intelligence chatbot-derived patient educational material regarding cirrhosis. An identical standardized phrase was used to generate patient educational materials on cirrhosis from 4 large language model-derived chatbots (ChatGPT, DocsGPT, Google Bard, and Bing Chat), and the outputs were compared against a pre-existing human-derived educational material (Epic). Objective scores for readability and grade level were determined using Flesch-Kincaid and Simple Measure of Gobbledygook scoring systems. 14 patients/caregivers and 8 transplant hepatologists were blinded and independently scored the materials on understandability and actionability and indicated whether they believed the material was human or artificial intelligence-generated. Understandability and actionability were determined using the Patient Education Materials Assessment Tool for Printable Materials. Transplant hepatologists also provided medical accuracy scores. Most educational materials scored similarly in readability and grade level but were above the desired sixth-grade reading level. All educational materials were deemed understandable by both groups, while only the human-derived educational material (Epic) was considered actionable by both groups. No significant difference in perceived actionability or understandability among the educational materials was identified. Both groups poorly identified which materials were human-derived versus artificial intelligence-derived. Chatbot-derived patient educational materials have comparable readability, grade level, understandability, and accuracy to human-derived materials. Readability, grade level, and actionability may be appropriate targets for improvement across educational materials on cirrhosis. Chatbot-derived patient educational materials show promise, and further studies should assess their usefulness in clinical practice.
Sections du résumé
BACKGROUND
BACKGROUND
The study compared the readability, grade level, understandability, actionability, and accuracy of standard patient educational material against artificial intelligence chatbot-derived patient educational material regarding cirrhosis.
METHODS
METHODS
An identical standardized phrase was used to generate patient educational materials on cirrhosis from 4 large language model-derived chatbots (ChatGPT, DocsGPT, Google Bard, and Bing Chat), and the outputs were compared against a pre-existing human-derived educational material (Epic). Objective scores for readability and grade level were determined using Flesch-Kincaid and Simple Measure of Gobbledygook scoring systems. 14 patients/caregivers and 8 transplant hepatologists were blinded and independently scored the materials on understandability and actionability and indicated whether they believed the material was human or artificial intelligence-generated. Understandability and actionability were determined using the Patient Education Materials Assessment Tool for Printable Materials. Transplant hepatologists also provided medical accuracy scores.
RESULTS
RESULTS
Most educational materials scored similarly in readability and grade level but were above the desired sixth-grade reading level. All educational materials were deemed understandable by both groups, while only the human-derived educational material (Epic) was considered actionable by both groups. No significant difference in perceived actionability or understandability among the educational materials was identified. Both groups poorly identified which materials were human-derived versus artificial intelligence-derived.
CONCLUSIONS
CONCLUSIONS
Chatbot-derived patient educational materials have comparable readability, grade level, understandability, and accuracy to human-derived materials. Readability, grade level, and actionability may be appropriate targets for improvement across educational materials on cirrhosis. Chatbot-derived patient educational materials show promise, and further studies should assess their usefulness in clinical practice.
Identifiants
pubmed: 38358382
doi: 10.1097/HC9.0000000000000367
pii: 02009842-202403010-00002
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
Copyright © 2024 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the American Association for the Study of Liver Diseases.
Références
Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med Published online. 2023;183:589–596.
Haupt CE, Marks M. AI-generated medical advice-GPT and beyond. JAMA. 2023;329:1349–1350.
Adams K Epic to Integrate GTP-4 into Its EHR Through Expanded Microsoft Partnership. MedCityNews. Published online April 28, 2023. https://medcitynews.com/2023/04/epic-tointegrate-gpt-4-into-its-ehr-through-expanded-microsoft-partnership/
van Dis EAM, Bollen J, Zuidema W, van Rooij R, Bockting CL. ChatGPT: Five priorities for research. Nature. 2023;614:224–226.
Kushniruk A. The development and use of chatbots in public health: Scoping review. JMIR Hum Factors. 2022;9:e35882.
Bujnowska-Fedjak MM, Waigóra J. Mastalerz-Migas. The internet as a source of health information and services. Adv Exp Med Biol. 2019;1211:1–16.
Cohen RA, Adams PF Use of the internet for Health Information: United States, 2009. CDC. 2011;NCHS Data Brief No 66.
Rowe IA. Lessons from epidemiology: The burden of liver disease. Dig Dis. 2017;35:304–309.
Scaglione S, Kliethermes S, Cao G, Shoham D, Durazo R, Luke A, et al. The epidemiology of cirrhosis in the United States: A population-based study. J Clin Gastroenterol. 2015;49:690–696.
Beste LA, Harp BK, Blais RK, Evans GA, Zickmund SL. Primary care providers report challenges to cirrhosis management and specialty care coordination. Dig Dis Sci. 2015;60:2628–2635.
Yeo YH, Samaan JS, Ng WH, Ting PS, Trivedi H, Vipani A, et al. Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma. Clin Mol Hepatol Published online. 2023;29:721–732.
Eltorai AE, Ghanian S, Adams CA Jr, Born CT, Daniels AH. Readability of patient education materials on the american Association for Surgery of Trauma website. Arch Trauma Res. 2014;3:e18161.
Weis BD. Health literacy: A manual for clinicians. AMA. 2003:31–4. http://lib.ncfh.org/pdfs/6617.pdf.
Rooney MK, Santiago G, Perni S, Horowitz DP, McCall AR, Einstein AJ, et al. Readability of patient education materials from highimpact medical journals: A 20-year analysis. J Patient Exper. 2021;8:2374373521998847.
Flesch R. A new readability yardstick. J Appl Psychol. 1948;32:221–233.
McLaughlin GH. SMOG Grading – a New Readability Formula. J Read. 1969;12:639–646.
Shoemaker SJ, Wolf MS, Brach C. Development of the Patient Education Materials Assessment Tool (PEMAT): A new measure of understandability and actionability for print and audiovisual patient information. Patient Educ Couns. 2014;96:395–403.
PEMAT for Printable Materials (PEMAT-P). Content last reviewed November 2020. Agency for Healthcare Research and Quality, Rockville, MD. https://www.ahrq.gov/health-literacy/patienteducation/pemat-p.html
Lipari M, Berlie H, Saleh Y, Hang P, Moser L. Understandability, actionability, and readability of online patient education materials about diabetes mellitus. Am J Health Syst Pharm. 2019;76:182–186.
Dy CJ, Taylor SA, Patel RM, McCarthy MM, Roberts TR, Daluiski A. Does the quality, accuracy, and readability of information about lateral epicondylitis on the internet vary with the search term used? Hand (N Y). 2012;7:420425.
Storino A, Castillo-Angeles M, Watkins AA, Vargas C, Mancias JD, Bullock A, et al. Assessing the accuracy and readability of online health information for patients with pancreatic cancer. JAMA Surg. 2016;151:831837.
Freundlich Grydgaard M, Bager P. Health literacy levels in outpatients with liver cirrhosis. Scand J Gastroenterol. 2018;53:1584–1589.
Gulati R, Nawaz M, Pyrsopoulos NT. Health literacy and liver disease. Clin Liver Dis. 2018;11:48–51.