Evaluating the Utility of a Large Language Model in Answering Common Patients' Gastrointestinal Health-Related Questions: Are We There Yet?

OpenAI’s ChatGPT chatbot gastroenterology medical information natural language processing (NLP) patients’ questions

Journal

Diagnostics (Basel, Switzerland)
ISSN: 2075-4418
Titre abrégé: Diagnostics (Basel)
Pays: Switzerland
ID NLM: 101658402

Informations de publication

Date de publication:
02 Jun 2023
Historique:
received: 27 03 2023
revised: 28 05 2023
accepted: 01 06 2023
medline: 10 6 2023
pubmed: 10 6 2023
entrez: 10 6 2023
Statut: epublish

Résumé

Patients frequently have concerns about their disease and find it challenging to obtain accurate Information. OpenAI's ChatGPT chatbot (ChatGPT) is a new large language model developed to provide answers to a wide range of questions in various fields. Our aim is to evaluate the performance of ChatGPT in answering patients' questions regarding gastrointestinal health. To evaluate the performance of ChatGPT in answering patients' questions, we used a representative sample of 110 real-life questions. The answers provided by ChatGPT were rated in consensus by three experienced gastroenterologists. The accuracy, clarity, and efficacy of the answers provided by ChatGPT were assessed. ChatGPT was able to provide accurate and clear answers to patients' questions in some cases, but not in others. For questions about treatments, the average accuracy, clarity, and efficacy scores (1 to 5) were 3.9 ± 0.8, 3.9 ± 0.9, and 3.3 ± 0.9, respectively. For symptoms questions, the average accuracy, clarity, and efficacy scores were 3.4 ± 0.8, 3.7 ± 0.7, and 3.2 ± 0.7, respectively. For diagnostic test questions, the average accuracy, clarity, and efficacy scores were 3.7 ± 1.7, 3.7 ± 1.8, and 3.5 ± 1.7, respectively. While ChatGPT has potential as a source of information, further development is needed. The quality of information is contingent upon the quality of the online information provided. These findings may be useful for healthcare providers and patients alike in understanding the capabilities and limitations of ChatGPT.

Sections du résumé

BACKGROUND AND AIMS OBJECTIVE
Patients frequently have concerns about their disease and find it challenging to obtain accurate Information. OpenAI's ChatGPT chatbot (ChatGPT) is a new large language model developed to provide answers to a wide range of questions in various fields. Our aim is to evaluate the performance of ChatGPT in answering patients' questions regarding gastrointestinal health.
METHODS METHODS
To evaluate the performance of ChatGPT in answering patients' questions, we used a representative sample of 110 real-life questions. The answers provided by ChatGPT were rated in consensus by three experienced gastroenterologists. The accuracy, clarity, and efficacy of the answers provided by ChatGPT were assessed.
RESULTS RESULTS
ChatGPT was able to provide accurate and clear answers to patients' questions in some cases, but not in others. For questions about treatments, the average accuracy, clarity, and efficacy scores (1 to 5) were 3.9 ± 0.8, 3.9 ± 0.9, and 3.3 ± 0.9, respectively. For symptoms questions, the average accuracy, clarity, and efficacy scores were 3.4 ± 0.8, 3.7 ± 0.7, and 3.2 ± 0.7, respectively. For diagnostic test questions, the average accuracy, clarity, and efficacy scores were 3.7 ± 1.7, 3.7 ± 1.8, and 3.5 ± 1.7, respectively.
CONCLUSIONS CONCLUSIONS
While ChatGPT has potential as a source of information, further development is needed. The quality of information is contingent upon the quality of the online information provided. These findings may be useful for healthcare providers and patients alike in understanding the capabilities and limitations of ChatGPT.

Identifiants

pubmed: 37296802
pii: diagnostics13111950
doi: 10.3390/diagnostics13111950
pmc: PMC10252924
pii:
doi:

Types de publication

Journal Article

Langues

eng

Références

Clin Mol Hepatol. 2023 Mar 22;:
pubmed: 36946005
Sci Rep. 2023 Mar 13;13(1):4164
pubmed: 36914821
J Med Internet Res. 2020 Oct 22;22(10):e20346
pubmed: 33090118
Hepatol Commun. 2023 Mar 24;7(4):
pubmed: 36972383
Int J Environ Res Public Health. 2023 Feb 15;20(4):
pubmed: 36834073
J Med Internet Res. 2021 May 6;23(5):e27460
pubmed: 33882012
Graefes Arch Clin Exp Ophthalmol. 2023 May 2;:
pubmed: 37129631
JMIR Med Educ. 2023 Mar 6;9:e46885
pubmed: 36863937
JNCI Cancer Spectr. 2023 Mar 1;7(2):
pubmed: 36929393
PLoS Med. 2018 Nov 6;15(11):e1002689
pubmed: 30399149
Aesthetic Plast Surg. 2023 Apr 24;:
pubmed: 37095384
Obes Surg. 2023 Jun;33(6):1790-1796
pubmed: 37106269
J Telemed Telecare. 2023 Feb 9;:1357633X231155520
pubmed: 36760131
Dig Liver Dis. 2008 Aug;40(8):659-66
pubmed: 18406672
J Med Internet Res. 2019 Apr 05;21(4):e12887
pubmed: 30950796
Heliyon. 2017 Jun 22;3(6):e00328
pubmed: 28707001
J Med Internet Res. 2019 Oct 28;21(10):e16222
pubmed: 31661083

Auteurs

Adi Lahat (A)

Chaim Sheba Medical Center, Department of Gastroenterology, Affiliated to Tel Aviv University, Tel Aviv 69978, Israel.

Eyal Shachar (E)

Chaim Sheba Medical Center, Department of Gastroenterology, Affiliated to Tel Aviv University, Tel Aviv 69978, Israel.

Benjamin Avidan (B)

Chaim Sheba Medical Center, Department of Gastroenterology, Affiliated to Tel Aviv University, Tel Aviv 69978, Israel.

Benjamin Glicksberg (B)

Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Eyal Klang (E)

The Sami Sagol AI Hub, ARC Innovation Center, Chaim Sheba Medical Center, Affiliated to Tel-Aviv University, Tel Aviv 69978, Israel.

Classifications MeSH