Evaluation of AI-generated responses by different artificial intelligence chatbots to the clinical decision-making case-based questions in oral and maxillofacial surgery.

Journal

Oral surgery, oral medicine, oral pathology and oral radiology

ISSN: 2212-4411

Titre abrégé: Oral Surg Oral Med Oral Pathol Oral Radiol

Pays: United States

ID NLM: 101576782

Informations de publication

Date de publication:
06 Mar 2024

Historique:

received: 22 01 2024

revised: 24 02 2024

accepted: 26 02 2024

medline: 4 4 2024

pubmed: 4 4 2024

entrez: 3 4 2024

Statut: aheadofprint

Résumé

This study aims to evaluate the correctness of the generated answers by Google Bard, GPT-3.5, GPT-4, Claude-Instant, and Bing chatbots to decision-making clinical questions in the oral and maxillofacial surgery (OMFS) area. A group of 3 board-certified oral and maxillofacial surgeons designed a questionnaire with 50 case-based questions in multiple-choice and open-ended formats. Answers of chatbots to multiple-choice questions were examined against the chosen option by 3 referees. The chatbots' answers to the open-ended questions were evaluated based on the modified global quality scale. A P-value under .05 was considered significant. Bard, GPT-3.5, GPT-4, Claude-Instant, and Bing answered 34%, 36%, 38%, 38%, and 26% of the questions correctly, respectively. In open-ended questions, GPT-4 scored the most answers evaluated as grades "4" or "5," and Bing scored the most answers evaluated as grades "1" or "2." There were no statistically significant differences between the 5 chatbots in responding to the open-ended (P = .275) and multiple-choice (P = .699) questions. Considering the major inaccuracies in the responses of chatbots, despite their relatively good performance in answering open-ended questions, this technology yet cannot be trusted as a consultant for clinicians in decision-making situations.

Identifiants

DOI: 10.1016/j.oooo.2024.02.018 PMID: 38570273

pubmed: 38570273

pii: S2212-4403(24)00095-6

doi: 10.1016/j.oooo.2024.02.018

pii:

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Informations de copyright

Déclaration de conflit d'intérêts

Declaration of interest The authors have no conflict to disclose regarding this manuscript.

Evaluation of AI-generated responses by different artificial intelligence chatbots to the clinical decision-making case-based questions in oral and maxillofacial surgery.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Ali Azadi (A)

Fatemeh Gorjinejad (F)

Hossein Mohammad-Rahimi (H)

Reza Tabrizi (R)

Mostafa Alam (M)

Mohsen Golkar (M)

Classifications MeSH