Leveraging Large Language Models in the delivery of post-operative dental care: a comparison between an embedded GPT model and ChatGPT.

Journal

BDJ open

ISSN: 2056-807X

Titre abrégé: BDJ Open

Pays: England

ID NLM: 101709456

Informations de publication

Date de publication:
12 Jun 2024

Historique:

received: 25 03 2024

accepted: 07 05 2024

revised: 01 05 2024

medline: 13 6 2024

pubmed: 13 6 2024

entrez: 12 6 2024

Statut: epublish

Résumé

This study underscores the transformative role of Artificial Intelligence (AI) in healthcare, particularly the promising applications of Large Language Models (LLMs) in the delivery of post-operative dental care. The aim is to evaluate the performance of an embedded GPT model and its comparison with ChatGPT-3.5 turbo. The assessment focuses on aspects like response accuracy, clarity, relevance, and up-to-date knowledge in addressing patient concerns and facilitating informed decision-making. An embedded GPT model, employing GPT-3.5-16k, was crafted via GPT-trainer to answer postoperative questions in four dental specialties including Operative Dentistry & Endodontics, Periodontics, Oral & Maxillofacial Surgery, and Prosthodontics. The generated responses were validated by thirty-six dental experts, nine from each specialty, employing a Likert scale, providing comprehensive insights into the embedded GPT model's performance and its comparison with GPT3.5 turbo. For content validation, a quantitative Content Validity Index (CVI) was used. The CVI was calculated both at the item level (I-CVI) and scale level (S-CVI/Ave). To adjust I-CVI for chance agreement, a modified kappa statistic (K*) was computed. The overall content validity of responses generated via embedded GPT model and ChatGPT was 65.62% and 61.87% respectively. Moreover, the embedded GPT model revealed a superior performance surpassing ChatGPT with an accuracy of 62.5% and clarity of 72.5%. In contrast, the responses generated via ChatGPT achieved slightly lower scores, with an accuracy of 52.5% and clarity of 67.5%. However, both models performed equally well in terms of relevance and up-to-date knowledge. In conclusion, embedded GPT model showed better results as compared to ChatGPT in providing post-operative dental care emphasizing the benefits of embedding and prompt engineering, paving the way for future advancements in healthcare applications.

Identifiants

DOI: 10.1038/s41405-024-00226-3 PMID: 38866751

pubmed: 38866751

doi: 10.1038/s41405-024-00226-3

pii: 10.1038/s41405-024-00226-3

doi:

Types de publication

Journal Article

Langues

eng

Pagination

Informations de copyright

Références

Sarkar D, Bali R, Sharma T, Sarkar D, Bali R, Sharma T. Machine learning basics. In: Practical Machine Learning with Python: A Problem-Solver’s Guide to Building Real-World Intelligent Systems. 2018. pp. 3–65. https://doi.org/10.1007/978-1-4842-3207-1 .

Panesar A. Machine learning and AI for healthcare. Springer; 2019. https://doi.org/10.1007/978-1-4842-6537-6 .

Shan T, Tay F, Gu L. Application of artificial intelligence in dentistry. J Dent Res. 2021;100:232–44. https://doi.org/10.1177/0022034520969115 .

doi: 10.1177/0022034520969115 pubmed: 33118431

Bohr A, Memarzadeh K. The rise of artificial intelligence in healthcare applications. In: Artificial Intelligence in healthcare. Elsevier; 2020. pp. 25–60. https://doi.org/10.1016/B978-0-12-818438-7.00002-2 .

Hadi MU, Al Tashi Q, Qureshi R, Shah A, Muneer A, Irfan M, et al. A Survey on Large Language Models: Applications, Challenges, Limitations, and Practical Usage. TechRxiv. 2023. https://doi.org/10.36227/techrxiv.23589741.v1 .

Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29:1930–40. https://doi.org/10.1038/s41591-023-02448-8 .

doi: 10.1038/s41591-023-02448-8 pubmed: 37460753

Lahat A, Shachar E, Avidan B, Glicksberg B, Klang E. Evaluating the Utility of a Large Language Model in Answering Common Patients’ Gastrointestinal Health-Related Questions: Are We There Yet? Diagnostics. 2023;13:1950 https://doi.org/10.3390/diagnostics13111950 .

doi: 10.3390/diagnostics13111950 pubmed: 37296802 pmcid: 10252924

Seth I, Cox A, Xie Y, Bulloch G, Hunter-Smith DJ, Rozen WM, et al. Evaluating Chatbot Efficacy for Answering Frequently Asked Questions in Plastic Surgery: A ChatGPT Case Study Focused on Breast Augmentation. Aesthet Surg J. 2023;43:1126–35. https://doi.org/10.1093/asj/sjad140 .

Lim ZW, Pushpanathan K, Yew SME, Lai Y, Sun C-H, Lam JSH, et al. Benchmarking large language models’ performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine. 2023;95:104770 https://doi.org/10.1016/j.ebiom.2023.104770 .

doi: 10.1016/j.ebiom.2023.104770 pubmed: 37625267 pmcid: 10470220

Dwyer T, Hoit G, Burns D, Higgins J, Chang J, Whelan D, et al. Use of an Artificial Intelligence Conversational Agent (Chatbot) for Hip Arthroscopy Patients Following Surgery. ASMAR. 2023;5:495–505. https://doi.org/10.1016/j.asmr.2023.01.020 .

doi: 10.1016/j.asmr.2023.01.020

Alsahafi YA, Alolayan AB, Alraddadi W, Alamri A, Aljadani M, Alenazi M, et al. The impact of the method of presenting instructions of postoperative care on the quality of life after simple tooth extraction. Saudi J Oral Sci 2021;8:143–9.

doi: 10.4103/sjoralsci.sjoralsci_14_21

LLM Embeddings — Explained Simply. 2024. https://pub.aimind.so/llm-embeddings-explained-simply . Accessed 8 January 2024.

Lynn MR. Determination and Quantification Of Content Validity. Nurs Res. 1986;35:382–6.

doi: 10.1097/00006199-198611000-00017 pubmed: 3640358

Drossman DA, Ruddy J. Improving patient-provider relationships to improve health care. CGH. 2020;18:1417–26. https://doi.org/10.1016/j.cgh.2019.12.007 .

doi: 10.1016/j.cgh.2019.12.007

Yang R, Tan TF, Lu W, Thirunavukarasu AJ, Ting DSW, Liu N. Large language models in health care: Development, applications, and challenges. Health Sci J. 2023;2:255–63. https://doi.org/10.1002/hcs2.61 .

doi: 10.1002/hcs2.61

Huang H, Zheng O, Wang D, Yin J, Wang Z, Ding S, et al. ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci. 2023;15:29 https://doi.org/10.1038/s41368-023-00239-y .

doi: 10.1038/s41368-023-00239-y pubmed: 37507396 pmcid: 10382494

Mohammad-Rahimi H, Ourang SA, Pourhoseingholi MA, Dianat O, Dummer PMH, Nosrat A. Validity and reliability of artificial intelligence chatbots as public sources of information on endodontics. Int Endod J. 2024;57:305–14. https://doi.org/10.1111/iej.14014 .

doi: 10.1111/iej.14014 pubmed: 38117284

Banerjee S, Dunn P, Conard S, Ng R. Large language modeling and classical AI methods for the future of healthcare. J Med Surg Public Health. 2023;1:100026 https://doi.org/10.1016/j.glmedi.2023.100026 .

doi: 10.1016/j.glmedi.2023.100026

Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, et al. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med Educ. 2023;9:e45312 https://doi.org/10.2196/45312 .

doi: 10.2196/45312 pubmed: 36753318 pmcid: 9947764

Umapathi LK, Pal A, Sankarasubbu M. Med-halt: Medical domain hallucination test for large language models. ArXiv. 2023. https://doi.org/10.48550/arXiv.2307.15343 .

Suárez A, Jiménez J, de Pedro ML, Andreu-Vázquez C, García VD, Sánchez MG, et al. Beyond the Scalpel: Assessing ChatGPT’s potential as an auxiliary intelligent virtual assistant in oral surgery. Computational Struct Biotechnol J. 2024;24(Dec):46–52.

doi: 10.1016/j.csbj.2023.11.058

Russe MF, Rau A, Ermer MA, Rothweiler R, Wenger S, Klöble K, et al. A content-aware chatbot based on GPT 4 provides trustworthy recommendations for Cone-Beam CT guidelines in dental imaging. Dentomaxillofacial Radiol. 2024;53(Feb):109–14.

doi: 10.1093/dmfr/twad015

Deiana G, Dettori M, Arghittu A, Azara A, Gabutti G, Castiglia P. Artificial intelligence and public health: evaluating ChatGPT responses to vaccination myths and misconceptions. Vaccines. 2023;11:1217 https://doi.org/10.3390/vaccines11071217 .

doi: 10.3390/vaccines11071217 pubmed: 37515033 pmcid: 10386180

Abu Arqub S, Al-Moghrabi D, Allareddy V, Upadhyay M, Vaid N, Yadav S. Content analysis of AI-generated (ChatGPT) responses concerning orthodontic clear aligners. Angle Orthod. 2024;94:263–72.

doi: 10.2319/071123-484.1 pubmed: 38195060 pmcid: 11050467

Rodrigues IB, Adachi JD, Beattie KA, MacDermid JC. Development and validation of a new tool to measure the facilitators, barriers and preferences to exercise in people with osteoporosis. BMC Musculoskelet Disord. 2017;18:540 https://doi.org/10.1186/s12891-017-1914-5 .

doi: 10.1186/s12891-017-1914-5 pubmed: 29258503 pmcid: 5738121

Wang J, Shi E, Yu S, Wu Z, Ma C, Dai H, et al., Prompt engineering for healthcare: Methodologies and applications. ArXiv. 2023. https://doi.org/10.48550/arXiv.2304.14670 .

Lu Q, Qiu B, Ding L, Xie L, Tao D. Error analysis prompting enables human-like translation evaluation in large language models: A case study on chatgpt. ArXiv. 2023. https://doi.org/10.48550/arXiv.2303.13809 .

Babayiğit O, Eroglu ZT, Sen DO, Yarkac FU. Potential Use of ChatGPT for Patient Information in Periodontology: A Descriptive Pilot Study. Cureus. 2023;15:e48518.

pubmed: 38073946 pmcid: 10708896

Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv Neural Inf Process. 2020;33:9459–74.

Dehghani M. Dental Severity Assessment through Few-shot Learning and SBERT Fine-tuning. ArXiv. 2024. https://arxiv.org/abs/2402.15755 .

Leveraging Large Language Models in the delivery of post-operative dental care: a comparison between an embedded GPT model and ChatGPT.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Références

Auteurs

Itrat Batool (I)

Nighat Naved (N)

Syed Murtaza Raza Kazmi (SMR)

Fahad Umer (F)

Classifications MeSH