Large Language Model-Based Neurosurgical Evaluation Matrix: A Novel Scoring Criteria to Assess the Efficacy of ChatGPT as an Educational Tool for Neurosurgery Board Preparation.

AI evaluation matrix Artificial intelligence ChatGPT Medical education technology Neurosurgical education

Journal

World neurosurgery

ISSN: 1878-8769

Titre abrégé: World Neurosurg

Pays: United States

ID NLM: 101528275

Informations de publication

Date de publication:
14 Oct 2023

Historique:

received: 03 10 2023

accepted: 07 10 2023

pubmed: 16 10 2023

medline: 16 10 2023

entrez: 15 10 2023

Statut: aheadofprint

Résumé

Technological advancements are reshaping medical education, with digital tools becoming essential in all levels of training. Amidst this transformation, the study explores the potential of ChatGPT, an artificial intelligence model by OpenAI, in enhancing neurosurgical board education. The focus extends beyond technology adoption to its effective utilization, with ChatGPT's proficiency evaluated against practice questions from the Primary Neurosurgery Written Board Exam. Using the Congress of Neurologic Surgeons (CNS) Self-Assessment Neurosurgery (SANS) Exam Board Review Prep questions, we conducted 3 rounds of analysis with ChatGPT. We developed a novel ChatGPT Neurosurgical Evaluation Matrix (CNEM) to assess the output quality, accuracy, concordance, and clarity of ChatGPT's answers. ChatGPT achieved spot-on accuracy for 66.7% of prompted questions, 59.4% of unprompted questions, and 63.9% of unprompted questions with a leading phrase. Stratified by topic, accuracy ranged from 50.0% (Vascular) to 78.8% (Neuropathology). In comparison to SANS explanations, ChatGPT output was considered better in 19.1% of questions, equal in 51.6%, and worse in 29.3%. Concordance analysis showed that 95.5% of unprompted ChatGPT outputs and 97.4% of unprompted outputs with a leading phrase were aligned. Our study evaluated the performance of ChatGPT in neurosurgical board education by assessing its accuracy, clarity, and concordance. The findings highlight the potential and challenges of integrating AI technologies like ChatGPT into medical and neurosurgical board education. Further research is needed to refine these tools and optimize their performance for enhanced medical education and patient care.

Identifiants

DOI: 10.1016/j.wneu.2023.10.043 PMID: 37839567

pubmed: 37839567

pii: S1878-8750(23)01448-1

doi: 10.1016/j.wneu.2023.10.043

pii:

doi:

Types de publication

Journal Article

Langues

eng

Large Language Model-Based Neurosurgical Evaluation Matrix: A Novel Scoring Criteria to Assess the Efficacy of ChatGPT as an Educational Tool for Neurosurgery Board Preparation.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Auteurs

Sneha Sai Mannam (SS)

Robert Subtirelu (R)

Daksh Chauhan (D)

Hasan S Ahmad (HS)

Irina Mihaela Matache (IM)

Kevin Bryan (K)

Siddharth V K Chitta (SVK)

Shreya C Bathula (SC)

Ryan Turlip (R)

Connor Wathen (C)

Yohannes Ghenbot (Y)

Sonia Ajmera (S)

Rachel Blue (R)

H Isaac Chen (HI)

Zarina S Ali (ZS)

Neil Malhotra (N)

Visish Srinivasan (V)

Ali K Ozturk (AK)

Jang W Yoon (JW)

Classifications MeSH