ChatGPT vs. neurologists: a cross-sectional study investigating preference, satisfaction ratings and perceived empathy in responses among people living with multiple sclerosis.

Artificial intelligence Large language model Machine learning Multiple sclerosis

Journal

Journal of neurology

ISSN: 1432-1459

Titre abrégé: J Neurol

Pays: Germany

ID NLM: 0423161

Informations de publication

Date de publication:
03 Apr 2024

Historique:

received: 15 01 2024

accepted: 12 03 2024

revised: 11 03 2024

medline: 3 4 2024

pubmed: 3 4 2024

entrez: 3 4 2024

Statut: aheadofprint

Résumé

ChatGPT is an open-source natural language processing software that replies to users' queries. We conducted a cross-sectional study to assess people living with Multiple Sclerosis' (PwMS) preferences, satisfaction, and empathy toward two alternate responses to four frequently-asked questions, one authored by a group of neurologists, the other by ChatGPT. An online form was sent through digital communication platforms. PwMS were blind to the author of each response and were asked to express their preference for each alternate response to the four questions. The overall satisfaction was assessed using a Likert scale (1-5); the Consultation and Relational Empathy scale was employed to assess perceived empathy. We included 1133 PwMS (age, 45.26 ± 11.50 years; females, 68.49%). ChatGPT's responses showed significantly higher empathy scores (Coeff = 1.38; 95% CI = 0.65, 2.11; p > z < 0.01), when compared with neurologists' responses. No association was found between ChatGPT' responses and mean satisfaction (Coeff = 0.03; 95% CI = - 0.01, 0.07; p = 0.157). College graduate, when compared with high school education responder, had significantly lower likelihood to prefer ChatGPT response (IRR = 0.87; 95% CI = 0.79, 0.95; p < 0.01). ChatGPT-authored responses provided higher empathy than neurologists. Although AI holds potential, physicians should prepare to interact with increasingly digitized patients and guide them on responsible AI use. Future development should consider tailoring AIs' responses to individual characteristics. Within the progressive digitalization of the population, ChatGPT could emerge as a helpful support in healthcare management rather than an alternative.

Sections du résumé

BACKGROUND BACKGROUND

METHODS METHODS

An online form was sent through digital communication platforms. PwMS were blind to the author of each response and were asked to express their preference for each alternate response to the four questions. The overall satisfaction was assessed using a Likert scale (1-5); the Consultation and Relational Empathy scale was employed to assess perceived empathy.

RESULTS RESULTS

We included 1133 PwMS (age, 45.26 ± 11.50 years; females, 68.49%). ChatGPT's responses showed significantly higher empathy scores (Coeff = 1.38; 95% CI = 0.65, 2.11; p > z < 0.01), when compared with neurologists' responses. No association was found between ChatGPT' responses and mean satisfaction (Coeff = 0.03; 95% CI = - 0.01, 0.07; p = 0.157). College graduate, when compared with high school education responder, had significantly lower likelihood to prefer ChatGPT response (IRR = 0.87; 95% CI = 0.79, 0.95; p < 0.01).

CONCLUSIONS CONCLUSIONS

ChatGPT-authored responses provided higher empathy than neurologists. Although AI holds potential, physicians should prepare to interact with increasingly digitized patients and guide them on responsible AI use. Future development should consider tailoring AIs' responses to individual characteristics. Within the progressive digitalization of the population, ChatGPT could emerge as a helpful support in healthcare management rather than an alternative.

Identifiants

DOI: 10.1007/s00415-024-12328-x PMID: 38568227

pubmed: 38568227

doi: 10.1007/s00415-024-12328-x

pii: 10.1007/s00415-024-12328-x

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Informations de copyright

Références

Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, Wang Y, Dong Q, Shen H, Wang Y (2017) Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol 2(4):230–243. https://doi.org/10.1136/svn-2017-000101

doi: 10.1136/svn-2017-000101 pubmed: 29507784 pmcid: 5829945

Ortiz M, Mallen V, Boquete L, Sánchez-Morla EM, Cordón B, Vilades E, Dongil-Moreno FJ, Miguel-Jiménez JM, Garcia-Martin E (2023) Diagnosis of multiple sclerosis using optical coherence tomography supported by artificial intelligence. Mult Scler Relat Disord 74:104725. https://doi.org/10.1016/j.msard.2023.104725

doi: 10.1016/j.msard.2023.104725 pubmed: 37086637

Afzal HMR, Luo S, Ramadan S, Lechner-Scott J (2022) The emerging role of artificial intelligence in multiple sclerosis imaging. Mult Scler 28(6):849–858. https://doi.org/10.1177/1352458520966298

doi: 10.1177/1352458520966298 pubmed: 33112207

Zivadinov R, Bergsland N, Jakimovski D, Weinstock-Guttman B, Benedict RHB, Riolo J, Silva D, Dwyer MG (2022) DeepGRAI registry study group. Thalamic atrophy measured by artificial intelligence in a multicentre clinical routine real-word study is associated with disability progression. J Neurol Neurosurg Psychiatry jnnp. https://doi.org/10.1136/jnnp-2022-329333

doi: 10.1136/jnnp-2022-329333

ChatGPT. https://openai.com/blog/chatgpt . Accessed Dec 2023

Shah NH, Entwistle D, Pfeffer MA (2023) Creation and adoption of large language models in medicine. JAMA 330(9):866–869. https://doi.org/10.1001/jama.2023.14217

doi: 10.1001/jama.2023.14217 pubmed: 37548965

ChatGPT Statistics 2023: Trends and the Future Perspectives. https://blog.gitnux.com/chat-gpt-statistics/ . Accessed Nov 2023

Goodman RS, Patrinely JR, Stone CA Jr et al (2023) Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open 6(10):e2336483. https://doi.org/10.1001/jamanetworkopen.2023.36483

doi: 10.1001/jamanetworkopen.2023.36483 pubmed: 37782499 pmcid: 10546234

Ali SR, Dobbs TD, Hutchings HA, Whitaker IS (2023) Using ChatGPT to write patient clinic letters. Lancet Digit Health 5(4):e179–e181. https://doi.org/10.1016/S2589-7500(23)00048-1

doi: 10.1016/S2589-7500(23)00048-1 pubmed: 36894409

Inojosa H, Gilbert S, Kather JN, Proschmann U, Akgün K, Ziemssen T (2023) Can ChatGPT explain it? Use of artificial intelligence in multiple sclerosis communication. Neurol Res Pract 5(1):48. https://doi.org/10.1186/s42466-023-00270-8

doi: 10.1186/s42466-023-00270-8 pubmed: 37649106 pmcid: 10469796

Madrigal L, Escoffery C (2019) Electronic health behaviors among us adults with chronic disease: cross-sectional survey. J Med Internet Res 21(3):e11240. https://doi.org/10.2196/11240

doi: 10.2196/11240 pubmed: 30835242 pmcid: 6423466

Charness N, Boot WR (2023) A grand challenge for psychology: reducing the age-related digital divide. Curr Dir Psychol Sci 31(2):187–193. https://doi.org/10.1177/09637214211068144

doi: 10.1177/09637214211068144

Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, Poole C, Schlesselman JJ, Egger M (2007) STROBE initiative. Strengthening the reporting of observational studies in epidemiology (STROBE): explanation and elaboration. Epidemiology 18(6):805–835. https://doi.org/10.1097/EDE.0b013e3181577511

doi: 10.1097/EDE.0b013e3181577511 pubmed: 18049195

Digital Technology, Web and Social Media Study Group. https://www.neuro.it/web/eventi/NEURO/gruppi.cfm?p=DIGITAL_WEB_SOCIAL . Accessed Dec 2023

Research Randomizer. https://www.randomizer.org . Accessed July 2023

Kroenke K, Spitzer RL, Williams JB (2003) The Patient Health Questionnaire-2: validity of a two-item depression screener. Med Care 41(11):1284–1292. https://doi.org/10.1097/01.MLR.0000093487.78664.3C

doi: 10.1097/01.MLR.0000093487.78664.3C pubmed: 14583691

Beswick E, Quigley S, Macdonald P, Patrick S, Colville S, Chandran S, Connick P (2022) The Patient Health Questionnaire (PHQ-9) as a tool to screen for depression in people with multiple sclerosis: a cross-sectional validation study. BMC Psychol 10(1):281. https://doi.org/10.1186/s40359-022-00949-8

doi: 10.1186/s40359-022-00949-8 pubmed: 36443880 pmcid: 9706934

Patten SB, Burton JM, Fiest KM, Wiebe S, Bulloch AG, Koch M, Dobson KS, Metz LM, Maxwell CJ, Jetté N (2015) Validity of four screening scales for major depression in MS. Mult Scler 21(8):1064–1071. https://doi.org/10.1177/1352458514559297

doi: 10.1177/1352458514559297 pubmed: 25583846

Mercer SW, Maxwell M, Heaney D, Watt GC (2004) The consultation and relational empathy (CARE) measure: development and preliminary validation and reliability of an empathy-based consultation process measure. Fam Pract 21(6):699–705. https://doi.org/10.1093/fampra/cmh621

doi: 10.1093/fampra/cmh621 pubmed: 15528286

Wang Y, Wang P, Wu Q, Wang Y, Lin B, Long J, Qing X, Wang P (2023) Doctors’ and patients’ perceptions of impacts of doctors’ communication and empathy skills on doctor-patient relationships during COVID-19. J Gen Intern Med 38(2):428–433. https://doi.org/10.1007/s11606-022-07784-y

doi: 10.1007/s11606-022-07784-y pubmed: 36253633

Martikainen S, Falcon M, Wikström V, Peltola S, Saarikivi K (2022) Perceptions of doctors’ empathy and patients’ subjective health status at an online clinic: development of an empathic Anamnesis Questionnaire. Psychosom Med 84(4):513–521. https://doi.org/10.1097/PSY.0000000000001055

doi: 10.1097/PSY.0000000000001055 pubmed: 35100186 pmcid: 9071034

Lucisano P, Piemontese ME (1988) Gulpease: a formula to predict readability of texts written in Italian Language. Scuola Città 3:110–124

Dell’orletta F, Montemagni S, Venturi G (2011) READ-IT: assessing readability of italian texts with a view to text simplification, in Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies. Edinburgh, pp 73–83

Zhao YC, Zhao M, Song S (2022) Online health information seeking among patients with chronic conditions: integrating the health belief model and social support theory. J Med Internet Res 24(11):e42447. https://doi.org/10.2196/42447

doi: 10.2196/42447 pubmed: 36322124 pmcid: 9669891

Brigo F, Lattanzi S, Bragazzi N, Nardone R, Moccia M, Lavorgna L (2018) Why do people search wikipedia for information on multiple sclerosis? Mult Scler Relat Disord 20:210–214. https://doi.org/10.1016/j.msard.2018.02.001

doi: 10.1016/j.msard.2018.02.001 pubmed: 29428464

Ayoub NF, Lee YJ, Grimm D, Balakrishnan K (2023) Comparison between ChatGPT and google search as sources of postoperative patient instructions. JAMA Otolaryngol Head Neck Surg 149(6):556–558. https://doi.org/10.1001/jamaoto.2023.0704

doi: 10.1001/jamaoto.2023.0704 pubmed: 37103921

Lavorgna L, De Stefano M, Sparaco M, Moccia M, Abbadessa G, Montella P, Buonanno D, Esposito S, Clerico M, Cenci C, Trojsi F, Lanzillo R, Rosa L, Morra VB, Ippolito D, Maniscalco G, Bisecco A, Tedeschi G, Bonavita S (2018) Fake news, influencers and health-related professional participation on the web: a pilot study on a social-network of people with multiple sclerosis. Mult Scler Relat Disord 25:175–178. https://doi.org/10.1016/j.msard.2018.07.046

doi: 10.1016/j.msard.2018.07.046 pubmed: 30096683

Herzer KR, Pronovost PJ (2021) Ensuring quality in the era of virtual care. JAMA 325(5):429–430. https://doi.org/10.1016/j.msard.2018.07.046

doi: 10.1016/j.msard.2018.07.046 pubmed: 33528544

Mello MM, Guha N (2023) ChatGPT and physicians’ malpractice risk. JAMA Health Forum 4(5):e231938. https://doi.org/10.1001/jamahealthforum.2023.1938

doi: 10.1001/jamahealthforum.2023.1938 pubmed: 37200013

van Laar E, van Deursen AJAM, van Dijk JAGM, de Haan J (2020) Determinants of 21st-century skills and 21st-century digital skills for workers: a systematic literature review. SAGE Open. https://doi.org/10.1177/2158244019900176

doi: 10.1177/2158244019900176

National Research Council (2000) How people learn: brain, mind, experience, and school expanded edition. The National Academies Press, Washington, DC

Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB, Faix DJ, Goodman AM, Longhurst CA, Hogarth M, Smith DM (2023) Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 183(6):589–596. https://doi.org/10.1001/jamainternmed.2023.1838

doi: 10.1001/jamainternmed.2023.1838 pubmed: 37115527

Kaya F, Aydin F, Schepman A et al (2022) The roles of personality traits, AI anxiety, and demographic factors in attitudes toward artificial intelligence. Int J Hum-Comput Int. https://doi.org/10.1080/10447318.2022.2151730

doi: 10.1080/10447318.2022.2151730

Jia X, Pang Y, Liu LS (2021) Online health information seeking behavior: a systematic review. Healthcare (Basel) 9(12):1740. https://doi.org/10.3390/healthcare9121740

doi: 10.3390/healthcare9121740 pubmed: 34946466

D’Andrea A, Grifoni P, Ferri F (2023) Online health information seeking: an italian case study for analyzing citizens’ behavior and perception. Int J Environ Res Public Health 20(2):1076. https://doi.org/10.3390/ijerph20021076

doi: 10.3390/ijerph20021076 pubmed: 36673830 pmcid: 9859265

De Meo E, Portaccio E, Giorgio A et al (2021) Identifying the distinct cognitive phenotypes in multiple sclerosis. JAMA Neurol 78(4):414–425. https://doi.org/10.1001/jamaneurol.2020.4920

doi: 10.1001/jamaneurol.2020.4920 pubmed: 33393981

Hatcher-Martin JM, Busis NA, Cohen BH, Wolf RA, Jones EC, Anderson ER, Fritz JV, Shook SJ, Bove RM (2021) American academy of neurology telehealth position statement. Neurology 97(7):334–339. https://doi.org/10.1212/WNL.0000000000012185

doi: 10.1212/WNL.0000000000012185 pubmed: 33986141 pmcid: 8377877

Haluza D, Naszay M, Stockinger A, Jungwirth D (2017) Digital natives versus digital immigrants: influence of online health information seeking on the doctor-patient relationship. Health Commun 32(11):1342–1349. https://doi.org/10.1080/10410236.2016.1220044

doi: 10.1080/10410236.2016.1220044 pubmed: 27710132

Chua V, Koh JH, Koh CHG, Tyagi S (2022) The willingness to pay for telemedicine among patients with chronic diseases: systematic review. J Med Internet Res 24(4):e33372. https://doi.org/10.2196/33372

doi: 10.2196/33372 pubmed: 35416779 pmcid: 9047785

Xie Z, Chen J, Or CK (2022) Consumers’ willingness to pay for ehealth and its influencing factors: systematic review and meta-analysis. J Med Internet Res 24(9):e25959. https://doi.org/10.2196/25959

doi: 10.2196/25959 pubmed: 36103227 pmcid: 9520394

Fan W, Yan Z (2010) Factors affecting response rates of the web survey: a systematic review. Comput Hum Behav 26:132–139. https://doi.org/10.1016/j.chb.2009.10.01

doi: 10.1016/j.chb.2009.10.01

Wu MJ, Zhao K, Fils-Aime F (2022) Response rates of online surveys in published research: a meta-analysis. Comput Hum Behav. https://doi.org/10.1016/j.chbr.2022.100206

doi: 10.1016/j.chbr.2022.100206