ChatGPT vs. neurologists: a cross-sectional study investigating preference, satisfaction ratings and perceived empathy in responses among people living with multiple sclerosis.
Artificial intelligence
Large language model
Machine learning
Multiple sclerosis
Journal
Journal of neurology
ISSN: 1432-1459
Titre abrégé: J Neurol
Pays: Germany
ID NLM: 0423161
Informations de publication
Date de publication:
03 Apr 2024
03 Apr 2024
Historique:
received:
15
01
2024
accepted:
12
03
2024
revised:
11
03
2024
medline:
3
4
2024
pubmed:
3
4
2024
entrez:
3
4
2024
Statut:
aheadofprint
Résumé
ChatGPT is an open-source natural language processing software that replies to users' queries. We conducted a cross-sectional study to assess people living with Multiple Sclerosis' (PwMS) preferences, satisfaction, and empathy toward two alternate responses to four frequently-asked questions, one authored by a group of neurologists, the other by ChatGPT. An online form was sent through digital communication platforms. PwMS were blind to the author of each response and were asked to express their preference for each alternate response to the four questions. The overall satisfaction was assessed using a Likert scale (1-5); the Consultation and Relational Empathy scale was employed to assess perceived empathy. We included 1133 PwMS (age, 45.26 ± 11.50 years; females, 68.49%). ChatGPT's responses showed significantly higher empathy scores (Coeff = 1.38; 95% CI = 0.65, 2.11; p > z < 0.01), when compared with neurologists' responses. No association was found between ChatGPT' responses and mean satisfaction (Coeff = 0.03; 95% CI = - 0.01, 0.07; p = 0.157). College graduate, when compared with high school education responder, had significantly lower likelihood to prefer ChatGPT response (IRR = 0.87; 95% CI = 0.79, 0.95; p < 0.01). ChatGPT-authored responses provided higher empathy than neurologists. Although AI holds potential, physicians should prepare to interact with increasingly digitized patients and guide them on responsible AI use. Future development should consider tailoring AIs' responses to individual characteristics. Within the progressive digitalization of the population, ChatGPT could emerge as a helpful support in healthcare management rather than an alternative.
Sections du résumé
BACKGROUND
BACKGROUND
ChatGPT is an open-source natural language processing software that replies to users' queries. We conducted a cross-sectional study to assess people living with Multiple Sclerosis' (PwMS) preferences, satisfaction, and empathy toward two alternate responses to four frequently-asked questions, one authored by a group of neurologists, the other by ChatGPT.
METHODS
METHODS
An online form was sent through digital communication platforms. PwMS were blind to the author of each response and were asked to express their preference for each alternate response to the four questions. The overall satisfaction was assessed using a Likert scale (1-5); the Consultation and Relational Empathy scale was employed to assess perceived empathy.
RESULTS
RESULTS
We included 1133 PwMS (age, 45.26 ± 11.50 years; females, 68.49%). ChatGPT's responses showed significantly higher empathy scores (Coeff = 1.38; 95% CI = 0.65, 2.11; p > z < 0.01), when compared with neurologists' responses. No association was found between ChatGPT' responses and mean satisfaction (Coeff = 0.03; 95% CI = - 0.01, 0.07; p = 0.157). College graduate, when compared with high school education responder, had significantly lower likelihood to prefer ChatGPT response (IRR = 0.87; 95% CI = 0.79, 0.95; p < 0.01).
CONCLUSIONS
CONCLUSIONS
ChatGPT-authored responses provided higher empathy than neurologists. Although AI holds potential, physicians should prepare to interact with increasingly digitized patients and guide them on responsible AI use. Future development should consider tailoring AIs' responses to individual characteristics. Within the progressive digitalization of the population, ChatGPT could emerge as a helpful support in healthcare management rather than an alternative.
Identifiants
pubmed: 38568227
doi: 10.1007/s00415-024-12328-x
pii: 10.1007/s00415-024-12328-x
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2024. The Author(s).
Références
Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, Wang Y, Dong Q, Shen H, Wang Y (2017) Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol 2(4):230–243. https://doi.org/10.1136/svn-2017-000101
doi: 10.1136/svn-2017-000101
pubmed: 29507784
pmcid: 5829945
Ortiz M, Mallen V, Boquete L, Sánchez-Morla EM, Cordón B, Vilades E, Dongil-Moreno FJ, Miguel-Jiménez JM, Garcia-Martin E (2023) Diagnosis of multiple sclerosis using optical coherence tomography supported by artificial intelligence. Mult Scler Relat Disord 74:104725. https://doi.org/10.1016/j.msard.2023.104725
doi: 10.1016/j.msard.2023.104725
pubmed: 37086637
Afzal HMR, Luo S, Ramadan S, Lechner-Scott J (2022) The emerging role of artificial intelligence in multiple sclerosis imaging. Mult Scler 28(6):849–858. https://doi.org/10.1177/1352458520966298
doi: 10.1177/1352458520966298
pubmed: 33112207
Zivadinov R, Bergsland N, Jakimovski D, Weinstock-Guttman B, Benedict RHB, Riolo J, Silva D, Dwyer MG (2022) DeepGRAI registry study group. Thalamic atrophy measured by artificial intelligence in a multicentre clinical routine real-word study is associated with disability progression. J Neurol Neurosurg Psychiatry jnnp. https://doi.org/10.1136/jnnp-2022-329333
doi: 10.1136/jnnp-2022-329333
ChatGPT. https://openai.com/blog/chatgpt . Accessed Dec 2023
Shah NH, Entwistle D, Pfeffer MA (2023) Creation and adoption of large language models in medicine. JAMA 330(9):866–869. https://doi.org/10.1001/jama.2023.14217
doi: 10.1001/jama.2023.14217
pubmed: 37548965
ChatGPT Statistics 2023: Trends and the Future Perspectives. https://blog.gitnux.com/chat-gpt-statistics/ . Accessed Nov 2023
Goodman RS, Patrinely JR, Stone CA Jr et al (2023) Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open 6(10):e2336483. https://doi.org/10.1001/jamanetworkopen.2023.36483
doi: 10.1001/jamanetworkopen.2023.36483
pubmed: 37782499
pmcid: 10546234
Ali SR, Dobbs TD, Hutchings HA, Whitaker IS (2023) Using ChatGPT to write patient clinic letters. Lancet Digit Health 5(4):e179–e181. https://doi.org/10.1016/S2589-7500(23)00048-1
doi: 10.1016/S2589-7500(23)00048-1
pubmed: 36894409
Inojosa H, Gilbert S, Kather JN, Proschmann U, Akgün K, Ziemssen T (2023) Can ChatGPT explain it? Use of artificial intelligence in multiple sclerosis communication. Neurol Res Pract 5(1):48. https://doi.org/10.1186/s42466-023-00270-8
doi: 10.1186/s42466-023-00270-8
pubmed: 37649106
pmcid: 10469796
Madrigal L, Escoffery C (2019) Electronic health behaviors among us adults with chronic disease: cross-sectional survey. J Med Internet Res 21(3):e11240. https://doi.org/10.2196/11240
doi: 10.2196/11240
pubmed: 30835242
pmcid: 6423466
Charness N, Boot WR (2023) A grand challenge for psychology: reducing the age-related digital divide. Curr Dir Psychol Sci 31(2):187–193. https://doi.org/10.1177/09637214211068144
doi: 10.1177/09637214211068144
Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, Poole C, Schlesselman JJ, Egger M (2007) STROBE initiative. Strengthening the reporting of observational studies in epidemiology (STROBE): explanation and elaboration. Epidemiology 18(6):805–835. https://doi.org/10.1097/EDE.0b013e3181577511
doi: 10.1097/EDE.0b013e3181577511
pubmed: 18049195
Digital Technology, Web and Social Media Study Group. https://www.neuro.it/web/eventi/NEURO/gruppi.cfm?p=DIGITAL_WEB_SOCIAL . Accessed Dec 2023
Research Randomizer. https://www.randomizer.org . Accessed July 2023
Kroenke K, Spitzer RL, Williams JB (2003) The Patient Health Questionnaire-2: validity of a two-item depression screener. Med Care 41(11):1284–1292. https://doi.org/10.1097/01.MLR.0000093487.78664.3C
doi: 10.1097/01.MLR.0000093487.78664.3C
pubmed: 14583691
Beswick E, Quigley S, Macdonald P, Patrick S, Colville S, Chandran S, Connick P (2022) The Patient Health Questionnaire (PHQ-9) as a tool to screen for depression in people with multiple sclerosis: a cross-sectional validation study. BMC Psychol 10(1):281. https://doi.org/10.1186/s40359-022-00949-8
doi: 10.1186/s40359-022-00949-8
pubmed: 36443880
pmcid: 9706934
Patten SB, Burton JM, Fiest KM, Wiebe S, Bulloch AG, Koch M, Dobson KS, Metz LM, Maxwell CJ, Jetté N (2015) Validity of four screening scales for major depression in MS. Mult Scler 21(8):1064–1071. https://doi.org/10.1177/1352458514559297
doi: 10.1177/1352458514559297
pubmed: 25583846
Mercer SW, Maxwell M, Heaney D, Watt GC (2004) The consultation and relational empathy (CARE) measure: development and preliminary validation and reliability of an empathy-based consultation process measure. Fam Pract 21(6):699–705. https://doi.org/10.1093/fampra/cmh621
doi: 10.1093/fampra/cmh621
pubmed: 15528286
Wang Y, Wang P, Wu Q, Wang Y, Lin B, Long J, Qing X, Wang P (2023) Doctors’ and patients’ perceptions of impacts of doctors’ communication and empathy skills on doctor-patient relationships during COVID-19. J Gen Intern Med 38(2):428–433. https://doi.org/10.1007/s11606-022-07784-y
doi: 10.1007/s11606-022-07784-y
pubmed: 36253633
Martikainen S, Falcon M, Wikström V, Peltola S, Saarikivi K (2022) Perceptions of doctors’ empathy and patients’ subjective health status at an online clinic: development of an empathic Anamnesis Questionnaire. Psychosom Med 84(4):513–521. https://doi.org/10.1097/PSY.0000000000001055
doi: 10.1097/PSY.0000000000001055
pubmed: 35100186
pmcid: 9071034
Lucisano P, Piemontese ME (1988) Gulpease: a formula to predict readability of texts written in Italian Language. Scuola Città 3:110–124
Dell’orletta F, Montemagni S, Venturi G (2011) READ-IT: assessing readability of italian texts with a view to text simplification, in Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies. Edinburgh, pp 73–83
Zhao YC, Zhao M, Song S (2022) Online health information seeking among patients with chronic conditions: integrating the health belief model and social support theory. J Med Internet Res 24(11):e42447. https://doi.org/10.2196/42447
doi: 10.2196/42447
pubmed: 36322124
pmcid: 9669891
Brigo F, Lattanzi S, Bragazzi N, Nardone R, Moccia M, Lavorgna L (2018) Why do people search wikipedia for information on multiple sclerosis? Mult Scler Relat Disord 20:210–214. https://doi.org/10.1016/j.msard.2018.02.001
doi: 10.1016/j.msard.2018.02.001
pubmed: 29428464
Ayoub NF, Lee YJ, Grimm D, Balakrishnan K (2023) Comparison between ChatGPT and google search as sources of postoperative patient instructions. JAMA Otolaryngol Head Neck Surg 149(6):556–558. https://doi.org/10.1001/jamaoto.2023.0704
doi: 10.1001/jamaoto.2023.0704
pubmed: 37103921
Lavorgna L, De Stefano M, Sparaco M, Moccia M, Abbadessa G, Montella P, Buonanno D, Esposito S, Clerico M, Cenci C, Trojsi F, Lanzillo R, Rosa L, Morra VB, Ippolito D, Maniscalco G, Bisecco A, Tedeschi G, Bonavita S (2018) Fake news, influencers and health-related professional participation on the web: a pilot study on a social-network of people with multiple sclerosis. Mult Scler Relat Disord 25:175–178. https://doi.org/10.1016/j.msard.2018.07.046
doi: 10.1016/j.msard.2018.07.046
pubmed: 30096683
Herzer KR, Pronovost PJ (2021) Ensuring quality in the era of virtual care. JAMA 325(5):429–430. https://doi.org/10.1016/j.msard.2018.07.046
doi: 10.1016/j.msard.2018.07.046
pubmed: 33528544
Mello MM, Guha N (2023) ChatGPT and physicians’ malpractice risk. JAMA Health Forum 4(5):e231938. https://doi.org/10.1001/jamahealthforum.2023.1938
doi: 10.1001/jamahealthforum.2023.1938
pubmed: 37200013
van Laar E, van Deursen AJAM, van Dijk JAGM, de Haan J (2020) Determinants of 21st-century skills and 21st-century digital skills for workers: a systematic literature review. SAGE Open. https://doi.org/10.1177/2158244019900176
doi: 10.1177/2158244019900176
National Research Council (2000) How people learn: brain, mind, experience, and school expanded edition. The National Academies Press, Washington, DC
Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB, Faix DJ, Goodman AM, Longhurst CA, Hogarth M, Smith DM (2023) Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 183(6):589–596. https://doi.org/10.1001/jamainternmed.2023.1838
doi: 10.1001/jamainternmed.2023.1838
pubmed: 37115527
Kaya F, Aydin F, Schepman A et al (2022) The roles of personality traits, AI anxiety, and demographic factors in attitudes toward artificial intelligence. Int J Hum-Comput Int. https://doi.org/10.1080/10447318.2022.2151730
doi: 10.1080/10447318.2022.2151730
Jia X, Pang Y, Liu LS (2021) Online health information seeking behavior: a systematic review. Healthcare (Basel) 9(12):1740. https://doi.org/10.3390/healthcare9121740
doi: 10.3390/healthcare9121740
pubmed: 34946466
D’Andrea A, Grifoni P, Ferri F (2023) Online health information seeking: an italian case study for analyzing citizens’ behavior and perception. Int J Environ Res Public Health 20(2):1076. https://doi.org/10.3390/ijerph20021076
doi: 10.3390/ijerph20021076
pubmed: 36673830
pmcid: 9859265
De Meo E, Portaccio E, Giorgio A et al (2021) Identifying the distinct cognitive phenotypes in multiple sclerosis. JAMA Neurol 78(4):414–425. https://doi.org/10.1001/jamaneurol.2020.4920
doi: 10.1001/jamaneurol.2020.4920
pubmed: 33393981
Hatcher-Martin JM, Busis NA, Cohen BH, Wolf RA, Jones EC, Anderson ER, Fritz JV, Shook SJ, Bove RM (2021) American academy of neurology telehealth position statement. Neurology 97(7):334–339. https://doi.org/10.1212/WNL.0000000000012185
doi: 10.1212/WNL.0000000000012185
pubmed: 33986141
pmcid: 8377877
Haluza D, Naszay M, Stockinger A, Jungwirth D (2017) Digital natives versus digital immigrants: influence of online health information seeking on the doctor-patient relationship. Health Commun 32(11):1342–1349. https://doi.org/10.1080/10410236.2016.1220044
doi: 10.1080/10410236.2016.1220044
pubmed: 27710132
Chua V, Koh JH, Koh CHG, Tyagi S (2022) The willingness to pay for telemedicine among patients with chronic diseases: systematic review. J Med Internet Res 24(4):e33372. https://doi.org/10.2196/33372
doi: 10.2196/33372
pubmed: 35416779
pmcid: 9047785
Xie Z, Chen J, Or CK (2022) Consumers’ willingness to pay for ehealth and its influencing factors: systematic review and meta-analysis. J Med Internet Res 24(9):e25959. https://doi.org/10.2196/25959
doi: 10.2196/25959
pubmed: 36103227
pmcid: 9520394
Fan W, Yan Z (2010) Factors affecting response rates of the web survey: a systematic review. Comput Hum Behav 26:132–139. https://doi.org/10.1016/j.chb.2009.10.01
doi: 10.1016/j.chb.2009.10.01
Wu MJ, Zhao K, Fils-Aime F (2022) Response rates of online surveys in published research: a meta-analysis. Comput Hum Behav. https://doi.org/10.1016/j.chbr.2022.100206
doi: 10.1016/j.chbr.2022.100206