Use of Large Language Models to Assess the Likelihood of Epidemics From the Content of Tweets: Infodemiology Study.

GPT-3.5 GPT-4 Generative Pre-trained Transformers Twitter X formerly known as Twitter conjunctivitis epidemic detection generative large language model infectious eye disease microblog social media

Journal

Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882

Informations de publication

Date de publication:
01 Mar 2024
Historique:
received: 19 05 2023
accepted: 19 01 2024
revised: 20 12 2023
medline: 4 3 2024
pubmed: 1 3 2024
entrez: 1 3 2024
Statut: epublish

Résumé

Previous work suggests that Google searches could be useful in identifying conjunctivitis epidemics. Content-based assessment of social media content may provide additional value in serving as early indicators of conjunctivitis and other systemic infectious diseases. We investigated whether large language models, specifically GPT-3.5 and GPT-4 (OpenAI), can provide probabilistic assessments of whether social media posts about conjunctivitis could indicate a regional outbreak. A total of 12,194 conjunctivitis-related tweets were obtained using a targeted Boolean search in multiple languages from India, Guam (United States), Martinique (France), the Philippines, American Samoa (United States), Fiji, Costa Rica, Haiti, and the Bahamas, covering the time frame from January 1, 2012, to March 13, 2023. By providing these tweets via prompts to GPT-3.5 and GPT-4, we obtained probabilistic assessments that were validated by 2 human raters. We then calculated Pearson correlations of these time series with tweet volume and the occurrence of known outbreaks in these 9 locations, with time series bootstrap used to compute CIs. Probabilistic assessments derived from GPT-3.5 showed correlations of 0.60 (95% CI 0.47-0.70) and 0.53 (95% CI 0.40-0.65) with the 2 human raters, with higher results for GPT-4. The weekly averages of GPT-3.5 probabilities showed substantial correlations with weekly tweet volume for 44% (4/9) of the countries, with correlations ranging from 0.10 (95% CI 0.0-0.29) to 0.53 (95% CI 0.39-0.89), with larger correlations for GPT-4. More modest correlations were found for correlation with known epidemics, with substantial correlation only in American Samoa (0.40, 95% CI 0.16-0.81). These findings suggest that GPT prompting can efficiently assess the content of social media posts and indicate possible disease outbreaks to a degree of accuracy comparable to that of humans. Furthermore, we found that automated content analysis of tweets is related to tweet volume for conjunctivitis-related posts in some locations and to the occurrence of actual epidemics. Future work may improve the sensitivity and specificity of these methods for disease outbreak detection.

Sections du résumé

BACKGROUND BACKGROUND
Previous work suggests that Google searches could be useful in identifying conjunctivitis epidemics. Content-based assessment of social media content may provide additional value in serving as early indicators of conjunctivitis and other systemic infectious diseases.
OBJECTIVE OBJECTIVE
We investigated whether large language models, specifically GPT-3.5 and GPT-4 (OpenAI), can provide probabilistic assessments of whether social media posts about conjunctivitis could indicate a regional outbreak.
METHODS METHODS
A total of 12,194 conjunctivitis-related tweets were obtained using a targeted Boolean search in multiple languages from India, Guam (United States), Martinique (France), the Philippines, American Samoa (United States), Fiji, Costa Rica, Haiti, and the Bahamas, covering the time frame from January 1, 2012, to March 13, 2023. By providing these tweets via prompts to GPT-3.5 and GPT-4, we obtained probabilistic assessments that were validated by 2 human raters. We then calculated Pearson correlations of these time series with tweet volume and the occurrence of known outbreaks in these 9 locations, with time series bootstrap used to compute CIs.
RESULTS RESULTS
Probabilistic assessments derived from GPT-3.5 showed correlations of 0.60 (95% CI 0.47-0.70) and 0.53 (95% CI 0.40-0.65) with the 2 human raters, with higher results for GPT-4. The weekly averages of GPT-3.5 probabilities showed substantial correlations with weekly tweet volume for 44% (4/9) of the countries, with correlations ranging from 0.10 (95% CI 0.0-0.29) to 0.53 (95% CI 0.39-0.89), with larger correlations for GPT-4. More modest correlations were found for correlation with known epidemics, with substantial correlation only in American Samoa (0.40, 95% CI 0.16-0.81).
CONCLUSIONS CONCLUSIONS
These findings suggest that GPT prompting can efficiently assess the content of social media posts and indicate possible disease outbreaks to a degree of accuracy comparable to that of humans. Furthermore, we found that automated content analysis of tweets is related to tweet volume for conjunctivitis-related posts in some locations and to the occurrence of actual epidemics. Future work may improve the sensitivity and specificity of these methods for disease outbreak detection.

Identifiants

pubmed: 38427404
pii: v26i1e49139
doi: 10.2196/49139
pmc: PMC10943433
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e49139

Subventions

Organisme : NEI NIH HHS
ID : P30 EY002162
Pays : United States
Organisme : NEI NIH HHS
ID : R01 EY024608
Pays : United States

Informations de copyright

©Michael S Deiner, Natalie A Deiner, Vagelis Hristidis, Stephen D McLeod, Thuy Doan, Thomas M Lietman, Travis C Porco. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 01.03.2024.

Références

J Med Internet Res. 2009 Mar 27;11(1):e11
pubmed: 19329408
J Med Internet Res. 2014 Nov 14;16(11):e250
pubmed: 25406040
Healthc Anal (N Y). 2023 Nov;3:100172
pubmed: 37064254
J Med Internet Res. 2014 Oct 20;16(10):e236
pubmed: 25331122
Int J Environ Res Public Health. 2023 Mar 03;20(5):
pubmed: 36901550
Br J Ophthalmol. 2014 Jun;98(6):841-3
pubmed: 24682179
World Wide Web. 2023;26(1):55-70
pubmed: 35308294
Epidemiology. 2020 Jan;31(1):90-97
pubmed: 31651659
PLoS One. 2023 Feb 24;18(2):e0282101
pubmed: 36827297
Emerg Infect Dis. 2018 Jan;24(1):168-170
pubmed: 29260662
J Biomed Inform. 2020 Aug;108:103500
pubmed: 32622833
J Biomed Inform. 2016 Aug;62:1-11
pubmed: 27224846
Healthcare (Basel). 2020 Aug 28;8(3):
pubmed: 32872330
Clin Ophthalmol. 2020 Feb 11;14:377-387
pubmed: 32103884
Ophthalmology. 2019 Sep;126(9):1219-1229
pubmed: 30981915
Br Med Bull. 2013;108:5-24
pubmed: 24103335
Expert Syst Appl. 2022 Jul 15;198:116882
pubmed: 35308584
JAMA Ophthalmol. 2016 Sep 1;134(9):1024-30
pubmed: 27416554
Invest Ophthalmol Vis Sci. 2018 Feb 1;59(2):910-920
pubmed: 29450538
J Med Internet Res. 2020 Jun 16;22(6):e19284
pubmed: 32501804
J Clin Virol. 2022 Dec;157:105318
pubmed: 36242841
Sci Rep. 2023 Nov 22;13(1):20512
pubmed: 37993519
Am J Trop Med Hyg. 2018 Jul;99(1):229-232
pubmed: 29761759
JMIR Public Health Surveill. 2016 Oct 20;2(2):e161
pubmed: 27765731
Annu Rev Public Health. 2020 Apr 2;41:101-118
pubmed: 31905322
PLoS One. 2023 May 8;18(5):e0285101
pubmed: 37155655
Am J Public Health. 2017 Jan;107(1):e1-e8
pubmed: 27854532
Cureus. 2023 Dec 12;15(12):e50369
pubmed: 38213361
JMIR Infodemiology. 2023 Mar 10;3:e40575
pubmed: 37113377
JMIR Med Educ. 2023 Mar 6;9:e46885
pubmed: 36863937
JMIR Med Educ. 2023 Mar 8;9:e46876
pubmed: 36867743
MMWR Morb Mortal Wkly Rep. 2013 Aug 16;62(32):637-41
pubmed: 23945769
J Clin Virol. 2022 Dec;157:105300
pubmed: 36209621
JMIR Public Health Surveill. 2018 Sep 25;4(3):e65
pubmed: 30274968
J Comput Soc Sci. 2023;6(1):359-388
pubmed: 36405087
Ophthalmology. 2019 Jun;126(6):779-782
pubmed: 31122357
Euro Surveill. 2012 Jun 07;17(23):
pubmed: 22720741

Auteurs

Michael S Deiner (MS)

Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.

Natalie A Deiner (NA)

College of Letters and Science, University of California, Santa Barbara, Santa Barbara, CA, United States.

Vagelis Hristidis (V)

Department of Computer Science and Engineering, University of California, Riverside, Riverside, CA, United States.

Stephen D McLeod (SD)

Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
American Academy of Ophthalmology, San Francisco, CA, United States.

Thuy Doan (T)

Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States.

Thomas M Lietman (TM)

Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States.

Travis C Porco (TC)

Department of Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Francis I. Proctor Foundation for Research in Ophthalmology, University of California, San Francisco, San Francisco, CA, United States.
Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH