Tracking Self-reported Symptoms and Medical Conditions on Social Media During the COVID-19 Pandemic: Infodemiological Study.
COVID-19
health conditions
infoveillance
mental health
natural language processing
pandemic
public health surveillance
social media
symptoms
Journal
JMIR public health and surveillance
ISSN: 2369-2960
Titre abrégé: JMIR Public Health Surveill
Pays: Canada
ID NLM: 101669345
Informations de publication
Date de publication:
28 09 2021
28 09 2021
Historique:
received:
06
04
2021
accepted:
26
08
2021
revised:
06
07
2021
pubmed:
14
9
2021
medline:
2
10
2021
entrez:
13
9
2021
Statut:
epublish
Résumé
Harnessing health-related data posted on social media in real time can offer insights into how the pandemic impacts the mental health and general well-being of individuals and populations over time. This study aimed to obtain information on symptoms and medical conditions self-reported by non-Twitter social media users during the COVID-19 pandemic, to determine how discussion of these symptoms and medical conditions changed over time, and to identify correlations between frequency of the top 5 commonly mentioned symptoms post and daily COVID-19 statistics (new cases, new deaths, new active cases, and new recovered cases) in the United States. We used natural language processing (NLP) algorithms to identify symptom- and medical condition-related topics being discussed on social media between June 14 and December 13, 2020. The sample posts were geotagged by NetBase, a third-party data provider. We calculated the positive predictive value and sensitivity to validate the classification of posts. We also assessed the frequency of health-related discussions on social media over time during the study period, and used Pearson correlation coefficients to identify statistically significant correlations between the frequency of the 5 most commonly mentioned symptoms and fluctuation of daily US COVID-19 statistics. Within a total of 9,807,813 posts (nearly 70% were sourced from the United States), we identified a discussion of 120 symptom-related topics and 1542 medical condition-related topics. Our classification of the health-related posts had a positive predictive value of over 80% and an average classification rate of 92% sensitivity. The 5 most commonly mentioned symptoms on social media during the study period were anxiety (in 201,303 posts or 12.2% of the total posts mentioning symptoms), generalized pain (189,673, 11.5%), weight loss (95,793, 5.8%), fatigue (91,252, 5.5%), and coughing (86,235, 5.2%). The 5 most discussed medical conditions were COVID-19 (in 5,420,276 posts or 66.4% of the total posts mentioning medical conditions), unspecified infectious disease (469,356, 5.8%), influenza (270,166, 3.3%), unspecified disorders of the central nervous system (253,407, 3.1%), and depression (151,752, 1.9%). Changes in posts in the frequency of anxiety, generalized pain, and weight loss were significant but negatively correlated with daily new COVID-19 cases in the United States (r=-0.49, r=-0.46, and r=-0.39, respectively; P<.05). Posts on the frequency of anxiety, generalized pain, weight loss, fatigue, and the changes in fatigue positively and significantly correlated with daily changes in both new deaths and new active cases in the United States (r ranged=0.39-0.48; P<.05). COVID-19 and symptoms of anxiety were the 2 most commonly discussed health-related topics on social media from June 14 to December 13, 2020. Real-time monitoring of social media posts on symptoms and medical conditions may help assess the population's mental health status and enhance public health surveillance for infectious disease.
Sections du résumé
BACKGROUND
Harnessing health-related data posted on social media in real time can offer insights into how the pandemic impacts the mental health and general well-being of individuals and populations over time.
OBJECTIVE
This study aimed to obtain information on symptoms and medical conditions self-reported by non-Twitter social media users during the COVID-19 pandemic, to determine how discussion of these symptoms and medical conditions changed over time, and to identify correlations between frequency of the top 5 commonly mentioned symptoms post and daily COVID-19 statistics (new cases, new deaths, new active cases, and new recovered cases) in the United States.
METHODS
We used natural language processing (NLP) algorithms to identify symptom- and medical condition-related topics being discussed on social media between June 14 and December 13, 2020. The sample posts were geotagged by NetBase, a third-party data provider. We calculated the positive predictive value and sensitivity to validate the classification of posts. We also assessed the frequency of health-related discussions on social media over time during the study period, and used Pearson correlation coefficients to identify statistically significant correlations between the frequency of the 5 most commonly mentioned symptoms and fluctuation of daily US COVID-19 statistics.
RESULTS
Within a total of 9,807,813 posts (nearly 70% were sourced from the United States), we identified a discussion of 120 symptom-related topics and 1542 medical condition-related topics. Our classification of the health-related posts had a positive predictive value of over 80% and an average classification rate of 92% sensitivity. The 5 most commonly mentioned symptoms on social media during the study period were anxiety (in 201,303 posts or 12.2% of the total posts mentioning symptoms), generalized pain (189,673, 11.5%), weight loss (95,793, 5.8%), fatigue (91,252, 5.5%), and coughing (86,235, 5.2%). The 5 most discussed medical conditions were COVID-19 (in 5,420,276 posts or 66.4% of the total posts mentioning medical conditions), unspecified infectious disease (469,356, 5.8%), influenza (270,166, 3.3%), unspecified disorders of the central nervous system (253,407, 3.1%), and depression (151,752, 1.9%). Changes in posts in the frequency of anxiety, generalized pain, and weight loss were significant but negatively correlated with daily new COVID-19 cases in the United States (r=-0.49, r=-0.46, and r=-0.39, respectively; P<.05). Posts on the frequency of anxiety, generalized pain, weight loss, fatigue, and the changes in fatigue positively and significantly correlated with daily changes in both new deaths and new active cases in the United States (r ranged=0.39-0.48; P<.05).
CONCLUSIONS
COVID-19 and symptoms of anxiety were the 2 most commonly discussed health-related topics on social media from June 14 to December 13, 2020. Real-time monitoring of social media posts on symptoms and medical conditions may help assess the population's mental health status and enhance public health surveillance for infectious disease.
Identifiants
pubmed: 34517338
pii: v7i9e29413
doi: 10.2196/29413
pmc: PMC8480398
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e29413Subventions
Organisme : NHLBI NIH HHS
ID : K12 HL138037
Pays : United States
Organisme : NCATS NIH HHS
ID : UL1 TR001863
Pays : United States
Informations de copyright
©Qinglan Ding, Daisy Massey, Chenxi Huang, Connor B Grady, Yuan Lu, Alina Cohen, Pini Matzner, Shiwani Mahajan, César Caraballo, Navin Kumar, Yuchen Xue, Rachel Dreyer, Brita Roy, Harlan M Krumholz. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 28.09.2021.
Références
J Am Med Inform Assoc. 2020 Aug 1;27(8):1310-1315
pubmed: 32620975
N Engl J Med. 2020 Apr 30;382(18):1708-1720
pubmed: 32109013
JAMA Netw Open. 2020 Jul 1;3(7):e2014323
pubmed: 32639569
Child Abuse Negl. 2021 Jun;116(Pt 2):104747
pubmed: 33358281
Mayo Clin Proc. 2006 Mar;81(3):291-3
pubmed: 16529129
Popul Health Manag. 2020 Oct;23(5):350-360
pubmed: 32897820
Electron Commer Res Appl. 2018 Jan-Feb;27:139-151
pubmed: 30147636
JMIR Public Health Surveill. 2020 Nov 11;6(4):e21978
pubmed: 33108310
J Med Internet Res. 2021 Jun 21;23(6):e26655
pubmed: 34086593
BMJ. 2020 Apr 6;369:m1373
pubmed: 32253180
JMIR Public Health Surveill. 2021 Apr 26;7(4):e26720
pubmed: 33847587
MMWR Morb Mortal Wkly Rep. 2020 Jul 17;69(28):904-908
pubmed: 32673296
BMJ Glob Health. 2020 May;5(5):
pubmed: 32409327
J Med Internet Res. 2021 Feb 10;23(2):e25431
pubmed: 33497352
Sci Rep. 2020 Feb 6;10(1):1342
pubmed: 32029754
JMIR Public Health Surveill. 2020 Apr 21;6(2):e18700
pubmed: 32293582
J Prev Med Hyg. 2020 Oct 06;61(3):E304-E312
pubmed: 33150219
JMIR Public Health Surveill. 2020 Jun 18;6(2):e19276
pubmed: 32421686
JAMA. 2020 May 26;323(20):2011-2012
pubmed: 32202611
Gen Psychiatr. 2020 Mar 6;33(2):e100213
pubmed: 32215365
JAMA. 2020 Feb 4;323(5):411-412
pubmed: 31922532
Lancet Infect Dis. 2020 May;20(5):533-534
pubmed: 32087114
MMWR Morb Mortal Wkly Rep. 2020 Sep 11;69(36):1250-1257
pubmed: 32915166
JAMA. 2020 Apr 21;323(15):1488-1494
pubmed: 32125362
PLoS One. 2019 Jun 17;14(6):e0215476
pubmed: 31206534
J Med Internet Res. 2020 Dec 14;22(12):e21418
pubmed: 33284783
J Med Internet Res. 2020 Apr 21;22(4):e19016
pubmed: 32287039
J Gen Intern Med. 2020 Sep;35(9):2798-2800
pubmed: 32638321