Enhancing the Predictive Power of Google Trends Data Through Network Analysis: Infodemiology Study of COVID-19.

COVID-19 health care analytics infodemiology infoveillance internet search volumes mobile phone network analysis network connectedness pandemic risk

Journal

JMIR public health and surveillance
ISSN: 2369-2960
Titre abrégé: JMIR Public Health Surveill
Pays: Canada
ID NLM: 101669345

Informations de publication

Date de publication:
07 09 2023
Historique:
received: 05 09 2022
accepted: 29 06 2023
revised: 01 06 2023
medline: 8 9 2023
pubmed: 7 9 2023
entrez: 7 9 2023
Statut: epublish

Résumé

The COVID-19 outbreak has revealed a high demand for timely surveillance of pandemic developments. Google Trends (GT), which provides freely available search volume data, has been proven to be a reliable forecast and nowcast measure for public health issues. Previous studies have tended to use relative search volumes from GT directly to analyze associations and predict the progression of pandemic. However, GT's normalization of the search volumes data and data retrieval restrictions affect the data resolution in reflecting the actual search behaviors, thus limiting the potential for using GT data to predict disease outbreaks. This study aimed to introduce a merged algorithm that helps recover the resolution and accuracy of the search volume data extracted from GT over long observation periods. In addition, this study also aimed to demonstrate the extended application of merged search volumes (MSVs) in combination of network analysis, via tracking the COVID-19 pandemic risk. We collected relative search volumes from GT and transformed them into MSVs using our proposed merged algorithm. The MSVs of the selected coronavirus-related keywords were compiled using the rolling window method. The correlations between the MSVs were calculated to form a dynamic network. The network statistics, including network density and the global clustering coefficients between the MSVs, were also calculated. Our research findings suggested that although GT restricts the search data retrieval into weekly data points over a long period, our proposed approach could recover the daily search volume over the same investigation period to facilitate subsequent research analyses. In addition, the dynamic time warping diagrams show that the dynamic networks were capable of predicting the COVID-19 pandemic trends, in terms of the number of COVID-19 confirmed cases and severity risk scores. The innovative method for handling GT search data and the application of MSVs and network analysis to broaden the potential for GT data are useful for predicting the pandemic risk. Further investigation of the GT dynamic network can focus on noncommunicable diseases, health-related behaviors, and misinformation on the internet.

Sections du résumé

BACKGROUND
The COVID-19 outbreak has revealed a high demand for timely surveillance of pandemic developments. Google Trends (GT), which provides freely available search volume data, has been proven to be a reliable forecast and nowcast measure for public health issues. Previous studies have tended to use relative search volumes from GT directly to analyze associations and predict the progression of pandemic. However, GT's normalization of the search volumes data and data retrieval restrictions affect the data resolution in reflecting the actual search behaviors, thus limiting the potential for using GT data to predict disease outbreaks.
OBJECTIVE
This study aimed to introduce a merged algorithm that helps recover the resolution and accuracy of the search volume data extracted from GT over long observation periods. In addition, this study also aimed to demonstrate the extended application of merged search volumes (MSVs) in combination of network analysis, via tracking the COVID-19 pandemic risk.
METHODS
We collected relative search volumes from GT and transformed them into MSVs using our proposed merged algorithm. The MSVs of the selected coronavirus-related keywords were compiled using the rolling window method. The correlations between the MSVs were calculated to form a dynamic network. The network statistics, including network density and the global clustering coefficients between the MSVs, were also calculated.
RESULTS
Our research findings suggested that although GT restricts the search data retrieval into weekly data points over a long period, our proposed approach could recover the daily search volume over the same investigation period to facilitate subsequent research analyses. In addition, the dynamic time warping diagrams show that the dynamic networks were capable of predicting the COVID-19 pandemic trends, in terms of the number of COVID-19 confirmed cases and severity risk scores.
CONCLUSIONS
The innovative method for handling GT search data and the application of MSVs and network analysis to broaden the potential for GT data are useful for predicting the pandemic risk. Further investigation of the GT dynamic network can focus on noncommunicable diseases, health-related behaviors, and misinformation on the internet.

Identifiants

pubmed: 37676701
pii: v9i1e42446
doi: 10.2196/42446
pmc: PMC10488898
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e42446

Informations de copyright

©Amanda MY Chu, Andy C Y Chong, Nick H T Lai, Agnes Tiwari, Mike K P So. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 07.09.2023.

Références

J Med Internet Res. 2009 Mar 27;11(1):e11
pubmed: 19329408
Nat Hum Behav. 2021 Mar;5(3):337-348
pubmed: 33547453
Sci Rep. 2021 Mar 4;11(1):5112
pubmed: 33664280
J Med Internet Res. 2021 Jun 7;23(6):e25579
pubmed: 34096875
Sci Rep. 2022 Feb 17;12(1):2668
pubmed: 35177679
Infect Dis Poverty. 2015 Dec 10;4:54
pubmed: 26654247
J Med Internet Res. 2018 Nov 06;20(11):e270
pubmed: 30401664
NPJ Digit Med. 2021 Feb 11;4(1):22
pubmed: 33574582
Sci Rep. 2020 Nov 26;10(1):20693
pubmed: 33244028
BMJ Open. 2020 Jul 5;10(7):e034156
pubmed: 32624467
NPJ Digit Med. 2020 Nov 20;3(1):152
pubmed: 33299072
Sci Rep. 2017 Jul 10;7(1):4993
pubmed: 28694479
Healthc Inform Res. 2015 Apr;21(2):67-73
pubmed: 25995958
Financ Res Lett. 2021 Oct;42:101884
pubmed: 34903954
Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1:5200-5
pubmed: 14745042
J Travel Med. 2020 Aug 20;27(5):
pubmed: 32463088
J Med Internet Res. 2021 May 3;23(5):e22933
pubmed: 33878015
Int J Infect Dis. 2020 Jun;95:221-223
pubmed: 32173572
JMIR Public Health Surveill. 2019 May 29;5(2):e13439
pubmed: 31144671
Prog Disaster Sci. 2020 Dec;8:100119
pubmed: 34173443
Sci Rep. 2016 Sep 06;6:32920
pubmed: 27595921
NPJ Digit Med. 2021 Feb 8;4(1):17
pubmed: 33558607
JMIR Public Health Surveill. 2021 Mar 29;7(3):e27317
pubmed: 33711799
JMIR Public Health Surveill. 2023 Sep 7;9:e42446
pubmed: 37676701
Curr Psychol. 2022 Oct 5;:1-8
pubmed: 36213568
Sci Rep. 2021 Mar 3;11(1):5106
pubmed: 33658529
Int J Environ Res Public Health. 2022 Jan 25;19(3):
pubmed: 35162369
Health Policy. 2019 Mar;123(3):338-341
pubmed: 30660346
Int J Infect Dis. 2021 Feb;103:97-101
pubmed: 33212255
Travel Med Infect Dis. 2020 Sep-Oct;37:101703
pubmed: 32360323
Am J Emerg Med. 2021 Jul;45:185-191
pubmed: 33046303
Euro Surveill. 2020 Mar;25(10):
pubmed: 32183935
J Travel Med. 2020 Dec 23;27(8):
pubmed: 32970124
Int J Infect Dis. 2020 Jul;96:558-561
pubmed: 32437929
Int J Infect Dis. 2020 May;94:116-118
pubmed: 32320809
Nature. 2009 Feb 19;457(7232):1012-4
pubmed: 19020500
NPJ Digit Med. 2021 Mar 3;4(1):41
pubmed: 33658681
JMIR Infodemiology. 2021 Nov 12;1(1):e32127
pubmed: 34841200
Int J Environ Res Public Health. 2021 Mar 19;18(6):
pubmed: 33808764
Proc Natl Acad Sci U S A. 2022 Feb 15;119(7):
pubmed: 35105729
Int J Infect Dis. 2020 Jun;95:192-197
pubmed: 32305520
BMJ. 2020 Mar 22;368:m1163
pubmed: 32201376
PLoS One. 2014 Oct 22;9(10):e109583
pubmed: 25337815
JMIR Public Health Surveill. 2020 Apr 14;6(2):e18828
pubmed: 32234709

Auteurs

Amanda My Chu (AM)

Department of Social Sciences and Policy Studies, The Education University of Hong Kong, Hong Kong, Hong Kong.

Andy C Y Chong (ACY)

School of Nursing, Tung Wah College, Hong Kong, Hong Kong.

Nick H T Lai (NHT)

Department of Information Systems, Business Statistics and Operations Management, The Hong Kong University of Science and Technology, Hong Kong, Hong Kong.

Agnes Tiwari (A)

School of Nursing, Hong Kong Sanatorium & Hospital, Hong Kong, Hong Kong.
Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong.

Mike K P So (MKP)

Department of Information Systems, Business Statistics and Operations Management, The Hong Kong University of Science and Technology, Hong Kong, Hong Kong.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH