Deep learning models for hepatitis E incidence prediction leveraging Baidu index.
Baidu index
Hepatitis E
KAN
LSTM
Prediction
Journal
BMC public health
ISSN: 1471-2458
Titre abrégé: BMC Public Health
Pays: England
ID NLM: 100968562
Informations de publication
Date de publication:
31 Oct 2024
31 Oct 2024
Historique:
received:
28
05
2024
accepted:
28
10
2024
medline:
31
10
2024
pubmed:
31
10
2024
entrez:
31
10
2024
Statut:
epublish
Résumé
Infectious diseases are major medical and social challenges of the 21 We collected data on hepatitis E incidence and cases in Shandong province from January 2009 to December 2022 are extracted. Baidu index is available from January 2009 to December 2022. Employing Pearson correlation analysis, we validated the relationship between the Baidu index and hepatitis E incidence. We utilized various LSTM architectures, including LSTM, stacked LSTM, attention-based LSTM, and attention-based stacked LSTM, to forecast hepatitis E incidence both with and without incorporating the Baidu index. Meanwhile, we introduce KAN to LSTM models for improving nonlinear learning capability. The performance of models are evaluated by three standard quality metrics, including root mean square error(RMSE), mean absolute percentage error(MAPE) and mean absolute error(MAE). Adjusting for the Baidu index altered the correlation between hepatitis E incidence and the Baidu index from -0.1654 to 0.1733. Without Baidu index, we obtained 17.04±0.13%, 17.19±0.57%, in terms of MAPE, by LSTM and attention based stacked LSTM, respectively. With the Baidu index, we obtained 15.36±0.16%, 15.15±0.07%, in term of MAPE, by the same methods. The prediction accuracy increased by 2%. The methods with KAN can improve the performance by 0.3%. More detailed results are shown in results section of this paper. Our experiments reveal a weak correlation and similar trends between the Baidu index and hepatitis E incidence. Baidu index proves to be valuable for predicting hepatitis E incidence. Furthermore, stack layers and KAN can also improve the representational ability of LSTM models.
Sections du résumé
BACKGROUND
BACKGROUND
Infectious diseases are major medical and social challenges of the 21
METHODS
METHODS
We collected data on hepatitis E incidence and cases in Shandong province from January 2009 to December 2022 are extracted. Baidu index is available from January 2009 to December 2022. Employing Pearson correlation analysis, we validated the relationship between the Baidu index and hepatitis E incidence. We utilized various LSTM architectures, including LSTM, stacked LSTM, attention-based LSTM, and attention-based stacked LSTM, to forecast hepatitis E incidence both with and without incorporating the Baidu index. Meanwhile, we introduce KAN to LSTM models for improving nonlinear learning capability. The performance of models are evaluated by three standard quality metrics, including root mean square error(RMSE), mean absolute percentage error(MAPE) and mean absolute error(MAE).
RESULTS
RESULTS
Adjusting for the Baidu index altered the correlation between hepatitis E incidence and the Baidu index from -0.1654 to 0.1733. Without Baidu index, we obtained 17.04±0.13%, 17.19±0.57%, in terms of MAPE, by LSTM and attention based stacked LSTM, respectively. With the Baidu index, we obtained 15.36±0.16%, 15.15±0.07%, in term of MAPE, by the same methods. The prediction accuracy increased by 2%. The methods with KAN can improve the performance by 0.3%. More detailed results are shown in results section of this paper.
CONCLUSIONS
CONCLUSIONS
Our experiments reveal a weak correlation and similar trends between the Baidu index and hepatitis E incidence. Baidu index proves to be valuable for predicting hepatitis E incidence. Furthermore, stack layers and KAN can also improve the representational ability of LSTM models.
Identifiants
pubmed: 39478514
doi: 10.1186/s12889-024-20532-7
pii: 10.1186/s12889-024-20532-7
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
3014Subventions
Organisme : Shandong Provincial Natural Science Foundation
ID : ZR2023MF110
Organisme : Taishan Scholar Program of Shandong Province
ID : ts201511105
Organisme : ZhiFei Disease Prevention and Control Technology Research Fund Project
ID : LYH2017-08
Informations de copyright
© 2024. The Author(s).
Références
Kamar N, Bendall R, Legrand-Abravanel F, Xia NS, Ijaz S, Izopet J, et al. Hepatitis E. Lancet. 2012;379(9835):2477–88.
doi: 10.1016/S0140-6736(11)61849-7
pubmed: 22549046
World Health Organization. Hepatitis E. 2021. https://www.who.intzh/news-room/fact-sheets/detail/hepatitis-e . Accessed 27 Jul 2021.
Hakim MS, Wang W, Bramer WM, Geng J, Huang F, de Man RA, et al. The global burden of hepatitis E outbreaks: a systematic review. Liver Int. 2017;37(1):19–31.
doi: 10.1111/liv.13237
pubmed: 27542764
Yin W, Han Y, Xin H, Liu W, Song Q, Li Z, et al. Hepatitis E outbreak in a mechanical factory in Qingdao City, China. Int J Infect Dis. 2019;86:191–6.
doi: 10.1016/j.ijid.2019.07.006
pubmed: 31310884
Wang C, Li Y, Feng W, Liu K, Zhang S, Hu F, et al. Epidemiological features and forecast model analysis for the morbidity of influenza in Ningbo, China, 2006–2014. Int J Environ Res Public Health. 2017;14(6):559.
doi: 10.3390/ijerph14060559
pubmed: 28587073
Lu Z, Ji W, Yin Y, Jin X, Wang L, Li Z, et al. Analysis on the trend of AIDS incidence in Zhejiang, China based on the age-period-cohort model (2004–2018). BMC Public Health. 2021;21(1):1077.
doi: 10.1186/s12889-021-11050-x
pubmed: 34090398
Roy S, Bhunia GS, Shit PK. Spatial prediction of COVID-19 epidemic using ARIMA techniques in India. Model Earth Syst Environ. 2021;7:1385–91.
doi: 10.1007/s40808-020-00890-y
pubmed: 32838022
Ren H, Li J, Yuan ZA, Hu JY, Yu Y, Lu YH. The development of a combined mathematical model to forecast the incidence of hepatitis E in Shanghai, China. BMC Infect Dis. 2013;13:1–6.
doi: 10.1186/1471-2334-13-421
Mollalo A, Rivera KM, Vahedi B. Artificial neural network modeling of novel coronavirus (COVID-19) incidence rates across the continental United States. Int J Environ Res Public Health. 2020;17(12):4204.
doi: 10.3390/ijerph17124204
pubmed: 32545581
Guo X, Shen H, Liu S, Xie N, Yang Y, Jin J. Predicting the trend of infectious diseases using grey self-memory system model: a case study of the incidence of tuberculosis. Public Health. 2021;201:108–14.
doi: 10.1016/j.puhe.2021.09.025
pubmed: 34823142
Wang Z, Huang Y, He B. Dual-grained representation for hand, foot, and mouth disease prediction within public health cyber-physical systems. Softw Pract Experience. 2021;51(11):2290–305.
doi: 10.1002/spe.2940
Zhang P, Wang Z, Huang Y, Wang M. Dual-grained directional representation for infectious disease case prediction. Knowl Based Syst. 2022;256:109806.
doi: 10.1016/j.knosys.2022.109806
Feng Y, Guo Y, Lv J, Yan B, Xu A, Zhang L. Prediction for Hepatitis E Incidence Using Support Vector Machine. J Med Imaging Health Inform. 2020;10(12):2863–8.
doi: 10.1166/jmihi.2020.3226
Guo Y, Feng Y, Qu F, Zhang L, Yan B, Lv J. Prediction of hepatitis E using machine learning models. PLoS ONE. 2020;15(9):e0237750.
doi: 10.1371/journal.pone.0237750
pubmed: 32941452
Cheng X, Liu W, Zhang X, Wang M, Bao C, Wu T. Predicting incidence of hepatitis E using machine learning in Jiangsu Province, China. Epidemiol Infect. 2022;150:e149.
doi: 10.1017/S0950268822001303
pubmed: 35899849
Zhang P, Wang Z, Chao G, Huang Y, Yan J. An oriented attention model for infectious disease cases prediction. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Kitakyushu: Springer; 2022. p. 124–36.
Wang Z, Zhang P, Huang Y, Chao G, Xie X, Fu Y. Oriented transformer for infectious disease case prediction. Appl Intell. 2023;53(24):30097–112.
doi: 10.1007/s10489-023-05101-6
Li F, Li M, Guan P, Ma S, Cui L, et al. Mapping publication trends and identifying hot spots of research on Internet health information seeking behavior: a quantitative and co-word biclustering analysis. J Med Internet Res. 2015;17(3):e3326.
doi: 10.2196/jmir.3326
Carneiro HA, Mylonakis E. Google trends: a web-based tool for real-time surveillance of disease outbreaks. Clin Infect Dis. 2009;49(10):1557–64.
doi: 10.1086/630200
pubmed: 19845471
Yang S, Santillana M, Kou SC. Accurate estimation of influenza epidemics using Google search data via ARGO. Proc Natl Acad Sci. 2015;112(47):14473–8.
doi: 10.1073/pnas.1515373112
pubmed: 26553980
Prasanth S, Singh U, Kumar A, Tikkiwal VA, Chong PH. Forecasting spread of COVID-19 using google trends: A hybrid GWO-deep learning approach. Chaos Solitons Fractals. 2021;142:110336.
doi: 10.1016/j.chaos.2020.110336
pubmed: 33110297
He Y, Zhao Y, Chen Y, Yuan HY, Tsui KL. Nowcasting influenza-like illness (ILI) via a deep learning approach using google search data: An empirical study on Taiwan ILI. Int J Intell Syst. 2022;37(3):2648–74.
doi: 10.1002/int.22788
Liu K, Huang S, Miao ZP, Chen B, Jiang T, Cai G, et al. Identifying potential norovirus epidemics in China via internet surveillance. J Med Internet Res. 2017;19(8):e282.
doi: 10.2196/jmir.7855
pubmed: 28790023
He G, Chen Y, Chen B, Wang H, Shen L, Liu L, et al. Using the Baidu search index to predict the incidence of HIV/AIDS in China. Sci Rep. 2018;8(1):9038.
doi: 10.1038/s41598-018-27413-1
pubmed: 29899360
Zhao Y, Xu Q, Chen Y, Tsui KL. Using Baidu index to nowcast hand-foot-mouth disease in China: a meta learning approach. BMC Infect Dis. 2018;18:1–11.
doi: 10.1186/s12879-018-3285-4
Wei S, Lin S, Wenjing Z, Shaoxia S, Yuejie Y, Yujie H, et al. The prediction of influenza-like illness using national influenza surveillance data and Baidu query data. BMC Public Health. 2024;24(1):513.
doi: 10.1186/s12889-024-17978-0
pubmed: 38369456
Zhao C, Yang Y, Wu S, Wu W, Xue H, An K, et al. Search trends and prediction of human brucellosis using Baidu index data from 2011 to 2018 in China. Sci Rep. 2020;10(1):5896.
doi: 10.1038/s41598-020-62517-7
pubmed: 32246053
Wu T, Wang M, Cheng X, Liu W, Zhu S, Zhang X. Predicting incidence of hepatitis E for thirteen cities in Jiangsu Province, China. Front Public Health. 2022;10:942543.
doi: 10.3389/fpubh.2022.942543
pubmed: 36262244
Peng T, Chen X, Wan M, Jin L, Wang X, Du X, et al. The prediction of hepatitis E through ensemble learning. Int J Environ Res Public Health. 2021;18(1):159.
doi: 10.3390/ijerph18010159
Feng Y, Cui X, Lv J, Yan B, Meng X, Zhang L, et al. Deep learning models for hepatitis E incidence prediction leveraging meteorological factors. PLoS ONE. 2023;18(3):e0282928.
doi: 10.1371/journal.pone.0282928
pubmed: 36913401
Pearson K. Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond A. 1894;185:71–110.
doi: 10.1098/rsta.1894.0003
Kingma DP. Adam: A method for stochastic optimization. 2014. arXiv preprint arXiv:1412.6980.
Liu Z, Wang Y, Vaidya S, Ruehle F, Halverson J, Soljačić M, Hou TY, Tegmark M. Kan: Kolmogorov-arnold networks. 2024. arXiv preprint arXiv:2404.19756.
Kolmogorov AN. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. In: Doklady Akademii Nauk. vol. 114. Kitakyushu: Russian Academy of Sciences; 1957. p. 953–6.
Pinkus A. Approximation theory of the MLP model in neural networks. Acta Numerica. 1999;8:143–95.
doi: 10.1017/S0962492900002919
Chen S, Ge L. Exploring the attention mechanism in LSTM-based Hong Kong stock price movement prediction. Quant Finan. 2019;19(9):1507–15.
doi: 10.1080/14697688.2019.1622287
Li Y, Zhu Z, Kong D, Han H, Zhao Y. EA-LSTM: Evolutionary attention-based LSTM for time series prediction. Knowl Based Syst. 2019;181:104785.
doi: 10.1016/j.knosys.2019.05.028