Utilizing a novel high-resolution malaria dataset for climate-informed predictions with a deep learning transformer model.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
28 Dec 2023
Historique:
received: 25 07 2023
accepted: 15 12 2023
medline: 29 12 2023
pubmed: 29 12 2023
entrez: 28 12 2023
Statut: epublish

Résumé

Climatic factors influence malaria transmission via the effect on the Anopheles vector and Plasmodium parasite. Modelling and understanding the complex effects that climate has on malaria incidence can enable important early warning capabilities. Deep learning applications across fields are proving valuable, however the field of epidemiological forecasting is still in its infancy with a lack of applied deep learning studies for malaria in southern Africa which leverage quality datasets. Using a novel high resolution malaria incidence dataset containing 23 years of daily data from 1998 to 2021, a statistical model and XGBOOST machine learning model were compared to a deep learning Transformer model by assessing the accuracy of their numerical predictions. A novel loss function, used to account for the variable nature of the data yielded performance around + 20% compared to the standard MSE loss. When numerical predictions were converted to alert thresholds to mimic use in a real-world setting, the Transformer's performance of 80% according to AUROC was 20-40% higher than the statistical and XGBOOST models and it had the highest overall accuracy of 98%. The Transformer performed consistently with increased accuracy as more climate variables were used, indicating further potential for this prediction framework to predict malaria incidence at a daily level using climate data for southern Africa.

Identifiants

pubmed: 38155182
doi: 10.1038/s41598-023-50176-3
pii: 10.1038/s41598-023-50176-3
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

23091

Informations de copyright

© 2023. The Author(s).

Références

Thomson, M. C. et al. Malaria early warnings based on seasonal climate forecasts from multi-model ensembles. Nature 439(7076), 576–579 (2006).
pubmed: 16452977 doi: 10.1038/nature04503
Hashizume, M., Terao, T. & Minakawa, N. The Indian Ocean Dipole and malaria risk in the highlands of western Kenya. Proc. Natl. Acad. Sci. 106(6), 1857–1862 (2009).
pubmed: 19174522 pmcid: 2644128 doi: 10.1073/pnas.0806544106
Haileselassie, W. et al. Burden of malaria, impact of interventions and climate variability in Western Ethiopia: an area with large irrigation-based farming. BMC Public Health 22(1), 1–11 (2022).
doi: 10.1186/s12889-022-12571-9
Zhou, G., Minakawa, N., Githeko, A. K. & Yan, G. Association between climate variability and malaria epidemics in the East African highlands. Proc. Natl. Acad. Sci. 101(8), 2375–2380 (2004).
pubmed: 14983017 pmcid: 356958 doi: 10.1073/pnas.0308714100
M’Bra, R. K. et al. Impact of climate variability on the transmission risk of malaria in northern Côte d’Ivoire. PLoS One 13(6), e0182304 (2018).
pubmed: 29897901 pmcid: 5999085 doi: 10.1371/journal.pone.0182304
Talapko, J., Škrlec, I., Alebić, T., Jukić, M. & Vćev, A. Malaria: the past and the present. Microorganisms 7(6), 179 (2019).
pubmed: 31234443 pmcid: 6617065 doi: 10.3390/microorganisms7060179
World Health Organization. World Malaria Report 2020 (World Health Organization, 2020).
doi: 10.30875/60123dd4-en
Ohrt, C. et al. Information systems to support surveillance for malaria elimination. Am. J. Trop. Med. Hyg. 93(1), 145 (2015).
pubmed: 26013378 pmcid: 4497887 doi: 10.4269/ajtmh.14-0257
Kim, Y. et al. Malaria predictions based on seasonal climate forecasts in South Africa: A time series distributed lag nonlinear model. Sci. Rep. 9(1), 1–10 (2019).
Santosh, T., Ramesh, D. & Reddy, D. LSTM based prediction of malaria abundances using big data. Comput. Biol. Med. 124, 103859 (2020).
pubmed: 32771672 doi: 10.1016/j.compbiomed.2020.103859
Mohapatra, P., Tripathi, N. K., Pal, I. & Shrestha, S. Comparative analysis of machine learning classifiers for the prediction of malaria incidence attributed to climatic factors.
Masinde, M. Africa's Malaria epidemic predictor: Application of machine learning on malaria incidence and climate data. Proc. of the 2020 the 4th International Conference on Compute and Data Analysis. 29–37 (2020).
Mussumeci, E. & Coelho, F. C. Large-scale multivariate forecasting models for Dengue-LSTM versus random forest regression. Spatial Spatio Temporal Epidemiol. 35, 100372 (2020).
doi: 10.1016/j.sste.2020.100372
Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380(14), 1347–1358 (2019).
pubmed: 30943338 doi: 10.1056/NEJMra1814259
Nkiruka, O., Prasad, R. & Clement, O. Prediction of malaria incidence using climate variability and machine learning. Inf. Med. Unlocked 22, 100508 (2021).
doi: 10.1016/j.imu.2020.100508
Thomson, M. C., Mason, S. J., Phindela, T. & Connor, S. J. Use of rainfall and sea surface temperature monitoring for malaria early warning in Botswana. Am. J. Trop. Med. Hyg. 73(1), 214–221 (2005).
pubmed: 16014862 doi: 10.4269/ajtmh.2005.73.214
Behera, S. K. et al. Malaria incidences in South Africa linked to a climate mode in southwestern Indian Ocean. Environ. Dev.. 27, 47–57 (2018).
doi: 10.1016/j.envdev.2018.07.002
Eikenberry, S. E. & Gumel, A. B. Mathematical modeling of climate change and malaria transmission dynamics: A historical review. J. Math. Biol. 77(4), 857–933 (2018).
pubmed: 29691632 doi: 10.1007/s00285-018-1229-7
Kifle, M. M. et al. Malaria risk stratification and modeling the effect of rainfall on malaria incidence in Eritrea. J. Environ. Public Health 2019, 1–11 (2019).
doi: 10.1155/2019/7314129
Okuneye, K. & Gumel, A. B. Analysis of a temperature-and rainfall-dependent model for malaria transmission dynamics. Math. Biosci. 287, 72–92 (2017).
pubmed: 27107977 doi: 10.1016/j.mbs.2016.03.013
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. & Polosukhin, I. Attention is all you need. In: Advances in neural information processing systems. Vol 30. (2017).
Carmichael, I. & Marron, J. S. Data science vs. statistics: Two cultures?. Jpn. J. Stat. Data Sci. 1(1), 117–138 (2018).
doi: 10.1007/s42081-018-0009-3
Abbasimehr, H. & Baghery, F. S. A novel time series clustering method with fine-tuned support vector regression for customer behavior analysis. Expert Syst. Appl. 204, 117584 (2022).
doi: 10.1016/j.eswa.2022.117584
Xu, J. et al. Forecast of dengue cases in 20 Chinese cities based on the deep learning method. Int. J. Environ. Res. Public Health 17(2), 453 (2020).
pubmed: 31936708 pmcid: 7014037 doi: 10.3390/ijerph17020453
Ho, T. S. et al. Comparing machine learning with case-control models to identify confirmed dengue cases. PLoS Negl. Trop. Dis. 14(11), e0008843 (2020).
pubmed: 33170848 pmcid: 7654779 doi: 10.1371/journal.pntd.0008843
Wang, M. et al. A novel model for malaria prediction based on ensemble algorithms. PloS One 14(12), e0226910 (2019).
pubmed: 31877185 pmcid: 6932799 doi: 10.1371/journal.pone.0226910
Lim, B., Arık, S. Ö., Loeff, N. & Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 37(4), 1748–1764 (2021).
doi: 10.1016/j.ijforecast.2021.03.012
Susan, S. & Kumar, A. The balancing trick: Optimized sampling of imbalanced datasets—a brief survey of the recent state of the art. Eng. Rep. 3(4), e12298 (2021).
doi: 10.1002/eng2.12298
Thickstun, J. The Transformer Model in Equations (University of Washington, 2021).
Bengio, S., Vinyals, O., Jaitly, N. & Shazeer, N. Scheduled sampling for sequence prediction with recurrent neural networks. Advances in Neural Information Processing Systems. 28 (2015).
Mohapatra, P., Tripathi, N. K., Pal, I. & Shrestha, S. Determining suitable machine learning classifier technique for prediction of malaria incidents attributed to climate of Odisha. Int. J. Environ. Health Res. 32(8), 1716–1732 (2022).
pubmed: 33769141 doi: 10.1080/09603123.2021.1905782
Jdey, I., Hcini, G. & Ltifi, H. Deep learning and machine learning for Malaria detection: Overview, challenges and future directions. arXiv preprint arXiv:2209.13292 . (2022).
Munir, M., Siddiqui, S. A., Chattha, M. A., Dengel, A. & Ahmed, S. Fusead: Unsupervised anomaly detection in streaming sensors data by fusing statistical and deep learning models. Sensors 19(11), 2451 (2019).
pubmed: 31146357 pmcid: 6603659 doi: 10.3390/s19112451
Kim, M. Prediction of COVID-19 confirmed cases after vaccination: Based on statistical and deep learning models. Sci. Med. J. 3(2), 153–165 (2021).
Martineau, P. et al. Predicting malaria outbreaks from sea surface temperature variability up to 9 months ahead in Limpopo, South Africa, using machine learning. Front. Pub. Health 25(10), 962377 (2022).
doi: 10.3389/fpubh.2022.962377
Adeola, A. M., Botai, J. O., Olwoch, J. M., Rautenbach, H. C., Adisa, O. M., De Jager, C., Botai, C. M. & Aaron, M. Predicting malaria cases using remotely sensed environmental variables in Nkomazi, South Africa. Geospatial Health. 14(1) (2019).
Mbunge, E., Milham, R. C., Sibiya, M. N. & Jr Takavarasha, S. Machine learning techniques for predicting malaria: Unpacking emerging challenges and opportunities for tackling malaria in sub-saharan Africa. Proc. Computer Science On-line Conference 327–344. (Springer International Publishing, Cham, 2023).
Nguyen, V. H. et al. Deep learning models for forecasting dengue fever based on climate data in Vietnam. PLoS Neglect. Trop. Dis. 16(6), e0010509 (2022).
doi: 10.1371/journal.pntd.0010509
Wu, N., Green, B., Ben, X. & O'Banion, S. Deep transformer models for time series forecasting: The influenza prevalence case. arXiv preprint arXiv:2001.08317 . (2020).
Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A. & Eickhoff, C. A transformer-based framework for multivariate time series representation learning. Proc. of the 27th ACM SIGKDD conference on knowledge discovery & data mining 2114–2124 (2021).
Wang, N. & Zhao, X. Time series forecasting based on convolution transformer. IEICE Trans. Inf. Syst. 106(5), 976–985 (2023).
doi: 10.1587/transinf.2022EDP7136
Xu, C., Li, J., Feng, B. & Lu, B. A financial time-series prediction model based on multiplex attention and linear transformer structure. Appl. Sci. 13(8), 5175 (2023).
doi: 10.3390/app13085175
Ahmed, D. M., Hassan, M. M. & Mstafa, R. J. A review on deep sequential models for forecasting time series data. Appl. Comput. Intell. Soft Comput. 3, 2022 (2022).
Ahmed, S., Nielsen, I. E., Tripathi, A., Siddiqui, S., Rasool, G. & Ramachandran, R. P. Transformers in time-series analysis: A tutorial. arXiv 2022. arXiv preprint arXiv:2205.01138 .
Haugsdal, E., Aune, E. & Ruocco, M. Persistence initialization: A novel adaptation of the transformer architecture for time series forecasting. Appl. Intell. 29, 1–6 (2023).
Mohammadi Farsani, R. & Pazouki, E. A transformer self-attention model for time series forecasting. J. Electric. Comput. Eng. Innov. (JECEI) 9(1), 1 (2020).
Kamana, E., Zhao, J. & Bai, D. Predicting the impact of climate change on the re-emergence of malaria cases in China using LSTMSeq2Seq deep learning model: A modelling and prediction analysis study. BMJ Open. 12(3), e053922 (2022).
pubmed: 35361642 pmcid: 8971767 doi: 10.1136/bmjopen-2021-053922
Teklehaimanot, H. D., Schwartz, J., Teklehaimanot, A. & Lipsitch, M. Alert threshold algorithms and malaria epidemic detection. Emerg. Infect. Dis. 10(7), 1220 (2004).
pubmed: 15324541 pmcid: 3323320 doi: 10.3201/eid1007.030722
Hartfield, M. & Alizon, S. Introducing the outbreak threshold in epidemiology. PLoS Pathog. 9(6), e1003277 (2013).
pubmed: 23785276 pmcid: 3680036 doi: 10.1371/journal.ppat.1003277
Bingham, N. H. & Fry, J. M. Regression: Linear Models in Statistics (Springer Science & Business Media, 2010).
doi: 10.1007/978-1-84882-969-5
Das, A., Kong, W., Sen, R. & Zhou, Y. A decoder-only foundation model for time-series forecasting. arXiv preprint arXiv:2310.10688 . (2023).
Radford, A. et al. Language models are unsupervised multitask learners. Open AI Blog. 1(8), 9 (2019).
NOAA Physical sciences laboratory. NCEP/DOE AMIP-II Reanalysis (Reanalysis-2) Data. NOAA physical sciences laboratory. Available from: https://psl.noaa.gov/data/gridded/ data.ncep.reanalysis2.html. Accessed March 2023.
Liu, M., Ren, S., Ma, S., Jiao, J., Chen, Y., Wang, Z. & Song, W. Gated transformer networks for multivariate time series classification. arXiv preprint arXiv:2103.14438 . (2021).
Chu J, Cao J, Chen Y. An ensemble deep learning model based on transformers for long sequence time-series forecasting. Proc. International Conference on Neural Computing for Advanced Applications 273–286 (Springer Nature, Singapore, 2022).
Liu, C., Yu, S., Yu, M., Wei, B., Li, B., Li, G. & Huang, W. Adaptive smooth L1 loss: A better way to regress scene texts with extreme aspect ratios. Proc. 2021 IEEE Symposium on Computers and Communications (ISCC) 1–7 (IEEE, 2021).

Auteurs

Micheal T Pillay (MT)

Department of Vector Ecology and Environment, Institute of Tropical Medicine (NEKKEN), Nagasaki University, 1-12-4, Sakamoto, Nagasaki City, 852-8523, Japan. michaelteron@gmail.com.
Graduate of School of Biomedical Sciences, Nagasaki University, Nagasaki City, Japan. michaelteron@gmail.com.

Noboru Minakawa (N)

Department of Vector Ecology and Environment, Institute of Tropical Medicine (NEKKEN), Nagasaki University, 1-12-4, Sakamoto, Nagasaki City, 852-8523, Japan.

Yoonhee Kim (Y)

Department of Global Environmental Health, Graduate School of Medicine, The University of Tokyo: The University of Tokyo, 7-3-1 Hongo, Bunkyo Ward, Tokyo, 113-8654, Japan.

Nyakallo Kgalane (N)

Limpopo Department of Health, Malaria Control: 18 College Street, Polokwane, 0700, South Africa.

Jayanthi V Ratnam (JV)

Application Laboratory, Japan Agency for Marine-Earth Science and Technology (JAMSTEC), 3173-25, Showa-Machi, Kanazawa-Ku, Yokohama-City, Kanagawa, 236-0001, Japan.

Swadhin K Behera (SK)

Application Laboratory, Japan Agency for Marine-Earth Science and Technology (JAMSTEC), 3173-25, Showa-Machi, Kanazawa-Ku, Yokohama-City, Kanagawa, 236-0001, Japan.

Masahiro Hashizume (M)

Graduate School of Medicine Department of Global Health Policy, The University of Tokyo: The University of Tokyo, 7-3-1 Hongo, Bunkyo Ward, Tokyo, 113-8654, Japan.

Neville Sweijd (N)

Alliance for Collaboration on Climate & Earth Systems Science (ACCESS), CSIR, Lower Hope Road, Rosebank, 770, Cape Town, South Africa.

Classifications MeSH