A reproducible ensemble machine learning approach to forecast dengue outbreaks.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
15 Feb 2024
Historique:
received: 15 09 2023
accepted: 23 01 2024
medline: 16 2 2024
pubmed: 16 2 2024
entrez: 15 2 2024
Statut: epublish

Résumé

Dengue fever, a prevalent and rapidly spreading arboviral disease, poses substantial public health and economic challenges in tropical and sub-tropical regions worldwide. Predicting infectious disease outbreaks on a countrywide scale is complex due to spatiotemporal variations in dengue incidence across administrative areas. To address this, we propose a machine learning ensemble model for forecasting the dengue incidence rate (DIR) in Brazil, with a focus on the population under 19 years old. The model integrates spatial and temporal information, providing one-month-ahead DIR estimates at the state level. Comparative analyses with a dummy model and ablation studies demonstrate the ensemble model's qualitative and quantitative efficacy across the 27 Brazilian Federal Units. Furthermore, we showcase the transferability of this approach to Peru, another Latin American country with differing epidemiological characteristics. This timely forecast system can aid local governments in implementing targeted control measures. The study advances climate services for health by identifying factors triggering dengue outbreaks in Brazil and Peru, emphasizing collaborative efforts with intergovernmental organizations and public health institutions. The innovation lies not only in the algorithms themselves but in their application to a domain marked by data scarcity and operational scalability challenges. We bridge the gap by integrating well-curated ground data with advanced analytical methods, addressing a significant deficiency in current practices. The successful transfer of the model to Peru and its consistent performance during the 2019 outbreak in Brazil showcase its scalability and practical application. While acknowledging limitations in handling extreme values, especially in regions with low DIR, our approach excels where accurate predictions are critical. The study not only contributes to advancing DIR forecasting but also represents a paradigm shift in integrating advanced analytics into public health operational frameworks. This work, driven by a collaborative spirit involving intergovernmental organizations and public health institutions, sets a precedent for interdisciplinary collaboration in addressing global health challenges. It not only enhances our understanding of factors triggering dengue outbreaks but also serves as a template for the effective implementation of advanced analytical methods in public health.

Identifiants

pubmed: 38360915
doi: 10.1038/s41598-024-52796-9
pii: 10.1038/s41598-024-52796-9
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

3807

Informations de copyright

© 2024. The Author(s).

Références

Buczak, A. L. et al. Ensemble method for dengue prediction. PLoS ONE 13, e0189988 (2018).
doi: 10.1371/journal.pone.0189988 pubmed: 29298320 pmcid: 5752022
Messina, J. P. et al. The current and future global distribution and population at risk of dengue. Nat. Microbiol. 4, 1508–1515 (2019).
doi: 10.1038/s41564-019-0476-8 pubmed: 31182801 pmcid: 6784886
Pinheiro, F. P. & Corber, S. J. Global situation of dengue and dengue haemorrhagic fever, and its emergence in the Americas. World health statistics quarterly. Rapport trimestriel de statistiques sanitaires mondiales 50, 161–169 (1997).
Hammond, S. N. et al. Differences in dengue severity in infants, children, and adults in a 3-year hospital-based study in Nicaragua. Am. J. Trop. Med. Hyg. 73, 1063–1070 (2005).
doi: 10.4269/ajtmh.2005.73.1063 pubmed: 16354813
Hales, S. & van Panhuis, W. A new strategy for dengue control. Lancet 365, 551–551 (2005).
doi: 10.1016/S0140-6736(05)70772-8 pubmed: 15708083
Wen, T.-H., Lin, M.-H., Teng, H.-J. & Chang, N.-T. Incorporating the human-aedes mosquito interactions into measuring the spatial risk of urban dengue fever. Appl. Geogr. 62, 256–266 (2015).
doi: 10.1016/j.apgeog.2015.05.003
Colón-González, F. J. et al. Projecting the risk of mosquito-borne diseases in a warmer and more populated world: a multi-model, multi-scenario intercomparison modelling study. Lancet Planetary Health5, e404–e414. https://doi.org/10.1016/s2542-5196(21)00132-7 (2021).
Gubler, D. J. Dengue, urbanization and globalization: the unholy trinity of the 21st century. Trop. Med. Health 39, S3–S11 (2011).
doi: 10.2149/tmh.2011-S05
Lowe, R. et al. Spatio-temporal modelling of climate-sensitive disease risk: towards an early warning system for dengue in Brazil. Comput. Geosci. 37, 371–381 (2011).
doi: 10.1016/j.cageo.2010.01.008
Fitzpatrick, C. & Engels, D. Leaving no one behind: a neglected tropical disease indicator and tracers for the sustainable development goals. Int. Health 8, i15–i18 (2016).
doi: 10.1093/inthealth/ihw002 pubmed: 26940304 pmcid: 4777229
Yboa, B. C. & Labrague, L. J. Dengue knowledge and preventive practices among rural residents in Samar province, Philippines. Am. J. Public Health Res. 1, 47–52 (2013).
doi: 10.12691/ajphr-1-2-2
Innocenti, UNICEF. Best of UNICEF Research 2022, Miscellanea. UNICEF Innocenti - Global Office of Research and Foresight, Florence, Italy (2022). ISBN: 978-88-652-2068-9.
United Nations Children’s Fund (UNICEF). The Climate Crisis is a Child Rights Crisis: Introducing the Children’s Climate Risk Index. New York, US (2021). ISBN: 978-92-806-5276-5.
Luz, P. M., Mendes, B. V. M., Codeço, C. T., Struchiner, C. J. & Galvani, A. P. Time series analysis of dengue incidence in Rio de Janeiro, Brazil. Am. J. Trop. Med. Hyg. 79, 933–939 (2008).
doi: 10.4269/ajtmh.2008.79.933 pubmed: 19052308
Lima, M. V. M. d. & Laporta, G. Z. Evaluation of the models for forecasting dengue in Brazil from 2000 to 2017: An ecological time-series study. Insects, 11, 794 (2020).
Stolerman, L. M., Maia, P. D. & Kutz, J. N. Forecasting dengue fever in Brazil: an assessment of climate conditions. PLoS ONE 14, e0220106 (2019).
doi: 10.1371/journal.pone.0220106 pubmed: 31393908 pmcid: 6687106
Souza, C., Maia, P., Stolerman, L. M., Rolla, V. & Velho, L. Predicting dengue outbreaks in brazil with manifold learning on climate data. Expert Syst. Appl. 192, 116324 (2022).
doi: 10.1016/j.eswa.2021.116324
McGough, S. F., Clemente, L., Kutz, J. N. & Santillana, M. A dynamic, ensemble learning approach to forecast dengue fever epidemic years in brazil using weather and population susceptibility cycles. J. R. Soc. Interface 18, 20201006 (2021).
doi: 10.1098/rsif.2020.1006 pubmed: 34129785 pmcid: 8205538
Siregar, F. & Makmur, T. Time series analysis of dengue hemorrhagic fever cases and climate: a model for dengue prediction. J. Phys.: Conf. Ser., vol. 1235, 012072 (IOP Publishing, 2019).
Baquero, O. S., Santana, L. M. R. & Chiaravalloti-Neto, F. Dengue forecasting in são paulo city with generalized additive models, artificial neural networks and seasonal autoregressive integrated moving average models. PLoS ONE 13, e0195065 (2018).
doi: 10.1371/journal.pone.0195065 pubmed: 29608586 pmcid: 5880372
Buczak, A. L., Koshute, P. T., Babin, S. M., Feighner, B. H. & Lewis, S. H. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med. Inform. Decis. Mak. 12, 1–20 (2012).
doi: 10.1186/1472-6947-12-124
Benedum, C. M., Shea, K. M., Jenkins, H. E., Kim, L. Y. & Markuzon, N. Weekly dengue forecasts in iquitos, peru; san juan, puerto rico; and singapore. PLoS Negl. Trop. Dis. 14, e0008710 (2020).
doi: 10.1371/journal.pntd.0008710 pubmed: 33064770 pmcid: 7567393
Deb, S., Acebedo, C. M. L., Dhanapal, G. & Heng, C. M. C. An ensemble prediction approach to weekly dengue cases forecasting based on climatic and terrain conditions. J. Health Soc. Sci. 2, 257–272 (2017).
Colón-González, F. J. et al. Probabilistic seasonal dengue forecasting in vietnam: A modelling study using superensembles. PLOS Med.18, e1003542, https://doi.org/10.1371/journal.pmed.1003542 (2021).
Bavia, L. et al. Epidemiological study on dengue in southern Brazil under the perspective of climate and poverty. Sci. Rep. 10, 1–16 (2020).
doi: 10.1038/s41598-020-58542-1
Cianci, D., Hartemink, N. & Ibáñez-Justicia, A. Modelling the potential spatial distribution of mosquito species using three different techniques. Int. J. Health Geogr. 14, 1–10 (2015).
doi: 10.1186/s12942-015-0001-0
Althouse, B. M., Ng, Y. Y. & Cummings, D. A. Prediction of dengue incidence using search query surveillance. PLoS Negl. Trop. Dis. 5, e1258 (2011).
doi: 10.1371/journal.pntd.0001258 pubmed: 21829744 pmcid: 3149016
Espina, K. & Estuar, M. R. J. E. Infodemiology for syndromic surveillance of dengue and typhoid fever in the Philippines. Procedia Comput. Sci. 121, 554–561 (2017).
doi: 10.1016/j.procs.2017.11.073
Sani, A. et al. Bayesian temporal, spatial and spatio-temporal models of dengue in a small area with inla. Int. J. Model. Simul., 1–13 (2022).
Chou-Chen, S.-W. et al. Bayesian spatio-temporal model with inla for dengue fever risk prediction in costa rica. arXiv preprint arXiv:2302.06747 (2023).
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning, vol. 112 (Springer, 2013).
Kornblith, S., Chen, T., Lee, H. & Norouzi, M. Why do better loss functions lead to less transferable features? Adv. Neural Inf. Process. Syst.34 (2021).
Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. Classification and Regression Trees (Routledge, 2017).
Ibragimov, B. & Gusev, G. Minimal variance sampling in stochastic gradient boosting. Advances in Neural Information Processing Systems32 (2019).
Huang, G. et al. Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J. Hydrol.574, 1029–1041, https://doi.org/10.1016/j.jhydrol.2019.04.085 (2019).
Jabeur, S. B., Gharib, C., Mefteh-Wali, S. & Arfi, W. B. CatBoost model and artificial intelligence techniques for corporate failure prediction. Technol. Forecast. Soc. Change . 166, 120658, https://doi.org/10.1016/j.techfore.2021.120658 (2021).
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V. & Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, 6639-6649, https://doi.org/10.5555/3327757.3327770 (Curran Associates Inc, 2018).
Dorogush, A. V., Ershov, V. & Gulin, A. CatBoost: Gradient boosting with categorical features support. In Proceedings of the Workshop on ML Systems at NIPS 2017, NIPS 2017 (2017).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat.29, 1189 – 1232, https://doi.org/10.1214/aos/1013203451 (2001).
Vapnik, V. N. The Nature of Statistical Learning Theory (Springer, 1995).
Awad, M. & Khanna, R. Support Vector Regression, 67–80 (Apress, 2015).
Hüsken, M. & Stagge, P. Recurrent neural networks for time series classification. Neurocomputing50, 223–235, https://doi.org/10.1016/S0925-2312(01)00706-8 (2003).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
doi: 10.1162/neco.1997.9.8.1735 pubmed: 9377276
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
doi: 10.1023/A:1010933404324
Meinshausen, N. Quantile regression forests. J. Mach. Learn. Res. 7, 983–999 (2006).
Sistema de Informação de Agravos de Notificação. Accessed on 09 Feb 2022.
Instituto Brasileiro de Geografia e Estatística. Accessed on 09 Feb 2022.
Muñoz Sabater, J. et al. ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021 (2021).
U.S. Geological Survey (USGS) and the National Aeronautics and Space Administration (NASA) Land Processes Distributed Active Archive Center (LP DAAC). MODIS/Terra Surface Reflectance Daily L2G Global 1 km and 500 m. Accessed on 16 Feb 2022.
Jarvis, A., Guevara, E., Reuter, H. & Nelson, A. Hole-filled srtm for the globe: version 4: Data grid (2008). Published by CGIAR-CSI on 19 August 2008.
University of Maryland Global Forest Change 2000–2020. Accessed on 16 Feb 2022.
GitHub repository for “A reproducible ensemble machine learning approach to forecast dengue outbreaks”. https://github.com/ESA-PhiLab/ESA-UNICEF_DengueForecastProject . Accessed on 9 June 2022.
Hansen, M. et al. High-resolution global maps of 21st-century forest cover change. Science 342, 850–853. https://doi.org/10.1126/science.1244693 (2013).
Gorelick, N. et al. Google earth engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ.202, 18–27, https://doi.org/10.1016/j.rse.2017.06.031 (2017).
Lowe, R. et al. Combined effects of hydrometeorological hazards and urbanisation on dengue risk in brazil: A spatiotemporal modelling study. Lancet Planetary Health 5, e209–e219 (2021).
doi: 10.1016/S2542-5196(20)30292-8 pubmed: 33838736
Lowe, R. et al. Dengue outlook for the world cup in brazil: An early warning model framework driven by real-time seasonal climate forecasts. Lancet. Infect. Dis 14, 619–626 (2014).
doi: 10.1016/S1473-3099(14)70781-9 pubmed: 24841859
Singh, D. & Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020).
doi: 10.1016/j.asoc.2019.105524
Atluri, G., Karpatne, A. & Kumar, V. Spatio-temporal data mining: A survey of problems and methods. ACM Comput. Surv.51, https://doi.org/10.1145/3161602 (2018).
Quinn, J., McEachen, J., Fullan, M., Gardner, M. & Drummy, M. Dive into deep learning: Tools for engagement (Corwin Press, 2019).

Auteurs

Alessandro Sebastianelli (A)

Engineering Department, University of Sannio, Benevento, Italy. alessandro.sebastianelli@esa.int.
European Space Agency, Φ-lab, Frascati, Italy. alessandro.sebastianelli@esa.int.

Dario Spiller (D)

School of Aerospace Engineering, Sapienza University of Rome, Rome, Italy.

Raquel Carmo (R)

European Space Agency, Φ-lab, Frascati, Italy.

James Wheeler (J)

European Space Agency, Φ-lab, Frascati, Italy.

Artur Nowakowski (A)

Faculty of Geodesy and Cartography, Warsaw University of Technology, Warsaw, Poland.

Ludmilla Viana Jacobson (LV)

Statistics Department, Fluminense Federal University, Niterói, Brazil.

Dohyung Kim (D)

UNICEF, New York, NY, USA.

Hanoch Barlevi (H)

UNICEF, New York, NY, USA.

Zoraya El Raiss Cordero (ZER)

UNICEF, New York, NY, USA.

Felipe J Colón-González (FJ)

Wellcome Trust, Data for Science and Health, London, UK.
Centre on Climate Change and Planetary Health and Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK.
Tyndall Centre for Climate Change Research, School of Environmental Sciences, University of East Anglia, Norwich, UK.

Rachel Lowe (R)

Centre on Climate Change and Planetary Health and Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK.
Barcelona Supercomputing Center (BSC), Barcelona, Spain.
Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain.

Silvia Liberata Ullo (SL)

Engineering Department, University of Sannio, Benevento, Italy.

Rochelle Schneider (R)

European Space Agency, Φ-lab, Frascati, Italy. rochelle.schneider@esa.int.

Classifications MeSH