Machine learning and explainable artificial intelligence for the prevention of waterborne cryptosporidiosis and giardiosis.
Cryptosporium
Explainable artificial intelligence
Giardia
Machine learning
Monitoring system
Waterborne outbreak
Journal
Water research
ISSN: 1879-2448
Titre abrégé: Water Res
Pays: England
ID NLM: 0105072
Informations de publication
Date de publication:
22 Jul 2024
22 Jul 2024
Historique:
received:
22
03
2024
revised:
21
06
2024
accepted:
15
07
2024
medline:
23
7
2024
pubmed:
23
7
2024
entrez:
23
7
2024
Statut:
aheadofprint
Résumé
Cryptosporidium and Giardia are important parasitic protozoa due to their zoonotic potential and impact on human health, and have often caused waterborne outbreaks of disease. Detection of (oo)cysts in water matrices is challenging and extremely costly, thus only few countries have legislated for regular monitoring of drinking water for their presence. Several attempts have been made trying to investigate the association between the presence of such (oo)cysts in waters with other biotic or abiotic factors, with inconclusive findings. In this regard, the aim of this study was the development of an holistic approach leveraging Machine Learning (ML) and eXplainable Artificial Intelligence (XAI) techniques, in order to provide empirical evidence related to the presence and prediction of Cryptosporidium oocysts and Giardia cysts in water samples. To meet this objective, we initially modelled the complex relationship between Cryptosporidium and Giardia (oo)cysts and a set of parasitological, microbiological, physicochemical and meteorological parameters via a model-agnostic meta-learner algorithm that provides flexibility regarding the selection of the ML model executing the fitting task. Based on this generic approach, a set of four well-known ML candidates were, empirically, evaluated in terms of their predictive capabilities. Then, the best-performed algorithms, were further examined through XAI techniques for gaining meaningful insights related to the explainability and interpretability of the derived solutions. The findings reveal that the Random Forest achieves the highest prediction performance when the objective is the prediction of both contamination and contamination intensity with Cryptosporidium oocysts in a given water sample, with meteorological/physicochemical and microbiological markers being informative, respectively. For the prediction of contamination with Giardia, the eXtreme Gradient Boosting with physicochemical parameters was the most efficient algorithm, while, the Support Vector Regression that takes into consideration both microbiological and meteorological markers was more efficient for evaluating the contamination intensity with cysts. The results of the study designate that the adoption of ML and XAI approaches can be considered as a valuable tool for unveiling the complicated correlation of the presence and contamination intensity with these zoonotic parasites that could constitute, in turn, a basis for the development of monitoring platforms and early warning systems for the prevention of waterborne disease outbreaks.
Identifiants
pubmed: 39042970
pii: S0043-1354(24)01010-8
doi: 10.1016/j.watres.2024.122110
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
122110Informations de copyright
Copyright © 2024. Published by Elsevier Ltd.
Déclaration de conflit d'intérêts
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.