Generalizability of machine learning in predicting antimicrobial resistance in E. coli: a multi-country case study in Africa.

E. coli Africa Antimicrobial resistance Machine learning Whole-genome sequencing

Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
18 Mar 2024
Historique:
received: 28 09 2023
accepted: 11 03 2024
medline: 19 3 2024
pubmed: 19 3 2024
entrez: 19 3 2024
Statut: epublish

Résumé

Antimicrobial resistance (AMR) remains a significant global health threat particularly impacting low- and middle-income countries (LMICs). These regions often grapple with limited healthcare resources and access to advanced diagnostic tools. Consequently, there is a pressing need for innovative approaches that can enhance AMR surveillance and management. Machine learning (ML) though underutilized in these settings, presents a promising avenue. This study leverages ML models trained on whole-genome sequencing data from England, where such data is more readily available, to predict AMR in E. coli, targeting key antibiotics such as ciprofloxacin, ampicillin, and cefotaxime. A crucial part of our work involved the validation of these models using an independent dataset from Africa, specifically from Uganda, Nigeria, and Tanzania, to ascertain their applicability and effectiveness in LMICs. Model performance varied across antibiotics. The Support Vector Machine excelled in predicting ciprofloxacin resistance (87% accuracy, F1 Score: 0.57), Light Gradient Boosting Machine for cefotaxime (92% accuracy, F1 Score: 0.42), and Gradient Boosting for ampicillin (58% accuracy, F1 Score: 0.66). In validation with data from Africa, Logistic Regression showed high accuracy for ampicillin (94%, F1 Score: 0.97), while Random Forest and Light Gradient Boosting Machine were effective for ciprofloxacin (50% accuracy, F1 Score: 0.56) and cefotaxime (45% accuracy, F1 Score:0.54), respectively. Key mutations associated with AMR were identified for these antibiotics. As the threat of AMR continues to rise, the successful application of these models, particularly on genomic datasets from LMICs, signals a promising avenue for improving AMR prediction to support large AMR surveillance programs. This work thus not only expands our current understanding of the genetic underpinnings of AMR but also provides a robust methodological framework that can guide future research and applications in the fight against AMR.

Sections du résumé

BACKGROUND BACKGROUND
Antimicrobial resistance (AMR) remains a significant global health threat particularly impacting low- and middle-income countries (LMICs). These regions often grapple with limited healthcare resources and access to advanced diagnostic tools. Consequently, there is a pressing need for innovative approaches that can enhance AMR surveillance and management. Machine learning (ML) though underutilized in these settings, presents a promising avenue. This study leverages ML models trained on whole-genome sequencing data from England, where such data is more readily available, to predict AMR in E. coli, targeting key antibiotics such as ciprofloxacin, ampicillin, and cefotaxime. A crucial part of our work involved the validation of these models using an independent dataset from Africa, specifically from Uganda, Nigeria, and Tanzania, to ascertain their applicability and effectiveness in LMICs.
RESULTS RESULTS
Model performance varied across antibiotics. The Support Vector Machine excelled in predicting ciprofloxacin resistance (87% accuracy, F1 Score: 0.57), Light Gradient Boosting Machine for cefotaxime (92% accuracy, F1 Score: 0.42), and Gradient Boosting for ampicillin (58% accuracy, F1 Score: 0.66). In validation with data from Africa, Logistic Regression showed high accuracy for ampicillin (94%, F1 Score: 0.97), while Random Forest and Light Gradient Boosting Machine were effective for ciprofloxacin (50% accuracy, F1 Score: 0.56) and cefotaxime (45% accuracy, F1 Score:0.54), respectively. Key mutations associated with AMR were identified for these antibiotics.
CONCLUSION CONCLUSIONS
As the threat of AMR continues to rise, the successful application of these models, particularly on genomic datasets from LMICs, signals a promising avenue for improving AMR prediction to support large AMR surveillance programs. This work thus not only expands our current understanding of the genetic underpinnings of AMR but also provides a robust methodological framework that can guide future research and applications in the fight against AMR.

Identifiants

pubmed: 38500034
doi: 10.1186/s12864-024-10214-4
pii: 10.1186/s12864-024-10214-4
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

287

Informations de copyright

© 2024. The Author(s).

Références

Laxminarayan R, Duse A, Wattal C, et al. Antibiotic resistance-the need for global solutions. Lancet Infect Dis. 2013;13:1057–98.
doi: 10.1016/S1473-3099(13)70318-9 pubmed: 24252483
Refugees UNHC. for. Refworld| Transforming our world: the 2030 Agenda for Sustainable Development. Refworld, https://www.refworld.org/docid/57b6e3e44.html (accessed 27 September 2023).
160518_Final paper_with. cover.pdf, https://amr-review.org/sites/default/files/160518_Final%20paper_with%20cover.pdf (accessed 27 September 2023).
Murray CJL, Ikuta KS, Sharara F, et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet. 2022;399:629–55.
doi: 10.1016/S0140-6736(21)02724-0
Nabadda S, Kakooza F, Kiggundu R, et al. Implementation of the World Health Organization Global Antimicrobial Resistance Surveillance System in Uganda, 2015–2020: mixed-methods study using National Surveillance Data. JMIR Public Health Surveill. 2021;7:e29954.
doi: 10.2196/29954 pubmed: 34673531 pmcid: 8569544
Ingle DJ, Levine MM, Kotloff KL, et al. Dynamics of antimicrobial resistance in intestinal Escherichia coli from children in community settings in South Asia and sub-saharan Africa. Nat Microbiol. 2018;3:1063–73.
doi: 10.1038/s41564-018-0217-4 pubmed: 30127495 pmcid: 6787116
Amr NGHRU. on GS of. Whole-genome sequencing as part of national and international surveillance programmes for antimicrobial resistance: a roadmap. BMJ Global Health 2020; 5: e002244.
Katyali D, Kawau G, Blomberg B, et al. Antibiotic use at a tertiary hospital in Tanzania: findings from a point prevalence survey. Antimicrob Resist Infect Control. 2023;12:112.
doi: 10.1186/s13756-023-01317-w pubmed: 37817204 pmcid: 10566109
Achi CR, Ayobami O, Mark G, et al. Operationalising One Health in Nigeria: reflections from a High-Level Expert Panel discussion commemorating the 2020 World Antibiotics Awareness Week. Front Public Health. 2021;9:673504.
doi: 10.3389/fpubh.2021.673504 pubmed: 34136458 pmcid: 8203202
Zankari E, Hasman H, Cosentino S, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67:2640–4.
doi: 10.1093/jac/dks261 pubmed: 22782487 pmcid: 3468078
McArthur AG, Waglechner N, Nizam F, et al. The comprehensive antibiotic resistance database. Antimicrob Agents Chemother. 2013;57:3348–57.
doi: 10.1128/AAC.00419-13 pubmed: 23650175 pmcid: 3697360
Her H-L, Wu Y-W. A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains. Bioinformatics. 2018;34:i89–95.
doi: 10.1093/bioinformatics/bty276 pubmed: 29949970 pmcid: 6022653
Wheeler NE, Gardner PP, Barquist L. Machine learning identifies signatures of host adaptation in the bacterial pathogen Salmonella enterica. PLoS Genet. 2018;14:e1007333.
doi: 10.1371/journal.pgen.1007333 pubmed: 29738521 pmcid: 5940178
Sf R, Mr O, Mj M et al. Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome. mSystems; 3. Epub ahead of print 9 January 2018. https://doi.org/10.1128/mSystems.00123-17 .
Ren Y, Chakraborty T, Doijad S, et al. Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning. Bioinformatics. 2022;38:325–34.
doi: 10.1093/bioinformatics/btab681 pubmed: 34613360
Mw P. T H, M W, Evaluation of Machine Learning and Rules-Based Approaches for Predicting Antimicrobial Resistance Profiles in Gram-negative Bacilli from Whole Genome Sequence Data. Frontiers in microbiology; 7. Epub ahead of print 28 November 2016. https://doi.org/10.3389/fmicb.2016.01887 .
M N, T B, Sw L, et al. Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae. Scientific reports; 8. Epub ahead of print 11 January 2018. https://doi.org/10.1038/s41598-017-18972-w .
Antonopoulos DA, Assaf R, Aziz RK, et al. PATRIC as a unique resource for studying antimicrobial resistance. Brief Bioinform. 2019;20:1094–102.
doi: 10.1093/bib/bbx083 pubmed: 28968762
Onywera H, Ondoa P, Nfii F, et al. Boosting pathogen genomics and bioinformatics workforce in Africa. Lancet Infect Dis. 2024;24:e106–12.
doi: 10.1016/S1473-3099(23)00394-8 pubmed: 37778362
Kallonen T, Brodrick HJ, Harris SR, et al. Systematic longitudinal survey of invasive Escherichia coli in England demonstrates a stable population structure only transiently disturbed by the emergence of ST131. Genome Res. 2017;27:1437–49.
doi: 10.1101/gr.216606.116 pubmed: 28720578 pmcid: 5538559
Stanley IJ, Kajumbula H, Bazira J, et al. Multidrug resistance among Escherichia coli and Klebsiella pneumoniae carried in the gut of out-patients from pastoralist communities of Kasese district, Uganda. PLoS ONE. 2018;13:e0200093.
doi: 10.1371/journal.pone.0200093 pubmed: 30016317 pmcid: 6049918
Sserwadda I, Kidenya BR, Kanyerezi S, et al. Unraveling virulence determinants in extended-spectrum beta-lactamase-producing Escherichia coli from East Africa using whole-genome sequencing. BMC Infect Dis. 2023;23:587.
doi: 10.1186/s12879-023-08579-0 pubmed: 37679664 pmcid: 10483776
Afolayan AO, Aboderin AO, Oaikhena AO, et al. An ST131 clade and a phylogroup a clade bearing an O101-like O-antigen cluster predominate among bloodstream Escherichia coli isolates from South-West Nigeria hospitals. Microb Genomics. 2022;8:000863.
doi: 10.1099/mgen.0.000863
Chen S, Zhou Y, Chen Y, et al. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90.
doi: 10.1093/bioinformatics/bty560 pubmed: 30423086 pmcid: 6129281
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
doi: 10.1093/bioinformatics/btp324 pubmed: 19451168 pmcid: 2705234
Danecek P, Bonfield JK, Liddle J, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10:giab008.
doi: 10.1093/gigascience/giab008 pubmed: 33590861 pmcid: 7931819
Li H, Handsaker B, Wysoker A, et al. The sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9.
doi: 10.1093/bioinformatics/btp352 pubmed: 19505943 pmcid: 2723002
Pordes TOSGEB on behalf of the OC, Petravick D, Kramer B, et al. The open science grid. J Phys: Conf Ser. 2007;78:012057.
Sfiligoi I, Bradley DC, Holzman B et al. The Pilot Way to Grid Resources Using glideinWMS. In: 2009 WRI World Congress on Computer Science and Information Engineering, pp. 428–432.
5-6-32-656.pdf, https://www.mathsjournal.com/pdf/2021/vol6issue1/PartA/5-6-32-656.pdf (accessed 27 September 2023).

Auteurs

Mike Nsubuga (M)

Department of Immunology and Molecular Biology, School of Biomedical Sciences, College of Health Sciences, Makerere University, P.O Box 7072, Kampala, Uganda.
The African Center of Excellence in Bioinformatics and Data-Intensive Sciences, Infectious Diseases Institute, College of Health Sciences, Makerere University, P.O Box 22418, Kampala, Uganda.
Faculty of Health Sciences, University of Bristol, Bristol, BS40 5DU, UK.
Jean Golding Institute, University of Bristol, Bristol, BS8 1UH, UK.

Ronald Galiwango (R)

Department of Immunology and Molecular Biology, School of Biomedical Sciences, College of Health Sciences, Makerere University, P.O Box 7072, Kampala, Uganda.
The African Center of Excellence in Bioinformatics and Data-Intensive Sciences, Infectious Diseases Institute, College of Health Sciences, Makerere University, P.O Box 22418, Kampala, Uganda.

Daudi Jjingo (D)

Department of Computer Science, College of Computing and Information Sciences, Makerere University, P.O Box 7062, Kampala, Uganda.
The African Center of Excellence in Bioinformatics and Data-Intensive Sciences, Infectious Diseases Institute, College of Health Sciences, Makerere University, P.O Box 22418, Kampala, Uganda.

Gerald Mboowa (G)

Department of Immunology and Molecular Biology, School of Biomedical Sciences, College of Health Sciences, Makerere University, P.O Box 7072, Kampala, Uganda. gmboowa@gmail.com.
The African Center of Excellence in Bioinformatics and Data-Intensive Sciences, Infectious Diseases Institute, College of Health Sciences, Makerere University, P.O Box 22418, Kampala, Uganda. gmboowa@gmail.com.
Africa Centres for Disease Control and Prevention, African Union Commission, P.O Box 3243, Roosevelt Street, Addis Ababa, W21 K19, Ethiopia. gmboowa@gmail.com.

Classifications MeSH