Generalizability of machine learning in predicting antimicrobial resistance in E. coli: a multi-country case study in Africa.
E. coli
Africa
Antimicrobial resistance
Machine learning
Whole-genome sequencing
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
18 Mar 2024
18 Mar 2024
Historique:
received:
28
09
2023
accepted:
11
03
2024
medline:
19
3
2024
pubmed:
19
3
2024
entrez:
19
3
2024
Statut:
epublish
Résumé
Antimicrobial resistance (AMR) remains a significant global health threat particularly impacting low- and middle-income countries (LMICs). These regions often grapple with limited healthcare resources and access to advanced diagnostic tools. Consequently, there is a pressing need for innovative approaches that can enhance AMR surveillance and management. Machine learning (ML) though underutilized in these settings, presents a promising avenue. This study leverages ML models trained on whole-genome sequencing data from England, where such data is more readily available, to predict AMR in E. coli, targeting key antibiotics such as ciprofloxacin, ampicillin, and cefotaxime. A crucial part of our work involved the validation of these models using an independent dataset from Africa, specifically from Uganda, Nigeria, and Tanzania, to ascertain their applicability and effectiveness in LMICs. Model performance varied across antibiotics. The Support Vector Machine excelled in predicting ciprofloxacin resistance (87% accuracy, F1 Score: 0.57), Light Gradient Boosting Machine for cefotaxime (92% accuracy, F1 Score: 0.42), and Gradient Boosting for ampicillin (58% accuracy, F1 Score: 0.66). In validation with data from Africa, Logistic Regression showed high accuracy for ampicillin (94%, F1 Score: 0.97), while Random Forest and Light Gradient Boosting Machine were effective for ciprofloxacin (50% accuracy, F1 Score: 0.56) and cefotaxime (45% accuracy, F1 Score:0.54), respectively. Key mutations associated with AMR were identified for these antibiotics. As the threat of AMR continues to rise, the successful application of these models, particularly on genomic datasets from LMICs, signals a promising avenue for improving AMR prediction to support large AMR surveillance programs. This work thus not only expands our current understanding of the genetic underpinnings of AMR but also provides a robust methodological framework that can guide future research and applications in the fight against AMR.
Sections du résumé
BACKGROUND
BACKGROUND
Antimicrobial resistance (AMR) remains a significant global health threat particularly impacting low- and middle-income countries (LMICs). These regions often grapple with limited healthcare resources and access to advanced diagnostic tools. Consequently, there is a pressing need for innovative approaches that can enhance AMR surveillance and management. Machine learning (ML) though underutilized in these settings, presents a promising avenue. This study leverages ML models trained on whole-genome sequencing data from England, where such data is more readily available, to predict AMR in E. coli, targeting key antibiotics such as ciprofloxacin, ampicillin, and cefotaxime. A crucial part of our work involved the validation of these models using an independent dataset from Africa, specifically from Uganda, Nigeria, and Tanzania, to ascertain their applicability and effectiveness in LMICs.
RESULTS
RESULTS
Model performance varied across antibiotics. The Support Vector Machine excelled in predicting ciprofloxacin resistance (87% accuracy, F1 Score: 0.57), Light Gradient Boosting Machine for cefotaxime (92% accuracy, F1 Score: 0.42), and Gradient Boosting for ampicillin (58% accuracy, F1 Score: 0.66). In validation with data from Africa, Logistic Regression showed high accuracy for ampicillin (94%, F1 Score: 0.97), while Random Forest and Light Gradient Boosting Machine were effective for ciprofloxacin (50% accuracy, F1 Score: 0.56) and cefotaxime (45% accuracy, F1 Score:0.54), respectively. Key mutations associated with AMR were identified for these antibiotics.
CONCLUSION
CONCLUSIONS
As the threat of AMR continues to rise, the successful application of these models, particularly on genomic datasets from LMICs, signals a promising avenue for improving AMR prediction to support large AMR surveillance programs. This work thus not only expands our current understanding of the genetic underpinnings of AMR but also provides a robust methodological framework that can guide future research and applications in the fight against AMR.
Identifiants
pubmed: 38500034
doi: 10.1186/s12864-024-10214-4
pii: 10.1186/s12864-024-10214-4
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
287Informations de copyright
© 2024. The Author(s).
Références
Laxminarayan R, Duse A, Wattal C, et al. Antibiotic resistance-the need for global solutions. Lancet Infect Dis. 2013;13:1057–98.
doi: 10.1016/S1473-3099(13)70318-9
pubmed: 24252483
Refugees UNHC. for. Refworld| Transforming our world: the 2030 Agenda for Sustainable Development. Refworld, https://www.refworld.org/docid/57b6e3e44.html (accessed 27 September 2023).
160518_Final paper_with. cover.pdf, https://amr-review.org/sites/default/files/160518_Final%20paper_with%20cover.pdf (accessed 27 September 2023).
Murray CJL, Ikuta KS, Sharara F, et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet. 2022;399:629–55.
doi: 10.1016/S0140-6736(21)02724-0
Nabadda S, Kakooza F, Kiggundu R, et al. Implementation of the World Health Organization Global Antimicrobial Resistance Surveillance System in Uganda, 2015–2020: mixed-methods study using National Surveillance Data. JMIR Public Health Surveill. 2021;7:e29954.
doi: 10.2196/29954
pubmed: 34673531
pmcid: 8569544
Ingle DJ, Levine MM, Kotloff KL, et al. Dynamics of antimicrobial resistance in intestinal Escherichia coli from children in community settings in South Asia and sub-saharan Africa. Nat Microbiol. 2018;3:1063–73.
doi: 10.1038/s41564-018-0217-4
pubmed: 30127495
pmcid: 6787116
Amr NGHRU. on GS of. Whole-genome sequencing as part of national and international surveillance programmes for antimicrobial resistance: a roadmap. BMJ Global Health 2020; 5: e002244.
Katyali D, Kawau G, Blomberg B, et al. Antibiotic use at a tertiary hospital in Tanzania: findings from a point prevalence survey. Antimicrob Resist Infect Control. 2023;12:112.
doi: 10.1186/s13756-023-01317-w
pubmed: 37817204
pmcid: 10566109
Achi CR, Ayobami O, Mark G, et al. Operationalising One Health in Nigeria: reflections from a High-Level Expert Panel discussion commemorating the 2020 World Antibiotics Awareness Week. Front Public Health. 2021;9:673504.
doi: 10.3389/fpubh.2021.673504
pubmed: 34136458
pmcid: 8203202
Zankari E, Hasman H, Cosentino S, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67:2640–4.
doi: 10.1093/jac/dks261
pubmed: 22782487
pmcid: 3468078
McArthur AG, Waglechner N, Nizam F, et al. The comprehensive antibiotic resistance database. Antimicrob Agents Chemother. 2013;57:3348–57.
doi: 10.1128/AAC.00419-13
pubmed: 23650175
pmcid: 3697360
Her H-L, Wu Y-W. A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains. Bioinformatics. 2018;34:i89–95.
doi: 10.1093/bioinformatics/bty276
pubmed: 29949970
pmcid: 6022653
Wheeler NE, Gardner PP, Barquist L. Machine learning identifies signatures of host adaptation in the bacterial pathogen Salmonella enterica. PLoS Genet. 2018;14:e1007333.
doi: 10.1371/journal.pgen.1007333
pubmed: 29738521
pmcid: 5940178
Sf R, Mr O, Mj M et al. Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome. mSystems; 3. Epub ahead of print 9 January 2018. https://doi.org/10.1128/mSystems.00123-17 .
Ren Y, Chakraborty T, Doijad S, et al. Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning. Bioinformatics. 2022;38:325–34.
doi: 10.1093/bioinformatics/btab681
pubmed: 34613360
Mw P. T H, M W, Evaluation of Machine Learning and Rules-Based Approaches for Predicting Antimicrobial Resistance Profiles in Gram-negative Bacilli from Whole Genome Sequence Data. Frontiers in microbiology; 7. Epub ahead of print 28 November 2016. https://doi.org/10.3389/fmicb.2016.01887 .
M N, T B, Sw L, et al. Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae. Scientific reports; 8. Epub ahead of print 11 January 2018. https://doi.org/10.1038/s41598-017-18972-w .
Antonopoulos DA, Assaf R, Aziz RK, et al. PATRIC as a unique resource for studying antimicrobial resistance. Brief Bioinform. 2019;20:1094–102.
doi: 10.1093/bib/bbx083
pubmed: 28968762
Onywera H, Ondoa P, Nfii F, et al. Boosting pathogen genomics and bioinformatics workforce in Africa. Lancet Infect Dis. 2024;24:e106–12.
doi: 10.1016/S1473-3099(23)00394-8
pubmed: 37778362
Kallonen T, Brodrick HJ, Harris SR, et al. Systematic longitudinal survey of invasive Escherichia coli in England demonstrates a stable population structure only transiently disturbed by the emergence of ST131. Genome Res. 2017;27:1437–49.
doi: 10.1101/gr.216606.116
pubmed: 28720578
pmcid: 5538559
Stanley IJ, Kajumbula H, Bazira J, et al. Multidrug resistance among Escherichia coli and Klebsiella pneumoniae carried in the gut of out-patients from pastoralist communities of Kasese district, Uganda. PLoS ONE. 2018;13:e0200093.
doi: 10.1371/journal.pone.0200093
pubmed: 30016317
pmcid: 6049918
Sserwadda I, Kidenya BR, Kanyerezi S, et al. Unraveling virulence determinants in extended-spectrum beta-lactamase-producing Escherichia coli from East Africa using whole-genome sequencing. BMC Infect Dis. 2023;23:587.
doi: 10.1186/s12879-023-08579-0
pubmed: 37679664
pmcid: 10483776
Afolayan AO, Aboderin AO, Oaikhena AO, et al. An ST131 clade and a phylogroup a clade bearing an O101-like O-antigen cluster predominate among bloodstream Escherichia coli isolates from South-West Nigeria hospitals. Microb Genomics. 2022;8:000863.
doi: 10.1099/mgen.0.000863
Chen S, Zhou Y, Chen Y, et al. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90.
doi: 10.1093/bioinformatics/bty560
pubmed: 30423086
pmcid: 6129281
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
doi: 10.1093/bioinformatics/btp324
pubmed: 19451168
pmcid: 2705234
Danecek P, Bonfield JK, Liddle J, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10:giab008.
doi: 10.1093/gigascience/giab008
pubmed: 33590861
pmcid: 7931819
Li H, Handsaker B, Wysoker A, et al. The sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9.
doi: 10.1093/bioinformatics/btp352
pubmed: 19505943
pmcid: 2723002
Pordes TOSGEB on behalf of the OC, Petravick D, Kramer B, et al. The open science grid. J Phys: Conf Ser. 2007;78:012057.
Sfiligoi I, Bradley DC, Holzman B et al. The Pilot Way to Grid Resources Using glideinWMS. In: 2009 WRI World Congress on Computer Science and Information Engineering, pp. 428–432.
5-6-32-656.pdf, https://www.mathsjournal.com/pdf/2021/vol6issue1/PartA/5-6-32-656.pdf (accessed 27 September 2023).