Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study.

foodborne disease machine learning pathogens prediction

Journal

JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109

Informations de publication

Date de publication:
26 Jan 2021
Historique:
received: 11 10 2020
accepted: 28 12 2020
revised: 18 12 2020
entrez: 26 1 2021
pubmed: 27 1 2021
medline: 27 1 2021
Statut: epublish

Résumé

Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in clinical features, and there is a low proportion of clinically actual pathogen detection in real life. We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens that have not been tested. We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationship between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of 4 models to obtain the pathogen prediction model with the highest accuracy. The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that they play important roles in classifying the foodborne disease pathogens. Data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases.

Sections du résumé

BACKGROUND BACKGROUND
Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in clinical features, and there is a low proportion of clinically actual pathogen detection in real life.
OBJECTIVE OBJECTIVE
We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens that have not been tested.
METHODS METHODS
We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationship between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of 4 models to obtain the pathogen prediction model with the highest accuracy.
RESULTS RESULTS
The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that they play important roles in classifying the foodborne disease pathogens.
CONCLUSIONS CONCLUSIONS
Data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases.

Identifiants

pubmed: 33496675
pii: v9i1e24924
doi: 10.2196/24924
pmc: PMC7872834
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e24924

Informations de copyright

©Hanxue Wang, Wenjuan Cui, Yunchang Guo, Yi Du, Yuanchun Zhou. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 26.01.2021.

Références

J Food Prot. 1997 Oct;60(10):1265-1286
pubmed: 31207736
J Med Microbiol. 2005 Jan;54(Pt 1):51-54
pubmed: 15591255
Wei Sheng Yan Jiu. 2006 Mar;35(2):201-4
pubmed: 16758972
Emerg Infect Dis. 2001 May-Jun;7(3):382-9
pubmed: 11384513
JAMA. 2014 Jul;312(2):117-8
pubmed: 24963655
MMWR Morb Mortal Wkly Rep. 1997 Mar 28;46(12):258-61
pubmed: 9087688
J Clin Microbiol. 2016 Aug;54(8):1975-83
pubmed: 27008877
PLoS Med. 2015 Dec 03;12(12):e1001921
pubmed: 26633831
MMWR Surveill Summ. 2006 Nov 10;55(10):1-42
pubmed: 17093388
Sci Rep. 2015 Apr 10;5:9524
pubmed: 25860918
Clin Infect Dis. 2005 Sep 1;41(5):698-704
pubmed: 16080093
Wei Sheng Yan Jiu. 2004 Nov;33(6):725-7
pubmed: 15727189
MMWR Surveill Summ. 2018 Jul 27;67(10):1-11
pubmed: 30048426
Front Microbiol. 2019 Aug 06;10:1722
pubmed: 31447800
J Food Prot. 1990 Aug;53(8):711-728
pubmed: 31018333
Front Med. 2018 Feb;12(1):48-57
pubmed: 29282610
Emerg Infect Dis. 2011 Jan;17(1):7-15
pubmed: 21192848
Foodborne Pathog Dis. 2019 Jul;16(7):439-440
pubmed: 31259613
Front Microbiol. 2015 Jan 12;5:770
pubmed: 25628612
Foodborne Pathog Dis. 2011 Aug;8(8):887-900
pubmed: 21492021
PLoS One. 2013 Oct 02;8(10):e75922
pubmed: 24098406
Wei Sheng Yan Jiu. 2010 May;39(3):331-4
pubmed: 20568464
Science. 2015 Jul 17;349(6245):255-60
pubmed: 26185243
JMIR Public Health Surveill. 2018 Jun 06;4(2):e57
pubmed: 29875090
Epidemiology. 2004 Jan;15(1):86-92
pubmed: 14712151
J Am Med Inform Assoc. 2018 Dec 1;25(12):1586-1592
pubmed: 29329402
Stat Med. 2003 May 15;22(9):1365-81
pubmed: 12704603
J Food Prot. 2000 Jun;63(6):807-9
pubmed: 10852576

Auteurs

Hanxue Wang (H)

Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.
Chinese Academy of Sciences University, Beijing, China.

Wenjuan Cui (W)

Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.

Yunchang Guo (Y)

China National Center for Food Safety Risk Assessment, Beijing, China.

Yi Du (Y)

Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.
Chinese Academy of Sciences University, Beijing, China.

Yuanchun Zhou (Y)

Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.
Chinese Academy of Sciences University, Beijing, China.

Classifications MeSH