Machine Learning Approach to Predicting COVID-19 Disease Severity Based on Clinical Blood Test Data: Statistical Analysis and Model Development.

COVID-19 blood blood samples data set machine learning morbidity mortality outcome prediction risk severity statistical analysis testing

Journal

JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109

Informations de publication

Date de publication:
13 Apr 2021
Historique:
received: 20 11 2020
accepted: 21 03 2021
revised: 21 01 2021
pubmed: 30 3 2021
medline: 30 3 2021
entrez: 29 3 2021
Statut: epublish

Résumé

Accurate prediction of the disease severity of patients with COVID-19 would greatly improve care delivery and resource allocation and thereby reduce mortality risks, especially in less developed countries. Many patient-related factors, such as pre-existing comorbidities, affect disease severity and can be used to aid this prediction. Because rapid automated profiling of peripheral blood samples is widely available, we aimed to investigate how data from the peripheral blood of patients with COVID-19 can be used to predict clinical outcomes. We investigated clinical data sets of patients with COVID-19 with known outcomes by combining statistical comparison and correlation methods with machine learning algorithms; the latter included decision tree, random forest, variants of gradient boosting machine, support vector machine, k-nearest neighbor, and deep learning methods. Our work revealed that several clinical parameters that are measurable in blood samples are factors that can discriminate between healthy people and COVID-19-positive patients, and we showed the value of these parameters in predicting later severity of COVID-19 symptoms. We developed a number of analytical methods that showed accuracy and precision scores >90% for disease severity prediction. We developed methodologies to analyze routine patient clinical data that enable more accurate prediction of COVID-19 patient outcomes. With this approach, data from standard hospital laboratory analyses of patient blood could be used to identify patients with COVID-19 who are at high risk of mortality, thus enabling optimization of hospital facilities for COVID-19 treatment.

Sections du résumé

BACKGROUND BACKGROUND
Accurate prediction of the disease severity of patients with COVID-19 would greatly improve care delivery and resource allocation and thereby reduce mortality risks, especially in less developed countries. Many patient-related factors, such as pre-existing comorbidities, affect disease severity and can be used to aid this prediction.
OBJECTIVE OBJECTIVE
Because rapid automated profiling of peripheral blood samples is widely available, we aimed to investigate how data from the peripheral blood of patients with COVID-19 can be used to predict clinical outcomes.
METHODS METHODS
We investigated clinical data sets of patients with COVID-19 with known outcomes by combining statistical comparison and correlation methods with machine learning algorithms; the latter included decision tree, random forest, variants of gradient boosting machine, support vector machine, k-nearest neighbor, and deep learning methods.
RESULTS RESULTS
Our work revealed that several clinical parameters that are measurable in blood samples are factors that can discriminate between healthy people and COVID-19-positive patients, and we showed the value of these parameters in predicting later severity of COVID-19 symptoms. We developed a number of analytical methods that showed accuracy and precision scores >90% for disease severity prediction.
CONCLUSIONS CONCLUSIONS
We developed methodologies to analyze routine patient clinical data that enable more accurate prediction of COVID-19 patient outcomes. With this approach, data from standard hospital laboratory analyses of patient blood could be used to identify patients with COVID-19 who are at high risk of mortality, thus enabling optimization of hospital facilities for COVID-19 treatment.

Identifiants

pubmed: 33779565
pii: v9i4e25884
doi: 10.2196/25884
pmc: PMC8045777
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e25884

Subventions

Organisme : EPA
ID : EP-C-18-008
Pays : United States

Informations de copyright

©Sakifa Aktar, Md Martuza Ahamad, Md Rashed-Al-Mahfuz, AKM Azad, Shahadat Uddin, AHM Kamal, Salem A Alyami, Ping-I Lin, Sheikh Mohammed Shariful Islam, Julian MW Quinn, Valsamma Eapen, Mohammad Ali Moni. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.04.2021.

Références

Inform Med Unlocked. 2020;21:100449
pubmed: 33102686
Nat Commun. 2020 Oct 27;11(1):5411
pubmed: 33110070
Med (N Y). 2020 Dec 18;1(1):128-138.e3
pubmed: 32838352
Sci Rep. 2020 Mar 12;10(1):4605
pubmed: 32165685
Auton Neurosci. 2001 Jul 20;90(1-2):47-56
pubmed: 11485292
Curr Opin Crit Care. 2012 Dec;18(6):700-6
pubmed: 22954664
J Med Virol. 2020 Sep;92(9):1518-1524
pubmed: 32104917
Clin Chim Acta. 2020 Aug;507:174-180
pubmed: 32339487
Brief Bioinform. 2021 Mar 22;22(2):1254-1266
pubmed: 33024988
Brief Bioinform. 2021 Mar 22;22(2):1451-1465
pubmed: 33611340
Ann Lab Med. 2020 Mar 31;40(5):351-360
pubmed: 32237288
Nat Mach Intell. 2020 Jan;2(1):56-67
pubmed: 32607472
Z Gesundh Wiss. 2020 Apr 19;:1-9
pubmed: 32313806
J Clin Diagn Res. 2017 Sep;11(9):ZC36-ZC39
pubmed: 29207830
Clin Infect Dis. 2020 Jul 28;71(15):762-768
pubmed: 32161940
Heart Rhythm. 2020 Sep;17(9):1434-1438
pubmed: 32535142
Int J Infect Dis. 2020 May;94:128-132
pubmed: 32251805
Brief Bioinform. 2021 Mar 22;22(2):1387-1401
pubmed: 33458761
Brief Bioinform. 2021 Mar 22;22(2):1415-1429
pubmed: 33539530
Expert Syst Appl. 2020 Dec 1;160:113661
pubmed: 32834556
Brief Bioinform. 2021 Mar 22;22(2):1175-1196
pubmed: 32778874
J Clin Epidemiol. 2020 Oct;126:207-216
pubmed: 32712176
Clin Infect Dis. 2020 Mar 16;:
pubmed: 32173725
BMC Med Inform Decis Mak. 2019 Dec 21;19(1):281
pubmed: 31864346
Lancet. 2020 Feb 15;395(10223):507-513
pubmed: 32007143

Auteurs

Sakifa Aktar (S)

Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science & Technology University, Gopalganj, Bangladesh.

Md Martuza Ahamad (MM)

Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science & Technology University, Gopalganj, Bangladesh.

Md Rashed-Al-Mahfuz (M)

Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh.

Akm Azad (A)

iThree Institute, Faculty of Science, University Technology of Sydney, Sydney, Australia.

Shahadat Uddin (S)

Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Darlington, Sydney, Australia.

Ahm Kamal (A)

Department of Computer Science and Engineering, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh, Bangladesh.

Salem A Alyami (SA)

Department of Mathematics and Statistics, Faculty of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia.

Ping-I Lin (PI)

School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia.

Sheikh Mohammed Shariful Islam (SMS)

Institute for Physical Activity and Nutrition, Faculty of Health, Deakin University, Victoria, Australia.

Julian Mw Quinn (JM)

Healthy Ageing Theme, The Garvan Institute of Medical Research, Darlington, Australia.

Valsamma Eapen (V)

School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia.

Mohammad Ali Moni (MA)

School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia.
Healthy Ageing Theme, The Garvan Institute of Medical Research, Darlington, Australia.
WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, University of New South Wales, Sydney, Australia.

Classifications MeSH