A machine learning approach for identification of gastrointestinal predictors for the risk of COVID-19 related hospitalization.
Artificial intelligence
COVID-19
Hospitalization
Liver
Machine learning
Predictors
Random forest
SARS-CoV-2
Symptoms
Journal
PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425
Informations de publication
Date de publication:
2022
2022
Historique:
received:
01
09
2021
accepted:
24
02
2022
entrez:
28
3
2022
pubmed:
29
3
2022
medline:
29
3
2022
Statut:
epublish
Résumé
COVID-19 can be presented with various gastrointestinal symptoms. Shortly after the pandemic outbreak, several machine learning algorithms were implemented to assess new diagnostic and therapeutic methods for this disease. The aim of this study is to assess gastrointestinal and liver-related predictive factors for SARS-CoV-2 associated risk of hospitalization. Data collection was based on a questionnaire from the COVID-19 outpatient test center and from the emergency department at the University Hospital in combination with the data from internal hospital information system and from a mobile application used for telemedicine follow-up of patients. For statistical analysis SARS-CoV-2 negative patients were considered as controls in three different SARS-CoV-2 positive patient groups (divided based on severity of the disease). The data were visualized and analyzed in R version 4.0.5. The Chi-squared or Fisher test was applied to test the null hypothesis of independence between the factors followed, where appropriate, by the multiple comparisons with the Benjamini Hochberg adjustment. The null hypothesis of the equality of the population medians of a continuous variable was tested by the Kruskal Wallis test, followed by the Dunn multiple comparisons test. In order to assess predictive power of the gastrointestinal parameters and other measured variables for predicting an outcome of the patient group the Random Forest machine learning algorithm was trained on the data. The predictive ability was quantified by the ROC curve, constructed from the Out-of-Bag data. Matthews correlation coefficient was used as a one-number summary of the quality of binary classification. The importance of the predictors was measured using the Variable Importance. A 2D representation of the data was obtained by means of Principal Component Analysis for mixed type of data. Findings with the A total of 710 patients were enrolled in the study. The presence of diarrhea and nausea was significantly higher in the emergency department group than in the COVID-19 outpatient test center. Among liver enzymes only aspartate transaminase (AST) has been significantly elevated in the hospitalized group compared to patients discharged home. Based on the Random Forest algorithm, AST has been identified as the most important predictor followed by age or diabetes mellitus. Diarrhea and bloating have also predictive importance, although much lower than AST. SARS-CoV-2 positivity is connected with isolated AST elevation and the level is linked with the severity of the disease. Furthermore, using the machine learning Random Forest algorithm, we have identified the elevated AST as the most important predictor for COVID-19 related hospitalizations.
Sections du résumé
Background and aim
COVID-19 can be presented with various gastrointestinal symptoms. Shortly after the pandemic outbreak, several machine learning algorithms were implemented to assess new diagnostic and therapeutic methods for this disease. The aim of this study is to assess gastrointestinal and liver-related predictive factors for SARS-CoV-2 associated risk of hospitalization.
Methods
Data collection was based on a questionnaire from the COVID-19 outpatient test center and from the emergency department at the University Hospital in combination with the data from internal hospital information system and from a mobile application used for telemedicine follow-up of patients. For statistical analysis SARS-CoV-2 negative patients were considered as controls in three different SARS-CoV-2 positive patient groups (divided based on severity of the disease). The data were visualized and analyzed in R version 4.0.5. The Chi-squared or Fisher test was applied to test the null hypothesis of independence between the factors followed, where appropriate, by the multiple comparisons with the Benjamini Hochberg adjustment. The null hypothesis of the equality of the population medians of a continuous variable was tested by the Kruskal Wallis test, followed by the Dunn multiple comparisons test. In order to assess predictive power of the gastrointestinal parameters and other measured variables for predicting an outcome of the patient group the Random Forest machine learning algorithm was trained on the data. The predictive ability was quantified by the ROC curve, constructed from the Out-of-Bag data. Matthews correlation coefficient was used as a one-number summary of the quality of binary classification. The importance of the predictors was measured using the Variable Importance. A 2D representation of the data was obtained by means of Principal Component Analysis for mixed type of data. Findings with the
Results
A total of 710 patients were enrolled in the study. The presence of diarrhea and nausea was significantly higher in the emergency department group than in the COVID-19 outpatient test center. Among liver enzymes only aspartate transaminase (AST) has been significantly elevated in the hospitalized group compared to patients discharged home. Based on the Random Forest algorithm, AST has been identified as the most important predictor followed by age or diabetes mellitus. Diarrhea and bloating have also predictive importance, although much lower than AST.
Conclusion
SARS-CoV-2 positivity is connected with isolated AST elevation and the level is linked with the severity of the disease. Furthermore, using the machine learning Random Forest algorithm, we have identified the elevated AST as the most important predictor for COVID-19 related hospitalizations.
Identifiants
pubmed: 35341062
doi: 10.7717/peerj.13124
pii: 13124
pmc: PMC8944335
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Pagination
e13124Informations de copyright
© 2022 Lipták et al.
Déclaration de conflit d'intérêts
The authors declare that they have no competing interests.
Références
PeerJ. 2020 Sep 9;8:e9945
pubmed: 32974109
World J Gastroenterol. 2020 Aug 21;26(31):4579-4588
pubmed: 32884218
N Engl J Med. 2020 Apr 30;382(18):1708-1720
pubmed: 32109013
Clin Transl Gastroenterol. 2020 Dec;11(12):e00259
pubmed: 33463978
Radiology. 2020 Aug;296(2):E65-E71
pubmed: 32191588
Gut. 2020 Jun;69(6):1002-1009
pubmed: 32213556
PLoS One. 2020 Oct 14;15(10):e0240346
pubmed: 33052960
Cancer Inform. 2007 Feb 11;2:59-77
pubmed: 19458758
Transl Res. 2020 Dec;226:57-69
pubmed: 32827705
Gastroenterology. 2020 May;158(6):1831-1833.e3
pubmed: 32142773
JGH Open. 2020 Sep 12;4(6):1096-1101
pubmed: 33319043
N Engl J Med. 2019 Apr 4;380(14):1347-1358
pubmed: 30943338
Lancet Gastroenterol Hepatol. 2020 May;5(5):428-430
pubmed: 32145190
Gastroenterology. 2021 Feb;160(3):938-940
pubmed: 33160964
Physiol Genomics. 2020 Apr 1;52(4):200-202
pubmed: 32216577
Clin Infect Dis. 2020 Jul 28;71(15):786-792
pubmed: 32211755
Radiology. 2018 Aug;288(2):318-328
pubmed: 29944078
Lancet Digit Health. 2020 Aug;2(8):e391-e392
pubmed: 32835197
J Clin Virol. 2020 Jul;128:104386
pubmed: 32388469
Hepatology. 2020 Sep;72(3):807-817
pubmed: 32473607
Lancet Gastroenterol Hepatol. 2020 Jul;5(7):667-678
pubmed: 32405603
PLoS One. 2020 Apr 24;15(4):e0232391
pubmed: 32330208
Ann Gastroenterol. 2020 Nov-Dec;33(6):615-630
pubmed: 33162738
Am J Gastroenterol. 2020 May;115(5):766-773
pubmed: 32287140
World Neurosurg. 2021 Apr;148:e450-e458
pubmed: 33444843
Signal Transduct Target Ther. 2021 Apr 24;6(1):165
pubmed: 33895786
Front Public Health. 2020 Jul 03;8:357
pubmed: 32719767
Neurogastroenterol Motil. 2021 Mar;33(3):e14104
pubmed: 33591607
Gastroenterology. 2020 Jul;159(1):320-334.e27
pubmed: 32407808
Clin Gastroenterol Hepatol. 2020 Jul;18(8):1663-1672
pubmed: 32278065