A hybrid machine learning/deep learning COVID-19 severity predictive model from CT images and clinical data.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
14 03 2022
Historique:
received: 25 05 2021
accepted: 22 02 2022
entrez: 15 3 2022
pubmed: 16 3 2022
medline: 5 4 2022
Statut: epublish

Résumé

COVID-19 clinical presentation and prognosis are highly variable, ranging from asymptomatic and paucisymptomatic cases to acute respiratory distress syndrome and multi-organ involvement. We developed a hybrid machine learning/deep learning model to classify patients in two outcome categories, non-ICU and ICU (intensive care admission or death), using 558 patients admitted in a northern Italy hospital in February/May of 2020. A fully 3D patient-level CNN classifier on baseline CT images is used as feature extractor. Features extracted, alongside with laboratory and clinical data, are fed for selection in a Boruta algorithm with SHAP game theoretical values. A classifier is built on the reduced feature space using CatBoost gradient boosting algorithm and reaching a probabilistic AUC of 0.949 on holdout test set. The model aims to provide clinical decision support to medical doctors, with the probability score of belonging to an outcome class and with case-based SHAP interpretation of features importance.

Identifiants

pubmed: 35288579
doi: 10.1038/s41598-022-07890-1
pii: 10.1038/s41598-022-07890-1
pmc: PMC8919158
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

4329

Informations de copyright

© 2022. The Author(s).

Références

Struyf, T. et al. Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19 disease. In Cochrane Database of Systematic Reviews (2020).
Gupta, A. et al. Extrapulmonary manifestations of COVID-19. Nat. Med. 26, 1017–1032 (2020).
pubmed: 32651579 doi: 10.1038/s41591-020-0968-3
Li, H. et al. SARS-CoV-2 and viral sepsis: Observations and hypotheses. Lancet 395, 1517–1520 (2020).
pubmed: 32311318 pmcid: 7164875 doi: 10.1016/S0140-6736(20)30920-X
Wiersinga, W. J., Rhodes, A., Cheng, A. C., Peacock, S. J. & Prescott, H. C. Pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (COVID-19): A review. JAMA 324, 782–793 (2020).
pubmed: 32648899 doi: 10.1001/jama.2020.12839
Tayarani-N, M.-H. Applications of artificial intelligence in battling against Covid-19: A literature review. Chaos Solitons Fractals 142, 110338 (2021).
pubmed: 33041533 doi: 10.1016/j.chaos.2020.110338
Born, J. et al. On the role of artificial intelligence in medical imaging of COVID-19. Patterns 2, 100330 (2021).
pubmed: 34405156 pmcid: 8361688 doi: 10.1016/j.patter.2021.100330
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V. & Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural. Inf. Process. Syst. 31, 6638–6648 (2018).
Dorogush, A. V., Ershov, V. & Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363 . Workshop on ML Systems at NIPS 2017 (2018).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 4765–4774 (2017).
Lundberg, S. M. et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Bbiomed. Eng. 2, 749–760 (2018).
doi: 10.1038/s41551-018-0304-0
Bottino, F. et al. COVID mortality prediction with machine learning methods: A systematic review and critical appraisal. J. Pers. Med. 11, 893 (2021).
pubmed: 34575670 pmcid: 8467935 doi: 10.3390/jpm11090893
Kulkarni, A. R. et al. Deep learning model to predict the need for mechanical ventilation using chest X-ray images in hospitalised patients with COVID-19. BMJ Innov.7 (2021).
Feng, Y.-Z. et al. Severity assessment and progression prediction of COVID-19 patients based on the LesionEncoder framework and chest CT. Information 12, 471 (2021).
doi: 10.3390/info12110471
Xiao, L.-S. et al. Development and validation of a deep learning-based model using computed tomography imaging for predicting disease severity of coronavirus disease 2019. Front. Bioeng. Biotechnol. 8, 898 (2020).
pubmed: 32850746 pmcid: 7411489 doi: 10.3389/fbioe.2020.00898
Wang, S. et al. A deep learning radiomics model to identify poor outcome in COVID-19 patients with underlying health conditions: A multicenter study. IEEE J. Biomed. Health Inform. 25, 2353–2362 (2021).
pubmed: 33905341 doi: 10.1109/JBHI.2021.3076086
Zhang, K. et al. Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography. Cell 181, 1423–1433 (2020).
pubmed: 32416069 pmcid: 7196900 doi: 10.1016/j.cell.2020.04.045
Chassagnon, G. et al. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Med. Image Anal. 67, 101860 (2021).
pubmed: 33171345 doi: 10.1016/j.media.2020.101860
Chao, H. et al. Integrative analysis for COVID-19 patient outcome prediction. Med. Image Anal. 67, 101844 (2020).
pubmed: 33091743 pmcid: 7553063 doi: 10.1016/j.media.2020.101844
Wu, Q. et al. Radiomics analysis of computed tomography helps predict poor prognostic outcome in COVID-19. Theranostics 10, 7231 (2020).
pubmed: 32641989 pmcid: 7330838 doi: 10.7150/thno.46428
Ning, W. et al. Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nat. Biomed. Eng. 4, 1197–1207 (2020).
pubmed: 33208927 pmcid: 7723858 doi: 10.1038/s41551-020-00633-5
Lassau, N. et al. Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients. Nat. Commun. 12, 1–11 (2021).
doi: 10.1038/s41467-020-20657-4
Jiao, Z. et al. Prognostication of patients with COVID-19 using artificial intelligence based on chest x-rays and clinical data: A retrospective study. Lancet Digit. Health 3, e286–e294 (2021).
pubmed: 33773969 pmcid: 7990487 doi: 10.1016/S2589-7500(21)00039-X
Wang, R. et al. Artificial intelligence for prediction of COVID-19 progression using CT imaging and clinical data. Eur. Radiol. 35, 205–212 (2022).
doi: 10.1007/s00330-021-08049-8
Shamout, F. E. et al. An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department. NPJ Digit. Med. 4, 1–11 (2021).
doi: 10.1038/s41746-021-00453-0
Kwon, Y. J. et al. Combining initial radiographs and clinical variables improves deep learning prognostication in patients with COVID-19 from the emergency department. Radiol. Artif. Intell. 3, e200098 (2020).
pubmed: 33928257 pmcid: 7754832 doi: 10.1148/ryai.2020200098
Ho, T. T. et al. Deep learning models for predicting severe progression in COVID-19-infected patients: Retrospective study. JMIR Med. Inform. 9, e24973 (2021).
pubmed: 33455900 pmcid: 7850779 doi: 10.2196/24973
Xu, M. et al. Accurately differentiating COVID-19, other viral infection, and healthy individuals using multimodal features via late fusion learning. J. Med. Internet Res. 23, e25535 (2021).
pubmed: 33404516 pmcid: 7790733 doi: 10.2196/25535
Fang, C. et al. Deep learning for predicting COVID-19 malignant progression. Med. Image Anal. 72, 102096 (2021).
pubmed: 34051438 pmcid: 8112895 doi: 10.1016/j.media.2021.102096
Soda, P. et al. AIforCOVID: Predicting the clinical outcomes in patients with COVID-19 applying AI to chest-X-rays. An Italian multicentre study. Med. Image Anal. 74, 102216 (2021).
pubmed: 34492574 pmcid: 8401374 doi: 10.1016/j.media.2021.102216
Aloisio, E. et al. A comprehensive appraisal of laboratory biochemistry tests as major predictors of COVID-19 severity. Arch. Pathol. Lab. Med. 144, 1457–1464 (2020).
pubmed: 32649222 doi: 10.5858/arpa.2020-0389-SA
Chen, X.-Y., Huang, M.-Y., Xiao, Z.-W., Yang, S. & Chen, X.-Q. Lactate dehydrogenase elevations is associated with severity of COVID-19: A meta-analysis. Crit. Care 24, 1–3 (2020).
doi: 10.1186/s13054-020-03161-5
Lippi, G. & Favaloro, E. J. D-dimer is associated with severity of coronavirus disease 2019: A pooled analysis. Thromb. Haemost. 120, 876 (2020).
pubmed: 32246450 pmcid: 7295300 doi: 10.1055/s-0040-1709650
McElvaney, O. J. et al. Characterization of the inflammatory response to severe COVID-19 illness. Am. J. Respir. Crit. Care Med. 202, 812–821 (2020).
pubmed: 32584597 pmcid: 7491404 doi: 10.1164/rccm.202005-1583OC
Rodriguez-Morales, A. J. et al. Clinical, laboratory and imaging features of COVID-19: A systematic review and meta-analysis. Travel Med. Infect. Dis. 34, 101623 (2020).
pubmed: 32179124 pmcid: 7102608 doi: 10.1016/j.tmaid.2020.101623
Guan, W.-J. et al. Clinical characteristics of coronavirus disease 2019 in China. N. Engl. J. Med. 382, 1708–1720 (2020).
pubmed: 32109013 doi: 10.1056/NEJMoa2002032
Keany, E. et al. Ekeany/Boruta-Shap: BorutaShap, https://doi.org/10.5281/zenodo.4247618 (2020).
Sluimer, I., Prokop, M. & Van Ginneken, B. Toward automated segmentation of the pathological lung in CT. IEEE Trans. Med. Imaging 24, 1025–1038 (2005).
pubmed: 16092334 doi: 10.1109/TMI.2005.851757
Liauchuk, V. & Kovalev, V. ImageCLEF 2017: Supervoxels and co-occurrence for tuberculosis CT image classification. In CLEF2017 Working Notes, CEUR Workshop Proceedings (CEUR-WS.org http://ceur-ws.org , Dublin, Ireland, 2017).
Sharp, G. C. et al. Plastimatch: An open source software suite for radiotherapy image processing. In Proceedings of the XVIth International Conference on the use of Computers in Radiotherapy (ICCR), Amsterdam, Netherlands (2010).
ImageCLEFmed Tubercolosis (2020). Accessed: 2020-12-19.
Wu, Y. & He, K. Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV), 3–19 (2018).
Ba, J. L., Kiros, J. R. & Hinton, G. E. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
Pérez-García, F., Sparks, R. & Ourselin, S. TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning. arXiv preprint arXiv:2003.04696 (2020).
Lin, W. et al. Convolutional neural networks-based MRI image analysis for the Alzheimer’s disease prediction from mild cognitive impairment. Front. Neurosci. 12, 777 (2018).
pubmed: 30455622 pmcid: 6231297 doi: 10.3389/fnins.2018.00777
Kursa, M. B., Jankowski, A. & Rudnicki, W. R. Boruta—A system for feature selection. Fund. Inform. 101, 271–285 (2010).
Lundberg, S. M., Erion, G. G. & Lee, S.-I. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888 (2018).
Hooker, G. & Mentch, L. Please stop permuting features: An explanation and alternatives. arXiv preprint arXiv:1905.03151 (2019).
Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631 (2019).
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, 8026–8037 (2019).
Force, A. D. T. et al. Acute respiratory distress syndrome. JAMA 307, 2526–2533 (2012).
Altaf, T., Anwar, S. M., Gul, N., Majeed, M. N. & Majid, M. Multi-class Alzheimer’s disease classification using image and clinical features. Biomed. Signal Process. Control 43, 64–74 (2018).
doi: 10.1016/j.bspc.2018.02.019
Tunali, I. et al. Novel clinical and radiomic predictors of rapid disease progression phenotypes among lung cancer patients treated with immunotherapy: An early report. Lung Cancer 129, 75–79 (2019).
pubmed: 30797495 doi: 10.1016/j.lungcan.2019.01.010
LeCun, Y. The Unreasonable Effectiveness of Deep Learning. http://videolectures.net/sahd2014_lecun_deep_learning/ (2014). UCL-Duke Workshop on Sensing and Analysis of High-Dimensional Data.
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778 (2016).
Chui, M. et al. Notes from the AI frontier: Insights from hundreds of use cases (McKinsey Global Institute, 2018).
Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794 (2016).
Ke, G. et al. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems, 3146–3154 (2017).
Bansal, S. Historical Data Science Trends on Kaggle. https://www.kaggle.com/shivamb/data-science-trends-on-kaggle (2019).
Shwartz-Ziv, R. & Armon, A. Tabular Data: Deep Learning Is Not All You Need. arXiv preprint arXiv:2106.03253 (2021).
Popov, S., Morozov, S. & Babenko, A. Neural oblivious decision ensembles for deep learning on tabular data. arXiv preprint arXiv:1909.06312 (2019).
Arık, S. O. & Pfister, T. (Attentive interpretable tabular learning. arXiv, Tabnet, 2020).
Meng, L. et al. A deep learning prognosis model help alert for COVID-19 patients at high-risk of death: A multi-center study. IEEE J. Biomed. Health Inform. 24, 3576–3584 (2020).
pubmed: 33108303 doi: 10.1109/JBHI.2020.3034296
Liu, M., Zhang, J., Adeli, E. & Shen, D. Joint classification and regression via deep multi-task multi-channel learning for Alzheimer’s disease diagnosis. IEEE Trans. Biomed. Eng. 66, 1195–1206 (2018).
pubmed: 30222548 pmcid: 6764421 doi: 10.1109/TBME.2018.2869989
Spasov, S. et al. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to Alzheimer’s disease. Neuroimage 189, 276–287 (2019).
pubmed: 30654174 doi: 10.1016/j.neuroimage.2019.01.031
Gessert, N., Nielsen, M., Shaikh, M., Werner, R. & Schlaefer, A. Skin lesion classification using ensembles of multi-resolution EfficientNets with meta data. MethodsX 7, 100864 (2020).
pubmed: 32292713 pmcid: 7150512 doi: 10.1016/j.mex.2020.100864
Pang, L., Wang, J., Zhao, L., Wang, C. & Zhan, H. A novel protein subcellular localization method with CNN-XGBoost model for Alzheimer’s disease. Front. Genet. 9, 751 (2019).
pubmed: 30713552 pmcid: 6345701 doi: 10.3389/fgene.2018.00751
Ren, X., Guo, H., Li, S., Wang, S. & Li, J. A novel image classification method with CNN-XGBoost model. In International Workshop on Digital Watermarking, 378–390 (Springer, 2017).
Carvalho, E. D., Carvalho, E. D., de Carvalho Filho, A. O., de Araújo, F. H. D. & Rabêlo, R. d. A. L. Diagnosis of COVID-19 in CT image using CNN and XGBoost. In 2020 IEEE Symposium on Computers and Communications (ISCC), 1–6 (IEEE, 2020).
Hancock, J. T. & Khoshgoftaar, T. M. CatBoost for big data: An interdisciplinary review. J. Big Data 7, 1–45 (2020).
doi: 10.1186/s40537-020-00369-8
Bostrom, N. & Yudkowsky, E. The ethics of artificial intelligence. In The Cambridge Handbook of Artificial Intelligence, vol. 1, 316–334 (2014).
European Union. Regulation (eu) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Off. J. L110(59), 1–88 (2016).

Auteurs

Matteo Chieregato (M)

Unit of Medical Physics, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy. matteo.chieregato@poliambulanza.it.

Fabio Frangiamore (F)

Unit of Medical Physics, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy.
Tattile s.r.l, 25030, Mairano, BS, Italy.

Mauro Morassi (M)

Department of Diagnostic Imaging, Unit of Radiology, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy.

Claudia Baresi (C)

Unit of Lean Managing, Fondazione Poliambulanza Istituto Ospedaliero, Information and Communications Technology, 25124, Brescia, Italy.

Stefania Nici (S)

Unit of Medical Physics, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy.
Unit of Medical Physics, Spedali Civili, 25124, Brescia, Italy.

Chiara Bassetti (C)

Unit of Medical Physics, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy.

Claudio Bnà (C)

Department of Diagnostic Imaging, Unit of Radiology, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy.

Marco Galelli (M)

Unit of Medical Physics, Fondazione Poliambulanza Istituto Ospedaliero, 25124, Brescia, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH