Hematoma expansion prediction based on SMOTE and XGBoost algorithm.
Hematoma expansion
Machine learning prediction
SMOTE
Unbalanced dataset
XGBoost
Journal
BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682
Informations de publication
Date de publication:
19 Jun 2024
19 Jun 2024
Historique:
received:
11
03
2023
accepted:
30
05
2024
medline:
20
6
2024
pubmed:
20
6
2024
entrez:
19
6
2024
Statut:
epublish
Résumé
Hematoma expansion (HE) is a high risky symptom with high rate of occurrence for patients who have undergone spontaneous intracerebral hemorrhage (ICH) after a major accident or illness. Correct prediction of the occurrence of HE in advance is critical to help the doctors to determine the next step medical treatment. Most existing studies focus only on the occurrence of HE within 6 h after the occurrence of ICH, while in reality a considerable number of patients have HE after the first 6 h but within 24 h. In this study, based on the medical doctors recommendation, we focus on prediction of the occurrence of HE within 24 h, as well as the occurrence of HE every 6 h within 24 h. Based on the demographics and computer tomography (CT) image extraction information, we used the XGBoost method to predict the occurrence of HE within 24 h. In this study, to solve the issue of highly imbalanced data set, which is a frequent case in medical data analysis, we used the SMOTE algorithm for data augmentation. To evaluate our method, we used a data set consisting of 582 patients records, and compared the results of proposed method as well as few machine learning methods. Our experiments show that XGBoost achieved the best prediction performance on the balanced dataset processed by the SMOTE algorithm with an accuracy of 0.82 and F1-score of 0.82. Moreover, our proposed method predicts the occurrence of HE within 6, 12, 18 and 24 h at the accuracy of 0.89, 0.82, 0.87 and 0.94, indicating that the HE occurrence within 24 h can be predicted accurately by the proposed method.
Identifiants
pubmed: 38898499
doi: 10.1186/s12911-024-02561-9
pii: 10.1186/s12911-024-02561-9
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
172Informations de copyright
© 2024. The Author(s).
Références
Liu J, Xu H, et al. Prediction of hematoma expansion in spontaneous intracerebral hemorrhage using support vector machine. EBioMedicine. 2019;43:454–9.
doi: 10.1016/j.ebiom.2019.04.040
pubmed: 31060901
pmcid: 6558220
Zhu F, Pan Z, Tang Y, et al. Machine learning models predict coagulopathy in spontaneous intracerebral hemorrhage patients in er. CNS Neurosci Ther. 2021;27:92–100.
doi: 10.1111/cns.13509
pubmed: 33249760
Rao M. People ’s medical publishing house. In: Guidelines for Prevention and Treatment of Cerebrovascular Diseases in China. (in Chinese), p. 54 (2007).
Rao M. People ’s medical publishing house. In: Guidelines for Prevention and Treatment of Cerebrovascular Diseases in China. (in Chinese), p. 1 (2007).
Sato S, Delcourt C, Zhang S, et al. Determinants and prognostic significance of hematoma sedimentation levels in acute intracerebral hemorrhage. Cerebrovasc Dis. 2015;41(1–2):80.
pubmed: 26671408
Craig S, et al. Investigators effects of early intensive blood pressure-lowering treatment on the growth of hematoma and perihematomal edema in acute intracerebral hemorrhage. Stroke. 2010;41:307–12.
doi: 10.1161/STROKEAHA.109.561795
Feigin V. Worldwide stroke incidence and early case fatality reported in 56 population-based studies: a systematic review. Lancet Neurol. 2009;8:355–69.
doi: 10.1016/S1474-4422(09)70025-0
pubmed: 19233729
Li Q. Island sign: an imaging predictor for early hematoma expansion and poor outcome in patients with intracerebral hemorrhage. Stroke. 2018;48:3019.
doi: 10.1161/STROKEAHA.117.017985
Li Q, Zhang G, et al. Black hole sign: novel imaging marker that predicts hematoma growth in patients with intracerebral hemorrhage. Stroke. 2016;471777–1781:1777–81.
doi: 10.1161/STROKEAHA.116.013186
Selariu E, et al. Swirl sign in intracerebral haemorrhage: definition, prevalence, reliability and prognostic value. BMC Neurol. 2012;12:109.
doi: 10.1186/1471-2377-12-109
pubmed: 23013418
pmcid: 3517489
Kumar V, et al. Ensembling classical machine learning and deep learning approaches for morbidity identification from clinical notes. IEEE Access. 2021;9:7107–26.
doi: 10.1109/ACCESS.2020.3043221
Wu Z et al. Anno-mi: A dataset of expert-annotated counselling dialogues. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, pp. 6177–6181 (2022).
Wu Z, et al. Creation, analysis and evaluation of annomi, a dataset of expert-annotated counselling dialogues. Future Internet. 2023;15(3):110.
doi: 10.3390/fi15030110
Chan S, Conell C, et al. Prediction of intracerebral haemorrhage expansion with clinical, laboratory, pharmacologic, and noncontrast radiographic variables. Int J Stroke. 2015;10(7):1057–61.
doi: 10.1111/ijs.12507
pubmed: 25918976
Tang Z, et al. Predicting hematoma expansion in intracerebral hemorrhage from brain ct scans via k-nearest neighbors matting and deep residual network. Biomed Signal Process Control. 2022;76:103656.
doi: 10.1016/j.bspc.2022.103656
Brouwers H, Chang Y, Falcone G, et al. Predicting hematoma expansion after primary intracerebral hemorrhage. JAMA Neurol. 2014;71(2):158–64.
doi: 10.1001/jamaneurol.2013.5433
pubmed: 24366060
pmcid: 4131760
Wang X, Arima H, et al. Clinical prediction algorithm (brain) to determine risk of hematoma growth in acute intracerebral hemorrhage. Stroke. 2015;46(2):376–81.
doi: 10.1161/STROKEAHA.114.006910
pubmed: 25503550
Yao X, Xu Y, et al. The hep score: a nomogram-derived hematoma expansion prediction scale. Neurocrit Care. 2015;23(2):179–87.
doi: 10.1007/s12028-015-0147-4
pubmed: 25963292
Huang Y, Zhang Q, Yang M. A reliable grading system for prediction of hematoma expansion in intracerebral hemorrhage in the basal ganglia. Biosci Trends. 2018;12(2):193–200.
doi: 10.5582/bst.2018.01061
pubmed: 29760358
Miyahara M, Noda R, Yamaguchi S, et al. New prediction score for hematoma expansion and neurological deterioration after spontaneous intracerebral hemorrhage: a hospital-based retrospective cohort study. J Stroke Cerebrovasc Dis. 2018;27(9):2543–50.
doi: 10.1016/j.jstrokecerebrovasdis.2018.05.018
pubmed: 29880210
Sakuta K, Sato T, et al. The nag scale: Noble predictive scale for hematoma expansion in intracerebral hemorrhage. J Stroke Cerebrovasc Dis. 2018;27(10):2606–12.
doi: 10.1016/j.jstrokecerebrovasdis.2018.05.020
pubmed: 29958849
Nawabi J, Elsayed S, et al. Inter- and intrarater agreement of spot sign and noncontrast ct markers for early intracerebral hemorrhage expansion. J Stroke Cerebrovasc Dis. 2020;9(4):1020.
Yang M, Du C, et al. Nomogram model for predicting hematoma expansion in spontaneous intracerebral hemorrhage: Multicenter retrospective study. World Neurosurg. 2020;137:470–8.
doi: 10.1016/j.wneu.2020.02.004
Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019;380(14):1347–58.
doi: 10.1056/NEJMra1814259
pubmed: 30943338
Tang Z, Zhu Y, Lu X, et al. Deep learning-based prediction of hematoma expansion using a single brain computed tomographic slice in patients with spontaneous intracerebral hemorrhages. World Neurosurg. 2022;8750(22):00749–5.
Jin C, Yu H, et al. Predicting treatment response from longitudinal images using multi-task deep learning. Nat Commun. 2021;12(1):1851.
doi: 10.1038/s41467-021-22188-y
pubmed: 33767170
pmcid: 7994301
Ma C, Wang L, Gao C, et al. Automatic and efficient prediction of hematoma expansion in patients with hypertensive intracerebral hemorrhage using deep learning based on ct images. J Pers Med. 2022;12(5):779.
doi: 10.3390/jpm12050779
pubmed: 35629201
pmcid: 9147936
Kanazawa T, Takahashi S, et al. Prediction of postoperative recurrence of chronic subdural hematoma using quantitative volumetric analysis in conjunction with computed tomography texture analysis. J Clin Neurosci. 2020;72:270–6.
doi: 10.1016/j.jocn.2019.11.019
pubmed: 31866353
Xu W, Tang W, Wu L et al. Early prediction of cerebral computed tomography under intelligent segmentation algorithm combined with serological indexes for hematoma enlargement after intracerebral hemorrhage. Comput Math Methods Med. 2022, 5863082 (2022).
Chang W, et al. A machine-learning based prediction method for hypertension outcomes based on medical data. Diagnostics. 2019;9:178.
doi: 10.3390/diagnostics9040178
pubmed: 31703364
pmcid: 6963807
Hassan M et al. Diabetes prediction in healthcare at early stage using machine learning approach. In: 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 01–05 (2021).
Dinh A, et al. A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med Inf Decis Mak. 2019;19:211.
doi: 10.1186/s12911-019-0918-5
Tama B et al. Improving an intelligent detection system for coronary heart disease using a two-tier classifier ensemble. In: In: BioMed Research Inter National [Internet]. Hindawi, p. 9816142 (2020).
Dhaliwal S, et al. Effective intrusion detection system using xgboost. Information. 2018;9(7):149.
doi: 10.3390/info9070149
Tanioka S, Yago T, et al. Machine learning prediction of hematoma expansion in acute intracerebral hemorrhage. Sci Rep. 2022;12(1):12452.
doi: 10.1038/s41598-022-15400-6
pubmed: 35864139
pmcid: 9304401
Chawla N, Bowyer k, et al. Smote: synthetic minority over-sampling technique. J Artif Intell Res (JAIR). 2002;16:321–57.
doi: 10.1613/jair.953
Wang H, Guo X, Jia Z, et al. Multilevel binomial logistic prediction model for malignant pulmonary nodules based on texture features of ct image. Eur J Radiol. 2010;74(1):124–9.
doi: 10.1016/j.ejrad.2009.01.024
pubmed: 19261415
Alghamdi M, Al-Mallah M, Keteyian S, et al. Predicting Diabetes mellitus using smote and ensemble machine learning approach: the henry ford exercise testing (fit) project. PLoS ONE. 2017;12(7):0179805.
doi: 10.1371/journal.pone.0179805
Pandey S, Janghel R. Automatic detection of arrhythmia from imbalanced ecg database using cnn model with smote. Australas Phys Eng Sci Med. 2019;42(4):1129–39.
doi: 10.1007/s13246-019-00815-9
pubmed: 31728941
Wang K, Tian J, et al. Improving risk identification of adverse outcomes in chronic heart failure using smote + enn and machine learning. Risk Manag Healthc Policy. 2021;14:2453–63.
doi: 10.2147/RMHP.S310295
pubmed: 34149290
pmcid: 8206455
Francis PPS, adn Prasad, Zahoor-Ul-Huq S. Medical data classification based on smote and recurrent neural network. Int J Eng Adv Technol. 2020;9:2560–5.
doi: 10.35940/ijeat.C5444.029320
Xu Z, Shen D, et al. An oversampling algorithm combining smote and k-means for imbalanced medical data. Inf Sci. 2021;572:574–98.
doi: 10.1016/j.ins.2021.02.056
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: ACM, editor In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016).
Pan B. Application of xgboost algorithm in hourly pm2.5 concentration prediction. iop Conf Ser Earth Environ Sci. 2018;113(1):012127.
doi: 10.1088/1755-1315/113/1/012127
Keller A, Pandey A. Smote and enn based xgboost prediction model for parkinson’s disease detection, 2021 2nd international conference on smart electronics and communication. In: 2021 2nd International Conference on Smart Electronics and Communication, pp. 839–846 (2021).
Kumar V et al. Data augmentation for reliability and fairness in counselling quality classification. In: In Proceedings of the 1st Workshop on Scarce Data in Artificial Intelligence for Healthcare - SDAIH, pp. 23–28 (2023).
Janssen A, Hoogendoorn M, et al. Application of shap values for inferring the optimal functional form of covariates in pharmacokinetic modeling. CPT Pharmacometrics Syst Pharmacol. 2022;11:1100–10.
doi: 10.1002/psp4.12828
pubmed: 38100100
pmcid: 9381890
Nawabi J, Elsayed S, Kniep H, et al. Inter-and intrarater agreement of spot sign and noncontrast ct markers for early intracerebral hemorrhage expansion. J Clin Med. 2020;9:1020.
doi: 10.3390/jcm9041020
pubmed: 32260409
pmcid: 7231301
Li Q, Zhang G, Xin X, et al. Black hole sign: novel imaging marker that predicts hematoma growth in patients with intracerebral hemorrhage. Stroke. 2016;47:1777–81.
doi: 10.1161/STROKEAHA.116.013186
pubmed: 27174523
Shimoda Y, Ohtomo S, Arai H et al. A poor outcome predictor in intracerebral hemorrhage. Cerebrovasc Dis. (2017).
Rangaraj S, Islam M, et al. Identifying risk factors of intracerebral hemorrhage stability using explainable attention model. Med Biol Eng Comput. 2022;60(2):337–48.
doi: 10.1007/s11517-021-02459-y
pubmed: 34859369
Anderson C, Heeley E, et al. Rapid blood-pressure lowering in patients with acute intracerebral hemorrhage. N Engl J Med. 2013;368(25):2355–65.
doi: 10.1056/NEJMoa1214609
pubmed: 23713578
Qureshi A. Intensive blood-pressure lowering in patients with acute cerebral hemorrhage. N Engl J Med. 2016;375:1033–43.
doi: 10.1056/NEJMoa1603460
pubmed: 27276234
pmcid: 5345109
Rodriguez-Luna D, Rubiera M, et al. Impact of blood pressure changes and course on hematoma growth in acute intracerebral hemorrhage. Eur J Neurol. 2013;20:1277–83.
doi: 10.1111/ene.12180
pubmed: 23647568
Oh DM, Shkirkova K, et al. Association between hyperacute blood pressure variability and hematoma expansion after intracerebral hemorrhage: secondary analysis of the fast-mag database. Neurocrit Care. (2022).
Dong Q, Gong S, Zhu X. Imbalanced deep learning by minority class incremental rectification. ieee Trans Pattern anal Mach Intell. 2019;41(6):1367–81.
doi: 10.1109/TPAMI.2018.2832629
Liu P, Zheng G. Handling imbalanced data: uncertainty-guided virtual adversarial training with batch nuclear-norm optimization for semi-supervised medical image classification. IEEE J BIOMEDICAL HEALTH Inf. 2022;41(7):2983–94.
doi: 10.1109/JBHI.2022.3162748
Zeng H, Yang C, et al. A lightgbm-based eeg analysis method for driver mental states classification. Comput Intell Neurosci. 2019;9:3761203.
Wang Y, Wang T. Application of improved lightgbm model in blood glucose prediction. Appl Sci. 2020;10:3227.
doi: 10.3390/app10093227
Pasha A, Anbalagan R, Setlur A, et al. Implementation of ensemble machine learning algorithms on exome datasets for predicting early diagnosis of cancers. BMC Bioinformatics. 2022;23(1):1–24.
Kavzoglu T, Teke A. Predictive performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (xgboost) and natural gradient boosting (ngboost). Bus Media B V). 2022;47(6):7367–85. Arabian Journal for Science & Engineering (Springer Science
Shwartz-Ziv R, Armon A. Tabular data: deep learning is not all you need. Inform Fusion. 2022;81:84–90.
doi: 10.1016/j.inffus.2021.11.011