Machine learning-based evaluation of prognostic factors for mortality and relapse in patients with acute lymphoblastic leukemia: a comparative simulation study.
Acute lymphoblastic leukemia
Classification
Data mining
Machine learning
Mortality
Relapse
Journal
BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682
Informations de publication
Date de publication:
16 Sep 2024
16 Sep 2024
Historique:
received:
06
01
2024
accepted:
21
08
2024
medline:
17
9
2024
pubmed:
17
9
2024
entrez:
16
9
2024
Statut:
epublish
Résumé
Predicting mortality and relapse in children with acute lymphoblastic leukemia (ALL) is crucial for effective treatment and follow-up management. ALL is a common and deadly childhood cancer that often relapses after remission. In this study, we aimed to apply and evaluate machine learning-based models for predicting mortality and relapse in pediatric ALL patients. This retrospective cohort study was conducted on 161 children aged less than 16 years with ALL. Survival status (dead/alive) and patient experience of relapse (yes/no) were considered as the outcome variables. Ten machine learning (ML) algorithms were used to predict mortality and relapse. The performance of the algorithms was evaluated by cross-validation and reported as mean sensitivity, specificity, accuracy and area under the curve (AUC). Finally, prognostic factors were identified based on the best algorithms. The mean accuracy of the ML algorithms for prediction of patient mortality ranged from 64 to 74% and for prediction of relapse, it varied from 64 to 84% on test data sets. The mean AUC of the ML algorithms for mortality and relapse was above 64%. The most important prognostic factors for predicting both mortality and relapse were identified as age at diagnosis, hemoglobin and platelets. In addition, significant prognostic factors for predicting mortality included clinical side effects such as splenomegaly, hepatomegaly and lymphadenopathy. Our results showed that artificial neural networks and bagging algorithms outperformed other algorithms in predicting mortality, while boosting and random forest algorithms excelled in predicting relapse in ALL patients across all criteria. These results offer significant clinical insights into the prognostic factors for children with ALL, which can inform treatment decisions and improve patient outcomes.
Sections du résumé
BACKGROUND
BACKGROUND
Predicting mortality and relapse in children with acute lymphoblastic leukemia (ALL) is crucial for effective treatment and follow-up management. ALL is a common and deadly childhood cancer that often relapses after remission. In this study, we aimed to apply and evaluate machine learning-based models for predicting mortality and relapse in pediatric ALL patients.
METHODS
METHODS
This retrospective cohort study was conducted on 161 children aged less than 16 years with ALL. Survival status (dead/alive) and patient experience of relapse (yes/no) were considered as the outcome variables. Ten machine learning (ML) algorithms were used to predict mortality and relapse. The performance of the algorithms was evaluated by cross-validation and reported as mean sensitivity, specificity, accuracy and area under the curve (AUC). Finally, prognostic factors were identified based on the best algorithms.
RESULTS
RESULTS
The mean accuracy of the ML algorithms for prediction of patient mortality ranged from 64 to 74% and for prediction of relapse, it varied from 64 to 84% on test data sets. The mean AUC of the ML algorithms for mortality and relapse was above 64%. The most important prognostic factors for predicting both mortality and relapse were identified as age at diagnosis, hemoglobin and platelets. In addition, significant prognostic factors for predicting mortality included clinical side effects such as splenomegaly, hepatomegaly and lymphadenopathy.
CONCLUSIONS
CONCLUSIONS
Our results showed that artificial neural networks and bagging algorithms outperformed other algorithms in predicting mortality, while boosting and random forest algorithms excelled in predicting relapse in ALL patients across all criteria. These results offer significant clinical insights into the prognostic factors for children with ALL, which can inform treatment decisions and improve patient outcomes.
Identifiants
pubmed: 39285373
doi: 10.1186/s12911-024-02645-6
pii: 10.1186/s12911-024-02645-6
doi:
Types de publication
Journal Article
Comparative Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
261Informations de copyright
© 2024. The Author(s).
Références
World Population Prospects 2023. https://population.un.org/wpp.
World Health Organization 2023. https://www.who.int/data/gho/data/themes/topics/topic-details/GHO/child-mortality-and-causes-of-death.
Belson M, Kingsley B, Holmes A. Risk factors for acute leukemia in children: a review. Environ Health Perspect. 2007;115(1):138–45.
pubmed: 17366834
doi: 10.1289/ehp.9023
Kashef A, Khatibi T, Mehrvar A. Treatment outcome classification of pediatric acute lymphoblastic leukemia patients with clinical and medical data using machine learning: a case study at MAHAK hospital. Inf Med Unlocked. 2020;20:100399.
doi: 10.1016/j.imu.2020.100399
Torres-Flores J, Espinoza-Zamora R, Garcia-Mendez J, Cervera-Ceballos E, Sosa-Espinoza A, Zapata-Canto N. Treatment-related mortality from infectious complications in an acute leukemia clinic. J Hematol. 2020;9(4):123.
pubmed: 33224392
pmcid: 7665858
doi: 10.14740/jh751
Kaplan JA. Leukemia in children. Pediatr Rev. 2019;40(7):319–31.
pubmed: 31263040
doi: 10.1542/pir.2018-0192
Torres-Roman JS, Valcarcel B, Guerra-Canchari P, Santos CAD, Barbosa IR, La Vecchia C, et al. Leukemia mortality in children from Latin America: trends and predictions to 2030. BMC Pediatr. 2020;20(1):1–9.
doi: 10.1186/s12887-020-02408-y
Nguyen K, Devidas M, Cheng S-C, La M, Raetz EA, Carroll WL, et al. Factors influencing survival after relapse from acute lymphoblastic leukemia: a children’s oncology group study. Leukemia. 2008;22(12):2142–50.
pubmed: 18818707
pmcid: 2872117
doi: 10.1038/leu.2008.251
Zawitkowska J, Lejman M, Romiszewski M, Matysiak M, Ćwiklińska M, Balwierz W, et al. Results of two consecutive treatment protocols in Polish children with acute lymphoblastic leukemia. Sci Rep. 2020;10(1):1–9.
doi: 10.1038/s41598-020-75860-6
Conneely SE, Stevens AM. Acute myeloid leukemia in children: emerging paradigms in genetics and new approaches to therapy. Curr Oncol Rep. 2021;23:1–13.
doi: 10.1007/s11912-020-01009-3
Jerez-Aragonés JM, Gómez-Ruiz JA, Ramos-Jiménez G, Muñoz-Pérez J, Alba-Conejo E. A combined neural network and decision trees model for prognosis of breast cancer relapse. Artif Intell Med. 2003;27(1):45–63.
pubmed: 12473391
doi: 10.1016/S0933-3657(02)00086-6
Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019;380(14):1347–58.
pubmed: 30943338
doi: 10.1056/NEJMra1814259
Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electron Markets. 2021;31(3):685–95.
doi: 10.1007/s12525-021-00475-2
Farhadian M, Torkaman S, Mojarad F. Random forest algorithm to identify factors associated with sports-related dental injuries in 6 to 13-year-old athlete children in Hamadan, Iran-2018-a cross-sectional study. BMC Sports Sci Med Rehabilitation. 2020;12:1–9.
doi: 10.1186/s13102-020-00217-5
Soofi AA, Awan A. Classification techniques in machine learning: applications and issues. J Basic Appl Sci. 2017;13:459–65.
doi: 10.6000/1927-5129.2017.13.76
Wu W-T, Li Y-J, Feng A-Z, Li L, Huang T, Xu A-D, et al. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Military Med Res. 2021;8:1–12.
doi: 10.1186/s40779-021-00338-z
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–8.
pubmed: 28117445
pmcid: 8382232
doi: 10.1038/nature21056
Karmakar R, Chatterjee S, Das AK, Mandal A. BCPUML: breast cancer prediction using machine learning approach—A performance analysis. SN Comput Sci. 2023;4(4):377.
doi: 10.1007/s42979-023-01825-x
Chang V, Bhavani VR, Xu AQ, Hossain M. An artificial intelligence model for heart disease detection using machine learning algorithms. Healthc Analytics. 2022;2:100016.
doi: 10.1016/j.health.2022.100016
Moslehi S, Rabiei N, Soltanian AR, Mamani M. Application of machine learning models based on decision trees in classifying the factors affecting mortality of COVID-19 patients in Hamadan, Iran. BMC Med Inf Decis Mak. 2022;22(1):192.
doi: 10.1186/s12911-022-01939-x
Hassanzadeh R, Farhadian M, Rafieemehr H. Hospital mortality prediction in traumatic injuries patients: comparing different SMOTE-based machine learning algorithms. BMC Med Res Methodol. 2023;23(1):1–15.
doi: 10.1186/s12874-023-01920-w
Anderson JP, Parikh JR, Shenfeld DK, Ivanov V, Marks C, Church BW, et al. Reverse engineering and evaluation of prediction models for progression to type 2 diabetes: an application of machine learning using electronic health records. J Diabetes Sci Technol. 2016;10(1):6–18.
doi: 10.1177/1932296815620200
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.
pubmed: 25750696
doi: 10.1016/j.csbj.2014.11.005
Cruz JA, Wishart DS. Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2006;2:117693510600200030.
doi: 10.1177/117693510600200030
Yeoh E-J, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell. 2002;1(2):133–43.
pubmed: 12086872
doi: 10.1016/S1535-6108(02)00032-6
Salah HT, Muhsen IN, Salama ME, Owaidah T, Hashmi SK. Machine learning applications in the diagnosis of leukemia: current trends and future directions. Int J Lab Hematol. 2019;41(6):717–25.
pubmed: 31498973
doi: 10.1111/ijlh.13089
Ross ME, Zhou X, Song G, Shurtleff SA, Girtman K, Williams WK, et al. Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood. 2003;102(8):2951–9.
pubmed: 12730115
doi: 10.1182/blood-2003-01-0338
Willenbrock H, Juncker A, Schmiegelow K, Knudsen S, Ryder L. Prediction of immunophenotype, treatment response, and relapse in childhood acute lymphoblastic leukemia using DNA microarrays. Leukemia. 2004;18(7):1270–7.
pubmed: 15152267
doi: 10.1038/sj.leu.2403392
Mohapatra S, Patra D, Satpathi S, editors. Image analysis of blood microscopic images for acute leukemia detection. 2010 international conference on industrial electronics, control and robotics; 2010: IEEE.
Tran V-N, Ismail W, Hassan R, Yoshitaka A, editors. An automated method for the nuclei and cytoplasm of acute myeloid leukemia detection in blood smear images. 2016 World Automation Congress (WAC); 2016: IEEE.
Eckardt J-N, Bornhäuser M, Wendt K, Middeke JM. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects. Blood Adv. 2020;4(23):6077–85.
pubmed: 33290546
pmcid: 7724910
doi: 10.1182/bloodadvances.2020002997
Ghaderzadeh M, Asadi F, Hosseini A, Bashash D, Abolghasemi H, Roshanpour A. Machine learning in detection and classification of leukemia using smear blood images: a systematic review. Sci Program. 2021;2021:1–14.
Pan L, Liu G, Lin F, Zhong S, Xia H, Sun X, et al. Machine learning applications for prediction of relapse in childhood acute lymphoblastic leukemia. Sci Rep. 2017;7(1):1–9.
Ramezan A, Warner CA, Maxwell TE. Evaluation of sampling and cross-validation tuning strategies for regional-scale machine learning classification. Remote Sens. 2019;11(2):185.
doi: 10.3390/rs11020185
Tougui I, Jilbab A, El Mhamdi J. Impact of the choice of cross-validation techniques on the results of machine learning-based diagnostic applications. Healthc Inf Res. 2021;27(3):189–99.
doi: 10.4258/hir.2021.27.3.189
Agresti A, Kateri M. Categorical data analysis. Springer; 2011.
Lee SK. On classification and regression trees for multiple responses and its application. J Classif. 2006;23(1):123–41.
doi: 10.1007/s00357-006-0007-1
Najafi-Ghobadi S, Najafi-Ghobadi K, Tapak L, Aghaei A. Application of data mining techniques and logistic regression to model drug use transition to injection: a case study in drug use treatment centers in Kermanshah Province, Iran. Subst Abuse Treat Prev Policy. 2019;14(1):1–11.
doi: 10.1186/s13011-019-0242-1
Buntine W, Niblett T. A further comparison of splitting rules for decision-tree induction. Mach Learn. 1992;8:75–85.
doi: 10.1007/BF00994006
Najafi-Vosough R, Faradmal J, Hosseini SK, Moghimbeigi A, Mahjub H. Predicting hospital readmission in heart failure patients in Iran: a comparison of various machine learning methods. Healthc Inf Res. 2021;27(4):307–14.
doi: 10.4258/hir.2021.27.4.307
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
doi: 10.1023/A:1010933404324
Suykens JA, De Brabanter J, Lukas L, Vandewalle J. Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing. 2002;48(1–4):85–105.
doi: 10.1016/S0925-2312(01)00644-0
Singh S, Parmar KS, Makkhan SJS, Kaur J, Peshoria S, Kumar J. Study of ARIMA and least square support vector machine (LS-SVM) models for the prediction of SARS-CoV-2 confirmed cases in the most affected countries. Chaos Solitons Fractals. 2020;139:110086.
pubmed: 32834622
pmcid: 7345281
doi: 10.1016/j.chaos.2020.110086
Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. Springer; 2009.
Ray S, editor. A quick review of machine learning algorithms. 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon); 2019: IEEE.
Garson DG. Interpreting neural network connection weights. 1991.
Tapak L, Shirmohammadi-Khorram N, Amini P, Alafchi B, Hamidi O, Poorolajal J. Prediction of survival and metastasis in breast cancer patients using machine learning classifiers. Clin Epidemiol Global Health. 2019;7(3):293–9.
doi: 10.1016/j.cegh.2018.10.003
Mayr A, Binder H, Gefeller O, Schmid M. The evolution of boosting algorithms. Methods Inf Med. 2014;53(06):419–27.
pubmed: 25112367
doi: 10.3414/ME13-01-0122
Shariatnia S, Ziaratban M, Rajabi A, Salehi A, Abdi Zarrini K, Vakili M. Modeling the diagnosis of coronary artery disease by discriminant analysis and logistic regression: a cross-sectional study. BMC Med Inf Decis Mak. 2022;22(1):85.
doi: 10.1186/s12911-022-01823-8
Izenman AJ. Linear discriminant analysis. Modern multivariate statistical techniques: regression, classification, and manifold learning. Springer; 2013. pp. 237–80.
Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian J Intern Med. 2013;4(2):627.
pubmed: 24009950
pmcid: 3755824
Bhojwani D, Kang H, Menezes RX, Yang W, Sather H, Moskowitz NP, et al. Gene expression signatures predictive of early response and outcome in high-risk childhood acute lymphoblastic leukemia: a children’s oncology group study. J Clin Oncol. 2008;26(27):4376.
pubmed: 18802149
pmcid: 2736991
doi: 10.1200/JCO.2007.14.4519
Hunger SP, Lu X, Devidas M, Camitta BM, Gaynon PS, Winick NJ, et al. Improved survival for children and adolescents with acute lymphoblastic leukemia between 1990 and 2005: a report from the children’s oncology group. J Clin Oncol. 2012;30(14):1663.
pubmed: 22412151
pmcid: 3383113
doi: 10.1200/JCO.2011.37.8018
Schultz KR, Pullen DJ, Sather HN, Shuster JJ, Devidas M, Borowitz MJ, et al. Risk-and response-based classification of childhood B-precursor acute lymphoblastic leukemia: a combined analysis of prognostic markers from the pediatric oncology group (POG) and children’s cancer group (CCG). Blood. 2007;109(3):926–35.
pubmed: 17003380
pmcid: 1785141
doi: 10.1182/blood-2006-01-024729
Pui C-H, Carroll WL, Meshinchi S, Arceci RJ. Biology, risk stratification, and therapy of pediatric acute leukemias: an update. J Clin Oncol. 2011;29(5):551.
pubmed: 21220611
doi: 10.1200/JCO.2010.30.7405
Rajput D, Wang W-J, Chen C-C. Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics. 2023;24(1):48.
pubmed: 36788550
pmcid: 9926644
doi: 10.1186/s12859-023-05156-9
Yang Y, Su X, Zhao B, Li G, Hu P, Zhang J et al. Fuzzy-based deep attributed graph clustering. IEEE Trans Fuzzy Syst. 2023.