Machine learning-based evaluation of prognostic factors for mortality and relapse in patients with acute lymphoblastic leukemia: a comparative simulation study.


Journal

BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682

Informations de publication

Date de publication:
16 Sep 2024
Historique:
received: 06 01 2024
accepted: 21 08 2024
medline: 17 9 2024
pubmed: 17 9 2024
entrez: 16 9 2024
Statut: epublish

Résumé

Predicting mortality and relapse in children with acute lymphoblastic leukemia (ALL) is crucial for effective treatment and follow-up management. ALL is a common and deadly childhood cancer that often relapses after remission. In this study, we aimed to apply and evaluate machine learning-based models for predicting mortality and relapse in pediatric ALL patients. This retrospective cohort study was conducted on 161 children aged less than 16 years with ALL. Survival status (dead/alive) and patient experience of relapse (yes/no) were considered as the outcome variables. Ten machine learning (ML) algorithms were used to predict mortality and relapse. The performance of the algorithms was evaluated by cross-validation and reported as mean sensitivity, specificity, accuracy and area under the curve (AUC). Finally, prognostic factors were identified based on the best algorithms. The mean accuracy of the ML algorithms for prediction of patient mortality ranged from 64 to 74% and for prediction of relapse, it varied from 64 to 84% on test data sets. The mean AUC of the ML algorithms for mortality and relapse was above 64%. The most important prognostic factors for predicting both mortality and relapse were identified as age at diagnosis, hemoglobin and platelets. In addition, significant prognostic factors for predicting mortality included clinical side effects such as splenomegaly, hepatomegaly and lymphadenopathy. Our results showed that artificial neural networks and bagging algorithms outperformed other algorithms in predicting mortality, while boosting and random forest algorithms excelled in predicting relapse in ALL patients across all criteria. These results offer significant clinical insights into the prognostic factors for children with ALL, which can inform treatment decisions and improve patient outcomes.

Sections du résumé

BACKGROUND BACKGROUND
Predicting mortality and relapse in children with acute lymphoblastic leukemia (ALL) is crucial for effective treatment and follow-up management. ALL is a common and deadly childhood cancer that often relapses after remission. In this study, we aimed to apply and evaluate machine learning-based models for predicting mortality and relapse in pediatric ALL patients.
METHODS METHODS
This retrospective cohort study was conducted on 161 children aged less than 16 years with ALL. Survival status (dead/alive) and patient experience of relapse (yes/no) were considered as the outcome variables. Ten machine learning (ML) algorithms were used to predict mortality and relapse. The performance of the algorithms was evaluated by cross-validation and reported as mean sensitivity, specificity, accuracy and area under the curve (AUC). Finally, prognostic factors were identified based on the best algorithms.
RESULTS RESULTS
The mean accuracy of the ML algorithms for prediction of patient mortality ranged from 64 to 74% and for prediction of relapse, it varied from 64 to 84% on test data sets. The mean AUC of the ML algorithms for mortality and relapse was above 64%. The most important prognostic factors for predicting both mortality and relapse were identified as age at diagnosis, hemoglobin and platelets. In addition, significant prognostic factors for predicting mortality included clinical side effects such as splenomegaly, hepatomegaly and lymphadenopathy.
CONCLUSIONS CONCLUSIONS
Our results showed that artificial neural networks and bagging algorithms outperformed other algorithms in predicting mortality, while boosting and random forest algorithms excelled in predicting relapse in ALL patients across all criteria. These results offer significant clinical insights into the prognostic factors for children with ALL, which can inform treatment decisions and improve patient outcomes.

Identifiants

pubmed: 39285373
doi: 10.1186/s12911-024-02645-6
pii: 10.1186/s12911-024-02645-6
doi:

Types de publication

Journal Article Comparative Study

Langues

eng

Sous-ensembles de citation

IM

Pagination

261

Informations de copyright

© 2024. The Author(s).

Références

World Population Prospects 2023. https://population.un.org/wpp.
World Health Organization 2023. https://www.who.int/data/gho/data/themes/topics/topic-details/GHO/child-mortality-and-causes-of-death.
Belson M, Kingsley B, Holmes A. Risk factors for acute leukemia in children: a review. Environ Health Perspect. 2007;115(1):138–45.
pubmed: 17366834 doi: 10.1289/ehp.9023
Kashef A, Khatibi T, Mehrvar A. Treatment outcome classification of pediatric acute lymphoblastic leukemia patients with clinical and medical data using machine learning: a case study at MAHAK hospital. Inf Med Unlocked. 2020;20:100399.
doi: 10.1016/j.imu.2020.100399
Torres-Flores J, Espinoza-Zamora R, Garcia-Mendez J, Cervera-Ceballos E, Sosa-Espinoza A, Zapata-Canto N. Treatment-related mortality from infectious complications in an acute leukemia clinic. J Hematol. 2020;9(4):123.
pubmed: 33224392 pmcid: 7665858 doi: 10.14740/jh751
Kaplan JA. Leukemia in children. Pediatr Rev. 2019;40(7):319–31.
pubmed: 31263040 doi: 10.1542/pir.2018-0192
Torres-Roman JS, Valcarcel B, Guerra-Canchari P, Santos CAD, Barbosa IR, La Vecchia C, et al. Leukemia mortality in children from Latin America: trends and predictions to 2030. BMC Pediatr. 2020;20(1):1–9.
doi: 10.1186/s12887-020-02408-y
Nguyen K, Devidas M, Cheng S-C, La M, Raetz EA, Carroll WL, et al. Factors influencing survival after relapse from acute lymphoblastic leukemia: a children’s oncology group study. Leukemia. 2008;22(12):2142–50.
pubmed: 18818707 pmcid: 2872117 doi: 10.1038/leu.2008.251
Zawitkowska J, Lejman M, Romiszewski M, Matysiak M, Ćwiklińska M, Balwierz W, et al. Results of two consecutive treatment protocols in Polish children with acute lymphoblastic leukemia. Sci Rep. 2020;10(1):1–9.
doi: 10.1038/s41598-020-75860-6
Conneely SE, Stevens AM. Acute myeloid leukemia in children: emerging paradigms in genetics and new approaches to therapy. Curr Oncol Rep. 2021;23:1–13.
doi: 10.1007/s11912-020-01009-3
Jerez-Aragonés JM, Gómez-Ruiz JA, Ramos-Jiménez G, Muñoz-Pérez J, Alba-Conejo E. A combined neural network and decision trees model for prognosis of breast cancer relapse. Artif Intell Med. 2003;27(1):45–63.
pubmed: 12473391 doi: 10.1016/S0933-3657(02)00086-6
Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019;380(14):1347–58.
pubmed: 30943338 doi: 10.1056/NEJMra1814259
Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electron Markets. 2021;31(3):685–95.
doi: 10.1007/s12525-021-00475-2
Farhadian M, Torkaman S, Mojarad F. Random forest algorithm to identify factors associated with sports-related dental injuries in 6 to 13-year-old athlete children in Hamadan, Iran-2018-a cross-sectional study. BMC Sports Sci Med Rehabilitation. 2020;12:1–9.
doi: 10.1186/s13102-020-00217-5
Soofi AA, Awan A. Classification techniques in machine learning: applications and issues. J Basic Appl Sci. 2017;13:459–65.
doi: 10.6000/1927-5129.2017.13.76
Wu W-T, Li Y-J, Feng A-Z, Li L, Huang T, Xu A-D, et al. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Military Med Res. 2021;8:1–12.
doi: 10.1186/s40779-021-00338-z
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–8.
pubmed: 28117445 pmcid: 8382232 doi: 10.1038/nature21056
Karmakar R, Chatterjee S, Das AK, Mandal A. BCPUML: breast cancer prediction using machine learning approach—A performance analysis. SN Comput Sci. 2023;4(4):377.
doi: 10.1007/s42979-023-01825-x
Chang V, Bhavani VR, Xu AQ, Hossain M. An artificial intelligence model for heart disease detection using machine learning algorithms. Healthc Analytics. 2022;2:100016.
doi: 10.1016/j.health.2022.100016
Moslehi S, Rabiei N, Soltanian AR, Mamani M. Application of machine learning models based on decision trees in classifying the factors affecting mortality of COVID-19 patients in Hamadan, Iran. BMC Med Inf Decis Mak. 2022;22(1):192.
doi: 10.1186/s12911-022-01939-x
Hassanzadeh R, Farhadian M, Rafieemehr H. Hospital mortality prediction in traumatic injuries patients: comparing different SMOTE-based machine learning algorithms. BMC Med Res Methodol. 2023;23(1):1–15.
doi: 10.1186/s12874-023-01920-w
Anderson JP, Parikh JR, Shenfeld DK, Ivanov V, Marks C, Church BW, et al. Reverse engineering and evaluation of prediction models for progression to type 2 diabetes: an application of machine learning using electronic health records. J Diabetes Sci Technol. 2016;10(1):6–18.
doi: 10.1177/1932296815620200
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.
pubmed: 25750696 doi: 10.1016/j.csbj.2014.11.005
Cruz JA, Wishart DS. Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2006;2:117693510600200030.
doi: 10.1177/117693510600200030
Yeoh E-J, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell. 2002;1(2):133–43.
pubmed: 12086872 doi: 10.1016/S1535-6108(02)00032-6
Salah HT, Muhsen IN, Salama ME, Owaidah T, Hashmi SK. Machine learning applications in the diagnosis of leukemia: current trends and future directions. Int J Lab Hematol. 2019;41(6):717–25.
pubmed: 31498973 doi: 10.1111/ijlh.13089
Ross ME, Zhou X, Song G, Shurtleff SA, Girtman K, Williams WK, et al. Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood. 2003;102(8):2951–9.
pubmed: 12730115 doi: 10.1182/blood-2003-01-0338
Willenbrock H, Juncker A, Schmiegelow K, Knudsen S, Ryder L. Prediction of immunophenotype, treatment response, and relapse in childhood acute lymphoblastic leukemia using DNA microarrays. Leukemia. 2004;18(7):1270–7.
pubmed: 15152267 doi: 10.1038/sj.leu.2403392
Mohapatra S, Patra D, Satpathi S, editors. Image analysis of blood microscopic images for acute leukemia detection. 2010 international conference on industrial electronics, control and robotics; 2010: IEEE.
Tran V-N, Ismail W, Hassan R, Yoshitaka A, editors. An automated method for the nuclei and cytoplasm of acute myeloid leukemia detection in blood smear images. 2016 World Automation Congress (WAC); 2016: IEEE.
Eckardt J-N, Bornhäuser M, Wendt K, Middeke JM. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects. Blood Adv. 2020;4(23):6077–85.
pubmed: 33290546 pmcid: 7724910 doi: 10.1182/bloodadvances.2020002997
Ghaderzadeh M, Asadi F, Hosseini A, Bashash D, Abolghasemi H, Roshanpour A. Machine learning in detection and classification of leukemia using smear blood images: a systematic review. Sci Program. 2021;2021:1–14.
Pan L, Liu G, Lin F, Zhong S, Xia H, Sun X, et al. Machine learning applications for prediction of relapse in childhood acute lymphoblastic leukemia. Sci Rep. 2017;7(1):1–9.
Ramezan A, Warner CA, Maxwell TE. Evaluation of sampling and cross-validation tuning strategies for regional-scale machine learning classification. Remote Sens. 2019;11(2):185.
doi: 10.3390/rs11020185
Tougui I, Jilbab A, El Mhamdi J. Impact of the choice of cross-validation techniques on the results of machine learning-based diagnostic applications. Healthc Inf Res. 2021;27(3):189–99.
doi: 10.4258/hir.2021.27.3.189
Agresti A, Kateri M. Categorical data analysis. Springer; 2011.
Lee SK. On classification and regression trees for multiple responses and its application. J Classif. 2006;23(1):123–41.
doi: 10.1007/s00357-006-0007-1
Najafi-Ghobadi S, Najafi-Ghobadi K, Tapak L, Aghaei A. Application of data mining techniques and logistic regression to model drug use transition to injection: a case study in drug use treatment centers in Kermanshah Province, Iran. Subst Abuse Treat Prev Policy. 2019;14(1):1–11.
doi: 10.1186/s13011-019-0242-1
Buntine W, Niblett T. A further comparison of splitting rules for decision-tree induction. Mach Learn. 1992;8:75–85.
doi: 10.1007/BF00994006
Najafi-Vosough R, Faradmal J, Hosseini SK, Moghimbeigi A, Mahjub H. Predicting hospital readmission in heart failure patients in Iran: a comparison of various machine learning methods. Healthc Inf Res. 2021;27(4):307–14.
doi: 10.4258/hir.2021.27.4.307
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
doi: 10.1023/A:1010933404324
Suykens JA, De Brabanter J, Lukas L, Vandewalle J. Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing. 2002;48(1–4):85–105.
doi: 10.1016/S0925-2312(01)00644-0
Singh S, Parmar KS, Makkhan SJS, Kaur J, Peshoria S, Kumar J. Study of ARIMA and least square support vector machine (LS-SVM) models for the prediction of SARS-CoV-2 confirmed cases in the most affected countries. Chaos Solitons Fractals. 2020;139:110086.
pubmed: 32834622 pmcid: 7345281 doi: 10.1016/j.chaos.2020.110086
Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. Springer; 2009.
Ray S, editor. A quick review of machine learning algorithms. 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon); 2019: IEEE.
Garson DG. Interpreting neural network connection weights. 1991.
Tapak L, Shirmohammadi-Khorram N, Amini P, Alafchi B, Hamidi O, Poorolajal J. Prediction of survival and metastasis in breast cancer patients using machine learning classifiers. Clin Epidemiol Global Health. 2019;7(3):293–9.
doi: 10.1016/j.cegh.2018.10.003
Mayr A, Binder H, Gefeller O, Schmid M. The evolution of boosting algorithms. Methods Inf Med. 2014;53(06):419–27.
pubmed: 25112367 doi: 10.3414/ME13-01-0122
Shariatnia S, Ziaratban M, Rajabi A, Salehi A, Abdi Zarrini K, Vakili M. Modeling the diagnosis of coronary artery disease by discriminant analysis and logistic regression: a cross-sectional study. BMC Med Inf Decis Mak. 2022;22(1):85.
doi: 10.1186/s12911-022-01823-8
Izenman AJ. Linear discriminant analysis. Modern multivariate statistical techniques: regression, classification, and manifold learning. Springer; 2013. pp. 237–80.
Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian J Intern Med. 2013;4(2):627.
pubmed: 24009950 pmcid: 3755824
Bhojwani D, Kang H, Menezes RX, Yang W, Sather H, Moskowitz NP, et al. Gene expression signatures predictive of early response and outcome in high-risk childhood acute lymphoblastic leukemia: a children’s oncology group study. J Clin Oncol. 2008;26(27):4376.
pubmed: 18802149 pmcid: 2736991 doi: 10.1200/JCO.2007.14.4519
Hunger SP, Lu X, Devidas M, Camitta BM, Gaynon PS, Winick NJ, et al. Improved survival for children and adolescents with acute lymphoblastic leukemia between 1990 and 2005: a report from the children’s oncology group. J Clin Oncol. 2012;30(14):1663.
pubmed: 22412151 pmcid: 3383113 doi: 10.1200/JCO.2011.37.8018
Schultz KR, Pullen DJ, Sather HN, Shuster JJ, Devidas M, Borowitz MJ, et al. Risk-and response-based classification of childhood B-precursor acute lymphoblastic leukemia: a combined analysis of prognostic markers from the pediatric oncology group (POG) and children’s cancer group (CCG). Blood. 2007;109(3):926–35.
pubmed: 17003380 pmcid: 1785141 doi: 10.1182/blood-2006-01-024729
Pui C-H, Carroll WL, Meshinchi S, Arceci RJ. Biology, risk stratification, and therapy of pediatric acute leukemias: an update. J Clin Oncol. 2011;29(5):551.
pubmed: 21220611 doi: 10.1200/JCO.2010.30.7405
Rajput D, Wang W-J, Chen C-C. Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics. 2023;24(1):48.
pubmed: 36788550 pmcid: 9926644 doi: 10.1186/s12859-023-05156-9
Yang Y, Su X, Zhao B, Li G, Hu P, Zhang J et al. Fuzzy-based deep attributed graph clustering. IEEE Trans Fuzzy Syst. 2023.

Auteurs

Zahra Mehrbakhsh (Z)

Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran.
Student Research Committee, Hamadan University of Medical Sciences, Hamadan, Iran.

Roghayyeh Hassanzadeh (R)

Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran.
Student Research Committee, Hamadan University of Medical Sciences, Hamadan, Iran.

Nasser Behnampour (N)

Department of Biostatistics and Epidemiology, School of Health, Golestan University of Medical Sciences, Gorgan, Iran.

Leili Tapak (L)

Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran. l.tapak@umsha.ac.ir.
Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran. l.tapak@umsha.ac.ir.

Ziba Zarrin (Z)

Department of Photogrammetry and Remote Sensing, K.N. Toosi University of Technology, Tehran, Iran.

Salman Khazaei (S)

Health Sciences Research Center, Health Sciences & Technology Research Institute, Hamadan University of Medical Science, Hamadan, Iran.

Irina Dinu (I)

School of Public Health, University of Alberta, Edmonton, Canada.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH