Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research.

SHAP breast cancer metastasis eXplainable artificial intelligence genomic biomarkers machine learning algorithms

Journal

Diagnostics (Basel, Switzerland)
ISSN: 2075-4418
Titre abrégé: Diagnostics (Basel)
Pays: Switzerland
ID NLM: 101658402

Informations de publication

Date de publication:
26 Oct 2023
Historique:
received: 04 10 2023
revised: 17 10 2023
accepted: 25 10 2023
medline: 14 11 2023
pubmed: 14 11 2023
entrez: 14 11 2023
Statut: epublish

Résumé

Method: This research presents a model combining machine learning (ML) techniques and eXplainable artificial intelligence (XAI) to predict breast cancer (BC) metastasis and reveal important genomic biomarkers in metastasis patients. A total of 98 primary BC samples was analyzed, comprising 34 samples from patients who developed distant metastases within a 5-year follow-up period and 44 samples from patients who remained disease-free for at least 5 years after diagnosis. Genomic data were then subjected to biostatistical analysis, followed by the application of the elastic net feature selection method. This technique identified a restricted number of genomic biomarkers associated with BC metastasis. A light gradient boosting machine (LightGBM), categorical boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Ada boosting (AdaBoost) algorithms were utilized for prediction. To assess the models' predictive abilities, the accuracy, F1 score, precision, recall, area under the ROC curve (AUC), and Brier score were calculated as performance evaluation metrics. To promote interpretability and overcome the "black box" problem of ML models, a SHapley Additive exPlanations (SHAP) method was employed. The LightGBM model outperformed other models, yielding remarkable accuracy of 96% and an AUC of 99.3%. In addition to biostatistical evaluation, in XAI-based SHAP results, increased expression levels of TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T ( The findings of this study may prevent disease progression and metastases and potentially improve clinical outcomes by recommending customized treatment approaches for BC patients.

Identifiants

pubmed: 37958210
pii: diagnostics13213314
doi: 10.3390/diagnostics13213314
pmc: PMC10650093
pii:
doi:

Types de publication

Journal Article

Langues

eng

Références

Psychiatr Danub. 2023 Spring;35(1):62-72
pubmed: 37060594
Oncol Lett. 2016 Nov;12(5):3845-3855
pubmed: 27895739
Diagnostics (Basel). 2023 Aug 09;13(16):
pubmed: 37627886
Nat Commun. 2021 Dec 13;12(1):7216
pubmed: 34903738
Oncotarget. 2015 Dec 1;6(38):41360-9
pubmed: 26462023
Semin Cell Dev Biol. 2017 Aug;68:72-84
pubmed: 28506892
J Med Radiat Sci. 2023 Aug 10;:
pubmed: 37563948
Mol Endocrinol. 2013 Apr;27(4):657-70
pubmed: 23518928
Front Oncol. 2020 Mar 31;10:330
pubmed: 32296631
Nat Cell Biol. 2011 Jan;13(1):102-8
pubmed: 21170034
Cancer. 2020 May 15;126 Suppl 10:2379-2393
pubmed: 32348566
Epidemiology. 2010 Jan;21(1):128-38
pubmed: 20010215
Comput Ind Eng. 2022 Mar;165:107912
pubmed: 35013637
Cancers (Basel). 2021 Jul 09;13(14):
pubmed: 34298668
Injury. 2023 May;54 Suppl 3:S69-S73
pubmed: 35135685
Sci Rep. 2016 Jun 07;6:27327
pubmed: 27273294
Biology (Basel). 2023 Jun 21;12(7):
pubmed: 37508326
Int Immunopharmacol. 2020 Jul;84:106535
pubmed: 32361569
Nature. 2002 Jan 31;415(6871):530-6
pubmed: 11823860
Br J Cancer. 2003 Jul 21;89(2):271-6
pubmed: 12865916
Radiol Med. 2022 Aug;127(8):819-836
pubmed: 35771379
Br J Cancer. 2008 Aug 5;99(3):398-403
pubmed: 18648365
Cancer Med. 2022 Jun;11(12):2503-2515
pubmed: 35191613
Clin Exp Med. 2023 Feb;23(1):1-16
pubmed: 35031885
Comput Methods Programs Biomed. 2022 Feb;214:106584
pubmed: 34942412
Cancers (Basel). 2019 Nov 14;11(11):
pubmed: 31739537
Mol Cell. 2010 Apr 9;38(1):6-15
pubmed: 20385085

Auteurs

Burak Yagin (B)

Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey.

Fatma Hilal Yagin (FH)

Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey.

Cemil Colak (C)

Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey.

Feyza Inceoglu (F)

Department of Biostatistics, Faculty of Medicine, Malatya Turgut Ozal University, Malatya 44090, Turkey.

Seifedine Kadry (S)

Department of applied Data science, Noroff University College, 4612 Kristiansand, Norway.
Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, United Arab Emirates.
Department of Electrical and Computer Engineering, Lebanese American University, Byblos 36, Lebanon.

Jungeun Kim (J)

Department of Software, Kongju National University, Cheonan 31080, Republic of Korea.

Classifications MeSH