Statistical and Machine Learning Methods for Discovering Prognostic Biomarkers for Survival Outcomes.
Cox regression
Elastic net
Gradient boosting
Lasso
Machine learning
Survival analysis
Journal
Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969
Informations de publication
Date de publication:
2023
2023
Historique:
entrez:
17
3
2023
pubmed:
18
3
2023
medline:
22
3
2023
Statut:
ppublish
Résumé
Discovering molecular biomarkers for predicting patient survival outcomes is an essential step toward improving prognosis and therapeutic decision-making in the treatment of severe diseases such as cancer. Due to the high-dimensionality nature of omics datasets, statistical methods such as the least absolute shrinkage and selection operator (Lasso) have been widely applied for cancer biomarker discovery. Due to their scalability and demonstrated prediction performance, machine learning methods such as XGBoost and neural network models have also been gaining popularity in the community recently. However, compared to more traditional survival methods such as Kaplan-Meier and Cox regression methods, high-dimensional methods for survival outcomes are still less well known to biomedical researchers. In this chapter, we will discuss the key analytical procedures in employing these methods for identifying biomarkers associated with survival data. We will also identify important considerations that emerged from the analysis of actual omics data. Some typical instances of misapplication and misinterpretation of machine learning methods will also be discussed. Using lung cancer and head and neck cancer datasets as demonstrations, we provide step-by-step instructions and sample R codes for prioritizing prognostic biomarkers.
Identifiants
pubmed: 36929071
doi: 10.1007/978-1-0716-2986-4_2
doi:
Substances chimiques
Biomarkers
0
Biomarkers, Tumor
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
11-21Informations de copyright
© 2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.
Références
Van Belle V, Pelckmans K, Suykens JA, Van Huffel S, Survival SVM (2008) A practical scalable algorithm. In: ESANN, 2008. Citeseer, p 94
Wilson CM, Li K, Sun Q, Kuan PF, Wang X (2021) Fenchel duality of cox partial likelihood with an application in survival kernel learning. Artif Intell Med 116:102077
doi: 10.1016/j.artmed.2021.102077
pubmed: 34020756
pmcid: 8159024
Li K, Yao S, Zhang Z, Cao B, Wilson CM, Kalos D, Kuan PF, Zhu R, Wang X (2022) Efficient gradient boosting for prognostic biomarker discovery. Bioinformatics 38(6):1631–1638
doi: 10.1093/bioinformatics/btab869
pubmed: 34978570
Steingrimsson JA, Morrison S (2020) Deep learning for survival outcomes. Stat Med 39(17):2339–2349
doi: 10.1002/sim.8542
pubmed: 32281672
pmcid: 7334068
Katzman JL, Shaham U, Cloninger A, Bates J, Jiang T, Kluger Y (2018) DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18(1):1–12
doi: 10.1186/s12874-018-0482-1
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67(2):301–320
doi: 10.1111/j.1467-9868.2005.00503.x
Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1
doi: 10.18637/jss.v039.i05
pubmed: 27065756
pmcid: 4824408
Sill M, Hielscher T, Becker N, Zucknick M (2015) c060: extended inference with lasso and elastic-net regularized Cox and generalized linear models. J Stat Softw 62:1–22
Wang S, Nan B, Rosset S, Zhu J (2011) Random lasso. Ann Appl Stat 5(1):468
doi: 10.1214/10-AOAS377
pubmed: 22997542
pmcid: 3445423
Kim S, Baladandayuthapani V, Lee JJ (2017) Prediction-oriented marker selection (PROMISE): with application to high-dimensional regression. Stat Biosci 9(1):217–245
doi: 10.1007/s12561-016-9169-5
pubmed: 28785367
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
doi: 10.1214/aos/1013203451
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD’16), vol 785. San Francisco, CA, p 794
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Proces Syst 30:3146–3154
Masters T (1993) Practical neural network recipes in C++. Academic Press Professional, Inc., Boston