Statistical and Machine Learning Methods for Discovering Prognostic Biomarkers for Survival Outcomes.

Survival Analysis Prognosis Machine Learning Biomarkers Datasets as Topic Neural Networks, Computer Kaplan-Meier Estimate Proportional Hazards Models Lung Neoplasms / diagnosis Head and Neck Neoplasms / diagnosis Programming Languages Deep Learning Biomarkers, Tumor Humans Male Female

Cox regression Elastic net Gradient boosting Lasso Machine learning Survival analysis

Journal

Methods in molecular biology (Clifton, N.J.)

ISSN: 1940-6029

Titre abrégé: Methods Mol Biol

Pays: United States

ID NLM: 9214969

Informations de publication

Date de publication:
2023

Historique:

entrez: 17 3 2023

pubmed: 18 3 2023

medline: 22 3 2023

Statut: ppublish

Résumé

Discovering molecular biomarkers for predicting patient survival outcomes is an essential step toward improving prognosis and therapeutic decision-making in the treatment of severe diseases such as cancer. Due to the high-dimensionality nature of omics datasets, statistical methods such as the least absolute shrinkage and selection operator (Lasso) have been widely applied for cancer biomarker discovery. Due to their scalability and demonstrated prediction performance, machine learning methods such as XGBoost and neural network models have also been gaining popularity in the community recently. However, compared to more traditional survival methods such as Kaplan-Meier and Cox regression methods, high-dimensional methods for survival outcomes are still less well known to biomedical researchers. In this chapter, we will discuss the key analytical procedures in employing these methods for identifying biomarkers associated with survival data. We will also identify important considerations that emerged from the analysis of actual omics data. Some typical instances of misapplication and misinterpretation of machine learning methods will also be discussed. Using lung cancer and head and neck cancer datasets as demonstrations, we provide step-by-step instructions and sample R codes for prioritizing prognostic biomarkers.

Identifiants

DOI: 10.1007/978-1-0716-2986-4_2 PMID: 36929071

pubmed: 36929071

doi: 10.1007/978-1-0716-2986-4_2

doi:

Substances chimiques

Biomarkers 0

Biomarkers, Tumor 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

11-21

Informations de copyright

Références

Van Belle V, Pelckmans K, Suykens JA, Van Huffel S, Survival SVM (2008) A practical scalable algorithm. In: ESANN, 2008. Citeseer, p 94

Wilson CM, Li K, Sun Q, Kuan PF, Wang X (2021) Fenchel duality of cox partial likelihood with an application in survival kernel learning. Artif Intell Med 116:102077

doi: 10.1016/j.artmed.2021.102077 pubmed: 34020756 pmcid: 8159024

Li K, Yao S, Zhang Z, Cao B, Wilson CM, Kalos D, Kuan PF, Zhu R, Wang X (2022) Efficient gradient boosting for prognostic biomarker discovery. Bioinformatics 38(6):1631–1638

doi: 10.1093/bioinformatics/btab869 pubmed: 34978570

Steingrimsson JA, Morrison S (2020) Deep learning for survival outcomes. Stat Med 39(17):2339–2349

doi: 10.1002/sim.8542 pubmed: 32281672 pmcid: 7334068

Katzman JL, Shaham U, Cloninger A, Bates J, Jiang T, Kluger Y (2018) DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18(1):1–12

doi: 10.1186/s12874-018-0482-1

Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67(2):301–320

doi: 10.1111/j.1467-9868.2005.00503.x

Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1

doi: 10.18637/jss.v039.i05 pubmed: 27065756 pmcid: 4824408

Sill M, Hielscher T, Becker N, Zucknick M (2015) c060: extended inference with lasso and elastic-net regularized Cox and generalized linear models. J Stat Softw 62:1–22

Wang S, Nan B, Rosset S, Zhu J (2011) Random lasso. Ann Appl Stat 5(1):468

doi: 10.1214/10-AOAS377 pubmed: 22997542 pmcid: 3445423

Kim S, Baladandayuthapani V, Lee JJ (2017) Prediction-oriented marker selection (PROMISE): with application to high-dimensional regression. Stat Biosci 9(1):217–245

doi: 10.1007/s12561-016-9169-5 pubmed: 28785367

Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

doi: 10.1214/aos/1013203451

Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD’16), vol 785. San Francisco, CA, p 794

Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Proces Syst 30:3146–3154

Masters T (1993) Practical neural network recipes in C++. Academic Press Professional, Inc., Boston

Statistical and Machine Learning Methods for Discovering Prognostic Biomarkers for Survival Outcomes.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Sijie Yao (S)

Xuefeng Wang (X)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH