A clinician's guide for developing a prediction model: a case study using real-world data of patients with castration-resistant prostate cancer.
Castration-resistant prostate cancer
Cox proportional hazard model
Decision-making
Prediction modeling
Journal
Journal of cancer research and clinical oncology
ISSN: 1432-1335
Titre abrégé: J Cancer Res Clin Oncol
Pays: Germany
ID NLM: 7902060
Informations de publication
Date de publication:
Aug 2020
Aug 2020
Historique:
received:
23
03
2020
accepted:
12
05
2020
pubmed:
20
6
2020
medline:
14
7
2020
entrez:
20
6
2020
Statut:
ppublish
Résumé
With the increasing interest in treatment decision-making based on risk prediction models, it is essential for clinicians to understand the steps in developing and interpreting such models. A retrospective registry of 20 Dutch hospitals with data on patients treated for castration-resistant prostate cancer was used to guide clinicians through the steps of developing a prediction model. The model of choice was the Cox proportional hazard model. Using the exemplary dataset several essential steps in prediction modelling are discussed including: coding of predictors, missing values, interaction, model specification and performance. An advanced method for appropriate selection of main effects, e.g. Least Absolute Shrinkage and Selection Operator (LASSO) regression, is described. Furthermore, the assumptions of Cox proportional hazard model are discussed, and how to handle violations of the proportional hazard assumption using time-varying coefficients. This study provides a comprehensive detailed guide to bridge the gap between the statistician and clinician, based on a large dataset of real-world patients treated for castration-resistant prostate cancer.
Identifiants
pubmed: 32556680
doi: 10.1007/s00432-020-03286-8
pii: 10.1007/s00432-020-03286-8
pmc: PMC7324416
doi:
Types de publication
Journal Article
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
2067-2075Références
Alba AC, Agoritsas T, Walsh M, Hanna S, Iorio A, Devereaux PJ et al (2017) Discrimination and calibration of clinical prediction models: users' guides to the medical literature. JAMA 318(14):1377–1384
pubmed: 29049590
doi: 10.1001/jama.2017.12126
Babyak MA (2004) What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models. Psychosom Med 66(3):411–421
pubmed: 15184705
Breheny P, Huang J (2015) Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat Comput 25(2):173–187
pubmed: 25750488
doi: 10.1007/s11222-013-9424-2
Carroll KJ (2003) On the use and utility of the Weibull model in the analysis of survival data. Control Clin Trials 24(6):682–701
pubmed: 14662274
doi: 10.1016/S0197-2456(03)00072-2
Collins GS, Reitsma JB, Altman DG, Moons KG (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. J Clin Epidemiol 68(2):134–143
pubmed: 25579640
doi: 10.1016/j.jclinepi.2014.11.010
Cornford P, Bellmunt J, Bolla M, Briers E, De Santis M, Gross T et al (2017) EAU-ESTRO-SIOG guidelines on prostate cancer. Part II: treatment of relapsing, metastatic, and castration-resistant prostate cancer. Eur Urol 71(4):630–642
pubmed: 27591931
doi: 10.1016/j.eururo.2016.08.002
de Angst IB, Kil PJM, Bangma CH, Takkenberg JJM (2019) Should we involve patients more actively? Perspectives of the multidisciplinary team on shared decision-making for older patients with metastatic castration-resistant prostate cancer. J Geriatr Oncol 10(4):653–658
pubmed: 30639265
doi: 10.1016/j.jgo.2018.12.003
Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press, London
doi: 10.1201/9780429246593
Fisher LD, Lin DY (1999) Time-dependent covariates in the Cox proportional-hazards regression model. Annu Rev Public Health 20(1):145–157
pubmed: 10352854
doi: 10.1146/annurev.publhealth.20.1.145
Freedman DA (2009) Statistical models: theory and practice. Cambridge University Press, Cambridge
doi: 10.1017/CBO9780511815867
Friedman JH, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent 33(1):22
Grambsch PM, Therneau TM (1994) Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 81(3):515–526
doi: 10.1093/biomet/81.3.515
Harrell FE Jr (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer, New York
doi: 10.1007/978-3-319-19425-7
Hastie T, Tibshirani R (1993) Varying-coefficient models. J Roy Stat Soc Ser B (Methodol) 55(4):757–779
Hastie T, Qian J (2016) Glmnet Vignette. https://web.stanford.edu/~hastie/Papers/Glmnet_Vignette.pdf . Accessed 5 Jan 2020
Kearns JT, Lin DW (2017) Prediction models for prostate cancer outcomes: what is the state of the art in 2017? Curr Opin Urol 27(5):469–474
pubmed: 28650863
doi: 10.1097/MOU.0000000000000423
Miller A (2002) Subset selection in regression. Chapman and Hall/CRC, London
doi: 10.1201/9781420035933
Moons KG, Donders RA, Stijnen T, Harrell FE Jr (2006) Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol 59(10):1092–1101
pubmed: 16980150
doi: 10.1016/j.jclinepi.2006.01.009
Franke, GR (2010) Multicollinearity part 2. Marketing Research. Wiley International Encyclopedia of Marketing
Papageorgiou G, Grant SW, Takkenberg JJM, Mokhles MM (2018) Statistical primer: how to deal with missing data in scientific research? Interact Cardiovasc Thorac Surg 27(2):153–158
pubmed: 29757374
doi: 10.1093/icvts/ivy102
Pencina MJ, D'Agostino RB Sr (2015) Evaluating discrimination of risk prediction models: the C statistic. JAMA 314(10):1063–1064
pubmed: 26348755
doi: 10.1001/jama.2015.11082
Ratner B (2010) Variable selection methods in regression: Ignorable problem, outing notable solution. J Target Meas Anal Market 18(1):65–75
doi: 10.1057/jt.2009.26
Royston P, Altman DG, Sauerbrei W (2006) Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med 25(1):127–141
pubmed: 16217841
doi: 10.1002/sim.2331
Rubin DB (2004) Multiple imputation for nonresponse in surveys. Wiley, Hoboken
Schoenfeld D (1982) Partial residuals for the proportional hazards regression model. Biometrika 69(1):239–241
doi: 10.1093/biomet/69.1.239
Steyerberg EW (2008) Clinical prediction models: a practical approach to development, validation, and updating. Springer, New York
Su T-L, Jaki T, Hickey GL, Buchan I, Sperrin M (2018) A review of statistical updating methods for clinical prediction models. Stat Methods Med Res 27(1):185–197
pubmed: 27460537
doi: 10.1177/0962280215626466
Therneau T, Crowson C, Atkinson E (2013) Using time dependent covariates and time dependent coefficients in the Cox model. Red 2:1
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B (Methodol) 58(1):267–288
Westgeest HM, Uyl-de Groot CA, van Moorselaar RJA, de Wit R, van den Bergh ACM, Coenen J et al (2018) Differences in trial and real-world populations in the Dutch castration-resistant prostate cancer registry. Eur Urol Focus 4(5):694–701
pubmed: 28753794
doi: 10.1016/j.euf.2016.09.008
van Buuren S, Groothuis-Oudshoorn K (2011) Mice: Multivariate Imputation by Chained Equations in R 45(3):67.
Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina MJ, Steyerberg EW (2016) A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol 74:167–176
pubmed: 26772608
doi: 10.1016/j.jclinepi.2015.12.005