Comparison of standard and penalized logistic regression in risk model development.
EPV, events per variable
MLE, maximum likelihood estimation
MSE, mean square error
NCDB, National Cancer Database
PRM, penalized regression model
cvMSE, cross-validated mean square error
esophagectomy
outcome research
predictive modeling
regression
Journal
JTCVS open
ISSN: 2666-2736
Titre abrégé: JTCVS Open
Pays: Netherlands
ID NLM: 101768541
Informations de publication
Date de publication:
Mar 2022
Mar 2022
Historique:
received:
05
05
2021
accepted:
13
01
2022
entrez:
25
8
2022
pubmed:
26
8
2022
medline:
26
8
2022
Statut:
epublish
Résumé
Regression models are ubiquitous in thoracic surgical research. We aimed to compare the value of standard logistic regression with the more complex but increasingly used penalized regression models using a recently published risk model as an example. Using a standardized data set of clinical T1-3N0 esophageal cancer patients, we created models to predict the likelihood of unexpected pathologic nodal disease after surgical resection. Models were fitted using standard logistic regression or penalized regression (ridge, lasso, elastic net, and adaptive lasso). We compared the model performance (Brier score, calibration slope, C statistic, and overfitting) of standard regression with penalized regression models. Among 3206 patients with clinical T1-3N0 esophageal cancer, 668 (22%) had unexpected pathologic nodal disease. Of the 15 candidate variables considered in the models, the key predictors of nodal disease included clinical tumor stage, tumor size, grade, and presence of lymphovascular invasion. The standard regression model and all 4 penalized logistic regression models had virtually identical performance with Brier score ranging from 0.138 to 0.141, concordance index ranging from 0.775 to 0.788, and calibration slope from 0.965 to 1.05. For predictive modeling in surgical outcomes research, when the data set is large and the outcome of interest is relatively frequent, standard regression models and the more complicated penalized models are very likely to have similar predictive performance. The choice of statistical methods for risk model development should be on the basis of the nature of the data at hand and good statistical practice, rather than the novelty or complexity of statistical models.
Identifiants
pubmed: 36003440
doi: 10.1016/j.xjon.2022.01.016
pii: S2666-2736(22)00028-6
pmc: PMC9390725
doi:
Types de publication
Journal Article
Langues
eng
Pagination
303-316Subventions
Organisme : NCI NIH HHS
ID : P30 CA091842
Pays : United States
Informations de copyright
© 2022 The Author(s).
Références
JAMA. 1997 Feb 12;277(6):488-94
pubmed: 9020274
Am J Cardiol. 2017 May 1;119(9):1443-1449
pubmed: 28274574
J Thorac Cardiovasc Surg. 2015 Dec;150(6):1496-1504, 1505.e1-5; discussion 1504-5
pubmed: 26410004
Stat Med. 2016 Nov 10;35(25):4546-4558
pubmed: 27357163
Stat Med. 2016 Mar 30;35(7):1159-77
pubmed: 26514699
J Thorac Dis. 2018 Nov;10(11):6147-6157
pubmed: 30622786
Ann Thorac Surg. 2016 Jun;101(6):2102-11
pubmed: 27083246
J Clin Epidemiol. 1995 Dec;48(12):1503-10
pubmed: 8543964
PLoS One. 2020 Nov 20;15(11):e0242730
pubmed: 33216811
J Surg Oncol. 2019 Dec 10;:
pubmed: 31823377
J Clin Oncol. 2008 Mar 10;26(8):1364-70
pubmed: 18323559
J Clin Epidemiol. 2003 Sep;56(9):826-32
pubmed: 14505766
Ann Surg. 2021 Jun 1;273(6):e214-e221
pubmed: 31274650
Stat Med. 2012 May 20;31(11-12):1150-61
pubmed: 21997569
Ann Intern Med. 1999 Mar 16;130(6):515-24
pubmed: 10075620
Ann Thorac Surg. 2016 Jul;102(1):239-46
pubmed: 27101729
J Clin Oncol. 2015 Mar 10;33(8):861-9
pubmed: 25624438
Stat Med. 2000 Feb 29;19(4):453-73
pubmed: 10694730
Ann Emerg Med. 1999 Apr;33(4):437-47
pubmed: 10092723
PLoS One. 2015 Apr 09;10(4):e0121295
pubmed: 25856315
Stat Methods Med Res. 2020 Nov;29(11):3166-3178
pubmed: 32401702
PLoS One. 2020 Jan 22;15(1):e0225939
pubmed: 31967987
J Am Coll Surg. 2019 Oct;229(4):355-365.e3
pubmed: 31226476
J Clin Epidemiol. 2004 Dec;57(12):1262-70
pubmed: 15617952