Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small.
Overfitting
Penalization
Risk prediction models
Sample size
Shrinkage
Journal
Journal of clinical epidemiology
ISSN: 1878-5921
Titre abrégé: J Clin Epidemiol
Pays: United States
ID NLM: 8801383
Informations de publication
Date de publication:
04 2021
04 2021
Historique:
received:
19
06
2020
revised:
15
11
2020
accepted:
02
12
2020
pubmed:
12
12
2020
medline:
29
9
2021
entrez:
11
12
2020
Statut:
ppublish
Résumé
When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms ('tuning parameters') are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance. This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net. In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model's Cox-Snell R Penalization methods are not a 'carte blanche'; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters.
Identifiants
pubmed: 33307188
pii: S0895-4356(20)31209-9
doi: 10.1016/j.jclinepi.2020.12.005
pmc: PMC8026952
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
88-96Subventions
Organisme : Cancer Research UK
ID : C49297/A27294
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/T025085/1
Pays : United Kingdom
Organisme : Department of Health
Pays : United Kingdom
Informations de copyright
Copyright © 2021 The Authors. Published by Elsevier Inc. All rights reserved.
Références
Stat Med. 2012 May 20;31(11-12):1150-61
pubmed: 21997569
J Clin Epidemiol. 2016 Jun;74:167-76
pubmed: 26772608
Stat Med. 2019 Mar 30;38(7):1276-1296
pubmed: 30357870
Ann Intern Med. 2015 Jan 6;162(1):W1-73
pubmed: 25560730
Stat Med. 2019 Sep 20;38(21):4051-4065
pubmed: 31270850
BMJ. 2015 Aug 11;351:h3868
pubmed: 26264962
Med Decis Making. 2001 Jan-Feb;21(1):45-56
pubmed: 11206946
Stat Med. 1990 Nov;9(11):1303-25
pubmed: 2277880
Stat Med. 2013 Jul 20;32(16):2747-66
pubmed: 23303608
Stat Med. 2019 Mar 30;38(7):1262-1275
pubmed: 30347470
J Stat Softw. 2010;33(1):1-22
pubmed: 20808728
Stat Med. 2014 Jul 10;33(15):2585-96
pubmed: 24549725
Stat Methods Med Res. 1997 Jun;6(2):167-83
pubmed: 9261914
BMJ. 2020 Mar 18;368:m441
pubmed: 32188600
Stat Med. 2016 Jan 30;35(2):214-26
pubmed: 26553135
Stat Methods Med Res. 2020 Nov;29(11):3166-3178
pubmed: 32401702
Bioinformatics. 2005 May 1;21(9):1979-86
pubmed: 15691862