Impact of sample size on the stability of risk scores from clinical prediction models: a case study in cardiovascular disease.

Precision Risk prediction Sample size Stability Statistical methods

Journal

Diagnostic and prognostic research

ISSN: 2397-7523

Titre abrégé: Diagn Progn Res

Pays: England

ID NLM: 101718985

Informations de publication

Date de publication:
2020

Historique:

received: 25 02 2020

accepted: 12 08 2020

entrez: 18 9 2020

pubmed: 19 9 2020

medline: 19 9 2020

Statut: epublish

Résumé

Stability of risk estimates from prediction models may be highly dependent on the sample size of the dataset available for model derivation. In this paper, we evaluate the stability of cardiovascular disease risk scores for individual patients when using different sample sizes for model derivation; such sample sizes include those similar to models recommended in the national guidelines, and those based on recently published sample size formula for prediction models. We mimicked the process of sampling For a sample size of 100,000, the median 5-95th percentile range of risks for patients across the 1000 models was 0.77%, 1.60%, 2.42% and 3.22% for patients with population-derived risks of 4-5%, 9-10%, 14-15% and 19-20% respectively; for Widely used cardiovascular disease risk prediction models suffer from high levels of instability induced by sampling variation. Many models will also suffer from overfitting (a closely linked concept), but at acceptable levels of overfitting, there may still be high levels of instability in individual risk. Stability of risk estimates should be a criterion when determining the minimum sample size to develop models.

Sections du résumé

BACKGROUND BACKGROUND

METHODS METHODS

We mimicked the process of sampling

RESULTS RESULTS

For a sample size of 100,000, the median 5-95th percentile range of risks for patients across the 1000 models was 0.77%, 1.60%, 2.42% and 3.22% for patients with population-derived risks of 4-5%, 9-10%, 14-15% and 19-20% respectively; for

CONCLUSIONS CONCLUSIONS

Widely used cardiovascular disease risk prediction models suffer from high levels of instability induced by sampling variation. Many models will also suffer from overfitting (a closely linked concept), but at acceptable levels of overfitting, there may still be high levels of instability in individual risk. Stability of risk estimates should be a criterion when determining the minimum sample size to develop models.

Identifiants

DOI: 10.1186/s41512-020-00082-3 PMID: 32944655 PMC: PMC7487849

pubmed: 32944655

doi: 10.1186/s41512-020-00082-3

pii: 82

pmc: PMC7487849

doi:

Types de publication

Journal Article

Langues

eng

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Competing interestsAll authors state they have nothing to disclose.

Références

J Am Heart Assoc. 2018 Mar 10;7(6):

pubmed: 29525785

Circulation. 2008 Feb 12;117(6):743-53

pubmed: 18212285

BMC Med Inform Decis Mak. 2008 Nov 26;8:53

pubmed: 19036144

J Clin Hypertens (Greenwich). 2012 Apr;14(4):261-4

pubmed: 22458749

Stat Med. 2019 Mar 30;38(7):1276-1296

pubmed: 30357870

Eur Heart J. 2003 Jun;24(11):987-1003

pubmed: 12788299

BMJ. 2020 Mar 18;368:m441

pubmed: 32188600

Intensive Care Med. 1995 Sep;21(9):770-6

pubmed: 8847434

JAMA. 2012 Apr 18;307(15):1585-6

pubmed: 22511683

J Clin Epidemiol. 2005 Apr;58(4):383-90

pubmed: 15862724

BMJ. 2016 Jan 25;352:i6

pubmed: 26810254

Heart. 2007 Feb;93(2):172-6

pubmed: 17090561

BMJ. 2017 May 23;357:j2099

pubmed: 28536104

Stat Methods Med Res. 2019 Aug;28(8):2455-2474

pubmed: 29966490

Chest. 1991 Dec;100(6):1619-36

pubmed: 1959406

BMC Med Res Methodol. 2016 Nov 24;16(1):163

pubmed: 27881078

BMJ. 2016 May 16;353:i2416

pubmed: 27184143

Circulation. 2014 Jun 24;129(25 Suppl 2):S49-73

pubmed: 24222018

Int J Epidemiol. 2015 Jun;44(3):827-36

pubmed: 26050254

Stat Med. 1996 Feb 28;15(4):361-87

pubmed: 8668867

Ann Intern Med. 2013 Apr 16;158(8):596-603

pubmed: 23588748

BMJ. 2012 Sep 18;345:e5900

pubmed: 22990994

Impact of sample size on the stability of risk scores from clinical prediction models: a case study in cardiovascular disease.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Alexander Pate (A)

Richard Emsley (R)

Matthew Sperrin (M)

Glen P Martin (GP)

Tjeerd van Staa (T)

Classifications MeSH