Comparison of regression imputation methods of baseline covariates that predict survival outcomes.
Missing data
proportional hazards model
regression imputation
Journal
Journal of clinical and translational science
ISSN: 2059-8661
Titre abrégé: J Clin Transl Sci
Pays: England
ID NLM: 101689953
Informations de publication
Date de publication:
04 Sep 2020
04 Sep 2020
Historique:
entrez:
5
5
2021
pubmed:
6
5
2021
medline:
6
5
2021
Statut:
epublish
Résumé
Missing data are inevitable in medical research and appropriate handling of missing data is critical for statistical estimation and making inferences. Imputation is often employed in order to maximize the amount of data available for statistical analysis and is preferred over the typically biased output of complete case analysis. This article examines several types of regression imputation of missing covariates in the prediction of time-to-event outcomes subject to right censoring. We evaluated the performance of five regression methods in the imputation of missing covariates for the proportional hazards model via summary statistics, including proportional bias and proportional mean squared error. The primary objective was to determine which among the parametric generalized linear models (GLMs) and least absolute shrinkage and selection operator (LASSO), and nonparametric multivariate adaptive regression splines (MARS), support vector machine (SVM), and random forest (RF), provides the "best" imputation model for baseline missing covariates in predicting a survival outcome. LASSO on an average observed the smallest bias, mean square error, mean square prediction error, and median absolute deviation (MAD) of the final analysis model's parameters among all five methods considered. SVM performed the second best while GLM and MARS exhibited the lowest relative performances. LASSO and SVM outperform GLM, MARS, and RF in the context of regression imputation for prediction of a time-to-event outcome.
Identifiants
pubmed: 33948262
doi: 10.1017/cts.2020.533
pii: S2059866120005336
pmc: PMC8057424
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e40Subventions
Organisme : NCI NIH HHS
ID : R01 CA155296
Pays : United States
Informations de copyright
© The Association for Clinical and Translational Science 2020.
Déclaration de conflit d'intérêts
The authors have no conflicts of interest to declare.
Références
J Stat Softw. 2010;33(1):1-22
pubmed: 20808728
Stat Methods Med Res. 1995 Sep;4(3):197-217
pubmed: 8548103
Phlebology. 2011 Aug;26(5):215-6
pubmed: 21791707
Stat Med. 2009 Jul 10;28(15):1982-98
pubmed: 19452569
J Clin Oncol. 2003 Apr 1;21(7):1232-7
pubmed: 12663709
Lancet. 2010 Oct 2;376(9747):1147-54
pubmed: 20888992
Stat Methods Med Res. 2019 Jun;28(6):1676-1688
pubmed: 29717943
BMC Med Res Methodol. 2012 Jul 11;12:96
pubmed: 22784200
Stat Methods Med Res. 1999 Mar;8(1):3-15
pubmed: 10347857
J Natl Cancer Inst. 2013 Nov 20;105(22):1729-37
pubmed: 24136890
Am J Epidemiol. 2010 Nov 1;172(9):1070-6
pubmed: 20841346
Stat Anal Data Min. 2017 Dec;10(6):363-377
pubmed: 29403567
BMC Med Res Methodol. 2017 Dec 6;17(1):162
pubmed: 29207961
Stat Methods Med Res. 1996 Sep;5(3):215-38
pubmed: 8931194
Aust N Z J Public Health. 2001 Oct;25(5):464-9
pubmed: 11688629
Test (Madr). 2009 May 1;18(1):1-43
pubmed: 21218187
Stat Med. 2005 Jun 15;24(11):1713-23
pubmed: 15724232
Stat Med. 2010 Dec 10;29(28):2920-31
pubmed: 20842622
Clin Cancer Res. 2007 Nov 1;13(21):6396-403
pubmed: 17975152
Annu Rev Public Health. 2004;25:99-117
pubmed: 15015914
Hum Mutat. 2012 Dec;33(12):1708-18
pubmed: 22777693
Stat Med. 2004 Jun 15;23(11):1793-815
pubmed: 15160409