Estimating treatment effects under untestable assumptions with nonignorable missing data.
Heckman model
average treatment effects
full-information maximum likelihood
missing not at random
multiple imputation
selection models
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
20 May 2020
20 May 2020
Historique:
received:
09
04
2018
revised:
17
01
2020
accepted:
20
01
2020
pubmed:
15
2
2020
medline:
22
6
2021
entrez:
15
2
2020
Statut:
ppublish
Résumé
Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1658-1674Subventions
Organisme : Medical Research Council
ID : MC_UU_12023/21
Pays : United Kingdom
Informations de copyright
© 2020 John Wiley & Sons, Ltd.
Références
Mattei A, Mealli F, Pacini B. Identification of causal effects in the presence of nonignorable missing outcome values. Biometrics. 2014;70:278-288.
Faria R, Gomes M, Epstein D, White IR. A guide to handling missing data in cost-effectiveness analysis conducted within randomised controlled trials. Pharmacoeconomics. 2014;32:1157-1170.
Mason A, Gomes M, Grieve R, Ulug P, Powell J, Carpenter J. Development of a practical approach to expert elicitation for randomised controlled trials with missing health outcomes: application to the IMPROVE Trial. Clin Trials. 2017;14:357-367.
Heckman JJ. Sample selection bias as a specification error. Econometrica. 1979;47:153-161.
Diggle P, Kenward MG. Informative drop-out in longitudinal data-analysis. J Royal Stat Soc Ser C-Appl Stat. 1994;43:49-93.
Daniels M, Hogan J. Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis. Chapman and Hall / CRC: Boca Raton, FL; 2008.
Galimard JE, Chevret S, Protopopescu C, Resche-Rigon M. A multiple imputation approach for MNAR mechanisms compatible with Heckman's model. Stat Med. 2016;35:2907-2920.
Vella F. Estimating models with sample selection bias: a survey. J Hum Res. 1998;33:127-169.
Das M, Newey WK, Vella F. Nonparametric estimation of sample selection models. Rev Econ Stud. 2003;70:33-58.
Pigini C. Bivariate non-normality in the sample selection model. J Econ Methods. 2015;4:123-144.
Zhelonkin M, Genton MG, Ronchetti E. Robust inference in sample selection models. J Royal Stat Soc Ser B. 2016;78:805-827.
Mohan K, Pearl J. On the testability of models with missing data. Proceedings of Artificial Intelligence and Statistics. 2014;33:643-650.
Puhani PA. The Heckman correction for sample selection and its critique. J Econ Surv. 2000;14:53-68.
Little RJ, Rubin DB. Statistical Analysis with Missing Data. Wiley Series in Probability and Mathematical Statistics. Wiley: New York, NY; 2002.
Molenberghs G, Fitzmaurice GM, Kenward M, Tsiatis AA, Verbeke G. Handbook of Missing Data Methodology. Chapman & Hall / CRC: Boca Raton, FL; 2014.
Grant AM, Boachie C, Cotton SC, et al. Clinical and economic evaluation of laparoscopic surgery compared with medical management for gastro-oesophageal reflux disease: 5-year follow-up of multicentre randomised trial (the REFLUX trial). Health Technol Assess. 2013;17:1-167.
Gomes M, Gutacker N, Bojke C, Street A. Addressing missing data in Patient-Reported Outcome Measures (PROMs): implications for the use of PROMs for comparing provider performance. Health Econ. 2016;25:515-528.
EuroQol-a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199-208.
Meng XL. Multiple-imputation inferences with uncongenial sources of input. Stat Sci. 1994;9:538-558.
Carpenter J, Kenward M. Multiple imputation and its application. Statistics in Practice. Chichester, NH: Wiley; 2013.
Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley Series in Probability and Mathematical Statistics. New York, NY: Wiley; 1987.
White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30:377-399.
Gomes M, Rosalba R, Camarena Brenes J, Giampiero M. Copula selection models for non-Gaussian outcomes that are missing not at random. Stat Med. 2019;38:480-496.
Sales AE, Plomondon ME, Magid DJ, Spertus JA, Rumsfeld JS. Assessing response bias from missing quality of life data: the Heckman method. Health Qual Life Outcomes. 2004;2:49.
Alva M, Gray A, Mihaylova B, Clarke P. The effect of diabetes complications on health-related quality of life: the importance of longitudinal data to address patient heterogeneity. Health Econ. 2014;23:487-500.
Washbrook E, Clarke PS, Steele F. Investigating non-ignorable dropout in panel studies of residential mobility. J Royal Stat Soc Ser C-Appl Stat. 2014;63:239-266.
Tseng CH, Elashoff R, Li N, Li G. Longitudinal data analysis with non-ignorable missing data. Stat Methods Med Res. 2016;25:205-220.
Toomet O, Henningsen A. Sample selection models in R: package sampleSelection. J Stat Softw. 2008;27:1-23.
Mason A, Richardson S, Plewis I, Best N. Strategy for modelling nonrandom missing data mechanisms in observational studies using Bayesian methods. J Off Stat. 2012;28:279-302.
Plummer M. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. Paper presented at: Proceedings of the 3rd International Workshop on Distributed Statistical Computing; 2003:1-10.
Del Bianco P, Borgoni R. Handling dropout and clustering in longitudinal multicentre clinical trials. Stat Model. 2006;6:141-157.
Marchenko YV, Genton MG. A Heckman selection-t model. J Am Stat Assoc. 2012;107:304-317.
McGovern M, Bärnighausen T, Marra G, Radice R. On the assumption of bivariate normality in selection models: a copula approach applied to estimating HIV prevalence. Epidemiology. 2015;26:229-237.
Clarke S, Houle B. Evaluation of Heckman selection model method for correcting estimates of HIV prevalence from sample surveys. Center for Statistics and the Social Sciences Working Paper no. 120. 2012.
Kenward MG. Selection models for repeated measurements with non-random dropout: an illustration of sensitivity. Stat Med. 1998;17:2723-2732.