Parametric and nonparametric propensity score estimation in multilevel observational studies.
Super Learner
clustering
machine learning
observational studies
propensity score weighting
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
15 10 2023
15 10 2023
Historique:
revised:
16
05
2023
received:
10
01
2023
accepted:
10
07
2023
medline:
19
9
2023
pubmed:
3
8
2023
entrez:
2
8
2023
Statut:
ppublish
Résumé
There has been growing interest in using nonparametric machine learning approaches for propensity score estimation in order to foster robustness against misspecification of the propensity score model. However, the vast majority of studies focused on single-level data settings, and research on nonparametric propensity score estimation in clustered data settings is scarce. In this article, we extend existing research by describing a general algorithm for incorporating random effects into a machine learning model, which we implemented for generalized boosted modeling (GBM). In a simulation study, we investigated the performance of logistic regression, GBM, and Bayesian additive regression trees for inverse probability of treatment weighting (IPW) when the data are clustered, the treatment exposure mechanism is nonlinear, and unmeasured cluster-level confounding is present. For each approach, we compared fixed and random effects propensity score models to single-level models and evaluated their use in both marginal and clustered IPW. We additionally investigated the performance of the standard Super Learner and the balance Super Learner. The results showed that when there was no unmeasured confounding, logistic regression resulted in moderate bias in both marginal and clustered IPW, whereas the nonparametric approaches were unbiased. In presence of cluster-level confounding, fixed and random effects models greatly reduced bias compared to single-level models in marginal IPW, with fixed effects GBM and fixed effects logistic regression performing best. Finally, clustered IPW was overall preferable to marginal IPW and the balance Super Learner outperformed the standard Super Learner, though neither worked as well as their best candidate model.
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
4147-4176Informations de copyright
© 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Références
McCaffrey DF, Ridgeway G, Morral AR. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods. 2004;9(4):403-425. doi:10.1037/1082-989X.9.4.403
Setoguchi S, Schneeweiss S, Brookhart MA, Glynn RJ, Cook EF. Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiol Drug Saf. 2008;17(6):546-555. doi:10.1002/pds.1555
Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337-346.
Pirracchio R, Petersen ML, Van Der Laan M. Improving propensity score estimators' robustness to model misspecification using super learner. Am J Epidemiol. 2015;181(2):108-119.
Nichols A, McBride L. Propensity scores and causal inference using machine learning methods. Paper presented at: Presentation in the Track Session “Machine Learning in Applied Economics” at the Annual Meeting of the Agicultural and Applied Economics Association (AAEA); July 2019; Atlanta, GA:21-23.
Neugebauer R, Schmittdiel JA, van der Laan MJ A case study of the impact of data-adaptive versus model-based estimation of the propensity scores on causal inferences from three inverse probability weighting estimators. Int J Biostat. 2016;12(1):131-155.
Pirracchio R, Carone M. The balance super learner: a robust adaptation of the super learner to improve estimation of the average treatment effect in the treated based on propensity score matching. Stat Methods Med Res. 2018;27(8):2504-2518.
Thoemmes FJ, West SG. The use of propensity scores for nonrandomized designs with clustered data. Multivar Behav Res. 2011;46(3):514-543. doi:10.1080/00273171.2011.569395
Arpino B, Mealli F. The specification of the propensity score in multilevel observational studies. Comput Stat Data Anal. 2011;55(4):1770-1780. doi:10.1016/j.csda.2010.11.008
Li F, Zaslavsky AM, Landrum MB. Propensity score weighting with multilevel data. Stat Med. 2013;32(19):3373-3387. doi:10.1002/sim.5786
Fuentes A, Lüdtke O, Robitzsch A. Causal inference with multilevel data: a comparison of different propensity score weighting approaches. Multivar Behav Res. 2021;57:1-24. doi:10.1080/00273171.2021.1925521
Chang T, Nguyen TQ, Lee Y, Jackson JW, Stuart EA. Flexible propensity score estimation strategies for clustered data in observational studies. Stat Med. 2022;41:1-17. doi:10.1002/sim.9551
Chang T, Stuart EA. Propensity score methods for observational studies with clustered data: a review. Stat Med. 2022;41(18):3612-3626. doi:10.1002/sim.9437
van der Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol Biol. 2007;6(1):1-20.
Neyman J. On the application of probability theory to agricultural experiments. Essay on principles. Section 9. (Translated and edited by D.M. Dabrowska and T.P. Speed, Statistical Science (1990), 5, 465-480). Ann Agric Sci. 10:1-51. Originally published: 1923.
Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66(5):688. doi:10.1037/h0037350
Imbens GW. Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat. 2004;86(1):4-29.
Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41-55. doi:10.1093/biomet/70.1.41
Hernán MA, Robins JM. Causal Inference: What If. Boca Raton, FL: Chapman & Hall/CRC; 2020.
Allison PD. Fixed Effects Regression Models. Thousand Oaks, CA: SAGE Publications; 2009.
Griffin BA, McCaffrey DF, Almirall D, Burgette LF, Setodji CM. Chasing balance and other recommendations for improving nonparametric propensity score models. J Causal Inf. 2017;5(2):1-18.
Cefalu M, Ridgeway G, McCaffrey D, Morral A, Griffin BA, Burgette L. twang: Toolkit for Weighting and Analysis of Nonequivalent Groups. R package version 2.5. 2021.
Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38:367-378. doi:10.1016/S0167-9473(01)00065-2
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York, NY: Springer; 2009.
Hajjem A, Larocque D, Bellavance F. Generalized mixed effects regression trees. Stat Prob Lett. 2017;126:114-118.
Ngufor C, Van Houten H, Caffo BS, Shah ND, McCoy RG. Mixed effect machine learning: a framework for predicting longitudinal change in hemoglobin A1c. J Biomed Inform. 2019;89:56-67.
Chipman HA, George EI, McCulloch RE. BART: Bayesian additive regression trees. Ann Appl Stat. 2010;4(1):266-298.
Tan YV, Flannagan CA, Elliott MR. Predicting human-driving behavior to help driverless vehicles drive: random intercept Bayesian additive regression trees. Stat Interface. 2018;11:557-572.
Dorie V, Perrett G, Hill JL, Goodrich B. Stan and BART for causal inference: estimating heterogeneous treatment effects using the power of Stan and the flexibility of machine learning. Entropy. 2022;24(12):1-22. doi:10.3390/e24121782
Dorie V. dbarts: discrete Bayesian additive regression trees sampler. R package version 0.9-22. 2022.
Dorie V. stan4bart: Bayesian additive regression trees with Stan-sampled parametric extensions. R package version 0.0-4. 2022.
Naimi AI, Balzer LB. Stacked generalization: an introduction to super learning. Eur J Epidemiol. 2018;33(5):459-464.
Moodie EE, Stephens DA. Treatment prediction, balance, and propensity score adjustment. Epidemiology. 2017;28(5):e51-e53.
Alam S, Moodie EE, Stephens DA. Should a propensity score model be super? The utility of ensemble procedures for causal adjustment. Stat Med. 2019;38(9):1690-1702.
Kang J, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22(4):523-539. doi:10.1214/07-STS227
Rubin DB. Using propensity scores to help design observational studies: application to the tobacco litigation. Health Serv Outcomes Res Methodol. 2001;2(3-4):169-188. doi:10.1023/A:1020363010465
Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399-424. doi:10.1080/00273171.2011.568786
Leite W. Practical Propensity Score Methods Using R. Thousand Oaks, CA: SAGE Publications; 2016.
Lee Y, Nguyen TQ, Stuart EA. Partially pooled propensity score models for average treatment effect estimation with multilevel data. J R Stat Soc A Stat Soc. 2021;184(4):1578-1598.
Suk Y, Kang H. Robust machine learning for treatment effects in multilevel observational studies under cluster-level unmeasured confounding. Psychometrika. 2022;87(1):310-343. doi:10.1007/s11336-021-09805-x
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2019.
Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Stürmer T. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):1149-1156.
Therneau T, Atkinson B. rpart: recursive partitioning and regression trees. R package version 4.1-15. 2019.
European Foundation for the Improvement of Living and Working Conditions. European Quality of Life time series, 2007 and 2011: open access. [Data collection]. Distributed by the UK Data Service. SN. 2015:7724. doi:10.5255/UKDA-SN-7724-1
Lumley T. Survey: analysis of complex survey samples. R package version 4.0. 2020.
Suk Y. A within-group approach to ensemble machine learning methods for causal inference in multilevel studies. J Educ Behavior Stat. 2023:10769986231162096. doi:10.3102/10769986231162096
Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc. 1994;89(427):846-866. doi:10.1080/01621459.1994.10476818
Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61(4):962-973. doi:10.1111/j.1541-0420.2005.00377.x
Tsiatis A, Davidian M. Comment: demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22(4):569. doi:10.1214/07-STS227
Arkhangelsky D, Imbens G. Fixed effects and the generalized Mundlak estimator. arXiv.org.1807.02099. 2023.
Fahrmeir L, Kneib T, Lang S, Marx B. Regression: Models, Methods and Applications. New York, NY: Springer; 2013.
Sela RJ, Simonoff JS. RE-EM trees: a data mining approach for longitudinal and clustered data. Mach Learn. 2012;86:169-207. doi:10.1080/00273171.2012.658328
Deng H. Interpreting tree ensembles with intrees. Int J Data Sci Anal. 2019;7(4):277-287.