On variance estimation of the inverse probability-of-treatment weighting estimator: A tutorial for different types of propensity score weights.
ATE
ATT
IPTW
matching weights
overlap weights
variance estimator
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
15 Apr 2024
15 Apr 2024
Historique:
revised:
12
02
2024
received:
17
03
2023
accepted:
01
04
2024
medline:
16
4
2024
pubmed:
16
4
2024
entrez:
15
4
2024
Statut:
aheadofprint
Résumé
Propensity score methods, such as inverse probability-of-treatment weighting (IPTW), have been increasingly used for covariate balancing in both observational studies and randomized trials, allowing the control of both systematic and chance imbalances. Approaches using IPTW are based on two steps: (i) estimation of the individual propensity scores (PS), and (ii) estimation of the treatment effect by applying PS weights. Thus, a variance estimator that accounts for both steps is crucial for correct inference. Using a variance estimator which ignores the first step leads to overestimated variance when the estimand is the average treatment effect (ATE), and to under or overestimated estimates when targeting the average treatment effect on the treated (ATT). In this article, we emphasize the importance of using an IPTW variance estimator that correctly considers the uncertainty in PS estimation. We present a comprehensive tutorial to obtain unbiased variance estimates, by proposing and applying a unifying formula for different types of PS weights (ATE, ATT, matching and overlap weights). This can be derived either via the linearization approach or M-estimation. Extensive R code is provided along with the corresponding large-sample theory. We perform simulation studies to illustrate the behavior of the estimators under different treatment and outcome prevalences and demonstrate appropriate behavior of the analytical variance estimator. We also use a reproducible analysis of observational lung cancer data as an illustrative example, estimating the effect of receiving a PET-CT scan on the receipt of surgery.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Cancer Research UK
ID : C7923/A29018
Pays : United Kingdom
Organisme : Cancer Research UK
ID : C7923/A30945
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/T032448/1
Pays : United Kingdom
Informations de copyright
© 2024 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Références
Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41‐55.
Rosenbaum P, Rubin D. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984;79:516‐524.
Rosenbaum P. Model‐based direct adjustment. J Am Stat Assoc. 1987;82:387‐394.
Austin P. An introduction to propensity‐score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399‐424.
Williamson E, Morley R, Lucas A, Carpenter J. Propensity scores: from Naïve enthusiasm to intuitive understanding. Stat Methods Med Res. 2012;21(3):273‐293.
D'Agostino RB. Propensity score methods for bias reduction in the comparison of a treatment to a non‐randomized control group. Stat Med. 1998;17(19):2265‐2281.
Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Stürmer T. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):1149‐1156.
Williamson EJ, Forbes A, White IR. Variance reduction in randomised trials by inverse probability weighting using the propensity score. Stat Med. 2014;33(5):721‐737.
Westreich D, Cole SR, Funk MJ, Brookhart MA, Stürmer T. The role of the c‐statistic in variable selection for propensity score models. Pharmacoepidemiol Drug Saf. 2011;20(3):317‐320.
Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2004;23(19):2937‐2960.
Austin PC. Optimal Caliper widths for propensity‐score matching when estimating differences in means and differences in proportions in observational studies. Pharm Stat. 2011;10(2):150‐161.
Crump RK, Hotz VJ, Imbens GW, Mitnik OA. Dealing with limited overlap in estimation of average treatment effects. Biometrika. 2009;96(1):187‐199.
Lee BK, Lessler J, Stuart EA. Weight trimming and propensity score weighting. PLoS One. 2011;6(3):e18174.
Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168(6):656‐664.
Li F, Thomas LE, Li F. Addressing extreme propensity scores via the overlap weights. Am J Epidemiol. 2018;188(1):250‐257.
Li L, Greene T. A weighting analogue to pair matching in propensity score analysis. Int J Biostat. 2013;9(2):215‐234.
Li F, Morgan KL, Zaslavsky AM. Balancing covariates via propensity score weighting. J Am Stat Assoc. 2018;113(521):390‐400. doi:10.1080/01621459.2016.1260466
Freedman DA. On the so‐called Huber sandwich estimator and robust standard errors. Am Stat. 2006;60(4):299‐302.
Robins JM, Hernán M, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11(5):550‐560.
Reifeis SA, Hudgens MG. On variance of the treatment effect in the treated when estimated by inverse probability weighting. Am J Epidemiol. 2022;191(6):1092‐1097.
Webster‐Clark M, Stürmer T, Wang T, et al. Using propensity scores to estimate effects of treatment initiation decisions: state of the science. Stat Med. 2020;40(7):1718‐1735. doi:10.1002/sim.8866
Cole SR, Frangakis CE. The consistency statement in causal inference: a definition or an assumption? Epidemiology. 2009;20(1):3‐5.
Westreich D, Cole SR. Invited commentary: positivity in practice. Am J Epidemiol. 2010;171(6):674‐677.
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15(3):413‐419.
VanderWeele TJ. Concerning the consistency assumption in causal inference. Epidemiology. 2009;20(6):880‐883.
Hernán MA, Robins JM. Causal Inference: What if. London: Chapman and Hall; 2020.
Aronow PM, Robins JM, Saarinen T, Sävje F, Sekhon J. Nonparametric identification is not enough, but randomized controlled trials are. arXiv preprint arXiv:210811342, 2021.
Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337‐346.
McCaffrey DF, Griffin BA, Almirall D, Slaughter ME, Ramchand R, Burgette LF. A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Stat Med. 2013;32(19):3388‐3414. doi:10.1002/sim.5753
Smith MJ, Mansournia MA, Maringe C, et al. Introduction to computational causal inference using reproducible Stata, R, and Python code: a tutorial. Stat Med. 2021;41(2):407‐432. doi:10.1002/sim.9234
Hajek J. Comment on “an essay on the logical foundations of survey sampling” by D. Basu. Foundations of Statistical Inference. New York: Holt, Rinehart, and Winston; 1971:236.
Greifer N, Stuart EA. Choosing the estimand when matching or weighting in observational studies. arXiv preprint arXiv:210610577, 2021.
Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25(1):1‐21.
Hirano K, Imbens GW. Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res Methodol. 2001;2(3‐4):259‐278.
Belot A, Fowler H, Njagi EN, et al. Association between age, deprivation and specific comorbid conditions and the receipt of major surgery in patients with non‐small cell lung cancer in England: a population‐based study. Thorax. 2019;74(1):51‐59.
Boos DD, Stefanski LA. 7. Essential Statistical Inference: Theory and Methods. New York: Springer; 2013.
Stefanski LA, Boos DD. The calculus of M‐estimation. Am Stat. 2002;56(1):29‐38.
Deville JC. Variance estimation for complex statistics and estimators: linearization and residual techniques. Surv Methodol. 1999;25(2):193‐203.
Hardin JW, Hilbe JM. Generalized Estimating Equations. London: Chapman and Hall/CRC; 2002.
Greifer N. WeightIt: weighting for covariate balance in observational studies. r package version 0.13.1; 2022. https://CRAN.R‐project.org/package=WeightIt
Morris TP, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Stat Med. 2019;38(11):2074‐2102.
Orihara S, Kawamura T, Taguri M. Comments on ‘a weighting analogue to pair matching in propensity score analysis’ by L. Li and T. Greene. Int J Biostat. 2022;19:53‐60.
Austin PC. Variance estimation when using inverse probability of treatment weighting (IPTW) with survival analysis. Stat Med. 2016;35(30):5642‐5655.
Hajage D, Chauvet G, Belin L, Lafourcade A, Tubach F, De Rycke Y. Closed‐form variance estimator for weighted propensity score estimators with survival outcome. Biom J. 2018;60(6):1151‐1163.
Saul BC, Hudgens MG. The calculus of M‐estimation in R with geex. J Stat Softw. 2020;92(2):1‐15.
Zhou T, Tong G, Li F, Thomas L, Li F. PSweight: an R package for propensity score weighting analysis. arXiv preprint arXiv:201008893v4, 2021.
Efron B, Tibshirani RJ. An Introduction to the Bootstrap. London: Chapman and Hall/CRC; 1993.
Austin PC. Bootstrap vs asymptotic variance estimation when using propensity score weighting with continuous and binary outcomes. Stat Med. 2022;41(22):4426‐4443.
Schomaker M, Heumann C. Bootstrap inference when using multiple imputation. Stat Med. 2018;37(14):2252‐2266. https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.7654
Hines O, Dukes O, Diaz‐Ordaz K, Vansteelandt S. Demystifying statistical learning based on efficient influence functions. Am Stat. 2022;76(3):292‐304. doi:10.1080/00031305.2021.2021984
Gill RD. Non‐ and semi‐parametric maximum likelihood estimators and the Von Mises method (part 1) [with discussion and reply]. Scand J Stat. 1989;16(2):97‐128.
Liao J, Rohde C. Variance reduction in the inverse probability weighted estimators for the average treatment effect using the propensity score. Biometrics. 2021;78:660‐667.
Matsouaka A, Roland LY, Zhou Y. Overlap, matching, or entropy weights: what are we weighting for? arXiv preprint arXiv:221012968, 2022.
Leyrat C, Seaman SR, White IR, et al. Propensity score analysis with partially observed covariates: how should multiple imputation be used? Stat Methods Med Res. 2019;28(1):3‐19.
Young JG, Stensrud MJ, Tchetgen EJT, Hernán MA. A causal framework for classical statistical estimands in failure‐time settings with competing events. Stat Med. 2020;39(8):1199‐1236.