On variance estimation of the inverse probability-of-treatment weighting estimator: A tutorial for different types of propensity score weights.

ATE ATT IPTW matching weights overlap weights variance estimator

Journal

Statistics in medicine

ISSN: 1097-0258

Titre abrégé: Stat Med

Pays: England

ID NLM: 8215016

Informations de publication

Date de publication:
15 Apr 2024

Historique:

revised: 12 02 2024

received: 17 03 2023

accepted: 01 04 2024

medline: 16 4 2024

pubmed: 16 4 2024

entrez: 15 4 2024

Statut: aheadofprint

Résumé

Propensity score methods, such as inverse probability-of-treatment weighting (IPTW), have been increasingly used for covariate balancing in both observational studies and randomized trials, allowing the control of both systematic and chance imbalances. Approaches using IPTW are based on two steps: (i) estimation of the individual propensity scores (PS), and (ii) estimation of the treatment effect by applying PS weights. Thus, a variance estimator that accounts for both steps is crucial for correct inference. Using a variance estimator which ignores the first step leads to overestimated variance when the estimand is the average treatment effect (ATE), and to under or overestimated estimates when targeting the average treatment effect on the treated (ATT). In this article, we emphasize the importance of using an IPTW variance estimator that correctly considers the uncertainty in PS estimation. We present a comprehensive tutorial to obtain unbiased variance estimates, by proposing and applying a unifying formula for different types of PS weights (ATE, ATT, matching and overlap weights). This can be derived either via the linearization approach or M-estimation. Extensive R code is provided along with the corresponding large-sample theory. We perform simulation studies to illustrate the behavior of the estimators under different treatment and outcome prevalences and demonstrate appropriate behavior of the analytical variance estimator. We also use a reproducible analysis of observational lung cancer data as an illustrative example, estimating the effect of receiving a PET-CT scan on the receipt of surgery.

Identifiants

DOI: 10.1002/sim.10078 PMID: 38622063

pubmed: 38622063

doi: 10.1002/sim.10078

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Subventions

Organisme : Cancer Research UK

ID : C7923/A29018

Pays : United Kingdom

Organisme : Cancer Research UK

ID : C7923/A30945

Pays : United Kingdom

Organisme : Medical Research Council

ID : MR/T032448/1

Pays : United Kingdom

Informations de copyright

Références

Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41‐55.

Rosenbaum P, Rubin D. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984;79:516‐524.

Rosenbaum P. Model‐based direct adjustment. J Am Stat Assoc. 1987;82:387‐394.

Austin P. An introduction to propensity‐score methods for reducing the effects of confounding in observational studies. Multivar Behav Res. 2011;46(3):399‐424.

Williamson E, Morley R, Lucas A, Carpenter J. Propensity scores: from Naïve enthusiasm to intuitive understanding. Stat Methods Med Res. 2012;21(3):273‐293.

D'Agostino RB. Propensity score methods for bias reduction in the comparison of a treatment to a non‐randomized control group. Stat Med. 1998;17(19):2265‐2281.

Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Stürmer T. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):1149‐1156.

Williamson EJ, Forbes A, White IR. Variance reduction in randomised trials by inverse probability weighting using the propensity score. Stat Med. 2014;33(5):721‐737.

Westreich D, Cole SR, Funk MJ, Brookhart MA, Stürmer T. The role of the c‐statistic in variable selection for propensity score models. Pharmacoepidemiol Drug Saf. 2011;20(3):317‐320.

Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2004;23(19):2937‐2960.

Austin PC. Optimal Caliper widths for propensity‐score matching when estimating differences in means and differences in proportions in observational studies. Pharm Stat. 2011;10(2):150‐161.

Crump RK, Hotz VJ, Imbens GW, Mitnik OA. Dealing with limited overlap in estimation of average treatment effects. Biometrika. 2009;96(1):187‐199.

Lee BK, Lessler J, Stuart EA. Weight trimming and propensity score weighting. PLoS One. 2011;6(3):e18174.

Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168(6):656‐664.

Li F, Thomas LE, Li F. Addressing extreme propensity scores via the overlap weights. Am J Epidemiol. 2018;188(1):250‐257.

Li L, Greene T. A weighting analogue to pair matching in propensity score analysis. Int J Biostat. 2013;9(2):215‐234.

Li F, Morgan KL, Zaslavsky AM. Balancing covariates via propensity score weighting. J Am Stat Assoc. 2018;113(521):390‐400. doi:10.1080/01621459.2016.1260466

Freedman DA. On the so‐called Huber sandwich estimator and robust standard errors. Am Stat. 2006;60(4):299‐302.

Robins JM, Hernán M, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11(5):550‐560.

Reifeis SA, Hudgens MG. On variance of the treatment effect in the treated when estimated by inverse probability weighting. Am J Epidemiol. 2022;191(6):1092‐1097.

Webster‐Clark M, Stürmer T, Wang T, et al. Using propensity scores to estimate effects of treatment initiation decisions: state of the science. Stat Med. 2020;40(7):1718‐1735. doi:10.1002/sim.8866

Cole SR, Frangakis CE. The consistency statement in causal inference: a definition or an assumption? Epidemiology. 2009;20(1):3‐5.

Westreich D, Cole SR. Invited commentary: positivity in practice. Am J Epidemiol. 2010;171(6):674‐677.

Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15(3):413‐419.

VanderWeele TJ. Concerning the consistency assumption in causal inference. Epidemiology. 2009;20(6):880‐883.

Hernán MA, Robins JM. Causal Inference: What if. London: Chapman and Hall; 2020.

Aronow PM, Robins JM, Saarinen T, Sävje F, Sekhon J. Nonparametric identification is not enough, but randomized controlled trials are. arXiv preprint arXiv:210811342, 2021.

Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337‐346.

McCaffrey DF, Griffin BA, Almirall D, Slaughter ME, Ramchand R, Burgette LF. A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Stat Med. 2013;32(19):3388‐3414. doi:10.1002/sim.5753

Smith MJ, Mansournia MA, Maringe C, et al. Introduction to computational causal inference using reproducible Stata, R, and Python code: a tutorial. Stat Med. 2021;41(2):407‐432. doi:10.1002/sim.9234

Hajek J. Comment on “an essay on the logical foundations of survey sampling” by D. Basu. Foundations of Statistical Inference. New York: Holt, Rinehart, and Winston; 1971:236.

Greifer N, Stuart EA. Choosing the estimand when matching or weighting in observational studies. arXiv preprint arXiv:210610577, 2021.

Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25(1):1‐21.

Hirano K, Imbens GW. Estimation of causal effects using propensity score weighting: an application to data on right heart catheterization. Health Serv Outcomes Res Methodol. 2001;2(3‐4):259‐278.

Belot A, Fowler H, Njagi EN, et al. Association between age, deprivation and specific comorbid conditions and the receipt of major surgery in patients with non‐small cell lung cancer in England: a population‐based study. Thorax. 2019;74(1):51‐59.

Boos DD, Stefanski LA. 7. Essential Statistical Inference: Theory and Methods. New York: Springer; 2013.

Stefanski LA, Boos DD. The calculus of M‐estimation. Am Stat. 2002;56(1):29‐38.

Deville JC. Variance estimation for complex statistics and estimators: linearization and residual techniques. Surv Methodol. 1999;25(2):193‐203.

Hardin JW, Hilbe JM. Generalized Estimating Equations. London: Chapman and Hall/CRC; 2002.

Greifer N. WeightIt: weighting for covariate balance in observational studies. r package version 0.13.1; 2022. https://CRAN.R‐project.org/package=WeightIt

Morris TP, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Stat Med. 2019;38(11):2074‐2102.

Orihara S, Kawamura T, Taguri M. Comments on ‘a weighting analogue to pair matching in propensity score analysis’ by L. Li and T. Greene. Int J Biostat. 2022;19:53‐60.

Austin PC. Variance estimation when using inverse probability of treatment weighting (IPTW) with survival analysis. Stat Med. 2016;35(30):5642‐5655.

Hajage D, Chauvet G, Belin L, Lafourcade A, Tubach F, De Rycke Y. Closed‐form variance estimator for weighted propensity score estimators with survival outcome. Biom J. 2018;60(6):1151‐1163.

Saul BC, Hudgens MG. The calculus of M‐estimation in R with geex. J Stat Softw. 2020;92(2):1‐15.

Zhou T, Tong G, Li F, Thomas L, Li F. PSweight: an R package for propensity score weighting analysis. arXiv preprint arXiv:201008893v4, 2021.

Efron B, Tibshirani RJ. An Introduction to the Bootstrap. London: Chapman and Hall/CRC; 1993.

Austin PC. Bootstrap vs asymptotic variance estimation when using propensity score weighting with continuous and binary outcomes. Stat Med. 2022;41(22):4426‐4443.

Schomaker M, Heumann C. Bootstrap inference when using multiple imputation. Stat Med. 2018;37(14):2252‐2266. https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.7654

Hines O, Dukes O, Diaz‐Ordaz K, Vansteelandt S. Demystifying statistical learning based on efficient influence functions. Am Stat. 2022;76(3):292‐304. doi:10.1080/00031305.2021.2021984

Gill RD. Non‐ and semi‐parametric maximum likelihood estimators and the Von Mises method (part 1) [with discussion and reply]. Scand J Stat. 1989;16(2):97‐128.

Liao J, Rohde C. Variance reduction in the inverse probability weighted estimators for the average treatment effect using the propensity score. Biometrics. 2021;78:660‐667.

Matsouaka A, Roland LY, Zhou Y. Overlap, matching, or entropy weights: what are we weighting for? arXiv preprint arXiv:221012968, 2022.

Leyrat C, Seaman SR, White IR, et al. Propensity score analysis with partially observed covariates: how should multiple imputation be used? Stat Methods Med Res. 2019;28(1):3‐19.

Young JG, Stensrud MJ, Tchetgen EJT, Hernán MA. A causal framework for classical statistical estimands in failure‐time settings with competing events. Stat Med. 2020;39(8):1199‐1236.

On variance estimation of the inverse probability-of-treatment weighting estimator: A tutorial for different types of propensity score weights.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Subventions

Informations de copyright

Références

Auteurs

Andriana Kostouraki (A)

David Hajage (D)

Bernard Rachet (B)

Elizabeth J Williamson (EJ)

Guillaume Chauvet (G)

Aurélien Belot (A)

Clémence Leyrat (C)

Classifications MeSH