Ascertaining properties of weighting in the estimation of optimal treatment regimes under monotone missingness.

Q-learning augmented inverse probability weighting dynamic treatment regimes monotonic coarseness outcome weighted learning

Journal

Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016

Informations de publication

Date de publication:
10 11 2020
Historique:
received: 21 06 2019
revised: 28 04 2020
accepted: 30 04 2020
pubmed: 31 7 2020
medline: 22 6 2021
entrez: 31 7 2020
Statut: ppublish

Résumé

Dynamic treatment regimes operationalize precision medicine as a sequence of decision rules, one per stage of clinical intervention, that map up-to-date patient information to a recommended intervention. An optimal treatment regime maximizes the mean utility when applied to the population of interest. Methods for estimating an optimal treatment regime assume the data to be fully observed, which rarely occurs in practice. A common approach is to first use multiple imputation and then pool the estimators across imputed datasets. However, this approach requires estimating the joint distribution of patient trajectories, which can be high-dimensional, especially when there are multiple stages of intervention. We examine the application of inverse probability weighted estimating equations as an alternative to multiple imputation in the context of monotonic missingness. This approach applies to a broad class of estimators of an optimal treatment regime including both Q-learning and a generalization of outcome weighted learning. We establish consistency under mild regularity conditions and demonstrate its advantages in finite samples using a series of simulation experiments and an application to a schizophrenia study.

Identifiants

pubmed: 32729973
doi: 10.1002/sim.8678
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

3503-3520

Subventions

Organisme : NCI NIH HHS
ID : P01 CA142538
Pays : United States
Organisme : NIDDK NIH HHS
ID : R01-DK-108073
Pays : United States

Informations de copyright

© 2020 John Wiley & Sons, Ltd.

Références

Murphy SA. Optimal dynamic treatment regimes. J Royal Stat Soc Ser B (Stat Methodol). 2003;65(2):331-355.
Robins JM. Optimal structural nested models for optimal sequential decisions. Paper presented at: Proceedings of the 2nd seattle Symposium in Biostatistics; 2004:189-326; Springer.
Kosorok MR, Moodie EE. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. Vol 21. Philadelphia, PA: SIAM; 2015.
Linn KA, Laber EB, Stefanski LA. Interactive Q-learning for quantiles. J Am Stat Assoc. 2017;112(518):638-649.
Wang L, Zhou Y, Song R, Sherwood B. Quantile-optimal treatment regimes. J Am Stat Assoc. 2018;113(523):1243-1254.
Henderson R, Ansell P, Alshibani D. Regret-regression for optimal dynamic treatment regimes. Biometrics. 2010;66(4):1192-1201.
Barrett JK, Henderson R, Rosthøj S. Doubly robust estimation of optimal dynamic treatment regimes. Stat Biosci. 2014;6(2):244-260.
Rich B, Moodie EE, Stephens DA. Simulating sequential multiple assignment randomized trials to generate optimal personalized warfarin dosing strategies. Clin Trials. 2014;11(4):435-444.
Wang L, Rotnitzky A, Lin X, Millikan RE, Thall PF. Evaluation of viable dynamic treatment regimes in a sequentially randomized trial of advanced prostate cancer. J Am Stat Assoc. 2012;107(498):493-508.
Xu Y, Thall PF, Hua W, Andersson BS. Bayesian non-parametric survival regression for optimizing precision dosing of intravenous busulfan in allogeneic stem cell transplantation. J Royal Stat Soc Ser C (Appl Stat). 2019;68(3):809-828.
Nahum-Shani I, Qian M, Almirall D, et al. Q-learning: a data analysis method for constructing adaptive interventions. Psychol Methods. 2012;17(4):478.
Laber EB, Lizotte DJ, Qian M, Pelham WE, Murphy SA. Dynamic treatment regimes: technical challenges and applications. Electr J Stat. 2014;8(1):1225.
Zhang Y, Laber EB, Davidian M, Tsiatis AA. Estimation of optimal treatment regimes using lists. J Am Stat Assoc. 2017; Just-Accepted. 113(524):1541-1549.
van der Laan MJ, Petersen ML. Causal effect models for realistic individualized treatment and intention to treat rules. Int J Biostat. 2007;3(1):1-52.
Petersen ML, Porter KE, Gruber S, Wang Y, van der Laan MJ. Diagnosing and responding to violations in the positivity assumption. Stat Methods Med Res. 2012;21(1):31-54.
Young JG, Cain LE, Robins JM, O'Reilly EJ, Hernán MA. Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula. Stat Biosci. 2011;3(1):119.
Shortreed SM, Laber E, Scott Stroup T, Pineau J, Murphy SA. A multiple imputation strategy for sequential multiple assignment randomized trials. Stat Med. 2014;33(24):4202-4214.
Blatt D, Murphy S, Zhu J. A-Learning for Approximate Planning. Ann Arbor, MI: University of Michigan; 2004.
Murphy SA. A generalization error for Q-learning. J Mach Learn Res. 2005;6(July):1073-1097.
Moodie EE, Richardson TS, Stephens DA. Demystifying optimal dynamic treatment regimes. Biometrics. 2007;63(2):447-455.
Schulte PJ, Tsiatis AA, Laber EB, Davidian M. Q-and A-learning methods for estimating optimal dynamic treatment regimes. Stat Sci. 2014;29(4):640.
Zhao Y, Kosorok MR, Zeng D. Reinforcement learning design for cancer clinical trials. Stat Med. 2009;28(26):3294-3315.
Goldberg Y, Kosorok MR. Q-learning with censored data. Ann Stat. 2012;40(1):529.
Lu W, Zhang HH, Zeng D. Variable selection for optimal treatment decision. Stat Methods Med Res. 2013;22(5):493-504.
Moodie EE, Dean N, Sun YR. Q-learning: flexible learning about useful utilities. Stat Biosci. 2014;6(2):223-243.
Tian L, Alizadeh AA, Gentles AJ, Tibshirani R. A simple method for estimating interactions between a treatment and a large number of covariates. J Am Stat Assoc. 2014;109(508):1517-1532.
Laber EB, Linn KA, Stefanski LA. Interactive model building for Q-learning. Biometrika. 2014;101(4):831-847.
Zhou X, Kosorok MR. Causal nearest neighbor rules for optimal treatment regimes; 2017. arXiv preprint arXiv:171108451.
Jeng XJ, Lu W, Peng H, et al. High-dimensional inference for personalized treatment decision. Electr J Stat. 2018;12(1):2074-2089.
Shi C, Fan A, Song R, Lu W. High-dimensional A-learning for optimal dynamic treatment regimes. Ann Stat. 2018;46(3):925-957.
Kosorok MR, Laber EB. Precision medicine. annual review of statistics and its application; 2019; In press.
Orellana L, Rotnitzky A, Robins JM. Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, Part I: main content. Int J Biostat. 2010;6(2):1-46.
Zhang B, Tsiatis AA, Laber EB, Davidian M. A robust method for estimating optimal treatment regimes. Biometrics. 2012;68(4):1010-1018.
Zhao Y, Zeng D, Rush AJ, Kosorok MR. Estimating individualized treatment rules using outcome weighted learning. J Am Stat Assoc. 2012;107(499):1106-1118.
Zhang B, Tsiatis AA, Laber EB, Davidian M. Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions. Biometrika. 2013;100(3):681-694.
Zhao YQ, Zeng D, Laber EB, Song R, Yuan M, Kosorok MR. Doubly robust learning for estimating individualized treatment with censored data. Biometrika. 2014;102(1):151-168.
Zhao YQ, Zeng D, Laber EB, Kosorok MR. New statistical learning methods for estimating optimal dynamic treatment regimes. J Am Stat Assoc. 2015;110(510):583-598.
Zhou X, Mayer-Hamblett N, Khan U, Kosorok MR. Residual weighted learning for estimating individualized treatment rules. J Am Stat Assoc. 2017;112(517):169-187.
Athey S, Wager S. Efficient policy learning; 2017. arXiv preprint arXiv:170202896.
Zhang B, Zhang M. C-learning: a new classification framework to estimate optimal dynamic treatment regimes. Biometrics. 2018;74(3):891-899.
Liu Y, Wang Y, Kosorok MR, Zhao Y, Zeng D. Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens. Stat Med. 2018;37(26):3776-3788.
Luckett DJ, Laber EB, Kahkoska AR, Maahs DM, Mayer-Davis E, Kosorok MR. Estimating dynamic treatment regimes in mobile health using V-learning. J Am Stat Assoc. 2018;115(530):692-706.
Robins JM. Causal Inference from Complex Longitudinal Data Latent Variable Modeling and Applications to Causality. New York, NY: Springer; 1997:69-117.
Yu Z, van der Laan MJ. Construction of counterfactuals and the G-computation formula; 2002.
Xu Y, Müller P, Wahed AS, Thall PF. Bayesian nonparametric estimation for dynamic treatment regimes with sequential transition times. J Am Stat Assoc. 2016;111(515):921-950.
Guan Q, Reich BJ, Laber EB, Bandyopadhyay D. Bayesian nonparametric policy search with application to periodontal recall intervals; 2018. arXiv preprint arXiv:181004338.
Laber EB, Meyer NJ, Reich BJ, Pacifici K, Collazo JA, Drake JM. Optimal treatment allocations in space and time for on-line control of an emerging infectious disease. J Royal Stat Soc Ser C (Appl Stat). 2018;67(4):743-789.
Almirall D, DiStefano C, Chang YC, et al. Longitudinal effects of adaptive interventions with a speech-generating device in minimally verbal children with ASD. J Clin Child Adolesc Psychol. 2016;45(4):442-456.
Lu X, Nahum-Shani I, Kasari C, et al. Comparing dynamic treatment regimes using repeated-measures outcomes: modeling considerations in SMART studies. Stat Med. 2016;35(10):1595-1615.
Ertefaie A, Shortreed S, Chakraborty B. Q-learning residual analysis: application to the effectiveness of sequences of antipsychotic medications for patients with schizophrenia. Stat Med. 2016;35(13):2221-2234.
Nahum-Shani I, Ertefaie A, Lu X, et al. A SMART data analysis method for constructing adaptive treatment strategies for substance use disorders. Addiction. 2017;112(5):901-909.
Kilbourne AM, Smith SN, Choi SY, et al. Adaptive School-based Implementation of CBT (ASIC): clustered-SMART for building an optimized adaptive implementation intervention to improve uptake of mental health interventions in schools. Implement Sci. 2018;13(1):119.
Kidwell KM, Seewald NJ, Tran Q, Kasari C, Almirall D. Design and analysis considerations for comparing dynamic treatment regimens with binary outcomes from sequential multiple assignment randomized trials. J Appl Stat. 2018;45(9):1628-1651.
Seaman SR, White IR, Copas AJ, Li L. Combining multiple imputation and inverse-probability weighting. Biometrics. 2012;68(1):129-137.
Tsiatis A. Semiparametric Theory and Missing Data. New York, NY: Springer Science & Business Media; 2007.
Lavori PW, Dawson R. Dynamic treatment regimes: practical design considerations. Clin Trials. 2004;1(1):9-20.
Murphy SA. An experimental design for the development of adaptive treatment strategies. Stat Med. 2005;24(10):1455-1481.
Kidwell KM. SMART designs in cancer research: Past, present, and future. Clin Trials. 2014;11(4):445-456.
Laber E, Zhao Y. Tree-based methods for individualized treatment regimes. Biometrika. 2015;102(3):501-514.
Chen G, Zeng D, Kosorok MR. Personalized dose finding using outcome weighted learning. J Am Stat Assoc. 2016;111(516):1509-1521.
Rubin DB. Bayesian inference for causal effects: The role of randomization. Ann Stat. 1978;6:34-58.
Splawa-Neyman J, Dabrowska DM, Speed T. On the application of probability theory to agricultural experiments. essay on principles. Section 9. Stat Sci. 1990;5:465-472.
Chakraborty B, Moodie E. Statistical Methods for Dynamic Treatment Regimes. New York, NY: Springer; 2013.
Qian M, Murphy SA. Performance guarantees for individualized treatment rules. Ann Stat. 2011;39(2):1180.
Rubin DB. Inference and missing data. Biometrika. 1976;63(3):581-592.
Little RJ, Rubin DB. Statistical Analysis with Missing Data. Vol 333. Hoboken, NJ: John Wiley & Sons; 2014.
Tsiatis AA, Kenward MG, Fitzmaurice G, Verbeke G, Molenberghs G. Handbook of Missing Data Methodology. Boca Raton, FL: Chapman & Hall/CRC; 2014.
Kosorok MR. Introduction to Empirical Processes and Semiparametric Inference. New York, NY: Springer Science & Business Media; 2007.
Robins JM, Rotnitzky A. Semiparametric efficiency in multivariate regression models with missing data. J Am Stat Assoc. 1995;90(429):122-129.
Fu WJ. Penalized estimating equations. Biometrics. 2003;59(1):126-132.
Johnson BA, Lin D, Zeng D. Penalized estimating functions and variable selection in semiparametric regression models. J Am Stat Assoc. 2008;103:672-680.
Zhao YQ, Laber EB, Ning Y, Saha S, Sands B. Efficient augmentation and relaxation learning for individualized treatment rules using observational data. J Mach Learn Res. 2019;48:1-23.
Jiang B, Song R, Li J, Zeng D. Entropy learning for dynamic treatment regimes. Stat Sinica. 2019;29:1633-1655.
Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45.
Stroup TS, McEvoy JP, Swartz MS, et al. The national institute of mental health clinical antipsychotic trials of intervention effectiveness (CATIE) project: schizophrenia trial design and protocol development. Schizophr Bull. 2003;29(1):15.
Shortreed SM, Laber E, Lizotte DJ, Stroup TS, Pineau J, Murphy SA. Informing sequential clinical decision-making through reinforcement learning: an empirical study. Mach Learn. 2011;84(1-2):109-136.
Shortreed SM, Moodie EE. Estimating the optimal dynamic antipsychotic treatment regime: evidence from the sequential multiple-assignment randomized clinical antipsychotic trials of intervention and effectiveness schizophrenia study. J Royal Stat Soc Ser C (Appl Stat). 2012;61(4):577-599.
Laber EB, Lizotte DJ, Ferguson B. Set-valued dynamic treatment regimes for competing outcomes. Biometrics. 2014;70(1):53-61.
Busoniu L, Babuska R, De Schutter B, Ernst D. Reinforcement Learning and Dynamic Programming Using Function Approximators. Boca Raton, FL: CRC Press; 2010.
Geramifard A, Walsh TJ, Tellex S, et al. A tutorial on linear function approximators for dynamic programming and reinforcement learning. Found Trends Mach Learn. 2013;6(4):375-451.
Sutton RS, Barto AG. Reinforcement Learning: An Introduction: MIT Press; 2018.
Bellman R. Dynamic Programming. Princeton, NJ: Princeton University Press; 1957:151.
Zhang B, Tsiatis AA, Davidian M, Zhang M, Laber E. Estimating optimal treatment regimes from a classification perspective. Stat. 2012;1(1):103-114.
Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines and other Kernel-based Learning Methods. Cambridge, MA: Cambridge University Press; 2000.
Moguerza JM, Muñoz A. Support vector machines with applications. Stat Sci. 2006;21(3):322-336.
Berlinet A, Thomas-Agnan C. Reproducing Kernel Hilbert Spaces in Probability and Statistics. New York, NY: Springer Science & Business Media; 2011.
Nosedal-Sanchez A, Storlie CB, Lee TC, Christensen R. Reproducing kernel Hilbert spaces for penalized regression: a tutorial. Am Stat. 2012;66(1):50-60.
Rubin DB, van der Laan MJ. Statistical issues and limitations in personalized medicine research with clinical trials. Int J Biostat. 2012;8.
Qi Z, Liu Y. D-learning to estimate optimal individual treatment rules. Electron J Stat. 2018a;12(2):3601-3638.
Qi Z, Liu D, Fu H, Liu Y. Multi-armed angle-based direct learning for estimating optimal individualized treatment rules with various outcomes. J Am Stat Assoc. 2018b;115:1-35.
Davidian M, Tsiatis AA, Holloway S, Laber EB. Introduction to Treatment Regimes. Boca Raton, FL: Chapman Hall (forthcoming; 2019.
Taylor JM, Cheng W, Foster JC. Reader reaction to "A robust method for estimating optimal treatment regimes" by Zhang et al.(2012). Biometrics. 2015;71(1):267-273.
Wolsey LA, Nemhauser GL. Integer and Combinatorial Optimization. Hoboken, NJ: John Wiley & Sons; 2014.
Boyd S, Vandenberghe L. Convex Optimization. Cambridge, MA: Cambridge University Press; 2004.

Auteurs

Lin Dong (L)

Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA.

Eric Laber (E)

Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA.

Yair Goldberg (Y)

Department of Statistics, Technion Israel Institute of Technology, Haifa, Israel.

Rui Song (R)

Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA.

Shu Yang (S)

Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH