Predicting subgroup treatment effects for a new study: Motivations, results and learnings from running a data challenge in a pharmaceutical corporation.

common task framework data science machine learning subgroup analysis subgroup identification

Journal

Pharmaceutical statistics
ISSN: 1539-1612
Titre abrégé: Pharm Stat
Pays: England
ID NLM: 101201192

Informations de publication

Date de publication:
07 Feb 2024
Historique:
revised: 01 12 2023
received: 30 03 2023
accepted: 21 01 2024
medline: 8 2 2024
pubmed: 8 2 2024
entrez: 8 2 2024
Statut: aheadofprint

Résumé

We present the motivation, experience, and learnings from a data challenge conducted at a large pharmaceutical corporation on the topic of subgroup identification. The data challenge aimed at exploring approaches to subgroup identification for future clinical trials. To mimic a realistic setting, participants had access to 4 Phase III clinical trials to derive a subgroup and predict its treatment effect on a future study not accessible to challenge participants. A total of 30 teams registered for the challenge with around 100 participants, primarily from Biostatistics organization. We outline the motivation for running the challenge, the challenge rules, and logistics. Finally, we present the results of the challenge, the participant feedback as well as the learnings. We also present our view on the implications of the results on exploratory analyses related to treatment effect heterogeneity.

Identifiants

pubmed: 38326967
doi: 10.1002/pst.2368
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© 2024 John Wiley & Sons Ltd.

Références

Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. pp. 248-255. 2009.
Saez-Rodriguez J, Costello JC, Friend SH, et al. Crowdsourcing biomedical research: leveraging communities as innovation engines. Nat Rev Genet. 2016;17(8):470-486. doi:10.1038/nrg.2016.69
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIII. Proteins Struct Funct Bioinform. 2019;87(12):1011-1020. doi:10.1002/prot.25823
Voorhees EM, Harman DK. TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing). The MIT Press; 2005.
Donoho D. 50 years of data science. J Comput Graph Stat. 2017;26(4):745-766. doi:10.1080/10618600.2017.1384734
Liberman M. Fred Jelinek. Comput Linguist. 2010;36(4):595-599.
Ruberg SJ. Assessing and communicating heterogeneity of treatment effects for patient subpopulations: the hardest problem there is. Pharm Stat. 2021;20(5):939-944.
Yusuf S, Wittes J, Probstfield J, Tyroler HA. Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. Jama. 1991;266(1):93-98.
Assmann SF, Pocock SJ, Enos LE, Kasten LE. Subgroup analysis and other (mis) uses of baseline data in clinical trials. Lancet. 2000;355(9209):1064-1069.
Sleight P. Debate: subgroup analyses in clinical trials: fun to look at-but don't believe them! Trials. 2000;1(1):1-3.
Rothwell PM. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet. 2005;365(9454):176-186.
Austin PC, Mamdani MM, Juurlink DN, Hux JE. Testing multiple statistical hypotheses resulted in spurious associations: a study of astrological signs and health. J Clin Epidemiol. 2006;59(9):964-969.
Lagakos SW. The challenge of subgroup analyses-reporting without distorting. N Engl J Med. 2006;354(16):1667-1669.
Alosh M, Huque MF, Bretz F, D'Agostino RB Sr. Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials. Stat Med. 2017;36(8):1334-1360.
European Medicines Agency. Guideline on the investigation of subgroups in confirmatory clinical trials. EMA/CHMP/539146. 2019.
Silberzahn R, Uhlmann EL, Martin DP, et al. Many analysts, one data set: making transparent how variations in analytic choices affect results. Adv Methods Pract Psychol Sci. 2018;1(3):337-356. doi:10.1177/2515245917747646
Lipkovich I, Dmitrienko A, D'Agostini RB. Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med. 2017;36(1):136-196. doi:10.1002/sim.7064
Kent DM, Van Klaveren D, Paulus JK, et al. The predictive approaches to treatment effect heterogeneity (PATH) statement: explanation and elaboration. Ann Intern Med. 2020;172(1):W1-W25. doi:10.7326/M18-3668
Nie X, Wager S. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika. 2021;108(2):299-319.
Blair HA. Secukinumab: a review in psoriatic arthritis. Drugs. 2021;81(4):483-494.
Ogdie A, Coates L. The changing face of clinical trials in psoriatic arthritis. Curr Rheumatol Rep. 2017;19:1-10.
Felson DT, Anderson JJ, Boers M, et al. The American College of Rheumatology preliminary core set of disease activity measures for rheumatoid arthritis clinical trials. Arthritis Rheum. 1993;36(6):729-740.
O'Hagan A, Stevens JW, Campbell MJ. Assurance in clinical trial design. Pharm Stat. 2005;4(3):187-201.
CDISC. ADaM dataset and metadata standard. 2021 https://www.cdisc.org/standards/foundational/adam
Loh WY, Cao L, Zhou P. Subgroup identification for precision medicine: a comparative review of 13 methods. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9(5):e1326.
Sun S, Sechidis K, Chen Y, et al. Comparing algorithms for characterizing treatment effect heterogeneity in randomized trials. Biom J. 2022;66. https://onlinelibrary.wiley.com/doi/10.1002/bimj.202100337
Baillie M, Moloney C, Mueller CP, Dorn J, Branson J, Ohlssen D. Good data science practice: moving toward a code of practice for drug development. Stat Biopharm Res. 2023;15(1):74-85. doi:10.1080/19466315.2022.2063172
Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27-38.
ICH Harmonised Guideline. Addendum on Estimands and Sensitivity Analyses in Clinical Trials to the Guideline on Statistical Principles for Clinical Trials E9(R1). https://database.ich.org/sites/default/files/E9-R1_Step4_Guideline_2019_1203.pdf 2019.
Ge M, Durham LK, Meyer RD, Xie W, Thomas N. Covariate-adjusted difference in proportions from clinical trials using logistic regression andweighted risk differences. Drug Inf J. 2011;45(4):481-493.
Zeileis A, Hothorn T, Hornik K. Model-based recursive partitioning. J Comput Graph Stat. 2008;17(2):492-514.
Foster JC, Taylor JM, Ruberg SJ. Subgroup identification from randomized clinical trial data. Stat Med. 2011;30(24):2867-2880.
Wager S, Athey S. Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc. 2018;113(523):1228-1242.
Lemmerich F, Becker M. pysubgroup: Easy-to-Use Subgroup Discovery in Python. Springer; 2018:658-662.
Thomas M, Bornkamp B, Ickstadt K. Identifying treatment effect heterogeneity in dose-finding trials using Bayesian hierarchical models. Pharm Stat. 2022;21(1):17-37.
Van der Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol Biol. 2007;6(1). https://www.degruyter.com/document/doi/10.2202/1544-6115.1309/html
Baillie M, Cessie lS, Schmidt CO, Lusa L, Huebner M; for the Topic Group "Initial Data Analysis" of the STRATOS Initiative. Ten simple rules for initial data analysis. PLoS Comput Biol. 2022;18(2):1-7. doi:10.1371/journal.pcbi.1009819
Leek JT, Peng RD. Statistics: p values are just the tip of the iceberg. Nature. 2015;520(7549):612.
Sechidis K, Kormaksson M, Ohlssen D. Using knockoffs for controlled predictive biomarker identification. Stat Med. 2021;40(25):5453-5473.
Poole C, Shrier I, VanderWeele TJ. Is the risk difference really a more heterogeneous measure? Epidemiology. 2015;26(5):714-718.
Ding P, VanderWeele TJ. The differential geometry of homogeneity spaces across effect scales. arXiv preprint arXiv:1510.08534. 2015.
Bretz F, Westfall PH. Multiplicity and replicability: two sides of the same coin. Pharm Stat. 2014;13(6):343-344.
Gibson EW. The role of p-values in judging the strength of evidence and realistic replication expectations. Stat Biopharm Res. 2021;13(1):6-18.
Baker M. The author file: Gustavo Stolovitzky. Nat Methods. 2012;9(8):767.
Ruberg S, Zhang Y, Showalter H, Shen L. A platform for comparing subgroup identification methodologies. Biom J. 2023;66:2200164.
Sies A, Demyttenaere K, Van Mechelen I. Studying treatment-effect heterogeneity in precision medicine through induced subgroups. J Biopharm Stat. 2019;29(3):491-507.

Auteurs

Björn Bornkamp (B)

Global Drug Development, Novartis Pharma AG, Basel, Switzerland.

Silvia Zaoli (S)

Global Drug Development, Novartis Pharma AG, Basel, Switzerland.

Michela Azzarito (M)

Global Drug Development, Novartis Pharma AG, Basel, Switzerland.

Ruvie Martin (R)

Global Drug Development, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA.

Carsten Philipp Müller (CP)

Data Digital, Beyond Conception GmbH, Altendorf, Switzerland.

Conor Moloney (C)

Global Drug Development, Novartis Pharma AG, Dublin, Ireland.

Giulia Capestro (G)

Global Drug Development, Novartis Pharma AG, Basel, Switzerland.

David Ohlssen (D)

Global Drug Development, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA.

Mark Baillie (M)

Global Drug Development, Novartis Pharma AG, Basel, Switzerland.

Classifications MeSH