Leveraging the variational Bayes autoencoder for survival analysis.

Censored data Deep learning Survival analysis Time to event Variational autoencoders

Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
19 Oct 2024
Historique:
received: 11 07 2024
accepted: 10 10 2024
medline: 20 10 2024
pubmed: 20 10 2024
entrez: 19 10 2024
Statut: epublish

Résumé

Survival analysis in medical research has witnessed a growing interest in applying deep learning techniques to model complex, high-dimensional, heterogeneous, incomplete, and censored data. Current methods make assumptions about the relations between data that may not be valid in practice. Therefore, we introduce SAVAE (Survival Analysis Variational Autoencoder). SAVAE, based on Variational Autoencoders, contributes significantly to the field by introducing a tailored Evidence Lower BOund formulation, supporting various parametric distributions for covariates and survival time (if the log-likelihood is differentiable). It offers a general method that demonstrates robustness and stability through different experiments. Our proposal effectively estimates time-to-event, accounting for censoring, covariate interactions, and time-varying risk associations. We validate our model in diverse datasets, including genomic, clinical, and demographic tabular data, with varying levels of censoring. This approach demonstrates competitive performance compared to state-of-the-art techniques, as assessed by the Concordance Index and the Integrated Brier Score. SAVAE also offers an interpretable model that parametrically models covariates and time. Moreover, its generative architecture facilitates further applications such as clustering, data imputation, and synthetic patient data generation through latent space inference from survival data. This approach fosters data sharing and collaboration, improving medical research and personalized patient care.

Identifiants

pubmed: 39427084
doi: 10.1038/s41598-024-76047-z
pii: 10.1038/s41598-024-76047-z
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

24567

Subventions

Organisme : European Union's Horizon 2020
ID : 101017549

Informations de copyright

© 2024. The Author(s).

Références

Jing, H. & Smola, A. J. Neural survival recommender. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. 515–524 (2017).
Grob, C. M., Lerman, D. C., Langlinais, C. A. & Villante, N. K. Assessing and teaching job-related social skills to adults with autism spectrum disorder. J. Appl. Behav. Anal. 52, 150–172 (2019).
doi: 10.1002/jaba.503 pubmed: 30221363
Wang, R. et al. Estimation of global black carbon direct radiative forcing and its uncertainty constrained by observations. J. Geophys. Res. Atmos. 121, 5948–5971 (2016).
doi: 10.1002/2015JD024326
Dellana, S. & West, D. Survival analysis of supply chain financial risk. J. Risk Finance 17, 130–151 (2016).
doi: 10.1108/JRF-11-2015-0112
Cox, D. R. Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodological) 34, 187–202 (1972).
doi: 10.1111/j.2517-6161.1972.tb00899.x
Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53, 457–481 (1958).
doi: 10.1080/01621459.1958.10501452
Lee, E. T. & Wang, J. Statistical Methods for Survival Data Analysis. Vol. 476 (Wiley, 2003).
Ranganath, R., Tran, D., Altosaar, J. & Blei, D. Operator variational inference. Adv. Neural Inf. Process. Syst. 29 (2016).
Faraggi, D. & Simon, R. A neural network model for survival data. Stat. Med. 14, 73–82 (1995).
doi: 10.1002/sim.4780140108 pubmed: 7701159
Katzman, J. L. et al. Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18, 1–12 (2018).
doi: 10.1186/s12874-018-0482-1
Luck, M., Sylvain, T., Cardinal, H., Lodi, A. & Bengio, Y. Deep learning for patient-specific kidney graft survival analysis. arXiv preprint arXiv:1705.10245 (2020).
Kraisangka, J. & Druzdzel, M. J. A bayesian network interpretation of the cox’s proportional hazard model. Int. J. Approx. Reas. 103, 195–211 (2018).
doi: 10.1016/j.ijar.2018.09.007
Vinzamuri, B. & Reddy, C. K. Cox regression with correlation based regularization for electronic health records. In 2013 IEEE 13th International Conference on Data Mining. 757–766 (IEEE, 2013).
Vinzamuri, B., Li, Y. & Reddy, C. K. Active learning based survival regression for censored data. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 241–250 (2014).
Lee, C., Zame, W., Yoon, J. & Van Der Schaar, M. Deephit: A deep learning approach to survival analysis with competing risks. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32 (2018).
Kingma, D. P. & Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
doi: 10.1126/science.1127647 pubmed: 16873662
Ranganath, R., Perotte, A., Elhadad, N. & Blei, D. Deep survival analysis. In Machine Learning for Healthcare Conference. 101–114 (PMLR, 2016).
Liverani, S., Leigh, L., Hudson, I. L. & Byles, J. E. Clustering method for censored and collinear survival data. Comput. Stat. 36, 35–60 (2021).
doi: 10.1007/s00180-020-01000-3
Hosmer Jr, D. W., Lemeshow, S. & May, S. Applied Survival Analysis: Regression Modeling of Time-to-Event Data. Vol. 618 (Wiley, 2008).
Knaus, W. A. et al. The support prognostic model: Objective estimates of survival for seriously ill hospitalized adults. Ann. Intern. Med. 122, 191–203 (1995).
doi: 10.7326/0003-4819-122-3-199502010-00007 pubmed: 7810938
Foekens, J. A. et al. The urokinase system of plasminogen activation and prognosis in 2780 breast cancer patients. Cancer Res. 60, 636–43 (2000).
pubmed: 10676647
Schumacher, M. et al. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German breast cancer study group. J. Clin. Oncol. 12, 2086–2093 (1994).
doi: 10.1200/JCO.1994.12.10.2086 pubmed: 7931478
Dispenzieri, A. et al. Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. In Mayo Clinic Proceedings. Vol. 87. 517–523 (Elsevier, 2012).
Breslow, N. E. & Chatterjee, N. Design and analysis of two-phase studies with binary outcome applied to wilms tumour prognosis. J. R. Stat. Soc. Ser. C (Appl. Stat.) 48, 457–468 (1999).
doi: 10.1111/1467-9876.00165
Pereira, B. et al. The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes. Nat. Commun. 7, 11479 (202).
Therneau, T. M. Extending the cox model. In Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis. 51–84 (Springer, 1997).
Antolini, L., Boracchi, P. & Biganzoli, E. A time-dependent discrimination index for survival data. Stat. Med. 24, 3927–3944 (2005).
doi: 10.1002/sim.2427 pubmed: 16320281
Harrell, F. E., Califf, R. M., Pryor, D. B., Lee, K. L. & Rosati, R. A. Evaluating the yield of medical tests. Jama 247, 2543–2546 (1982).
doi: 10.1001/jama.1982.03320430047030 pubmed: 7069920
Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78, 1–3 (1950).
doi: 10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
Graf, E., Schmoor, C., Sauerbrei, W. & Schumacher, M. Assessment and comparison of prognostic classification schemes for survival data. Stat. Med. 18, 2529–2545 (1999).
doi: 10.1002/(SICI)1097-0258(19990915/30)18:17/18<2529::AID-SIM274>3.0.CO;2-5 pubmed: 10474158
Robins, J. M. et al. Information recovery and bias adjustment in proportional hazards regression analysis of randomized trials using surrogate markers. In Proceedings of the Biopharmaceutical Section. Vol. 24 (American Statistical Association, 1993).
Tukey, J. W. The philosophy of multiple comparisons. Stat. Sci. 100–116 (1991).
Lehmann, E. L. & Romano, J. P. Generalizations of the Familywise Error Rate (Springer, 2012).
Van der Laan, M. J., Dudoit, S. & Pollard, K. S. Multiple testing. Part II. Step-down procedures for control of the family-wise error rate. In Statistical Applications in Genetics and Molecular Biology. Vol. 3 (2004).
Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 65–70 (1979).
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019).
Kvamme, H., Borgan, Ø. & Scheel, I. Time-to-event prediction with neural networks and cox regression. J. Mach. Learn. Res 20, 1–30 (2019).
Virtanen, P. et al. Scipy 1.0. fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020).
Nelson, W. B. Applied Life Data Analysis (Wiley, 2005).
Lim, K.-L., Jiang, X. & Yi, C. Deep clustering with variational autoencoder. IEEE Signal Process. Lett. 27, 231–235 (2020).
doi: 10.1109/LSP.2020.2965328
McCoy, J. T., Kroon, S. & Auret, L. Variational autoencoders for missing data imputation with application to a simulated milling circuit. IFAC-PapersOnLine 51, 141–146 (2018).
doi: 10.1016/j.ifacol.2018.09.406
Chadebec, C. & Allassonniere, S. Data augmentation with variational autoencoders and manifold sampling. arxiv:2103.13751 (2021).
Gu, Z. et al. Frepd: A robust federated learning framework on variational autoencoder. Comput. Syst. Sci. Eng. 39, 307–320 (2021).
doi: 10.32604/csse.2021.017969
Polato, M. Federated variational autoencoder for collaborative filtering. In 2021 International Joint Conference on Neural Networks (IJCNN). 1–8 (IEEE, 2021).

Auteurs

Patricia A Apellániz (PA)

Information Processing and Telecommunications Center, ETSI Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense, 30, 28040, Madrid, Spain. patricia.alonsod@upm.es.

Juan Parras (J)

Information Processing and Telecommunications Center, ETSI Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense, 30, 28040, Madrid, Spain.

Santiago Zazo (S)

Information Processing and Telecommunications Center, ETSI Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense, 30, 28040, Madrid, Spain.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH