MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records.

Electronic health records Survival analysis Topic modeling

Journal

Journal of biomedical informatics
ISSN: 1532-0480
Titre abrégé: J Biomed Inform
Pays: United States
ID NLM: 100970413

Informations de publication

Date de publication:
15 Apr 2024
Historique:
received: 20 12 2023
revised: 07 03 2024
accepted: 03 04 2024
medline: 18 4 2024
pubmed: 18 4 2024
entrez: 17 4 2024
Statut: aheadofprint

Résumé

Survival models can help medical practitioners to evaluate the prognostic importance of clinical variables to patient outcomes such as mortality or hospital readmission and subsequently design personalized treatment regimes. Electronic Health Records (EHRs) hold the promise for large-scale survival analysis based on systematically recorded clinical features for each patient. However, existing survival models either do not scale to high dimensional and multi-modal EHR data or are difficult to interpret. In this study, we present a supervised topic model called MixEHR-SurG to simultaneously integrate heterogeneous EHR data and model survival hazard. Our contributions are three-folds: (1) integrating EHR topic inference with Cox proportional hazards likelihood; (2) integrating patient-specific topic hyperparameters using the PheCode concepts such that each topic can be identified with exactly one PheCode-associated phenotype; (3) multi-modal survival topic inference. This leads to a highly interpretable survival topic model that can infer PheCode-specific phenotype topics associated with patient mortality. We evaluated MixEHR-SurG using a simulated dataset and two real-world EHR datasets: the Quebec Congenital Heart Disease (CHD) data consisting of 8211 subjects with 75,187 outpatient claim records of 1767 unique ICD codes; the MIMIC-III consisting of 1458 subjects with multi-modal EHR records. Compared to the baselines, MixEHR-SurG achieved a superior dynamic AUROC for mortality prediction, with a mean AUROC score of 0.89 in the simulation dataset and a mean AUROC of 0.645 on the CHD dataset. Qualitatively, MixEHR-SurG associates severe cardiac conditions with high mortality risk among the CHD patients after the first heart failure hospitalization and critical brain injuries with increased mortality among the MIMIC-III patients after their ICU discharge. Together, the integration of the Cox proportional hazards model and EHR topic inference in MixEHR-SurG not only leads to competitive mortality prediction but also meaningful phenotype topics for in-depth survival analysis. The software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-SurG.

Identifiants

pubmed: 38631461
pii: S1532-0464(24)00056-X
doi: 10.1016/j.jbi.2024.104638
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

104638

Informations de copyright

Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Yixuan Li (Y)

Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada.

Archer Y Yang (AY)

Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada. Electronic address: archer.yang@mcgill.ca.

Ariane Marelli (A)

McGill Adult Unit for Congenital Heart Disease (MAUDE Unit), McGill University of Health Centre, Montreal, Canada. Electronic address: ariane.marelli@mcgill.ca.

Yue Li (Y)

Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada. Electronic address: yueli@cs.mcgill.ca.

Classifications MeSH