Integrating knowledge graphs into machine learning models for survival prediction and biomarker discovery in patients with non-small-cell lung cancer.


Journal

Journal of translational medicine
ISSN: 1479-5876
Titre abrégé: J Transl Med
Pays: England
ID NLM: 101190741

Informations de publication

Date de publication:
05 Aug 2024
Historique:
received: 03 06 2024
accepted: 13 07 2024
medline: 6 8 2024
pubmed: 6 8 2024
entrez: 5 8 2024
Statut: epublish

Résumé

Accurate survival prediction for Non-Small Cell Lung Cancer (NSCLC) patients remains a significant challenge for the scientific and clinical community despite decades of advanced analytics. Addressing this challenge not only helps inform the critical aspects of clinical study design and biomarker discovery but also ensures that the 'right patient' receives the 'right treatment'. However, survival prediction is a highly complex task, given the large number of 'omics; and clinical features, as well as the high degree of freedom that drive patient survival. Prior knowledge could play a critical role in uncovering the complexity of a disease and understanding the driving factors affecting a patient's survival. We introduce a methodology for incorporating prior knowledge into machine learning-based models for prediction of patient survival through Knowledge Graphs, demonstrating the advantage of such an approach for NSCLC patients. Using data from patients treated with immuno-oncologic therapies in the POPLAR (NCT01903993) and OAK (NCT02008227) clinical trials, we found that the use of knowledge graphs yielded significantly improved hazard ratios, including in the POPLAR cohort, for models based on biomarker tumor mutation burden compared with those based on knowledge graphs. Use of a model-defined mutational 10-gene signature led to significant overall survival differentiation for both trials. We provide parameterized code for incorporating knowledge graphs into survival analyses for use by the wider scientific community.

Identifiants

pubmed: 39103897
doi: 10.1186/s12967-024-05509-9
pii: 10.1186/s12967-024-05509-9
doi:

Substances chimiques

Biomarkers, Tumor 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

726

Informations de copyright

© 2024. The Author(s).

Références

van Timmeren JE, et al. Survival prediction of non-small cell lung cancer patients using radiomics analyses of cone-beam CT images. Radiother Oncol. 2017;123:363–9.
doi: 10.1016/j.radonc.2017.04.016 pubmed: 28506693
Yao J, Zhu X, Zhu F, Huang J. Deep correlational learning for survival prediction from multi-modality data. Medical Image Computing and Computer-Assisted Intervention, Quebec City, Quebec, Canada, September 10–14, 2017.
Vale-Silva LA, Rohr K. Long-term cancer survival prediction using multimodal deep learning. Sci Rep. 2021;11:13505.
doi: 10.1038/s41598-021-92799-4 pubmed: 34188098 pmcid: 8242026
Andre F, et al. Biomarker studies: a call for a comprehensive biomarker study registry. Nat Rev Clin Oncol. 2011;8:171–6.
doi: 10.1038/nrclinonc.2011.4 pubmed: 21364690
Chandak P, Huang K, Zitnik M. Building a knowledge graph to enable precision medicine. Sci Data. 2023;10:67.
doi: 10.1038/s41597-023-01960-3 pubmed: 36732524 pmcid: 9893183
Gogleva A, et al. Knowledge graph-based recommendation framework identifies drivers of resistance in EGFR mutant non-small cell lung cancer. Nat Commun. 2022;13:1667.
doi: 10.1038/s41467-022-29292-7 pubmed: 35351890 pmcid: 8964738
Himmelstein DS, et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife. 2017;6:e26726.
doi: 10.7554/eLife.26726 pubmed: 28936969 pmcid: 5640425
Weinreich SS, Mangon R, Sikkens JJ, Teeuw ME, Cornel MC. Orphanet: a European database for rare diseases. Ned Tijdschr Geneeskd. 2008;152:518–9.
pubmed: 18389888
Waagmeester A, et al. Wikidata as a knowledge graph for the life sciences. eLife. 2020;9:e52614.
doi: 10.7554/eLife.52614 pubmed: 32180547 pmcid: 7077981
Geleta D et al. Biological Insights Knowledge Graph: an integrated knowledge graph to support drug development, (2021). https://www.biorxiv.org/content/ https://doi.org/10.1101/2021.10.28.466262v1 . Accessed February 6, 2024.
Ramirez R, et al. Prediction and interpretation of cancer survival using graph convolution neural networks. Methods. 2021;192:120–30.
doi: 10.1016/j.ymeth.2021.01.004 pubmed: 33484826 pmcid: 8808665
Liu LJ, Ortiz-Soriano V, Neyra JA, Chen J. Kgdal: knowledge graph guided double attention lstm for rolling mortality prediction for aki-d patients. Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Gainesville, FL, August 1–4, 2021.
Fu X, Patrick E, Yang JY, Feng DD, Kim J. Deep multimodal graph-based network for survival prediction from highly multiplexed images and patient variables. Comput Biol Med. 2023;154:106576.
doi: 10.1016/j.compbiomed.2023.106576 pubmed: 36736097
Zhang H, Data integration through ontology-based data access to support integrative data analysis: a case study of cancer survival. 2017 IEEE International Conference on Bioinformatics and, Biomedicine et al. (BIBM), Kansas City, MO, November 13–16, 2017.
Zhao Y, et al. Pathologic lymph node ratio is a predictor of esophageal carcinoma patient survival: a literature-based pooled analysis. Oncotarget. 2017;8:62231.
doi: 10.18632/oncotarget.19258 pubmed: 28977940 pmcid: 5617500
Tang L, Liu H. Relational learning via latent social dimensions. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28–July 1, 2009.
Torres L, Chan KS, Eliassi-Rad T. GLEE: geometric laplacian eigenmap embedding. J Complex Netw. 2020;8:cnaa007.
doi: 10.1093/comnet/cnaa007
Qiu J et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, Marina del Ray, CA, February 5–9, 2018.
Zhang Z, Cui P, Li H, Wang X, Zhu W. Billion-scale network embedding with iterative random projection. 2018 IEEE International Conference on Data Mining (ICDM), Sentosa, Singapore, November 17–20, 2018.
Yang D, Rosso P, Li B, Cudre-Mauroux P. Nodesketch: highly efficient graph embeddings via recursive sketching. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, August 4–8, 2019.
Li J, Wu L, Guo R, Liu C, Liu H. Multi-level network embedding with boosted low-rank matrix approximation. Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Vancouver, BC, Canada, August 27–30, 2019.
Easton DF, et al. Gene-panel sequencing and the prediction of breast-cancer risk. N Engl J Med. 2015;372:2243–57.
doi: 10.1056/NEJMsr1501341 pubmed: 26014596 pmcid: 4610139
Goel MK, Khanna P, Kishore J. Understanding survival analysis: Kaplan-Meier estimate. Int J Ayurveda Res. 2010;1:274.
doi: 10.4103/0974-7788.76794 pubmed: 21455458 pmcid: 3059453
Gandara DR, et al. Blood-based tumor mutational burden as a predictor of clinical benefit in non-small-cell lung cancer patients treated with atezolizumab. Nat Med. 2018;24:1441–8.
doi: 10.1038/s41591-018-0134-3 pubmed: 30082870
Fehrenbacher L, et al. Atezolizumab versus Docetaxel for patients with previously treated non-small-cell lung cancer (POPLAR): a multicentre, open-label, phase 2 randomised controlled trial. Lancet. 2016;387:1837–46.
doi: 10.1016/S0140-6736(16)00587-0 pubmed: 26970723
Chen Y, et al. TP53 and ATM co-mutation predicts response to immune checkpoint inhibitors in non-small cell lung cancer (Abstract 1240P). Ann Oncol. 2019;30:V506.
doi: 10.1093/annonc/mdz253.066
Zhang F, et al. Co-occurring genomic alterations and immunotherapy efficacy in NSCLC. NPJ Precis Oncol. 2022;6:4.
doi: 10.1038/s41698-021-00243-7 pubmed: 35042953 pmcid: 8766442
Zhang F, et al. Co-occurring genomic alterations and immunotherapy efficacy in NSCLC. NPJ Precision Oncol. 2022;6:4.
doi: 10.1038/s41698-021-00243-7
Bai X et al. Development and validation of a genomic mutation signature to predict response to PD-1 inhibitors in non-squamous NSCLC: a multicohort study. (Correction in J Immunother Cancer. 2020;8(2):e000381corr1). J Immunother Cancer 8, e000381 (2020).
Chakravarty D et al. OncoKB: a precision oncology knowledge base. JCO Precis Oncol 2017, PO.17.00011 (2017).
Nguyen B, et al. Genomic characterization of metastatic patterns from prospective clinical sequencing of 25,000 patients. Cell. 2022;185:563–75.
doi: 10.1016/j.cell.2022.01.003 pubmed: 35120664 pmcid: 9147702
Rittmeyer A, et al. Atezolizumab versus Docetaxel in patients with previously treated non-small-cell lung cancer (OAK): a phase 3, open-label, multicentre randomised controlled trial. Lancet. 2017;389:255–65.
doi: 10.1016/S0140-6736(16)32517-X pubmed: 27979383
Mazieres J, et al. Atezolizumab versus Docetaxel in pretreated patients with NSCLC: final results from the randomized phase 2 POPLAR and phase 3 OAK clinical trials. J Thorac Oncol. 2021;16:140–50.
doi: 10.1016/j.jtho.2020.09.022 pubmed: 33166718
InnateDB. About InnateDB. https://www.innatedb.com . Accessed February 5, 2024.
Breuer K, et al. InnateDB: systems biology of innate immunity and beyond—recent updates and continuing curation. Nucleic Acids Res. 2013;41:D1228–33.
doi: 10.1093/nar/gks1147 pubmed: 23180781
Lanczos C. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J Res Natl Bur Stand. 1950;45:255–82.
doi: 10.6028/jres.045.026
Lunn M, McNeil D. Applying Cox regression to competing risks. Biometrics, 524–32 (1995).
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random Survival Forests. (2008).
Sha D, et al. Tumor mutational burden as a predictive biomarker in solid tumors. Cancer Discov. 2020;10:1808–25.
doi: 10.1158/2159-8290.CD-20-0522 pubmed: 33139244 pmcid: 7710563
Lee H, et al. Prognostic implications of type and density of tumour-infiltrating lymphocytes in gastric cancer. Br J Cancer. 2008;99:1704–11.
doi: 10.1038/sj.bjc.6604738 pubmed: 18941457 pmcid: 2584941
Martin L, et al. Examining the technique of angiogenesis assessment in invasive breast cancer. Br J Cancer. 1997;76:1046–54.
doi: 10.1038/bjc.1997.506 pubmed: 9376265 pmcid: 2228092
Preisser F, et al. Extent of lymph node dissection improves survival in prostate cancer patients treated with radical prostatectomy without lymph node invasion. Prostate. 2018;78:469–75.
doi: 10.1002/pros.23491 pubmed: 29460290

Auteurs

Chao Fang (C)

Oncology Data Science, Oncology R&D, AstraZeneca, Waltham, MA, USA.

Gustavo Alonso Arango Argoty (GA)

Oncology Data Science, Oncology R&D, AstraZeneca, Waltham, MA, USA.

Ioannis Kagiampakis (I)

Oncology Data Science, Oncology R&D, AstraZeneca, South San Francisco, CA, USA.

Mohammad Hassan Khalid (MH)

AI Accelerators, R&D IT, AstraZeneca, Cambridge, UK.

Etai Jacob (E)

Oncology Data Science, Oncology R&D, AstraZeneca, Waltham, MA, USA.

Krishna C Bulusu (KC)

Oncology Data Science, Oncology R&D, AstraZeneca, Cambridge, UK. krishna.bulusu@astrazeneca.com.

Natasha Markuzon (N)

Oncology Data Science, Oncology R&D, AstraZeneca, Waltham, MA, USA. natasha.markuzon@astrazeneca.com.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH