An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication.
Journal
Nature cancer
ISSN: 2662-1347
Titre abrégé: Nat Cancer
Pays: England
ID NLM: 101761119
Informations de publication
Date de publication:
07 2021
07 2021
Historique:
received:
14
04
2020
accepted:
14
06
2021
entrez:
5
2
2022
pubmed:
6
2
2022
medline:
20
4
2022
Statut:
ppublish
Résumé
Despite widespread adoption of electronic health records (EHRs), most hospitals are not ready to implement data science research in the clinical pipelines. Here, we develop MEDomics, a continuously learning infrastructure through which multimodal health data are systematically organized and data quality is assessed with the goal of applying artificial intelligence for individual prognosis. Using this framework, currently composed of thousands of individuals with cancer and millions of data points over a decade of data recording, we demonstrate prognostic utility of this framework in oncology. As proof of concept, we report an analysis using this infrastructure, which identified the Framingham risk score to be robustly associated with mortality among individuals with early-stage and advanced-stage cancer, a potentially actionable finding from a real-world cohort of individuals with cancer. Finally, we show how natural language processing (NLP) of medical notes could be used to continuously update estimates of prognosis as a given individual's disease course unfolds.
Identifiants
pubmed: 35121948
doi: 10.1038/s43018-021-00236-2
pii: 10.1038/s43018-021-00236-2
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
709-722Subventions
Organisme : CIHR
ID : FDN-143257
Pays : Canada
Commentaires et corrections
Type : CommentIn
Informations de copyright
© 2021. The Author(s), under exclusive licence to Springer Nature America, Inc.
Références
Arbabshirani, M. R. et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit. Med. 1, 9 (2018).
pubmed: 31304294
pmcid: 6550144
doi: 10.1038/s41746-017-0015-z
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).
pubmed: 28117445
doi: 10.1038/nature21056
pmcid: 28117445
Ehteshami Bejnordi, B. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).
pubmed: 29234806
pmcid: 5820737
doi: 10.1001/jama.2017.14585
Stidham, R. W. et al. Performance of a deep learning model vs human reviewers in grading endoscopic disease severity of patients with ulcerative colitis. JAMA Netw. Open 2, e193963 (2019).
pubmed: 31099869
pmcid: 6537821
doi: 10.1001/jamanetworkopen.2019.3963
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. J. Am. Med. Assoc. 316, 2402–2410 (2016).
doi: 10.1001/jama.2016.17216
Tomasev, N. et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572, 116–119 (2019).
pubmed: 31367026
pmcid: 6722431
doi: 10.1038/s41586-019-1390-1
Nemati, S. et al. An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit. Care Med. 46, 547–553 (2018).
pubmed: 29286945
pmcid: 5851825
doi: 10.1097/CCM.0000000000002936
Rojas, J. C. et al. Predicting intensive care unit readmission with machine learning using electronic health record data. Ann. Am. Thorac. Soc. 15, 846–853 (2018).
pubmed: 29787309
pmcid: 6207111
doi: 10.1513/AnnalsATS.201710-787OC
Frost, D. W. et al. Using the electronic medical record to identify patients at high risk for frequent emergency department visits and high system costs. Am. J. Med. 130, 601.e617–601.e622 (2017).
doi: 10.1016/j.amjmed.2016.12.008
Institute of Medicine (US) Roundtable on Evidence-Based Medicine. The Learning Healthcare System: Workshop Summary (eds. Olsen, L. A., Aisner, D. & McGinnis, J. M.) (National Academies Press, 2007).
Jackson, T. Building the ‘continuous learning’ healthcare system. Health Inf. Manag. 43, 4–5 (2014).
pubmed: 24948660
Deist, T. M. et al. Machine learning algorithms for outcome prediction in (chemo)radiotherapy: an empirical comparison of classifiers. Med. Phys. 45, 3449–3459 (2018).
pubmed: 29763967
doi: 10.1002/mp.12967
Gennatas, E. D. et al. Preoperative and postoperative prediction of long-term meningioma outcomes. PLoS ONE 13, e0204161 (2018).
pubmed: 30235308
pmcid: 6147484
doi: 10.1371/journal.pone.0204161
Hong, J. C., Niedzwiecki, D., Palta, M. & Tenenbaum, J. D. Predicting emergency visits and hospital admissions during radiation and chemoradiation: an internally validated pretreatment machine learning algorithm. JCO Clin. Cancer Inform. 2, 1–11 (2018).
pubmed: 30652595
Morin, O. et al. Integrated models incorporating radiologic and radiomic features predict meningioma grade, local failure, and overall survival. Neurooncol. Adv. 1, vdz011 (2019).
pubmed: 31608329
pmcid: 6777505
Morin, O. et al. A deep look into the future of quantitative imaging in oncology: a statement of working principles and proposal for change. Int. J. Radiat. Oncol. Biol. Phys. 102, 1074–1082 (2018).
pubmed: 30170101
doi: 10.1016/j.ijrobp.2018.08.032
Chen, W. C. et al. Histopathological features predictive of local control of atypical meningioma after surgery and adjuvant radiotherapy. J. Neurosurg. 130, 443–450 (2018).
pubmed: 29624151
doi: 10.3171/2017.9.JNS171609
pmcid: 29624151
Hong, J. C. et al. System for High-Intensity Evaluation During Radiation Therapy (SHIELD-RT): a prospective randomized study of machine learning–directed clinical evaluations during radiation and chemoradiation. J. Clin. Oncol. 38, 3652–3661 (2020).
pubmed: 32886536
doi: 10.1200/JCO.20.01688
pmcid: 32886536
Phillips, M. et al. Assessment of accuracy of an artificial intelligence algorithm to detect melanoma in images of skin lesions. JAMA Netw. Open 2, e1913436 (2019).
pubmed: 31617929
pmcid: 6806667
doi: 10.1001/jamanetworkopen.2019.13436
Rodriguez-Ruiz, A. et al. Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J. Natl Cancer Inst. 111, 916–922 (2019).
pubmed: 30834436
pmcid: 6748773
doi: 10.1093/jnci/djy222
Kann, B. H. et al. Pretreatment identification of head and neck cancer nodal metastasis and extranodal extension using deep learning neural networks. Sci. Rep. 8, 14036 (2018).
pubmed: 30232350
pmcid: 6145900
doi: 10.1038/s41598-018-32441-y
Lin, L. et al. Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma. Radiology 291, 677–686 (2019).
pubmed: 30912722
doi: 10.1148/radiol.2019182012
pmcid: 30912722
Banerjee, I., Bozkurt, S., Caswell-Jin, J. L., Kurian, A. W. & Rubin, D. L. Natural language processing approaches to detect the timeline of metastatic recurrence of breast cancer. JCO Clin. Cancer Inform. 3, 1–12 (2019).
pubmed: 31584836
pmcid: 31584836
Huang, S. C., Pareek, A., Seyyedi, S., Banerjee, I. & Lungren, M. P. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit. Med. 3, 136 (2020).
pubmed: 33083571
pmcid: 7567861
doi: 10.1038/s41746-020-00341-z
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
pubmed: 26978244
pmcid: 4792175
doi: 10.1038/sdata.2016.18
Lehne, M., Luijten, S., Vom Felde Genannt Imbusch, P. & Thun, S. The use of FHIR in digital health—a review of the scientific literature. Stud. Health Technol. Inform. 267, 52–58 (2019).
pubmed: 31483254
pmcid: 31483254
Pfaff, E. R. et al. Fast healthcare interoperability resources (FHIR) as a meta model to integrate common data models: development of a tool and quantitative validation study. JMIR Med. Inform. 7, e15199 (2019).
pubmed: 31621639
pmcid: 6913576
doi: 10.2196/15199
Semenov, I. et al. Experience in developing an FHIR medical data management platform to provide clinical decision support. Int. J. Environ. Res. Public Health 17, 73 (2019).
pmcid: 6981801
doi: 10.3390/ijerph17010073
pubmed: 6981801
Lambin, P. et al. Decision support systems for personalized and participative radiation oncology. Adv. Drug Deliv. Rev. 109, 131–153 (2017).
pubmed: 26774327
doi: 10.1016/j.addr.2016.01.006
Ta, C. N., Dumontier, M., Hripcsak, G., Tatonetti, N. P. & Weng, C. Columbia open health data, clinical concept prevalence and co-occurrence from electronic health records. Sci. Data 5, 180273 (2018).
pubmed: 30480666
pmcid: 6257042
doi: 10.1038/sdata.2018.273
DeSantis, C. E. et al. Breast cancer statistics, 2019. CA Cancer J. Clin. 69, 438–451 (2019).
pubmed: 31577379
doi: 10.3322/caac.21583
pmcid: 31577379
Lu, T. et al. Trends in the incidence, treatment, and survival of patients with lung cancer in the last four decades. Cancer Manag. Res. 11, 943–953 (2019).
pubmed: 30718965
pmcid: 6345192
doi: 10.2147/CMAR.S187317
Foster, C.C. et al. Overall survival according to immunotherapy and radiation treatment for metastatic non-small-cell lung cancer: a National Cancer Database analysis. Radiat. Oncol. 14, 18 (2019).
Neuman, H. B. et al. Stage IV breast cancer in the era of targeted therapy: does surgery of the primary tumor matter? Cancer 116, 1226–1233 (2010).
pubmed: 20101736
doi: 10.1002/cncr.24873
pmcid: 20101736
Hirsch, F. R. et al. Lung cancer: current therapies and new targeted treatments. Lancet 389, 299–311 (2017).
doi: 10.1016/S0140-6736(16)30958-8
Hughes, K. S. et al. Lumpectomy plus tamoxifen with or without irradiation in women age 70 years or older with early breast cancer: long-term follow-up of CALGB 9343. J. Clin. Oncol. 31, 2382–2387 (2013).
pubmed: 23690420
pmcid: 3691356
doi: 10.1200/JCO.2012.45.2615
Liu, J. et al. Predictive value for the chinese population of the Framingham CHD risk assessment tool compared with the chinese multi-provincial cohort study. J. Am. Med. Assoc. 291, 2591–2599 (2004).
doi: 10.1001/jama.291.21.2591
Triant, V. A. et al. Cardiovascular risk prediction functions underestimate risk in HIV infection. Circulation 137, 2203–2214 (2018).
pubmed: 29444987
pmcid: 6157923
doi: 10.1161/CIRCULATIONAHA.117.028975
Bastuji-Garin, S. et al. The Framingham prediction rule is not valid in a European population of treated hypertensive patients. J. Hypertens. 20, 1973–1980 (2002).
pubmed: 12359975
doi: 10.1097/00004872-200210000-00016
pmcid: 12359975
Gernaat, S. A. M. et al. The risk of cardiovascular disease following breast cancer by Framingham risk score. Breast Cancer Res. Treat. 170, 119–127 (2018).
pubmed: 29492735
pmcid: 5993849
doi: 10.1007/s10549-018-4723-0
Lee, K. et al. Effect of aerobic and resistance exercise intervention on cardiovascular disease risk in women with early-stage breast cancer: a randomized clinical trial. JAMA Oncol. 5, 710–714 (2019).
pubmed: 30920602
pmcid: 6512455
doi: 10.1001/jamaoncol.2019.0038
Beynon, R. A. et al. Tobacco smoking and alcohol drinking at diagnosis of head and neck cancer and all-cause mortality: results from head and neck 5000, a prospective observational cohort of people with head and neck cancer. Int. J. Cancer 143, 1114–1127 (2018).
pubmed: 29607493
pmcid: 6099366
doi: 10.1002/ijc.31416
Sollie, M. & Bille, C. Smoking and mortality in women diagnosed with breast cancer—a systematic review with meta-analysis based on 400,944 breast cancer cases. Gland Surg. 6, 385–393 (2017).
pubmed: 28861380
pmcid: 5566657
doi: 10.21037/gs.2017.04.06
Sorensen, L. T. Wound healing and infection in surgery. The clinical impact of smoking and smoking cessation: a systematic review and meta-analysis. Arch. Surg. 147, 373–383 (2012).
pubmed: 22508785
doi: 10.1001/archsurg.2012.5
pmcid: 22508785
Saquib, N., Stefanick, M. L., Natarajan, L. & Pierce, J. P. Mortality risk in former smokers with breast cancer: pack-years vs. smoking status. Int. J. Cancer 133, 2493–2497 (2013).
pubmed: 23649774
pmcid: 3770774
doi: 10.1002/ijc.28241
Elfiky, A. A., Pany, M. J., Parikh, R. B. & Obermeyer, Z. Development and application of a machine learning approach to assess short-term mortality risk among patients with cancer starting chemotherapy. JAMA Netw. Open 1, e180926 (2018).
pubmed: 30646043
pmcid: 6324307
doi: 10.1001/jamanetworkopen.2018.0926
Ganggayah, M. D., Taib, N. A., Har, Y. C., Lio, P. & Dhillon, S. K. Predicting factors for survival of breast cancer patients using machine learning techniques. BMC Med. Inform. Decis. Mak. 19, 48.
Ledford, H. Millions of black people affected by racial bias in health-care algorithms. Nature 574, 608–609 (2019).
pubmed: 31664201
doi: 10.1038/d41586-019-03228-6
pmcid: 31664201
Norgeot, B., Glicksberg, B. S. & Butte, A. J. A call for deep-learning healthcare. Nat. Med. 25, 14–15 (2019).
pubmed: 30617337
doi: 10.1038/s41591-018-0320-3
pmcid: 30617337
Norgeot, B. et al. Assessment of a deep learning model based on electronic health record data to forecast clinical outcomes in patients with rheumatoid arthritis. JAMA Netw. Open 2, e190606 (2019).
pubmed: 30874779
pmcid: 6484652
doi: 10.1001/jamanetworkopen.2019.0606
Hsu, E. R., Klemm, J. D., Kerlavage, A. R., Kusnezov, D. & Kibbe, W. A. Cancer moonshot data and technology team: enabling a national learning healthcare system for cancer to unleash the power of data. Clin. Pharmacol. Ther. 101, 613–615 (2017).
pubmed: 28139831
pmcid: 5414892
doi: 10.1002/cpt.636
Symonds, R. P. & Duxbury, A. Personal view: learning healthcare system for radiotherapy—maximising the opportunities and minimising the threats. Clin. Oncol. 32, 397–399 (2020).
doi: 10.1016/j.clon.2020.01.024
Zhang, M. Y. et al. Development of leptomeningeal metastases in breast cancer patients receiving stereotactic radiosurgery. Int. J. Radiat. Oncol. Biol. Phys. 105, E93 (2019).
Nohr, E. A. & Liew, Z. How to investigate and adjust for selection bias in cohort studies. Acta Obstet. Gynecol. Scand. 97, 407–416 (2018).
pubmed: 29415329
doi: 10.1111/aogs.13319
Chang, K. et al. Distributed deep learning networks among institutions for medical imaging. J. Am. Med. Inform. Assoc. 25, 945–954 (2018).
pubmed: 29617797
pmcid: 6077811
doi: 10.1093/jamia/ocy017
Duan, R., et al. Learning from electronic health records across multiple sites: a communication-efficient and privacy-preserving distributed algorithm. J. Am. Med. Inform. Assoc. 27, 376–385 (2019).
Jochems, A. et al. Developing and validating a survival prediction model for NSCLC patients through distributed learning across 3 countries. Int. J. Radiat. Oncol. Biol. Phys. 99, 344–352 (2017).
pubmed: 28871984
pmcid: 5575360
doi: 10.1016/j.ijrobp.2017.04.021
Zerka, F. et al. Systematic review of privacy-preserving distributed machine learning from federated databases in health care. JCO Clin. Cancer Inform. 4, 184–200 (2020).
pubmed: 32134684
doi: 10.1200/CCI.19.00047
Zwanenburg, A., et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).
Bajard, A. et al. An in silico approach helped to identify the best experimental design, population, and outcome for future randomized clinical trials. J. Clin. Epidemiol. 69, 125–136 (2016).
pubmed: 26186899
doi: 10.1016/j.jclinepi.2015.06.024
pmcid: 26186899
Clermont, G. et al. In silico design of clinical trials: a method coming of age. Crit. Care Med. 32, 2061–2070 (2004).
pubmed: 15483415
doi: 10.1097/01.CCM.0000142394.28791.C3
pmcid: 15483415
Hastie, T., Tibshirani, R. & Friedman, J.H. Element of Statistical Learning, Data Mining, Inference, and Prediction 2nd edn (Springer, 2001).
Blagus, R. & Lusa, L. SMOTE for high-dimensional class-imbalanced data. BMC Bioinf. 14, 106 (2013).
doi: 10.1186/1471-2105-14-106
Norgeot, B. et al. Protected health information filter (Philter): accurately and securely de-identifying free-text clinical notes. NPJ Digit. Med. 3, 57 (2020).
pubmed: 32337372
pmcid: 7156708
doi: 10.1038/s41746-020-0258-y
Buckley, J. M. et al. The feasibility of using natural language processing to extract clinical information from breast pathology reports. J. Pathol. Inform. 3, 23 (2012).
pubmed: 22934236
pmcid: 3424662
doi: 10.4103/2153-3539.97788