An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication.

Artificial Intelligence Data Accuracy Electronic Health Records Humans Natural Language Processing Neoplasms / diagnosis

Journal

Nature cancer

ISSN: 2662-1347

Titre abrégé: Nat Cancer

Pays: England

ID NLM: 101761119

Informations de publication

Date de publication:
07 2021

Historique:

received: 14 04 2020

accepted: 14 06 2021

entrez: 5 2 2022

pubmed: 6 2 2022

medline: 20 4 2022

Statut: ppublish

Résumé

Despite widespread adoption of electronic health records (EHRs), most hospitals are not ready to implement data science research in the clinical pipelines. Here, we develop MEDomics, a continuously learning infrastructure through which multimodal health data are systematically organized and data quality is assessed with the goal of applying artificial intelligence for individual prognosis. Using this framework, currently composed of thousands of individuals with cancer and millions of data points over a decade of data recording, we demonstrate prognostic utility of this framework in oncology. As proof of concept, we report an analysis using this infrastructure, which identified the Framingham risk score to be robustly associated with mortality among individuals with early-stage and advanced-stage cancer, a potentially actionable finding from a real-world cohort of individuals with cancer. Finally, we show how natural language processing (NLP) of medical notes could be used to continuously update estimates of prognosis as a given individual's disease course unfolds.

Identifiants

DOI: 10.1038/s43018-021-00236-2 PMID: 35121948

pubmed: 35121948

doi: 10.1038/s43018-021-00236-2

pii: 10.1038/s43018-021-00236-2

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

709-722

Subventions

Organisme : CIHR

ID : FDN-143257

Pays : Canada

Commentaires et corrections

Type : CommentIn

Informations de copyright

Références

Arbabshirani, M. R. et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit. Med. 1, 9 (2018).

pubmed: 31304294 pmcid: 6550144 doi: 10.1038/s41746-017-0015-z

Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

pubmed: 28117445 doi: 10.1038/nature21056 pmcid: 28117445

Ehteshami Bejnordi, B. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).

pubmed: 29234806 pmcid: 5820737 doi: 10.1001/jama.2017.14585

Stidham, R. W. et al. Performance of a deep learning model vs human reviewers in grading endoscopic disease severity of patients with ulcerative colitis. JAMA Netw. Open 2, e193963 (2019).

pubmed: 31099869 pmcid: 6537821 doi: 10.1001/jamanetworkopen.2019.3963

Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. J. Am. Med. Assoc. 316, 2402–2410 (2016).

doi: 10.1001/jama.2016.17216

Tomasev, N. et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572, 116–119 (2019).

pubmed: 31367026 pmcid: 6722431 doi: 10.1038/s41586-019-1390-1

Nemati, S. et al. An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit. Care Med. 46, 547–553 (2018).

pubmed: 29286945 pmcid: 5851825 doi: 10.1097/CCM.0000000000002936

Rojas, J. C. et al. Predicting intensive care unit readmission with machine learning using electronic health record data. Ann. Am. Thorac. Soc. 15, 846–853 (2018).

pubmed: 29787309 pmcid: 6207111 doi: 10.1513/AnnalsATS.201710-787OC

Frost, D. W. et al. Using the electronic medical record to identify patients at high risk for frequent emergency department visits and high system costs. Am. J. Med. 130, 601.e617–601.e622 (2017).

doi: 10.1016/j.amjmed.2016.12.008

Institute of Medicine (US) Roundtable on Evidence-Based Medicine. The Learning Healthcare System: Workshop Summary (eds. Olsen, L. A., Aisner, D. & McGinnis, J. M.) (National Academies Press, 2007).

Jackson, T. Building the ‘continuous learning’ healthcare system. Health Inf. Manag. 43, 4–5 (2014).

pubmed: 24948660

Deist, T. M. et al. Machine learning algorithms for outcome prediction in (chemo)radiotherapy: an empirical comparison of classifiers. Med. Phys. 45, 3449–3459 (2018).

pubmed: 29763967 doi: 10.1002/mp.12967

Gennatas, E. D. et al. Preoperative and postoperative prediction of long-term meningioma outcomes. PLoS ONE 13, e0204161 (2018).

pubmed: 30235308 pmcid: 6147484 doi: 10.1371/journal.pone.0204161

Hong, J. C., Niedzwiecki, D., Palta, M. & Tenenbaum, J. D. Predicting emergency visits and hospital admissions during radiation and chemoradiation: an internally validated pretreatment machine learning algorithm. JCO Clin. Cancer Inform. 2, 1–11 (2018).

pubmed: 30652595

Morin, O. et al. Integrated models incorporating radiologic and radiomic features predict meningioma grade, local failure, and overall survival. Neurooncol. Adv. 1, vdz011 (2019).

pubmed: 31608329 pmcid: 6777505

Morin, O. et al. A deep look into the future of quantitative imaging in oncology: a statement of working principles and proposal for change. Int. J. Radiat. Oncol. Biol. Phys. 102, 1074–1082 (2018).

pubmed: 30170101 doi: 10.1016/j.ijrobp.2018.08.032

Chen, W. C. et al. Histopathological features predictive of local control of atypical meningioma after surgery and adjuvant radiotherapy. J. Neurosurg. 130, 443–450 (2018).

pubmed: 29624151 doi: 10.3171/2017.9.JNS171609 pmcid: 29624151

Hong, J. C. et al. System for High-Intensity Evaluation During Radiation Therapy (SHIELD-RT): a prospective randomized study of machine learning–directed clinical evaluations during radiation and chemoradiation. J. Clin. Oncol. 38, 3652–3661 (2020).

pubmed: 32886536 doi: 10.1200/JCO.20.01688 pmcid: 32886536

Phillips, M. et al. Assessment of accuracy of an artificial intelligence algorithm to detect melanoma in images of skin lesions. JAMA Netw. Open 2, e1913436 (2019).

pubmed: 31617929 pmcid: 6806667 doi: 10.1001/jamanetworkopen.2019.13436

Rodriguez-Ruiz, A. et al. Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J. Natl Cancer Inst. 111, 916–922 (2019).

pubmed: 30834436 pmcid: 6748773 doi: 10.1093/jnci/djy222

Kann, B. H. et al. Pretreatment identification of head and neck cancer nodal metastasis and extranodal extension using deep learning neural networks. Sci. Rep. 8, 14036 (2018).

pubmed: 30232350 pmcid: 6145900 doi: 10.1038/s41598-018-32441-y

Lin, L. et al. Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma. Radiology 291, 677–686 (2019).

pubmed: 30912722 doi: 10.1148/radiol.2019182012 pmcid: 30912722

Banerjee, I., Bozkurt, S., Caswell-Jin, J. L., Kurian, A. W. & Rubin, D. L. Natural language processing approaches to detect the timeline of metastatic recurrence of breast cancer. JCO Clin. Cancer Inform. 3, 1–12 (2019).

pubmed: 31584836 pmcid: 31584836

Huang, S. C., Pareek, A., Seyyedi, S., Banerjee, I. & Lungren, M. P. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit. Med. 3, 136 (2020).

pubmed: 33083571 pmcid: 7567861 doi: 10.1038/s41746-020-00341-z

Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

pubmed: 26978244 pmcid: 4792175 doi: 10.1038/sdata.2016.18

Lehne, M., Luijten, S., Vom Felde Genannt Imbusch, P. & Thun, S. The use of FHIR in digital health—a review of the scientific literature. Stud. Health Technol. Inform. 267, 52–58 (2019).

pubmed: 31483254 pmcid: 31483254

Pfaff, E. R. et al. Fast healthcare interoperability resources (FHIR) as a meta model to integrate common data models: development of a tool and quantitative validation study. JMIR Med. Inform. 7, e15199 (2019).

pubmed: 31621639 pmcid: 6913576 doi: 10.2196/15199

Semenov, I. et al. Experience in developing an FHIR medical data management platform to provide clinical decision support. Int. J. Environ. Res. Public Health 17, 73 (2019).

pmcid: 6981801 doi: 10.3390/ijerph17010073 pubmed: 6981801

Lambin, P. et al. Decision support systems for personalized and participative radiation oncology. Adv. Drug Deliv. Rev. 109, 131–153 (2017).

pubmed: 26774327 doi: 10.1016/j.addr.2016.01.006

Ta, C. N., Dumontier, M., Hripcsak, G., Tatonetti, N. P. & Weng, C. Columbia open health data, clinical concept prevalence and co-occurrence from electronic health records. Sci. Data 5, 180273 (2018).

pubmed: 30480666 pmcid: 6257042 doi: 10.1038/sdata.2018.273

DeSantis, C. E. et al. Breast cancer statistics, 2019. CA Cancer J. Clin. 69, 438–451 (2019).

pubmed: 31577379 doi: 10.3322/caac.21583 pmcid: 31577379

Lu, T. et al. Trends in the incidence, treatment, and survival of patients with lung cancer in the last four decades. Cancer Manag. Res. 11, 943–953 (2019).

pubmed: 30718965 pmcid: 6345192 doi: 10.2147/CMAR.S187317

Foster, C.C. et al. Overall survival according to immunotherapy and radiation treatment for metastatic non-small-cell lung cancer: a National Cancer Database analysis. Radiat. Oncol. 14, 18 (2019).

Neuman, H. B. et al. Stage IV breast cancer in the era of targeted therapy: does surgery of the primary tumor matter? Cancer 116, 1226–1233 (2010).

pubmed: 20101736 doi: 10.1002/cncr.24873 pmcid: 20101736

Hirsch, F. R. et al. Lung cancer: current therapies and new targeted treatments. Lancet 389, 299–311 (2017).

doi: 10.1016/S0140-6736(16)30958-8

Hughes, K. S. et al. Lumpectomy plus tamoxifen with or without irradiation in women age 70 years or older with early breast cancer: long-term follow-up of CALGB 9343. J. Clin. Oncol. 31, 2382–2387 (2013).

pubmed: 23690420 pmcid: 3691356 doi: 10.1200/JCO.2012.45.2615

Liu, J. et al. Predictive value for the chinese population of the Framingham CHD risk assessment tool compared with the chinese multi-provincial cohort study. J. Am. Med. Assoc. 291, 2591–2599 (2004).

doi: 10.1001/jama.291.21.2591

Triant, V. A. et al. Cardiovascular risk prediction functions underestimate risk in HIV infection. Circulation 137, 2203–2214 (2018).

pubmed: 29444987 pmcid: 6157923 doi: 10.1161/CIRCULATIONAHA.117.028975

Bastuji-Garin, S. et al. The Framingham prediction rule is not valid in a European population of treated hypertensive patients. J. Hypertens. 20, 1973–1980 (2002).

pubmed: 12359975 doi: 10.1097/00004872-200210000-00016 pmcid: 12359975

Gernaat, S. A. M. et al. The risk of cardiovascular disease following breast cancer by Framingham risk score. Breast Cancer Res. Treat. 170, 119–127 (2018).

pubmed: 29492735 pmcid: 5993849 doi: 10.1007/s10549-018-4723-0

Lee, K. et al. Effect of aerobic and resistance exercise intervention on cardiovascular disease risk in women with early-stage breast cancer: a randomized clinical trial. JAMA Oncol. 5, 710–714 (2019).

pubmed: 30920602 pmcid: 6512455 doi: 10.1001/jamaoncol.2019.0038

Beynon, R. A. et al. Tobacco smoking and alcohol drinking at diagnosis of head and neck cancer and all-cause mortality: results from head and neck 5000, a prospective observational cohort of people with head and neck cancer. Int. J. Cancer 143, 1114–1127 (2018).

pubmed: 29607493 pmcid: 6099366 doi: 10.1002/ijc.31416

Sollie, M. & Bille, C. Smoking and mortality in women diagnosed with breast cancer—a systematic review with meta-analysis based on 400,944 breast cancer cases. Gland Surg. 6, 385–393 (2017).

pubmed: 28861380 pmcid: 5566657 doi: 10.21037/gs.2017.04.06

Sorensen, L. T. Wound healing and infection in surgery. The clinical impact of smoking and smoking cessation: a systematic review and meta-analysis. Arch. Surg. 147, 373–383 (2012).

pubmed: 22508785 doi: 10.1001/archsurg.2012.5 pmcid: 22508785

Saquib, N., Stefanick, M. L., Natarajan, L. & Pierce, J. P. Mortality risk in former smokers with breast cancer: pack-years vs. smoking status. Int. J. Cancer 133, 2493–2497 (2013).

pubmed: 23649774 pmcid: 3770774 doi: 10.1002/ijc.28241

Elfiky, A. A., Pany, M. J., Parikh, R. B. & Obermeyer, Z. Development and application of a machine learning approach to assess short-term mortality risk among patients with cancer starting chemotherapy. JAMA Netw. Open 1, e180926 (2018).

pubmed: 30646043 pmcid: 6324307 doi: 10.1001/jamanetworkopen.2018.0926

Ganggayah, M. D., Taib, N. A., Har, Y. C., Lio, P. & Dhillon, S. K. Predicting factors for survival of breast cancer patients using machine learning techniques. BMC Med. Inform. Decis. Mak. 19, 48.

Ledford, H. Millions of black people affected by racial bias in health-care algorithms. Nature 574, 608–609 (2019).

pubmed: 31664201 doi: 10.1038/d41586-019-03228-6 pmcid: 31664201

Norgeot, B., Glicksberg, B. S. & Butte, A. J. A call for deep-learning healthcare. Nat. Med. 25, 14–15 (2019).

pubmed: 30617337 doi: 10.1038/s41591-018-0320-3 pmcid: 30617337

Norgeot, B. et al. Assessment of a deep learning model based on electronic health record data to forecast clinical outcomes in patients with rheumatoid arthritis. JAMA Netw. Open 2, e190606 (2019).

pubmed: 30874779 pmcid: 6484652 doi: 10.1001/jamanetworkopen.2019.0606

Hsu, E. R., Klemm, J. D., Kerlavage, A. R., Kusnezov, D. & Kibbe, W. A. Cancer moonshot data and technology team: enabling a national learning healthcare system for cancer to unleash the power of data. Clin. Pharmacol. Ther. 101, 613–615 (2017).

pubmed: 28139831 pmcid: 5414892 doi: 10.1002/cpt.636

Symonds, R. P. & Duxbury, A. Personal view: learning healthcare system for radiotherapy—maximising the opportunities and minimising the threats. Clin. Oncol. 32, 397–399 (2020).

doi: 10.1016/j.clon.2020.01.024

Zhang, M. Y. et al. Development of leptomeningeal metastases in breast cancer patients receiving stereotactic radiosurgery. Int. J. Radiat. Oncol. Biol. Phys. 105, E93 (2019).

Nohr, E. A. & Liew, Z. How to investigate and adjust for selection bias in cohort studies. Acta Obstet. Gynecol. Scand. 97, 407–416 (2018).

pubmed: 29415329 doi: 10.1111/aogs.13319

Chang, K. et al. Distributed deep learning networks among institutions for medical imaging. J. Am. Med. Inform. Assoc. 25, 945–954 (2018).

pubmed: 29617797 pmcid: 6077811 doi: 10.1093/jamia/ocy017

Duan, R., et al. Learning from electronic health records across multiple sites: a communication-efficient and privacy-preserving distributed algorithm. J. Am. Med. Inform. Assoc. 27, 376–385 (2019).

Jochems, A. et al. Developing and validating a survival prediction model for NSCLC patients through distributed learning across 3 countries. Int. J. Radiat. Oncol. Biol. Phys. 99, 344–352 (2017).

pubmed: 28871984 pmcid: 5575360 doi: 10.1016/j.ijrobp.2017.04.021

Zerka, F. et al. Systematic review of privacy-preserving distributed machine learning from federated databases in health care. JCO Clin. Cancer Inform. 4, 184–200 (2020).

pubmed: 32134684 doi: 10.1200/CCI.19.00047

Zwanenburg, A., et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).

Bajard, A. et al. An in silico approach helped to identify the best experimental design, population, and outcome for future randomized clinical trials. J. Clin. Epidemiol. 69, 125–136 (2016).

pubmed: 26186899 doi: 10.1016/j.jclinepi.2015.06.024 pmcid: 26186899

Clermont, G. et al. In silico design of clinical trials: a method coming of age. Crit. Care Med. 32, 2061–2070 (2004).

pubmed: 15483415 doi: 10.1097/01.CCM.0000142394.28791.C3 pmcid: 15483415

Hastie, T., Tibshirani, R. & Friedman, J.H. Element of Statistical Learning, Data Mining, Inference, and Prediction 2nd edn (Springer, 2001).

Blagus, R. & Lusa, L. SMOTE for high-dimensional class-imbalanced data. BMC Bioinf. 14, 106 (2013).

doi: 10.1186/1471-2105-14-106

Norgeot, B. et al. Protected health information filter (Philter): accurately and securely de-identifying free-text clinical notes. NPJ Digit. Med. 3, 57 (2020).

pubmed: 32337372 pmcid: 7156708 doi: 10.1038/s41746-020-0258-y

Buckley, J. M. et al. The feasibility of using natural language processing to extract clinical information from breast pathology reports. J. Pathol. Inform. 3, 23 (2012).

pubmed: 22934236 pmcid: 3424662 doi: 10.4103/2153-3539.97788

An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Commentaires et corrections

Informations de copyright

Références

Auteurs

Olivier Morin (O)

Martin Vallières (M)

Steve Braunstein (S)

Jorge Barrios Ginart (JB)

Taman Upadhaya (T)

Henry C Woodruff (HC)

Alex Zwanenburg (A)

Avishek Chatterjee (A)

Javier E Villanueva-Meyer (JE)

Gilmer Valdes (G)

William Chen (W)

Julian C Hong (JC)

Sue S Yom (SS)

Timothy D Solberg (TD)

Steffen Löck (S)

Jan Seuntjens (J)

Catherine Park (C)

Philippe Lambin (P)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH