Distributed learning on 20 000+ lung cancer patients - The Personal Health Train.

Big data Distributed learning FAIR data Federated learning Lung cancer Machine learning Prediction modeling Survival analysis

Journal

Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology
ISSN: 1879-0887
Titre abrégé: Radiother Oncol
Pays: Ireland
ID NLM: 8407192

Informations de publication

Date de publication:
03 2020
Historique:
received: 22 06 2019
revised: 18 11 2019
accepted: 19 11 2019
pubmed: 9 1 2020
medline: 15 4 2021
entrez: 9 1 2020
Statut: ppublish

Résumé

Access to healthcare data is indispensable for scientific progress and innovation. Sharing healthcare data is time-consuming and notoriously difficult due to privacy and regulatory concerns. The Personal Health Train (PHT) provides a privacy-by-design infrastructure connecting FAIR (Findable, Accessible, Interoperable, Reusable) data sources and allows distributed data analysis and machine learning. Patient data never leaves a healthcare institute. Lung cancer patient-specific databases (tumor staging and post-treatment survival information) of oncology departments were translated according to a FAIR data model and stored locally in a graph database. Software was installed locally to enable deployment of distributed machine learning algorithms via a central server. Algorithms (MATLAB, code and documentation publicly available) are patient privacy-preserving as only summary statistics and regression coefficients are exchanged with the central server. A logistic regression model to predict post-treatment two-year survival was trained and evaluated by receiver operating characteristic curves (ROC), root mean square prediction error (RMSE) and calibration plots. In 4 months, we connected databases with 23 203 patient cases across 8 healthcare institutes in 5 countries (Amsterdam, Cardiff, Maastricht, Manchester, Nijmegen, Rome, Rotterdam, Shanghai) using the PHT. Summary statistics were computed across databases. A distributed logistic regression model predicting post-treatment two-year survival was trained on 14 810 patients treated between 1978 and 2011 and validated on 8 393 patients treated between 2012 and 2015. The PHT infrastructure demonstrably overcomes patient privacy barriers to healthcare data sharing and enables fast data analyses across multiple institutes from different countries with different regulatory regimens. This infrastructure promotes global evidence-based medicine while prioritizing patient privacy.

Sections du résumé

BACKGROUND AND PURPOSE
Access to healthcare data is indispensable for scientific progress and innovation. Sharing healthcare data is time-consuming and notoriously difficult due to privacy and regulatory concerns. The Personal Health Train (PHT) provides a privacy-by-design infrastructure connecting FAIR (Findable, Accessible, Interoperable, Reusable) data sources and allows distributed data analysis and machine learning. Patient data never leaves a healthcare institute.
MATERIALS AND METHODS
Lung cancer patient-specific databases (tumor staging and post-treatment survival information) of oncology departments were translated according to a FAIR data model and stored locally in a graph database. Software was installed locally to enable deployment of distributed machine learning algorithms via a central server. Algorithms (MATLAB, code and documentation publicly available) are patient privacy-preserving as only summary statistics and regression coefficients are exchanged with the central server. A logistic regression model to predict post-treatment two-year survival was trained and evaluated by receiver operating characteristic curves (ROC), root mean square prediction error (RMSE) and calibration plots.
RESULTS
In 4 months, we connected databases with 23 203 patient cases across 8 healthcare institutes in 5 countries (Amsterdam, Cardiff, Maastricht, Manchester, Nijmegen, Rome, Rotterdam, Shanghai) using the PHT. Summary statistics were computed across databases. A distributed logistic regression model predicting post-treatment two-year survival was trained on 14 810 patients treated between 1978 and 2011 and validated on 8 393 patients treated between 2012 and 2015.
CONCLUSION
The PHT infrastructure demonstrably overcomes patient privacy barriers to healthcare data sharing and enables fast data analyses across multiple institutes from different countries with different regulatory regimens. This infrastructure promotes global evidence-based medicine while prioritizing patient privacy.

Identifiants

pubmed: 31911366
pii: S0167-8140(19)33489-9
doi: 10.1016/j.radonc.2019.11.019
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

189-200

Informations de copyright

Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.

Auteurs

Timo M Deist (TM)

Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands; The D-Lab: Dpt of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands.

Frank J W M Dankers (FJWM)

Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands; Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, The Netherlands.

Priyanka Ojha (P)

Department of Radiation Oncology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek, Amsterdam, The Netherlands.

M Scott Marshall (M)

Department of Radiation Oncology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek, Amsterdam, The Netherlands.

Tomas Janssen (T)

Department of Radiation Oncology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek, Amsterdam, The Netherlands.

Corinne Faivre-Finn (C)

The University of Manchester, Manchester Academic Health Science Centre, The Christie NHS Foundation Trust, United Kingdom.

Carlotta Masciocchi (C)

Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy.

Vincenzo Valentini (V)

Università Cattolica del Sacro Cuore, Rome, Italy; Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy.

Jiazhou Wang (J)

Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.

Jiayan Chen (J)

Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.

Zhen Zhang (Z)

Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.

Emiliano Spezi (E)

School of Engineering, Cardiff University, United Kingdom; Velindre Cancer Centre, Cardiff, United Kingdom.

Mick Button (M)

Velindre Cancer Centre, Cardiff, United Kingdom.

Joost Jan Nuyttens (J)

Department of Radiation Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands.

René Vernhout (R)

Department of Radiation Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands.

Johan van Soest (J)

Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands.

Arthur Jochems (A)

The D-Lab: Dpt of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands.

René Monshouwer (R)

Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, The Netherlands.

Johan Bussink (J)

Department of Radiation Oncology, Radboud University Medical Center, Nijmegen, The Netherlands.

Gareth Price (G)

The University of Manchester, Manchester Academic Health Science Centre, The Christie NHS Foundation Trust, United Kingdom.

Philippe Lambin (P)

The D-Lab: Dpt of Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands.

Andre Dekker (A)

Department of Radiation Oncology (MAASTRO), GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, The Netherlands. Electronic address: andre.dekker@maastro.nl.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH