Privacy-Preserving Federated Survival Support Vector Machines for Cross-Institutional Time-To-Event Analysis: Algorithm Development and Validation.

FeatureCloud Implementation Implementation science algorithm centralized model federated federated learning machine learning predict prediction predictions predictive privacy regulation support vector machine survival survival analysis

Journal

JMIR AI
ISSN: 2817-1705
Titre abrégé: JMIR AI
Pays: Canada
ID NLM: 9918645789006676

Informations de publication

Date de publication:
29 Mar 2024
Historique:
received: 30 03 2023
accepted: 10 02 2024
revised: 06 08 2023
medline: 14 6 2024
pubmed: 14 6 2024
entrez: 14 6 2024
Statut: epublish

Résumé

Central collection of distributed medical patient data is problematic due to strict privacy regulations. Especially in clinical environments, such as clinical time-to-event studies, large sample sizes are critical but usually not available at a single institution. It has been shown recently that federated learning, combined with privacy-enhancing technologies, is an excellent and privacy-preserving alternative to data sharing. This study aims to develop and validate a privacy-preserving, federated survival support vector machine (SVM) and make it accessible for researchers to perform cross-institutional time-to-event analyses. We extended the survival SVM algorithm to be applicable in federated environments. We further implemented it as a FeatureCloud app, enabling it to run in the federated infrastructure provided by the FeatureCloud platform. Finally, we evaluated our algorithm on 3 benchmark data sets, a large sample size synthetic data set, and a real-world microbiome data set and compared the results to the corresponding central method. Our federated survival SVM produces highly similar results to the centralized model on all data sets. The maximal difference between the model weights of the central model and the federated model was only 0.001, and the mean difference over all data sets was 0.0002. We further show that by including more data in the analysis through federated learning, predictions are more accurate even in the presence of site-dependent batch effects. The federated survival SVM extends the palette of federated time-to-event analysis methods by a robust machine learning approach. To our knowledge, the implemented FeatureCloud app is the first publicly available implementation of a federated survival SVM, is freely accessible for all kinds of researchers, and can be directly used within the FeatureCloud platform.

Sections du résumé

BACKGROUND BACKGROUND
Central collection of distributed medical patient data is problematic due to strict privacy regulations. Especially in clinical environments, such as clinical time-to-event studies, large sample sizes are critical but usually not available at a single institution. It has been shown recently that federated learning, combined with privacy-enhancing technologies, is an excellent and privacy-preserving alternative to data sharing.
OBJECTIVE OBJECTIVE
This study aims to develop and validate a privacy-preserving, federated survival support vector machine (SVM) and make it accessible for researchers to perform cross-institutional time-to-event analyses.
METHODS METHODS
We extended the survival SVM algorithm to be applicable in federated environments. We further implemented it as a FeatureCloud app, enabling it to run in the federated infrastructure provided by the FeatureCloud platform. Finally, we evaluated our algorithm on 3 benchmark data sets, a large sample size synthetic data set, and a real-world microbiome data set and compared the results to the corresponding central method.
RESULTS RESULTS
Our federated survival SVM produces highly similar results to the centralized model on all data sets. The maximal difference between the model weights of the central model and the federated model was only 0.001, and the mean difference over all data sets was 0.0002. We further show that by including more data in the analysis through federated learning, predictions are more accurate even in the presence of site-dependent batch effects.
CONCLUSIONS CONCLUSIONS
The federated survival SVM extends the palette of federated time-to-event analysis methods by a robust machine learning approach. To our knowledge, the implemented FeatureCloud app is the first publicly available implementation of a federated survival SVM, is freely accessible for all kinds of researchers, and can be directly used within the FeatureCloud platform.

Identifiants

pubmed: 38875678
pii: v3i1e47652
doi: 10.2196/47652
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e47652

Informations de copyright

©Julian Späth, Zeno Sewald, Niklas Probul, Magali Berland, Mathieu Almeida, Nicolas Pons, Emmanuelle Le Chatelier, Pere Ginès, Cristina Solé, Adrià Juanola, Josch Pauling, Jan Baumbach. Originally published in JMIR AI (https://ai.jmir.org), 29.03.2024.

Auteurs

Julian Späth (J)

Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.

Zeno Sewald (Z)

LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Niklas Probul (N)

Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.

Magali Berland (M)

MetaGenoPolis, INRAE, Université Paris-Saclay, Jouy-en-Josas, France.

Mathieu Almeida (M)

MetaGenoPolis, INRAE, Université Paris-Saclay, Jouy-en-Josas, France.

Nicolas Pons (N)

MetaGenoPolis, INRAE, Université Paris-Saclay, Jouy-en-Josas, France.

Emmanuelle Le Chatelier (E)

MetaGenoPolis, INRAE, Université Paris-Saclay, Jouy-en-Josas, France.

Pere Ginès (P)

Liver Unit, Hospital Clínic de Barcelona, Barcelona, Spain.
Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain.
Centro de Investigacion en Red de Enfermedades hepaticas y Digestivas (CIBEReHD), Madrid, Spain.
Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain.

Cristina Solé (C)

Liver Unit, Hospital Clínic de Barcelona, Barcelona, Spain.
Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain.
Centro de Investigacion en Red de Enfermedades hepaticas y Digestivas (CIBEReHD), Madrid, Spain.

Adrià Juanola (A)

Liver Unit, Hospital Clínic de Barcelona, Barcelona, Spain.
Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain.
Centro de Investigacion en Red de Enfermedades hepaticas y Digestivas (CIBEReHD), Madrid, Spain.

Josch Pauling (J)

LipiTUM, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany.

Jan Baumbach (J)

Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany.

Classifications MeSH