Data heterogeneity in federated learning with Electronic Health Records: Case studies of risk prediction for acute kidney injury and sepsis diseases in critical care.
Journal
PLOS digital health
ISSN: 2767-3170
Titre abrégé: PLOS Digit Health
Pays: United States
ID NLM: 9918335064206676
Informations de publication
Date de publication:
Mar 2023
Mar 2023
Historique:
received:
01
09
2022
accepted:
10
02
2023
entrez:
15
3
2023
pubmed:
16
3
2023
medline:
16
3
2023
Statut:
epublish
Résumé
With the wider availability of healthcare data such as Electronic Health Records (EHR), more and more data-driven based approaches have been proposed to improve the quality-of-care delivery. Predictive modeling, which aims at building computational models for predicting clinical risk, is a popular research topic in healthcare analytics. However, concerns about privacy of healthcare data may hinder the development of effective predictive models that are generalizable because this often requires rich diverse data from multiple clinical institutions. Recently, federated learning (FL) has demonstrated promise in addressing this concern. However, data heterogeneity from different local participating sites may affect prediction performance of federated models. Due to acute kidney injury (AKI) and sepsis' high prevalence among patients admitted to intensive care units (ICU), the early prediction of these conditions based on AI is an important topic in critical care medicine. In this study, we take AKI and sepsis onset risk prediction in ICU as two examples to explore the impact of data heterogeneity in the FL framework as well as compare performances across frameworks. We built predictive models based on local, pooled, and FL frameworks using EHR data across multiple hospitals. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites' data. A model was updated locally, and its parameters were shared to a central aggregator, which was used to update the federated model's parameters and then subsequently, shared with each site. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within the EHR data. The different distributions of demographic profiles, medication use, and site information contributed to data heterogeneity.
Identifiants
pubmed: 36920974
doi: 10.1371/journal.pdig.0000117
pii: PDIG-D-22-00256
pmc: PMC10016691
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e0000117Subventions
Organisme : NIA NIH HHS
ID : R01 AG076234
Pays : United States
Organisme : NIA NIH HHS
ID : RF1 AG072449
Pays : United States
Organisme : NIGMS NIH HHS
ID : T32 GM083937
Pays : United States
Informations de copyright
Copyright: © 2023 Rajendran et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Clin J Am Soc Nephrol. 2014 Jan;9(1):12-20
pubmed: 24178971
Crit Care Med. 2018 Jul;46(7):1070-1077
pubmed: 29596073
PLOS Digit Health. 2022 May 19;1(5):e0000033
pubmed: 36812504
Turk J Anaesthesiol Reanim. 2014 Dec;42(6):294-301
pubmed: 27366441
Nat Commun. 2020 Nov 9;11(1):5668
pubmed: 33168827
JAMA. 2017 Oct 3;318(13):1241-1249
pubmed: 28903154
BMJ Open. 2017 Sep 27;7(9):e016591
pubmed: 28963291
IEEE Trans Neural Netw Learn Syst. 2022 Dec 07;PP:
pubmed: 37015617
JAMA. 2016 Aug 2;316(5):533-4
pubmed: 27483067
JMIR Med Inform. 2021 Jan 27;9(1):e24207
pubmed: 33400679
J Biomed Inform. 2002 Oct-Dec;35(5-6):352-9
pubmed: 12968784
Clin Kidney J. 2022 Aug 02;15(12):2266-2280
pubmed: 36381375
J Gerontol A Biol Sci Med Sci. 2023 Mar 30;78(4):718-726
pubmed: 35657011
IEEE Trans Neural Netw Learn Syst. 2022 Mar 28;PP:
pubmed: 35344498
Anaesth Crit Care Pain Med. 2022 Feb;41(1):101015
pubmed: 34968747
JAMA Netw Open. 2020 Aug 3;3(8):e2012892
pubmed: 32780123
Semin Nephrol. 2015 Jan;35(1):23-37
pubmed: 25795497
Sci Rep. 2022 Jan 14;12(1):749
pubmed: 35031637
Semin Nephrol. 2015 Jan;35(1):2-11
pubmed: 25795495
Korean J Anesthesiol. 2017 Aug;70(4):407-411
pubmed: 28794835
J Healthc Inform Res. 2021;5(1):1-19
pubmed: 33204939
Clin Exp Emerg Med. 2014 Sep 30;1(1):3-7
pubmed: 27752546
JAMA Intern Med. 2021 Aug 1;181(8):1065-1070
pubmed: 34152373
PLOS Digit Health. 2022 Apr 5;1(4):e0000023
pubmed: 36812510
medRxiv. 2020 May 19;:
pubmed: 32511484
AMIA Annu Symp Proc. 2018 Apr 16;2017:565-574
pubmed: 29854121
J Am Med Inform Assoc. 2022 Jan 29;29(3):559-575
pubmed: 34897469
Crit Care Med. 2020 Feb;48(2):210-217
pubmed: 31939789
JAMA. 2016 Feb 23;315(8):762-74
pubmed: 26903335
NPJ Digit Med. 2022 Jun 7;5(1):69
pubmed: 35672368
Sci Rep. 2020 Jul 28;10(1):12598
pubmed: 32724046