Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging.

Journal

Communications medicine

ISSN: 2730-664X

Titre abrégé: Commun Med (Lond)

Pays: England

ID NLM: 9918250414506676

Informations de publication

Date de publication:
14 Mar 2024

Historique:

received: 06 04 2023

accepted: 16 02 2024

medline: 15 3 2024

pubmed: 15 3 2024

entrez: 15 3 2024

Statut: epublish

Résumé

Artificial intelligence (AI) models are increasingly used in the medical domain. However, as medical data is highly sensitive, special precautions to ensure its protection are required. The gold standard for privacy preservation is the introduction of differential privacy (DP) to model training. Prior work indicates that DP has negative implications on model accuracy and fairness, which are unacceptable in medicine and represent a main barrier to the widespread use of privacy-preserving techniques. In this work, we evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training. We used two datasets: (1) A large dataset (N = 193,311) of high quality clinical chest radiographs, and (2) a dataset (N = 1625) of 3D abdominal computed tomography (CT) images, with the task of classifying the presence of pancreatic ductal adenocarcinoma (PDAC). Both were retrospectively collected and manually labeled by experienced radiologists. We then compared non-private deep convolutional neural networks (CNNs) and privacy-preserving (DP) models with respect to privacy-utility trade-offs measured as area under the receiver operating characteristic curve (AUROC), and privacy-fairness trade-offs, measured as Pearson's r or Statistical Parity Difference. We find that, while the privacy-preserving training yields lower accuracy, it largely does not amplify discrimination against age, sex or co-morbidity. However, we find an indication that difficult diagnoses and subgroups suffer stronger performance hits in private training. Our study shows that - under the challenging realistic circumstances of a real-life clinical dataset - the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness. Artificial intelligence (AI), in which computers can learn to do tasks that normally require human intelligence, is particularly useful in medical imaging. However, AI should be used in a way that preserves patient privacy. We explored the balance between maintaining patient data privacy and AI performance in medical imaging. We use an approach called differential privacy to protect the privacy of patients’ images. We show that, although training AI with differential privacy leads to a slight decrease in accuracy, it does not substantially increase bias against different age groups, genders, or patients with multiple health conditions. However, we notice that AI faces more challenges in accurately diagnosing complex cases and specific subgroups when trained under these privacy constraints. These findings highlight the importance of designing AI systems that are both privacy-conscious and capable of reliable diagnoses across patient groups.

Sections du résumé

BACKGROUND BACKGROUND

METHODS METHODS

We used two datasets: (1) A large dataset (N = 193,311) of high quality clinical chest radiographs, and (2) a dataset (N = 1625) of 3D abdominal computed tomography (CT) images, with the task of classifying the presence of pancreatic ductal adenocarcinoma (PDAC). Both were retrospectively collected and manually labeled by experienced radiologists. We then compared non-private deep convolutional neural networks (CNNs) and privacy-preserving (DP) models with respect to privacy-utility trade-offs measured as area under the receiver operating characteristic curve (AUROC), and privacy-fairness trade-offs, measured as Pearson's r or Statistical Parity Difference.

RESULTS RESULTS

We find that, while the privacy-preserving training yields lower accuracy, it largely does not amplify discrimination against age, sex or co-morbidity. However, we find an indication that difficult diagnoses and subgroups suffer stronger performance hits in private training.

CONCLUSIONS CONCLUSIONS

Our study shows that - under the challenging realistic circumstances of a real-life clinical dataset - the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.

Artificial intelligence (AI), in which computers can learn to do tasks that normally require human intelligence, is particularly useful in medical imaging. However, AI should be used in a way that preserves patient privacy. We explored the balance between maintaining patient data privacy and AI performance in medical imaging. We use an approach called differential privacy to protect the privacy of patients’ images. We show that, although training AI with differential privacy leads to a slight decrease in accuracy, it does not substantially increase bias against different age groups, genders, or patients with multiple health conditions. However, we notice that AI faces more challenges in accurately diagnosing complex cases and specific subgroups when trained under these privacy constraints. These findings highlight the importance of designing AI systems that are both privacy-conscious and capable of reliable diagnoses across patient groups.

Autres résumés

Type: plain-language-summary (eng)

Identifiants

DOI: 10.1038/s43856-024-00462-6 PMID: 38486100

pubmed: 38486100

doi: 10.1038/s43856-024-00462-6

pii: 10.1038/s43856-024-00462-6

doi:

Types de publication

Journal Article

Langues

eng

Pagination

Subventions

Organisme : Bundesministerium für Bildung, Wissenschaft, Forschung und Technologie (Federal Ministry for Education, Science, Research and Technology)

ID : 01ZZ2316C

Organisme : Bundesministerium für Bildung und Forschung (Federal Ministry of Education and Research)

ID : 01KX2021

Organisme : Bundesministerium für Bildung, Wissenschaft, Forschung und Technologie (Federal Ministry for Education, Science, Research and Technology)

ID : 01ZZ2316C

Organisme : Bundesministerium für Bildung und Forschung (Federal Ministry of Education and Research)

ID : 01KD2215B

Organisme : EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)

ID : 101057091

Informations de copyright

Références

Usynin, D. et al. Adversarial interference and its mitigations in privacy-preserving collaborative machine learning. Nat. Mach. Intell. 3, 749–758 (2021).

doi: 10.1038/s42256-021-00390-3

Konečny`, J., McMahan, H. B., Ramage, D. & Richtárik, P. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527 (2016).

Konečny`, J. et al. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).

McMahan, B., Moore, E., Ramage, D., Hampson, S. & y Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, 1273–1282 (PMLR, 2017).

Truhn, D. et al. Encrypted federated learning for secure decentralized collaboration in cancer image analysis. Med. Image Anal. (2024). https://doi.org/10.1016/j.media.2023.103059 .

Dwork, C. & Roth, A. et al. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2014).

doi: 10.1561/0400000042

Boenisch, F. et al. When the curious abandon honesty: Federated learning is not private. In 2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), 175–199 (IEEE, 2023).

Fowl, L., Geiping, J., Czaja, W., Goldblum, M. & Goldstein, T. Robbing the fed: Directly obtaining private data in federated learning with modified models. In International Conference on Learning Representations (2021).

Wang, K.-C. et al. Variational model inversion attacks. Adv. Neural Inf. Process. Syst. 34, 9706–9719 (2021).

Haim, N., Vardi, G., Yehudai, G., Shamir, O. & Irani, M. Reconstructing training data from trained neural networks. Adv. Neural Inf. Processing Syst. 35, 22911–22924 (2022).

Carlini, N. et al. Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), 5253–5270 (2023).

Food, U. & Administration, D. Artificial intelligence and machine learning (ai/ml)-enabled medical devices. Webpage (2023). https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices .

Wasserman, L. & Zhou, S. A statistical framework for differential privacy. J. Am. Stat. Assoc. 105, 375–389 (2010).

doi: 10.1198/jasa.2009.tm08651

Dong, J., Roth, A. & Su, W. J. Gaussian differential privacy. J. Royal Stat. Soc. Ser. B: Stat. Methodol. 84, 3–37 (2022).

doi: 10.1111/rssb.12454

Kaissis, G., Hayes, J., Ziller, A. & Rueckert, D. Bounding data reconstruction attacks with the hypothesis testing interpretation of differential privacy. Theory and Practice of Differential Privacy Workshop (2023).

Nasr, M. et al. Tight auditing of differentially private machine learning. In 32nd USENIX Security Symposium (USENIX Security 23), 1631–1648 (2023).

Kaissis, G. et al. End-to-end privacy preserving deep learning on multi-institutional medical imaging. Nat. Mach. Intell. 3, 473–484 (2021).

doi: 10.1038/s42256-021-00337-8

Hayes, J., Mahloujifar, S. & Balle, B. Bounding training data reconstruction in dp-sgd. arXiv preprint arXiv:2302.07225 (2023).

Balle, B., Cherubin, G. & Hayes, J. Reconstructing training data with informed adversaries. In 2022 IEEE Symposium on Security and Privacy (SP), 1138–1156 (IEEE, 2022).

Cohen, A. & Nissim, K. Towards formalizing the gdpr’s notion of singling out. Proc. Nat. Acad. Sci. 117, 8344–8352 (2020).

doi: 10.1073/pnas.1914598117 pubmed: 32234789 pmcid: 7165454

Cohen, A. Attacks on deidentification’s defenses. In 31st USENIX Security Symposium (USENIX Security 22), 1469–1486 (2022).

Abadi, M. et al. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 308–318 (2016).

Hatamizadeh, A. et al. Do gradient inversion attacks make federated learning unsafe? IEEE Trans. Med. Imaging (2023).

Dwork, C. A firm foundation for private data analysis. Commun. ACM 54, 86–95 (2011).

doi: 10.1145/1866739.1866758

De, S., Berrada, L., Hayes, J., Smith, S. L. & Balle, B. Unlocking high-accuracy differentially private image classification through scale. arXiv preprint arXiv:2204.13650 (2022).

Kurakin, A. et al. Toward training at imagenet scale with differential privacy. arXiv preprint arXiv:2201.12328 (2022).

Tran, C., Fioretto, F., Van Hentenryck, P. & Yao, Z. Decision making with differential privacy under a fairness lens. In IJCAI, 560–566 (2021).

Cummings, R., Gupta, V., Kimpara, D. & Morgenstern, J. On the compatibility of privacy and fairness. In Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization, 309–315 (2019).

Packhäuser, K. et al. Deep learning-based patient re-identification is able to exploit the biometric nature of medical chest x-ray data. Sci. Rep. 12, 14851 (2022).

doi: 10.1038/s41598-022-19045-3 pubmed: 36050406 pmcid: 9434540

Narayanan, A. & Shmatikov, V. Robust de-anonymization of large sparse datasets. In 2008 IEEE Symposium on Security and Privacy (sp 2008), 111–125 (IEEE, 2008).

Li, W. et al. Privacy-preserving federated brain tumour segmentation. In Machine Learning in Medical Imaging: 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 10, 133–141 (Springer, 2019).

Ziegler, J., Pfitzner, B., Schulz, H., Saalbach, A. & Arnrich, B. Defending against reconstruction attacks through differentially private federated learning for classification of heterogeneous chest x-ray data. Sensors 22, 5195 (2022).

doi: 10.3390/s22145195 pubmed: 35890875 pmcid: 9320045

Farrand, T., Mireshghallah, F., Singh, S. & Trask, A. Neither private nor fair: Impact of data imbalance on utility and fairness in differential privacy. In Proceedings of the 2020 Workshop on Privacy-preserving Machine Learning in Practice, 15–19 (2020).

Bagdasaryan, E., Poursaeed, O. & Shmatikov, V. Differential privacy has disparate impact on model accuracy. Advances in Neural Information Processing Systems 32, https://proceedings.neurips.cc/paper_files/paper/2019/hash/fc0de4e0396fff257ea362983c2dda5a-Abstract.html (2019).

Khader, F. et al. Artificial intelligence for clinical interpretation of bedside chest radiographs. Radiology 307, e220510 (2022).

Tayebi Arasteh, S. et al. Collaborative training of medical artificial intelligence models with non-uniform labels. Sci. Rep. 13, 6046 (2023).

doi: 10.1038/s41598-023-33303-y pubmed: 37055456 pmcid: 10102221

Johnson, A. E. et al. Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6, 317 (2019).

doi: 10.1038/s41597-019-0322-0 pubmed: 31831740 pmcid: 6908718

Klause, H., Ziller, A., Rueckert, D., Hammernik, K. & Kaissis, G. Differentially private training of residual networks with scale normalisation. Theory and Practice of Differential Privacy Workshop, ICML (2022).

Yang, J. et al. Reinventing 2d convolutions for 3d images. IEEE J. Biomed. Health Inform. 25, 3009–3018 (2021).

doi: 10.1109/JBHI.2021.3049452 pubmed: 33406047

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778 (2016).

Ioffe, S. & Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, 448–456 (pmlr, 2015).

Wu, Y. & He, K. Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV), 3–19 (2018).

Johnson, A. et al. Mimic-cxr-jpg-chest radiographs with structured labels. PhysioNet (2019).

Fukushima, K. Cognitron: A self-organizing multilayered neural network. Biol. Cybern. 20, 121–136 (1975).

doi: 10.1007/BF00342633 pubmed: 1203338

Nair, V. & Hinton, G. E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 807–814 (2010).

Dozat, T. Incorporating nesterov momentum into adam. In International Conference on Learning Representations, Workshop Track (2016).

Misra, D. Mish: A self regularized non-monotonic activation function. In The 31st British Machine Vision Conference (2020).

Konietschke, F. & Pauly, M. Bootstrapping and permuting paired t-test type statistics. Stat. Comput. 24, 283–296 (2014).

doi: 10.1007/s11222-012-9370-4

Unal, I. Defining an optimal cut-point value in roc analysis: an alternative approach. Comput. Math. Methods Med. 2017 (2017).

Calders, T. & Verwer, S. Three naive bayes approaches for discrimination-free classification. Data Mining Knowl. Discov. 21, 277–292 (2010).

doi: 10.1007/s10618-010-0190-x

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54, 1–35 (2021).

doi: 10.1145/3457607

Tayebi Arasteh, S. et al. Securing collaborative medical AI by using differential privacy: Domain transfer for classification of chest radiographs. Radiol. Artif. Intel. 6, e230212 (2024).

doi: 10.1148/ryai.230212

Wu, J. T. et al. Comparison of chest radiograph interpretations by artificial intelligence algorithm vs radiology residents. JAMA Netw. Open 3, e2022779–e2022779 (2020).

doi: 10.1001/jamanetworkopen.2020.22779 pubmed: 33034642 pmcid: 7547369

Seyyed-Kalantari, L., Zhang, H., McDermott, M. B., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 2176–2182 (2021).

doi: 10.1038/s41591-021-01595-0 pubmed: 34893776 pmcid: 8674135

Yousefpour, A. et al. Opacus: User-friendly differential privacy library in pytorch (2021). https://arxiv.org/abs/2109.12298 .

Arasteh, S. T. DP CXR. https://doi.org/10.5281/zenodo.10361657 (2023).

Ziller, A. 2.5d attention. https://doi.org/10.5281/zenodo.10361128 (2023).

Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging.

Journal

Informations de publication

Résumé

Sections du résumé

Autres résumés

Identifiants

Types de publication

Langues

Pagination

Subventions

Informations de copyright

Références

Auteurs

Soroosh Tayebi Arasteh (S)

Alexander Ziller (A)

Christiane Kuhl (C)

Marcus Makowski (M)

Sven Nebelung (S)

Rickmer Braren (R)

Daniel Rueckert (D)

Daniel Truhn (D)

Georgios Kaissis (G)

Classifications MeSH