Evaluating and Reducing Subgroup Disparity in AI Models: An Analysis of Pediatric COVID-19 Test Outcomes.

Journal

medRxiv : the preprint server for health sciences

Titre abrégé: medRxiv

Pays: United States

ID NLM: 101767986

Informations de publication

Date de publication:
19 Sep 2024

Historique:

medline: 7 10 2024

pubmed: 7 10 2024

entrez: 7 10 2024

Statut: epublish

Résumé

Artificial Intelligence (AI) fairness in healthcare settings has attracted significant attention due to the concerns to propagate existing health disparities. Despite ongoing research, the frequency and extent of subgroup fairness have not been sufficiently studied. In this study, we extracted a nationally representative pediatric dataset (ages 0-17, n=9,935) from the US National Health Interview Survey (NHIS) concerning COVID-19 test outcomes. For subgroup disparity assessment, we trained 50 models using five machine learning algorithms. We assessed the models' area under the curve (AUC) on 12 small (<15% of the total n) subgroups defined using social economic factors versus the on the overall population. Our results show that subgroup disparities were prevalent (50.7%) in the models. Subgroup AUCs were generally lower, with a mean difference of 0.01, ranging from -0.29 to +0.41. Notably, the disparities were not always statistically significant, with four out of 12 subgroups having statistically significant disparities across models. Additionally, we explored the efficacy of synthetic data in mitigating identified disparities. The introduction of synthetic data enhanced subgroup disparity in 57.7% of the models. The mean AUC disparities for models with synthetic data decreased on average by 0.03 via resampling and 0.04 via generative adverbial network methods.

Identifiants

DOI: 10.1101/2024.09.18.24313889 PMID: 39371141 PMC: PMC11451670

pubmed: 39371141

doi: 10.1101/2024.09.18.24313889

pmc: PMC11451670

pii:

doi:

Types de publication

Journal Article Preprint

Langues

eng

Evaluating and Reducing Subgroup Disparity in AI Models: An Analysis of Pediatric COVID-19 Test Outcomes.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Auteurs

Alexander Libin (A)

Jonah T Treitler (JT)

Tadas Vasaitis (T)

Yijun Shao (Y)

Classifications MeSH