Decision region analysis for generalizability of artificial intelligence models: estimating model generalizability in the case of cross-reactivity and population shift.

cross-reactivity decision region generalizability population shift represented and unrepresented subgroups vicinal distribution

Journal

Journal of medical imaging (Bellingham, Wash.)
ISSN: 2329-4302
Titre abrégé: J Med Imaging (Bellingham)
Pays: United States
ID NLM: 101643461

Informations de publication

Date de publication:
Jan 2024
Historique:
received: 30 06 2023
revised: 14 12 2023
accepted: 28 12 2023
pmc-release: 25 01 2025
medline: 29 1 2024
pubmed: 29 1 2024
entrez: 29 1 2024
Statut: ppublish

Résumé

Understanding an artificial intelligence (AI) model's ability to generalize to its target population is critical to ensuring the safe and effective usage of AI in medical devices. A traditional generalizability assessment relies on the availability of large, diverse datasets, which are difficult to obtain in many medical imaging applications. We present an approach for enhanced generalizability assessment by examining the decision space beyond the available testing data distribution. Vicinal distributions of virtual samples are generated by interpolating between triplets of test images. The generated virtual samples leverage the characteristics already in the test set, increasing the sample diversity while remaining close to the AI model's data manifold. We demonstrate the generalizability assessment approach on the non-clinical tasks of classifying patient sex, race, COVID status, and age group from chest x-rays. Decision region composition analysis for generalizability indicated that a disproportionately large portion of the decision space belonged to a single "preferred" class for each task, despite comparable performance on the evaluation dataset. Evaluation using cross-reactivity and population shift strategies indicated a tendency to overpredict samples as belonging to the preferred class (e.g., COVID negative) for patients whose subgroup was not represented in the model development data. An analysis of an AI model's decision space has the potential to provide insight into model generalizability. Our approach uses the analysis of composition of the decision space to obtain an improved assessment of model generalizability in the case of limited test data.

Identifiants

pubmed: 38283653
doi: 10.1117/1.JMI.11.1.014501
pii: 23181GRR
pmc: PMC10810180
doi:

Types de publication

Journal Article

Langues

eng

Pagination

014501

Informations de copyright

© 2024 The Authors.

Auteurs

Alexis Burgon (A)

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States.

Berkman Sahiner (B)

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States.

Nicholas Petrick (N)

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States.

Gene Pennello (G)

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States.

Kenny H Cha (KH)

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States.

Ravi K Samala (RK)

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States.

Classifications MeSH