Identifying erroneous height and weight values from adult electronic health records in the All of Us research program.

All of Us Research Program Data quality Electronic Health Records Measurement error

Journal

Journal of biomedical informatics
ISSN: 1532-0480
Titre abrégé: J Biomed Inform
Pays: United States
ID NLM: 100970413

Informations de publication

Date de publication:
22 May 2024
Historique:
received: 08 11 2023
revised: 29 04 2024
accepted: 21 05 2024
medline: 25 5 2024
pubmed: 25 5 2024
entrez: 24 5 2024
Statut: aheadofprint

Résumé

Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the All of Us Research Program (All of Us). We developed reference charts for adult heights and weights that were stratified on participant sex. Our analysis included 4,076,534 height and 5,207,328 wt measurements from ∼ 150,000 participants. Errors were identified using modified standard deviation scores, differences from their expected values, and significant changes between consecutive measurements. We evaluated our method with chart-reviewed heights (8,092) and weights (9,039) from 250 randomly selected participants and compared it with the current cleaning algorithm in All of Us. The proposed algorithm classified 1.4 % of height and 1.5 % of weight errors in the full cohort. Sensitivity was 90.4 % (95 % CI: 79.0-96.8 %) for heights and 65.9 % (95 % CI: 56.9-74.1 %) for weights. Precision was 73.4 % (95 % CI: 60.9-83.7 %) for heights and 62.9 (95 % CI: 54.0-71.1 %) for weights. In comparison, the current cleaning algorithm has inferior performance in sensitivity (55.8 %) and precision (16.5 %) for height errors while having higher precision (94.0 %) and lower sensitivity (61.9 %) for weight errors. Our proposed algorithm outperformed in detecting height errors compared to weights. It can serve as a valuable addition to the current All of Us cleaning algorithm for identifying erroneous height values.

Identifiants

pubmed: 38788889
pii: S1532-0464(24)00078-9
doi: 10.1016/j.jbi.2024.104660
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

104660

Informations de copyright

Copyright © 2024. Published by Elsevier Inc.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Andrew Guide (A)

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States.

Lina Sulieman (L)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Shawn Garbett (S)

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States.

Robert M Cronin (RM)

Department of Internal Medicine, The Ohio State University, Columbus, OH, United States.

Matthew Spotnitz (M)

Department of Biomedical Informatics, Columbia University, New York, NY, United States.

Karthik Natarajan (K)

Department of Biomedical Informatics, Columbia University, New York, NY, United States.

Robert J Carroll (RJ)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Paul Harris (P)

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States.

Qingxia Chen (Q)

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States. Electronic address: cindy.chen@vumc.org.

Classifications MeSH