Importance of missingness in baseline variables: A case study of the All of Us Research Program.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2023
2023
Historique:
received:
16
08
2022
accepted:
02
05
2023
medline:
22
5
2023
pubmed:
18
5
2023
entrez:
18
5
2023
Statut:
epublish
Résumé
The All of Us Research Program collects data from multiple information sources, including health surveys, to build a national longitudinal research repository that researchers can use to advance precision medicine. Missing survey responses pose challenges to study conclusions. We describe missingness in All of Us baseline surveys. We extracted survey responses between May 31, 2017, to September 30, 2020. Missing percentages for groups historically underrepresented in biomedical research were compared to represented groups. Associations of missing percentages with age, health literacy score, and survey completion date were evaluated. We used negative binomial regression to evaluate participant characteristics on the number of missed questions out of the total eligible questions for each participant. The dataset analyzed contained data for 334,183 participants who submitted at least one baseline survey. Almost all (97.0%) of the participants completed all baseline surveys, and only 541 (0.2%) participants skipped all questions in at least one of the baseline surveys. The median skip rate was 5.0% of the questions, with an interquartile range (IQR) of 2.5% to 7.9%. Historically underrepresented groups were associated with higher missingness (incidence rate ratio (IRR) [95% CI]: 1.26 [1.25, 1.27] for Black/African American compared to White). Missing percentages were similar by survey completion date, participant age, and health literacy score. Skipping specific questions were associated with higher missingness (IRRs [95% CI]: 1.39 [1.38, 1.40] for skipping income, 1.92 [1.89, 1.95] for skipping education, 2.19 [2.09-2.30] for skipping sexual and gender questions). Surveys in the All of Us Research Program will form an essential component of the data researchers can use to perform their analyses. Missingness was low in All of Us baseline surveys, but group differences exist. Additional statistical methods and careful analysis of surveys could help mitigate challenges to the validity of conclusions.
Identifiants
pubmed: 37200348
doi: 10.1371/journal.pone.0285848
pii: PONE-D-22-22898
pmc: PMC10194909
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0285848Subventions
Organisme : NIH HHS
ID : U2C OD023196
Pays : United States
Organisme : NIH HHS
ID : OT2 OD026550
Pays : United States
Organisme : NHLBI NIH HHS
ID : K23 HL141447
Pays : United States
Informations de copyright
Copyright: © 2023 Cronin et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Stat Methods Med Res. 2013 Jun;22(3):278-95
pubmed: 21220355
Epidemiology. 2019 Jul;30(4):597-608
pubmed: 31045611
Stat Methods Med Res. 1996 Sep;5(3):215-38
pubmed: 8931194
J Gen Intern Med. 2008 May;23(5):561-6
pubmed: 18335281
PLoS One. 2020 Jul 1;15(7):e0234962
pubmed: 32609747
J Gen Intern Med. 2014 Jan;29(1):119-26
pubmed: 23918160
SAGE Open Med. 2019 Jan 08;7:2050312118822912
pubmed: 30671242
Biometrics. 2012 Mar;68(1):129-37
pubmed: 22050039
N Engl J Med. 2019 Aug 15;381(7):668-676
pubmed: 31412182
Qual Life Res. 2009 Sep;18(7):873-80
pubmed: 19543809
Res Social Adm Pharm. 2021 May;17(5):921-929
pubmed: 32800458
Surv Res Methods. 2021 Aug 19;15(3):257-268
pubmed: 37201135
Prev Sci. 2007 Sep;8(3):206-13
pubmed: 17549635