Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study.

Missing data Missing indicator Multiple imputation Simulation study

Journal

BMC medical research methodology
ISSN: 1471-2288
Titre abrégé: BMC Med Res Methodol
Pays: England
ID NLM: 100968545

Informations de publication

Date de publication:
08 07 2020
Historique:
received: 18 11 2019
accepted: 28 06 2020
entrez: 10 7 2020
pubmed: 10 7 2020
medline: 25 6 2021
Statut: epublish

Résumé

Within routinely collected health data, missing data for an individual might provide useful information in itself. This occurs, for example, in the case of electronic health records, where the presence or absence of data is informative. While the naive use of missing indicators to try to exploit such information can introduce bias, its use in conjunction with multiple imputation may unlock the potential value of missingness to reduce bias in causal effect estimation, particularly in missing not at random scenarios and where missingness might be associated with unmeasured confounders. We conducted a simulation study to determine when the use of a missing indicator, combined with multiple imputation, would reduce bias for causal effect estimation, under a range of scenarios including unmeasured variables, missing not at random, and missing at random mechanisms. We use directed acyclic graphs and structural models to elucidate a variety of causal structures of interest. We handled missing data using complete case analysis, and multiple imputation with and without missing indicator terms. We find that multiple imputation combined with a missing indicator gives minimal bias for causal effect estimation in most scenarios. In particular the approach: 1) does not introduce bias in missing (completely) at random scenarios; 2) reduces bias in missing not at random scenarios where the missing mechanism depends on the missing variable itself; and 3) may reduce or increase bias when unmeasured confounding is present. In the presence of missing data, careful use of missing indicators, combined with multiple imputation, can improve causal effect estimation when missingness is informative, and is not detrimental when missingness is at random.

Sections du résumé

BACKGROUND
Within routinely collected health data, missing data for an individual might provide useful information in itself. This occurs, for example, in the case of electronic health records, where the presence or absence of data is informative. While the naive use of missing indicators to try to exploit such information can introduce bias, its use in conjunction with multiple imputation may unlock the potential value of missingness to reduce bias in causal effect estimation, particularly in missing not at random scenarios and where missingness might be associated with unmeasured confounders.
METHODS
We conducted a simulation study to determine when the use of a missing indicator, combined with multiple imputation, would reduce bias for causal effect estimation, under a range of scenarios including unmeasured variables, missing not at random, and missing at random mechanisms. We use directed acyclic graphs and structural models to elucidate a variety of causal structures of interest. We handled missing data using complete case analysis, and multiple imputation with and without missing indicator terms.
RESULTS
We find that multiple imputation combined with a missing indicator gives minimal bias for causal effect estimation in most scenarios. In particular the approach: 1) does not introduce bias in missing (completely) at random scenarios; 2) reduces bias in missing not at random scenarios where the missing mechanism depends on the missing variable itself; and 3) may reduce or increase bias when unmeasured confounding is present.
CONCLUSION
In the presence of missing data, careful use of missing indicators, combined with multiple imputation, can improve causal effect estimation when missingness is informative, and is not detrimental when missingness is at random.

Identifiants

pubmed: 32640992
doi: 10.1186/s12874-020-01068-x
pii: 10.1186/s12874-020-01068-x
pmc: PMC7346454
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

185

Subventions

Organisme : Medical Research Council
ID : MR/T025085/1
Pays : United Kingdom

Références

J Clin Epidemiol. 2016 Dec;80:107-115
pubmed: 27445178
Eur J Epidemiol. 2019 Jan;34(1):23-36
pubmed: 30341708
Biostatistics. 2020 Apr 1;21(2):236-252
pubmed: 30203058
J Clin Epidemiol. 2010 Jul;63(7):728-36
pubmed: 20346625
Stat Med. 2009 Apr 30;28(9):1402-14
pubmed: 19222021
Stat Med. 2019 May 20;38(11):2074-2102
pubmed: 30652356
CMAJ. 2012 Aug 7;184(11):1265-9
pubmed: 22371511
Stat Methods Med Res. 2012 Jun;21(3):243-56
pubmed: 21389091
Am J Epidemiol. 1995 Dec 15;142(12):1255-64
pubmed: 7503045

Auteurs

Matthew Sperrin (M)

Faculty of Biology, Medicine and Health, Vaughan House, University of Manchester, Manchester, M13 9PL, UK. matthew.sperrin@manchester.ac.uk.

Glen P Martin (GP)

Faculty of Biology, Medicine and Health, Vaughan House, University of Manchester, Manchester, M13 9PL, UK.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH