Leveraging electronic health records for data science: common pitfalls and how to avoid them.


Journal

The Lancet. Digital health
ISSN: 2589-7500
Titre abrégé: Lancet Digit Health
Pays: England
ID NLM: 101751302

Informations de publication

Date de publication:
12 2022
Historique:
received: 16 02 2022
revised: 29 06 2022
accepted: 28 07 2022
pubmed: 27 9 2022
medline: 30 11 2022
entrez: 26 9 2022
Statut: ppublish

Résumé

Analysis of electronic health records (EHRs) is an increasingly common approach for studying real-world patient data. Use of routinely collected data offers several advantages compared with other study designs, including reduced administrative costs, the ability to update analysis as practice patterns evolve, and larger sample sizes. Methodologically, EHR analysis is subject to distinct challenges because data are not collected for research purposes. In this Viewpoint, we elaborate on the importance of in-depth knowledge of clinical workflows and describe six potential pitfalls to be avoided when working with EHR data, drawing on examples from the literature and our experience. We propose solutions for prevention or mitigation of factors associated with each of these six pitfalls-sample selection bias, imprecise variable definitions, limitations to deployment, variable measurement frequency, subjective treatment allocation, and model overfitting. Ultimately, we hope that this Viewpoint will guide researchers to further improve the methodological robustness of EHR analysis.

Identifiants

pubmed: 36154811
pii: S2589-7500(22)00154-6
doi: 10.1016/S2589-7500(22)00154-6
pii:
doi:

Types de publication

Journal Article Review Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

e893-e898

Subventions

Organisme : NIBIB NIH HHS
ID : R01 EB017205
Pays : United States

Informations de copyright

Copyright © 2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of interests SLH is an employee of Microsoft Research (UK) and a board member of the non-profit organisation Association for Health Learning and Inference. All other authors declare no competing interests.

Auteurs

Christopher M Sauer (CM)

Laboratory for Critical Care Computational Intelligence, Department of Intensive Care Medicine, Amsterdam Medical Data Science, Amsterdam Cardiovascular Science, Amsterdam Institute for Infection and Immunity, Amsterdam UMC, Location VUmc, Amsterdam, Netherlands; Laboratory for Computational Physiology, Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: sauerc@mit.edu.

Li-Ching Chen (LC)

Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan.

Stephanie L Hyland (SL)

Microsoft Research, Cambridge, UK.

Armand Girbes (A)

Laboratory for Critical Care Computational Intelligence, Department of Intensive Care Medicine, Amsterdam Medical Data Science, Amsterdam Cardiovascular Science, Amsterdam Institute for Infection and Immunity, Amsterdam UMC, Location VUmc, Amsterdam, Netherlands.

Paul Elbers (P)

Laboratory for Critical Care Computational Intelligence, Department of Intensive Care Medicine, Amsterdam Medical Data Science, Amsterdam Cardiovascular Science, Amsterdam Institute for Infection and Immunity, Amsterdam UMC, Location VUmc, Amsterdam, Netherlands.

Leo A Celi (LA)

Laboratory for Computational Physiology, Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH