Leveraging electronic health records for data science: common pitfalls and how to avoid them.

Humans Electronic Health Records Data Science Data Collection Research Design Routinely Collected Health Data

Journal

The Lancet. Digital health

ISSN: 2589-7500

Titre abrégé: Lancet Digit Health

Pays: England

ID NLM: 101751302

Informations de publication

Date de publication:
12 2022

Historique:

received: 16 02 2022

revised: 29 06 2022

accepted: 28 07 2022

pubmed: 27 9 2022

medline: 30 11 2022

entrez: 26 9 2022

Statut: ppublish

Résumé

Analysis of electronic health records (EHRs) is an increasingly common approach for studying real-world patient data. Use of routinely collected data offers several advantages compared with other study designs, including reduced administrative costs, the ability to update analysis as practice patterns evolve, and larger sample sizes. Methodologically, EHR analysis is subject to distinct challenges because data are not collected for research purposes. In this Viewpoint, we elaborate on the importance of in-depth knowledge of clinical workflows and describe six potential pitfalls to be avoided when working with EHR data, drawing on examples from the literature and our experience. We propose solutions for prevention or mitigation of factors associated with each of these six pitfalls-sample selection bias, imprecise variable definitions, limitations to deployment, variable measurement frequency, subjective treatment allocation, and model overfitting. Ultimately, we hope that this Viewpoint will guide researchers to further improve the methodological robustness of EHR analysis.

Identifiants

DOI: 10.1016/S2589-7500(22)00154-6 PMID: 36154811

pubmed: 36154811

pii: S2589-7500(22)00154-6

doi: 10.1016/S2589-7500(22)00154-6

pii:

doi:

Types de publication

Journal Article Review Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

Pagination

e893-e898

Subventions

Organisme : NIBIB NIH HHS

ID : R01 EB017205

Pays : United States

Informations de copyright

Déclaration de conflit d'intérêts

Declaration of interests SLH is an employee of Microsoft Research (UK) and a board member of the non-profit organisation Association for Health Learning and Inference. All other authors declare no competing interests.

Leveraging electronic health records for data science: common pitfalls and how to avoid them.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Christopher M Sauer (CM)

Li-Ching Chen (LC)

Stephanie L Hyland (SL)

Armand Girbes (A)

Paul Elbers (P)

Leo A Celi (LA)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH