Automated Electronic Health Record-Based Tool for Identification of Patients With Metastatic Disease to Facilitate Clinical Trial Patient Ascertainment.


Journal

JCO clinical cancer informatics
ISSN: 2473-4276
Titre abrégé: JCO Clin Cancer Inform
Pays: United States
ID NLM: 101708809

Informations de publication

Date de publication:
06 2021
Historique:
entrez: 1 7 2021
pubmed: 2 7 2021
medline: 3 11 2021
Statut: ppublish

Résumé

To facilitate identification of clinical trial participation candidates, we developed a machine learning tool that automates the determination of a patient's metastatic status, on the basis of unstructured electronic health record (EHR) data. This tool scans EHR documents, extracting text snippet features surrounding key words (such as metastatic, progression, and local). A regularized logistic regression model was trained and used to classify patients across five metastatic categories: highly likely and likely positive, highly likely and likely negative, and unknown. Using a real-world oncology database of patients with solid tumors with manually abstracted information as reference, we calculated sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). We validated the performance in a real-world data set, evaluating accuracy gains upon additional user review of tool's outputs after integration into clinic workflows. In the training data set (N = 66,532), the model sensitivity and specificity (% [95% CI]) were 82.4 [81.9 to 83.0] and 95.5 [95.3 to 96.7], respectively; the PPV was 89.3 [88.8 to 90.0], and the NPV was 94.0 [93.8 to 94.2]. In the validation sample (n = 200 from five distinct care sites), after user review of model outputs, values increased to 97.1 [85.1 to 99.9] for sensitivity, 98.2 [94.8 to 99.6] for specificity, 91.9 [78.1 to 98.3] for PPV, and 99.4 [96.6 to 100.0] for NPV. The model assigned 163 of 200 patients to the highly likely categories. The error prevalence was 4% before and 2% after user review. This tool infers metastatic status from unstructured EHR data with high accuracy and high confidence in more than 75% of cases, without requiring additional manual review. By enabling efficient characterization of metastatic status, this tool could mitigate a key barrier for patient ascertainment and clinical trial participation in community clinics.

Identifiants

pubmed: 34197178
doi: 10.1200/CCI.20.00180
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

719-727

Auteurs

Jeffrey Kirshner (J)

Hematology Oncology Associates of Central New York, East Syracuse, NY.

Kelly Cohn (K)

Hematology Oncology Associates of Central New York, East Syracuse, NY.

Steven Dunder (S)

Southeast Nebraska Cancer Center, Lincoln, NE.

Karri Donahue (K)

Southeast Nebraska Cancer Center, Lincoln, NE.

Madeline Richey (M)

Flatiron Health Inc, New York, NY.

Peter Larson (P)

Flatiron Health Inc, New York, NY.

Lauren Sutton (L)

Flatiron Health Inc, New York, NY.

Evelyn Siu (E)

Flatiron Health Inc, New York, NY.

Janet Donegan (J)

Flatiron Health Inc, New York, NY.

Zexi Chen (Z)

Flatiron Health Inc, New York, NY.

Caroline Nightingale (C)

Flatiron Health Inc, New York, NY.

Melissa Estévez (M)

Flatiron Health Inc, New York, NY.

H James Hamrick (HJ)

Flatiron Health Inc, New York, NY.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH