Analyzing the symptoms in colorectal and breast cancer patients with or without type 2 diabetes using EHR data.
clinical decision-making
data mining
electronic health records
machine learning
text mining
Journal
Health informatics journal
ISSN: 1741-2811
Titre abrégé: Health Informatics J
Pays: England
ID NLM: 100883604
Informations de publication
Date de publication:
Historique:
entrez:
17
3
2021
pubmed:
18
3
2021
medline:
11
8
2021
Statut:
ppublish
Résumé
This research extracted patient-reported symptoms from free-text EHR notes of colorectal and breast cancer patients and studied the correlation of the symptoms with comorbid type 2 diabetes, race, and smoking status. An NLP framework was developed first to use UMLS MetaMap to extract all symptom terms from the 366,398 EHR clinical notes of 1694 colorectal cancer (CRC) patients and 3458 breast cancer (BC) patients. Semantic analysis and clustering algorithms were then developed to categorize all the relevant symptoms into eight symptom clusters defined by seed terms. After all the relevant symptoms were extracted from the EHR clinical notes, the frequency of the symptoms reported from colorectal cancer (CRC) and breast cancer (BC) patients over three time-periods post-chemotherapy was calculated. Logistic regression (LR) was performed with each symptom cluster as the response variable while controlling for diabetes, race, and smoking status. The results show that the CRC and BC patients with Type 2 Diabetes (T2D) were more likely to report symptoms than CRC and BC without T2D over three time-periods in the cancer trajectory. We also found that current smokers were more likely to report anxiety (CRC, BC), neuropathic symptoms (CRC, BC), anxiety (BC), and depression (BC) than non-smokers.
Identifiants
pubmed: 33726552
doi: 10.1177/14604582211000785
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM