Detecting Hypoglycemia Incidents Reported in Patients' Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance.
adverse event detection
drug-related side effects and adverse reactions
hypoglycemia
imbalanced data
natural language processing
secure messaging
supervised machine learning
Journal
Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882
Informations de publication
Date de publication:
11 03 2019
11 03 2019
Historique:
received:
21
08
2018
accepted:
10
02
2019
revised:
19
01
2019
entrez:
12
3
2019
pubmed:
12
3
2019
medline:
8
2
2020
Statut:
epublish
Résumé
Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety. We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector), to automatically identify hypoglycemia incidents reported in patients' secure messages. An expert in public health annotated 3000 secure message threads between patients with diabetes and US Department of Veterans Affairs clinical teams as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset to determine interannotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates 3 machine learning algorithms widely used for text classification: linear support vector machines, random forest, and logistic regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.80%) messages were annotated as positive, we investigated cost-sensitive learning and oversampling methods to mitigate the challenge of imbalanced data. The interannotator agreement was Cohen kappa=.976. Using cross-validation, logistic regression with cost-sensitive learning achieved the best performance (area under the receiver operating characteristic curve=0.954, sensitivity=0.693, specificity 0.974, F1 score=0.590). Cost-sensitive learning and the ensembled synthetic minority oversampling technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect. Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia.
Sections du résumé
BACKGROUND
Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety.
OBJECTIVE
We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector), to automatically identify hypoglycemia incidents reported in patients' secure messages.
METHODS
An expert in public health annotated 3000 secure message threads between patients with diabetes and US Department of Veterans Affairs clinical teams as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset to determine interannotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates 3 machine learning algorithms widely used for text classification: linear support vector machines, random forest, and logistic regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.80%) messages were annotated as positive, we investigated cost-sensitive learning and oversampling methods to mitigate the challenge of imbalanced data.
RESULTS
The interannotator agreement was Cohen kappa=.976. Using cross-validation, logistic regression with cost-sensitive learning achieved the best performance (area under the receiver operating characteristic curve=0.954, sensitivity=0.693, specificity 0.974, F1 score=0.590). Cost-sensitive learning and the ensembled synthetic minority oversampling technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect.
CONCLUSIONS
Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia.
Identifiants
pubmed: 30855231
pii: v21i3e11990
doi: 10.2196/11990
pmc: PMC6431826
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e11990Subventions
Organisme : NCI NIH HHS
ID : R25 CA172009
Pays : United States
Informations de copyright
©Jinying Chen, John Lalor, Weisong Liu, Emily Druhl, Edgard Granillo, Varsha G Vimalananda, Hong Yu. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 11.03.2019.
Références
Sci Rep. 2018 Oct 3;8(1):14722
pubmed: 30283093
BMC Bioinformatics. 2015 Nov 04;16:363
pubmed: 26537827
Diabetes Care. 2005 Dec;28(12):2948-61
pubmed: 16306561
J Diabetes Sci Technol. 2008 Jul;2(4):612-21
pubmed: 19885237
Diabetes Care. 2005 Jul;28(7):1624-9
pubmed: 15983311
J Gen Intern Med. 2011 Jul;26(7):698-704
pubmed: 21384219
IEEE Trans Syst Man Cybern B Cybern. 2009 Apr;39(2):539-50
pubmed: 19095540
Med Decis Making. 2016 Jan;36(1):137-44
pubmed: 25449060
Int Conf Affect Comput Intell Interact Workshops. 2013;2013:245-251
pubmed: 25574450
Surg Endosc. 2016 Apr;30(4):1432-40
pubmed: 26123340
JMIR Med Inform. 2016 Nov 30;4(4):e40
pubmed: 27903489
J Am Med Inform Assoc. 2011 Dec;18 Suppl 1:i18-23
pubmed: 21807648
J Am Med Inform Assoc. 2017 Sep 01;24(5):942-949
pubmed: 28371896
Health Aff (Millwood). 2010 Jul;29(7):1370-5
pubmed: 20606190
J Biomed Inform. 2017 Oct;74:59-70
pubmed: 28864104
J Surg Res. 2017 Jun 15;214:93-101
pubmed: 28624066
J Med Internet Res. 2018 Jul 11;20(7):e218
pubmed: 29997107
J Med Internet Res. 2004 May 14;6(2):e12
pubmed: 15249261
J Diabetes Sci Technol. 2015 Jan;9(1):86-90
pubmed: 25316712
Proc Conf Assoc Comput Linguist Meet. 2013 Aug;2013:67-73
pubmed: 37786783
J Am Med Inform Assoc. 2006 Jan-Feb;13(1):91-5
pubmed: 16221943
Diabetes Res Clin Pract. 2017 Jun;128:40-50
pubmed: 28437734
J Med Internet Res. 2014 Mar 06;16(3):e75
pubmed: 24610454
Appl Clin Inform. 2016 Jun 06;7(2):489-501
pubmed: 27437056
Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:2716-9
pubmed: 23366486
Clin Pharmacol Ther. 2012 Aug;92(2):228-34
pubmed: 22713699
JMIR Med Inform. 2018 Nov 26;6(4):e12159
pubmed: 30478023
J Diabetes Sci Technol. 2014 Jul;8(4):731-7
pubmed: 24876412
Arch Intern Med. 2004 Jul 12;164(13):1445-50
pubmed: 15249354
Diabetologia. 2007 Jun;50(6):1140-7
pubmed: 17415551
AMIA Annu Symp Proc. 2008 Nov 06;:1182
pubmed: 18999142
Int J Med Inform. 2017 Sep;105:110-120
pubmed: 28750904
Diabetes Care. 2009 Feb;32(2):234-9
pubmed: 19017773
JAMA. 2008 Jun 25;299(24):2857-67
pubmed: 18577730
Cleve Clin J Med. 2004 Apr;71(4):335-42
pubmed: 15117175
Diabetes Care. 2005 May;28(5):1245-9
pubmed: 15855602
J Am Med Inform Assoc. 2013 May 1;20(3):519-25
pubmed: 23242764
JMIR Public Health Surveill. 2018 Apr 25;4(2):e29
pubmed: 29695376
J Am Med Inform Assoc. 2005 Jul-Aug;12(4):448-57
pubmed: 15802475
Proc Conf. 2016 Jun;2016:473-482
pubmed: 27885364
AMIA Annu Symp Proc. 2015 Nov 05;2015:1861-70
pubmed: 26958285
Appl Clin Inform. 2015 Apr 29;6(2):288-304
pubmed: 26171076
Arch Intern Med. 2001 Jul 9;161(13):1653-9
pubmed: 11434798