Extracting Smoking Status from Electronic Health Records Using NLP and Deep Learning.


Journal

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
ISSN: 2153-4063
Titre abrégé: AMIA Jt Summits Transl Sci Proc
Pays: United States
ID NLM: 101539486

Informations de publication

Date de publication:
2020
Historique:
entrez: 2 6 2020
pubmed: 2 6 2020
medline: 2 6 2020
Statut: epublish

Résumé

Half a million people die every year from smoking-related issues across the United States. It is essential to identify individuals who are tobacco-dependent in order to implement preventive measures. In this study, we investigate the effectiveness of deep learning models to extract smoking status of patients from clinical progress notes. A Natural Language Processing (NLP) Pipeline was built that cleans the progress notes prior to processing by three deep neural networks: a CNN, a unidirectional LSTM, and a bidirectional LSTM. Each of these models was trained with a pre- trained or a post-trained word embedding layer. Three traditional machine learning models were also employed to compare against the neural networks. Each model has generated both binary and multi-class label classification. Our results showed that the CNN model with a pre-trained embedding layer performed the best for both binary and multi- class label classification.

Identifiants

pubmed: 32477672
pmc: PMC7233082

Types de publication

Journal Article

Langues

eng

Pagination

507-516

Subventions

Organisme : NCATS NIH HHS
ID : UL1 TR001420
Pays : United States

Informations de copyright

©2020 AMIA - All rights reserved.

Références

PLoS One. 2015 Aug 24;10(8):e0136341
pubmed: 26302085
J Am Med Inform Assoc. 2014 Sep-Oct;21(5):876-84
pubmed: 24833775
Radiother Oncol. 2008 Feb;86(2):211-6
pubmed: 18022719
BMC Bioinformatics. 2016 Jan 14;17:32
pubmed: 26763894
PLoS One. 2015 Aug 24;10(8):e0136651
pubmed: 26301417
PLoS One. 2013 Aug 16;8(8):e69932
pubmed: 23976944
Comput Math Methods Med. 2016;2016:6918381
pubmed: 26941831
AMIA Annu Symp Proc. 2009 Nov 14;2009:619-23
pubmed: 20351929
Int J Med Inform. 2014 Dec;83(12):983-92
pubmed: 23317809
PLoS One. 2010 Feb 17;5(2):e9274
pubmed: 20174660
EURASIP J Bioinform Syst Biol. 2017 Feb 1;2017:3
pubmed: 28203249
J Biomed Inform. 2015 Feb;53:196-207
pubmed: 25451103
JMLR Workshop Conf Proc. 2016 Aug;56:301-318
pubmed: 28286600
SSM Popul Health. 2019 Jan 11;7:100349
pubmed: 30723766
Biomed Res Int. 2014;2014:240403
pubmed: 24729964
Int J Med Inform. 2007 Jun;76 Suppl 1:S122-8
pubmed: 16899403
Drug Saf. 2014 Oct;37(10):777-90
pubmed: 25151493
J Am Med Inform Assoc. 2014 Sep-Oct;21(5):871-5
pubmed: 24786209

Auteurs

Suraj Rajendran (S)

Wake Forest University School of Medicine, Winston Salem, NC.
Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA.

Umit Topaloglu (U)

Wake Forest University School of Medicine, Winston Salem, NC.

Classifications MeSH