Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text.


Journal

BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682

Informations de publication

Date de publication:
31 01 2019
Historique:
entrez: 1 2 2019
pubmed: 1 2 2019
medline: 4 7 2019
Statut: epublish

Résumé

Extracting relations between important clinical entities is critical but very challenging for natural language processing (NLP) in the medical domain. Researchers have applied deep learning-based approaches to clinical relation extraction; but most of them consider sentence sequence only, without modeling syntactic structures. The aim of this study was to utilize a deep neural network to capture the syntactic features and further improve the performances of relation extraction in clinical notes. We propose a novel neural approach to model shortest dependency path (SDP) between target entities together with the sentence sequence for clinical relation extraction. Our neural network architecture consists of three modules: (1) sentence sequence representation module using bidirectional long short-term memory network (Bi-LSTM) to capture the features in the sentence sequence; (2) SDP representation module implementing the convolutional neural network (CNN) and Bi-LSTM network to capture the syntactic context for target entities using SDP information; and (3) classification module utilizing a fully-connected layer with Softmax function to classify the relation type between target entities. Using the 2010 i2b2/VA relation extraction dataset, we compared our approach with other baseline methods. Our experimental results show that the proposed approach achieved significant improvements over comparable existing methods, demonstrating the effectiveness of utilizing syntactic structures in deep learning-based relation extraction. The F-measure of our method reaches 74.34% which is 2.5% higher than the method without using syntactic features. We propose a new neural network architecture by modeling SDP along with sentence sequence to extract multi-relations from clinical text. Our experimental results show that the proposed approach significantly improve the performances on clinical notes, demonstrating the effectiveness of syntactic structures in deep learning-based relation extraction.

Sections du résumé

BACKGROUND
Extracting relations between important clinical entities is critical but very challenging for natural language processing (NLP) in the medical domain. Researchers have applied deep learning-based approaches to clinical relation extraction; but most of them consider sentence sequence only, without modeling syntactic structures. The aim of this study was to utilize a deep neural network to capture the syntactic features and further improve the performances of relation extraction in clinical notes.
METHODS
We propose a novel neural approach to model shortest dependency path (SDP) between target entities together with the sentence sequence for clinical relation extraction. Our neural network architecture consists of three modules: (1) sentence sequence representation module using bidirectional long short-term memory network (Bi-LSTM) to capture the features in the sentence sequence; (2) SDP representation module implementing the convolutional neural network (CNN) and Bi-LSTM network to capture the syntactic context for target entities using SDP information; and (3) classification module utilizing a fully-connected layer with Softmax function to classify the relation type between target entities.
RESULTS
Using the 2010 i2b2/VA relation extraction dataset, we compared our approach with other baseline methods. Our experimental results show that the proposed approach achieved significant improvements over comparable existing methods, demonstrating the effectiveness of utilizing syntactic structures in deep learning-based relation extraction. The F-measure of our method reaches 74.34% which is 2.5% higher than the method without using syntactic features.
CONCLUSIONS
We propose a new neural network architecture by modeling SDP along with sentence sequence to extract multi-relations from clinical text. Our experimental results show that the proposed approach significantly improve the performances on clinical notes, demonstrating the effectiveness of syntactic structures in deep learning-based relation extraction.

Identifiants

pubmed: 30700301
doi: 10.1186/s12911-019-0736-9
pii: 10.1186/s12911-019-0736-9
pmc: PMC6354333
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

22

Subventions

Organisme : NCI NIH HHS
ID : U24 CA194215
Pays : United States

Références

J Biomed Inform. 2009 Oct;42(5):839-51
pubmed: 19435614
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):524-7
pubmed: 20819856
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):594-600
pubmed: 21846787
J Am Med Inform Assoc. 2008 Jan-Feb;15(1):87-98
pubmed: 17947625
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):557-62
pubmed: 21565856
J Biomed Inform. 2003 Dec;36(6):462-77
pubmed: 14759819
Bioinformatics. 2018 Mar 1;34(5):828-835
pubmed: 29077847
J Am Med Inform Assoc. 2014 May-Jun;21(3):448-54
pubmed: 24091648
Sci Data. 2016 May 24;3:160035
pubmed: 27219127
J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161-74
pubmed: 7719797
J Biomed Inform. 2003 Jun;36(3):145-58
pubmed: 14615225
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6
pubmed: 21685143
J Biomed Inform. 2013 Apr;46(2):275-85
pubmed: 23380683
Stud Health Technol Inform. 2010;160(Pt 1):739-43
pubmed: 20841784
J Am Med Inform Assoc. 2018 Jan 1;25(1):93-98
pubmed: 29025149

Auteurs

Zhiheng Li (Z)

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.

Zhihao Yang (Z)

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.

Chen Shen (C)

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.

Jun Xu (J)

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

Yaoyun Zhang (Y)

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

Hua Xu (H)

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA. hua.xu@uth.tmc.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH