Integrating shortest dependency path and sentence sequence into a deep learning framework for relation extraction in clinical text.
Relation extraction - deep learning
Shortest dependency path
Journal
BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682
Informations de publication
Date de publication:
31 01 2019
31 01 2019
Historique:
entrez:
1
2
2019
pubmed:
1
2
2019
medline:
4
7
2019
Statut:
epublish
Résumé
Extracting relations between important clinical entities is critical but very challenging for natural language processing (NLP) in the medical domain. Researchers have applied deep learning-based approaches to clinical relation extraction; but most of them consider sentence sequence only, without modeling syntactic structures. The aim of this study was to utilize a deep neural network to capture the syntactic features and further improve the performances of relation extraction in clinical notes. We propose a novel neural approach to model shortest dependency path (SDP) between target entities together with the sentence sequence for clinical relation extraction. Our neural network architecture consists of three modules: (1) sentence sequence representation module using bidirectional long short-term memory network (Bi-LSTM) to capture the features in the sentence sequence; (2) SDP representation module implementing the convolutional neural network (CNN) and Bi-LSTM network to capture the syntactic context for target entities using SDP information; and (3) classification module utilizing a fully-connected layer with Softmax function to classify the relation type between target entities. Using the 2010 i2b2/VA relation extraction dataset, we compared our approach with other baseline methods. Our experimental results show that the proposed approach achieved significant improvements over comparable existing methods, demonstrating the effectiveness of utilizing syntactic structures in deep learning-based relation extraction. The F-measure of our method reaches 74.34% which is 2.5% higher than the method without using syntactic features. We propose a new neural network architecture by modeling SDP along with sentence sequence to extract multi-relations from clinical text. Our experimental results show that the proposed approach significantly improve the performances on clinical notes, demonstrating the effectiveness of syntactic structures in deep learning-based relation extraction.
Sections du résumé
BACKGROUND
Extracting relations between important clinical entities is critical but very challenging for natural language processing (NLP) in the medical domain. Researchers have applied deep learning-based approaches to clinical relation extraction; but most of them consider sentence sequence only, without modeling syntactic structures. The aim of this study was to utilize a deep neural network to capture the syntactic features and further improve the performances of relation extraction in clinical notes.
METHODS
We propose a novel neural approach to model shortest dependency path (SDP) between target entities together with the sentence sequence for clinical relation extraction. Our neural network architecture consists of three modules: (1) sentence sequence representation module using bidirectional long short-term memory network (Bi-LSTM) to capture the features in the sentence sequence; (2) SDP representation module implementing the convolutional neural network (CNN) and Bi-LSTM network to capture the syntactic context for target entities using SDP information; and (3) classification module utilizing a fully-connected layer with Softmax function to classify the relation type between target entities.
RESULTS
Using the 2010 i2b2/VA relation extraction dataset, we compared our approach with other baseline methods. Our experimental results show that the proposed approach achieved significant improvements over comparable existing methods, demonstrating the effectiveness of utilizing syntactic structures in deep learning-based relation extraction. The F-measure of our method reaches 74.34% which is 2.5% higher than the method without using syntactic features.
CONCLUSIONS
We propose a new neural network architecture by modeling SDP along with sentence sequence to extract multi-relations from clinical text. Our experimental results show that the proposed approach significantly improve the performances on clinical notes, demonstrating the effectiveness of syntactic structures in deep learning-based relation extraction.
Identifiants
pubmed: 30700301
doi: 10.1186/s12911-019-0736-9
pii: 10.1186/s12911-019-0736-9
pmc: PMC6354333
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
22Subventions
Organisme : NCI NIH HHS
ID : U24 CA194215
Pays : United States
Références
J Biomed Inform. 2009 Oct;42(5):839-51
pubmed: 19435614
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):524-7
pubmed: 20819856
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):594-600
pubmed: 21846787
J Am Med Inform Assoc. 2008 Jan-Feb;15(1):87-98
pubmed: 17947625
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):557-62
pubmed: 21565856
J Biomed Inform. 2003 Dec;36(6):462-77
pubmed: 14759819
Bioinformatics. 2018 Mar 1;34(5):828-835
pubmed: 29077847
J Am Med Inform Assoc. 2014 May-Jun;21(3):448-54
pubmed: 24091648
Sci Data. 2016 May 24;3:160035
pubmed: 27219127
J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161-74
pubmed: 7719797
J Biomed Inform. 2003 Jun;36(3):145-58
pubmed: 14615225
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6
pubmed: 21685143
J Biomed Inform. 2013 Apr;46(2):275-85
pubmed: 23380683
Stud Health Technol Inform. 2010;160(Pt 1):739-43
pubmed: 20841784
J Am Med Inform Assoc. 2018 Jan 1;25(1):93-98
pubmed: 29025149