Enriching contextualized language model from knowledge graph for biomedical information extraction.
biomedical information extraction
knowledge graph
language model
neural network
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
20 05 2021
20 05 2021
Historique:
received:
02
02
2020
revised:
05
05
2020
accepted:
07
05
2020
pubmed:
28
6
2020
medline:
23
11
2021
entrez:
28
6
2020
Statut:
ppublish
Résumé
Biomedical information extraction (BioIE) is an important task. The aim is to analyze biomedical texts and extract structured information such as named entities and semantic relations between them. In recent years, pre-trained language models have largely improved the performance of BioIE. However, they neglect to incorporate external structural knowledge, which can provide rich factual information to support the underlying understanding and reasoning for biomedical information extraction. In this paper, we first evaluate current extraction methods, including vanilla neural networks, general language models and pre-trained contextualized language models on biomedical information extraction tasks, including named entity recognition, relation extraction and event extraction. We then propose to enrich a contextualized language model by integrating a large scale of biomedical knowledge graphs (namely, BioKGLM). In order to effectively encode knowledge, we explore a three-stage training procedure and introduce different fusion strategies to facilitate knowledge injection. Experimental results on multiple tasks show that BioKGLM consistently outperforms state-of-the-art extraction models. A further analysis proves that BioKGLM can capture the underlying relations between biomedical knowledge concepts, which are crucial for BioIE.
Identifiants
pubmed: 32591802
pii: 5854405
doi: 10.1093/bib/bbaa110
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Review
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.