Deep neural model with self-training for scientific keyphrase extraction.

Deep Learning Information Storage and Retrieval / methods Models, Statistical Natural Language Processing Neural Networks, Computer Publications

Journal

PloS one

ISSN: 1932-6203

Titre abrégé: PLoS One

Pays: United States

ID NLM: 101285081

Informations de publication

Date de publication:
2020

Historique:

received: 27 12 2019

accepted: 16 04 2020

entrez: 16 5 2020

pubmed: 16 5 2020

medline: 29 7 2020

Statut: epublish

Résumé

Scientific information extraction is a crucial step for understanding scientific publications. In this paper, we focus on scientific keyphrase extraction, which aims to identify keyphrases from scientific articles and classify them into predefined categories. We present a neural network based approach for this task, which employs the bidirectional long short-memory (LSTM) to represent the sentences in the article. On top of the bidirectional LSTM layer in our neural model, conditional random field (CRF) is used to predict the label sequence for the whole sentence. Considering the expensive annotated data for supervised learning methods, we introduce self-training method into our neural model to leverage the unlabeled articles. Experimental results on the ScienceIE corpus and ACL keyphrase corpus show that our neural model achieves promising performance without any hand-designed features and external knowledge resources. Furthermore, it efficiently incorporates the unlabeled data and achieve competitive performance compared with previous state-of-the-art systems.

Identifiants

DOI: 10.1371/journal.pone.0232547 PMID: 32413094 PMC: PMC7228065

pubmed: 32413094

doi: 10.1371/journal.pone.0232547

pii: PONE-D-19-35793

pmc: PMC7228065

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

e0232547

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

BMC Bioinformatics. 2017 Oct 30;18(1):462

pubmed: 29084508

PLoS One. 2019 May 2;14(5):e0216046

pubmed: 31048840

Neural Comput. 1997 Nov 15;9(8):1735-80

pubmed: 9377276

Deep neural model with self-training for scientific keyphrase extraction.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Déclaration de conflit d'intérêts

Références

Auteurs

Xun Zhu (X)

Chen Lyu (C)

Donghong Ji (D)

Han Liao (H)

Fei Li (F)

Articles similaires

Exploring structural diversity across the protein universe with The Encyclopedia of Domains.

A new estimator of between study variance of standardized mean difference in meta-analysis.

Unsupervised learning for real-time and continuous gait phase detection.

Detection, classification, and characterization of proximal humerus fractures on plain radiographs.

Classifications MeSH