Paraphrasing to improve the performance of Electronic Health Records Question Answering.


Journal

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
ISSN: 2153-4063
Titre abrégé: AMIA Jt Summits Transl Sci Proc
Pays: United States
ID NLM: 101539486

Informations de publication

Date de publication:
2020
Historique:
entrez: 2 6 2020
pubmed: 2 6 2020
medline: 2 6 2020
Statut: epublish

Résumé

This paper describes a paraphrasing approach to improve the performance of question answering (QA) for electronic health records (EHRs). QA systems for structured EHR data usually rely on semantic parsing, which aims to generate machine-understandable logical forms from free-text questions. Training semantic parsers requires large datasets of question-logical form (QL) pairs, which are labor-intensive to create. Considering the scarcity of large QL datasets in the clinical domain, we propose a framework for expanding an existing dataset using paraphrasing. We experiment with different heuristics for multiple sample sizes and iterations to assess the effect of adding paraphrasing to the task of semantic parsing. We found that adding paraphrases to an existing dataset based on TERTHRESHOLD scores results in an improved performance in the majority (74%) of the experimental runs. Hence, the proposed paraphrasing-based framework has the potential to improve the performance of QA systems using a limited set of existing QL annotations.

Identifiants

pubmed: 32477685
pmc: PMC7233085

Types de publication

Journal Article

Langues

eng

Pagination

626-635

Subventions

Organisme : NLM NIH HHS
ID : R00 LM012104
Pays : United States

Informations de copyright

©2020 AMIA - All rights reserved.

Références

J Biomed Inform. 2017 Mar;67:69-79
pubmed: 28088527
LREC Int Conf Lang Resour Eval. 2016 May;2016:3772-3778
pubmed: 28503677
AMIA Annu Symp Proc. 2018 Apr 16;2017:1478-1487
pubmed: 29854217
AMIA Annu Symp Proc. 2020 Mar 04;2019:1207-1215
pubmed: 32308918

Auteurs

Sarvesh Soni (S)

School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston TX, USA.

Kirk Roberts (K)

School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston TX, USA.

Classifications MeSH