Paraphrasing to improve the performance of Electronic Health Records Question Answering.
Journal
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
ISSN: 2153-4063
Titre abrégé: AMIA Jt Summits Transl Sci Proc
Pays: United States
ID NLM: 101539486
Informations de publication
Date de publication:
2020
2020
Historique:
entrez:
2
6
2020
pubmed:
2
6
2020
medline:
2
6
2020
Statut:
epublish
Résumé
This paper describes a paraphrasing approach to improve the performance of question answering (QA) for electronic health records (EHRs). QA systems for structured EHR data usually rely on semantic parsing, which aims to generate machine-understandable logical forms from free-text questions. Training semantic parsers requires large datasets of question-logical form (QL) pairs, which are labor-intensive to create. Considering the scarcity of large QL datasets in the clinical domain, we propose a framework for expanding an existing dataset using paraphrasing. We experiment with different heuristics for multiple sample sizes and iterations to assess the effect of adding paraphrasing to the task of semantic parsing. We found that adding paraphrases to an existing dataset based on TERTHRESHOLD scores results in an improved performance in the majority (74%) of the experimental runs. Hence, the proposed paraphrasing-based framework has the potential to improve the performance of QA systems using a limited set of existing QL annotations.
Types de publication
Journal Article
Langues
eng
Pagination
626-635Subventions
Organisme : NLM NIH HHS
ID : R00 LM012104
Pays : United States
Informations de copyright
©2020 AMIA - All rights reserved.
Références
J Biomed Inform. 2017 Mar;67:69-79
pubmed: 28088527
LREC Int Conf Lang Resour Eval. 2016 May;2016:3772-3778
pubmed: 28503677
AMIA Annu Symp Proc. 2018 Apr 16;2017:1478-1487
pubmed: 29854217
AMIA Annu Symp Proc. 2020 Mar 04;2019:1207-1215
pubmed: 32308918