Adapting Bidirectional Encoder Representations from Transformers (BERT) to Assess Clinical Semantic Textual Similarity: Algorithm Development and Validation Study.

National NLP Clinical Challenges Natural Language Processing clinical text mining semantic textual similarity

Journal

JMIR medical informatics

ISSN: 2291-9694

Titre abrégé: JMIR Med Inform

Pays: Canada

ID NLM: 101645109

Informations de publication

Date de publication:
03 Feb 2021

Historique:

received: 28 07 2020

accepted: 22 12 2020

revised: 03 12 2020

entrez: 3 2 2021

pubmed: 4 2 2021

medline: 4 2 2021

Statut: epublish

Résumé

Natural Language Understanding enables automatic extraction of relevant information from clinical text data, which are acquired every day in hospitals. In 2018, the language model Bidirectional Encoder Representations from Transformers (BERT) was introduced, generating new state-of-the-art results on several downstream tasks. The National NLP Clinical Challenges (n2c2) is an initiative that strives to tackle such downstream tasks on domain-specific clinical data. In this paper, we present the results of our participation in the 2019 n2c2 and related work completed thereafter. The objective of this study was to optimally leverage BERT for the task of assessing the semantic textual similarity of clinical text data. We used BERT as an initial baseline and analyzed the results, which we used as a starting point to develop 3 different approaches where we (1) added additional, handcrafted sentence similarity features to the classifier token of BERT and combined the results with more features in multiple regression estimators, (2) incorporated a built-in ensembling method, M-Heads, into BERT by duplicating the regression head and applying an adapted training strategy to facilitate the focus of the heads on different input patterns of the medical sentences, and (3) developed a graph-based similarity approach for medications, which allows extrapolating similarities across known entities from the training set. The approaches were evaluated with the Pearson correlation coefficient between the predicted scores and ground truth of the official training and test dataset. We improved the performance of BERT on the test dataset from a Pearson correlation coefficient of 0.859 to 0.883 using a combination of the M-Heads method and the graph-based similarity approach. We also show differences between the test and training dataset and how the two datasets influenced the results. We found that using a graph-based similarity approach has the potential to extrapolate domain specific knowledge to unseen sentences. We observed that it is easily possible to obtain deceptive results from the test dataset, especially when the distribution of the data samples is different between training and test datasets.

Sections du résumé

BACKGROUND BACKGROUND

OBJECTIVE OBJECTIVE

The objective of this study was to optimally leverage BERT for the task of assessing the semantic textual similarity of clinical text data.

METHODS METHODS

We used BERT as an initial baseline and analyzed the results, which we used as a starting point to develop 3 different approaches where we (1) added additional, handcrafted sentence similarity features to the classifier token of BERT and combined the results with more features in multiple regression estimators, (2) incorporated a built-in ensembling method, M-Heads, into BERT by duplicating the regression head and applying an adapted training strategy to facilitate the focus of the heads on different input patterns of the medical sentences, and (3) developed a graph-based similarity approach for medications, which allows extrapolating similarities across known entities from the training set. The approaches were evaluated with the Pearson correlation coefficient between the predicted scores and ground truth of the official training and test dataset.

RESULTS RESULTS

We improved the performance of BERT on the test dataset from a Pearson correlation coefficient of 0.859 to 0.883 using a combination of the M-Heads method and the graph-based similarity approach. We also show differences between the test and training dataset and how the two datasets influenced the results.

CONCLUSIONS CONCLUSIONS

We found that using a graph-based similarity approach has the potential to extrapolate domain specific knowledge to unseen sentences. We observed that it is easily possible to obtain deceptive results from the test dataset, especially when the distribution of the data samples is different between training and test datasets.

Identifiants

DOI: 10.2196/22795 PMID: 33533728 PMC: PMC7889424

pubmed: 33533728

pii: v9i2e22795

doi: 10.2196/22795

pmc: PMC7889424

doi:

Types de publication

Journal Article

Langues

eng

Pagination

e22795

Informations de copyright

©Klaus Kades, Jan Sellner, Gregor Koehler, Peter M Full, T Y Emmy Lai, Jens Kleesiek, Klaus H Maier-Hein. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 03.02.2021.

Références

BMC Med Inform Decis Mak. 2019 Dec 27;19(Suppl 10):262

pubmed: 31882003

AMIA Annu Symp Proc. 2020 Mar 04;2019:1129-1138

pubmed: 32308910

J Am Med Inform Assoc. 2013 Jul-Aug;20(4):718-26

pubmed: 23355462

J Am Med Inform Assoc. 2010 Jan-Feb;17(1):19-24

pubmed: 20064797

J Am Med Inform Assoc. 2020 Apr 1;27(4):584-591

pubmed: 32044989

AMIA Annu Symp Proc. 2014 Nov 14;2014:1268-76

pubmed: 25954438

JMIR Med Inform. 2020 Nov 27;8(11):e23375

pubmed: 33245291

AMIA Jt Summits Transl Sci Proc. 2014 Apr 07;2014:37-42

pubmed: 25954575

Adapting Bidirectional Encoder Representations from Transformers (BERT) to Assess Clinical Semantic Textual Similarity: Algorithm Development and Validation Study.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Références

Auteurs

Klaus Kades (K)

Jan Sellner (J)

Gregor Koehler (G)

Peter M Full (PM)

T Y Emmy Lai (TYE)

Jens Kleesiek (J)

Klaus H Maier-Hein (KH)

Classifications MeSH