Adapting Bidirectional Encoder Representations from Transformers (BERT) to Assess Clinical Semantic Textual Similarity: Algorithm Development and Validation Study.

National NLP Clinical Challenges Natural Language Processing clinical text mining semantic textual similarity

Journal

JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109

Informations de publication

Date de publication:
03 Feb 2021
Historique:
received: 28 07 2020
accepted: 22 12 2020
revised: 03 12 2020
entrez: 3 2 2021
pubmed: 4 2 2021
medline: 4 2 2021
Statut: epublish

Résumé

Natural Language Understanding enables automatic extraction of relevant information from clinical text data, which are acquired every day in hospitals. In 2018, the language model Bidirectional Encoder Representations from Transformers (BERT) was introduced, generating new state-of-the-art results on several downstream tasks. The National NLP Clinical Challenges (n2c2) is an initiative that strives to tackle such downstream tasks on domain-specific clinical data. In this paper, we present the results of our participation in the 2019 n2c2 and related work completed thereafter. The objective of this study was to optimally leverage BERT for the task of assessing the semantic textual similarity of clinical text data. We used BERT as an initial baseline and analyzed the results, which we used as a starting point to develop 3 different approaches where we (1) added additional, handcrafted sentence similarity features to the classifier token of BERT and combined the results with more features in multiple regression estimators, (2) incorporated a built-in ensembling method, M-Heads, into BERT by duplicating the regression head and applying an adapted training strategy to facilitate the focus of the heads on different input patterns of the medical sentences, and (3) developed a graph-based similarity approach for medications, which allows extrapolating similarities across known entities from the training set. The approaches were evaluated with the Pearson correlation coefficient between the predicted scores and ground truth of the official training and test dataset. We improved the performance of BERT on the test dataset from a Pearson correlation coefficient of 0.859 to 0.883 using a combination of the M-Heads method and the graph-based similarity approach. We also show differences between the test and training dataset and how the two datasets influenced the results. We found that using a graph-based similarity approach has the potential to extrapolate domain specific knowledge to unseen sentences. We observed that it is easily possible to obtain deceptive results from the test dataset, especially when the distribution of the data samples is different between training and test datasets.

Sections du résumé

BACKGROUND BACKGROUND
Natural Language Understanding enables automatic extraction of relevant information from clinical text data, which are acquired every day in hospitals. In 2018, the language model Bidirectional Encoder Representations from Transformers (BERT) was introduced, generating new state-of-the-art results on several downstream tasks. The National NLP Clinical Challenges (n2c2) is an initiative that strives to tackle such downstream tasks on domain-specific clinical data. In this paper, we present the results of our participation in the 2019 n2c2 and related work completed thereafter.
OBJECTIVE OBJECTIVE
The objective of this study was to optimally leverage BERT for the task of assessing the semantic textual similarity of clinical text data.
METHODS METHODS
We used BERT as an initial baseline and analyzed the results, which we used as a starting point to develop 3 different approaches where we (1) added additional, handcrafted sentence similarity features to the classifier token of BERT and combined the results with more features in multiple regression estimators, (2) incorporated a built-in ensembling method, M-Heads, into BERT by duplicating the regression head and applying an adapted training strategy to facilitate the focus of the heads on different input patterns of the medical sentences, and (3) developed a graph-based similarity approach for medications, which allows extrapolating similarities across known entities from the training set. The approaches were evaluated with the Pearson correlation coefficient between the predicted scores and ground truth of the official training and test dataset.
RESULTS RESULTS
We improved the performance of BERT on the test dataset from a Pearson correlation coefficient of 0.859 to 0.883 using a combination of the M-Heads method and the graph-based similarity approach. We also show differences between the test and training dataset and how the two datasets influenced the results.
CONCLUSIONS CONCLUSIONS
We found that using a graph-based similarity approach has the potential to extrapolate domain specific knowledge to unseen sentences. We observed that it is easily possible to obtain deceptive results from the test dataset, especially when the distribution of the data samples is different between training and test datasets.

Identifiants

pubmed: 33533728
pii: v9i2e22795
doi: 10.2196/22795
pmc: PMC7889424
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e22795

Informations de copyright

©Klaus Kades, Jan Sellner, Gregor Koehler, Peter M Full, T Y Emmy Lai, Jens Kleesiek, Klaus H Maier-Hein. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 03.02.2021.

Références

BMC Med Inform Decis Mak. 2019 Dec 27;19(Suppl 10):262
pubmed: 31882003
AMIA Annu Symp Proc. 2020 Mar 04;2019:1129-1138
pubmed: 32308910
J Am Med Inform Assoc. 2013 Jul-Aug;20(4):718-26
pubmed: 23355462
J Am Med Inform Assoc. 2010 Jan-Feb;17(1):19-24
pubmed: 20064797
J Am Med Inform Assoc. 2020 Apr 1;27(4):584-591
pubmed: 32044989
AMIA Annu Symp Proc. 2014 Nov 14;2014:1268-76
pubmed: 25954438
JMIR Med Inform. 2020 Nov 27;8(11):e23375
pubmed: 33245291
AMIA Jt Summits Transl Sci Proc. 2014 Apr 07;2014:37-42
pubmed: 25954575

Auteurs

Klaus Kades (K)

German Cancer Research Center (DKFZ), Heidelberg, Germany.
Partner Site Heidelberg, German Cancer Consortium (DKTK), Heidelberg, Germany.

Jan Sellner (J)

German Cancer Research Center (DKFZ), Heidelberg, Germany.
Helmholtz Information and Data Science School for Health, Karlsruhe/Heidelberg, Germany.

Gregor Koehler (G)

German Cancer Research Center (DKFZ), Heidelberg, Germany.

Peter M Full (PM)

German Cancer Research Center (DKFZ), Heidelberg, Germany.
Heidelberg University, Heidelberg, Germany.

T Y Emmy Lai (TYE)

German Cancer Research Center (DKFZ), Heidelberg, Germany.
Hochschule Mannheim, University of Applied Sciences, Mannheim, Germany.

Jens Kleesiek (J)

German Cancer Research Center (DKFZ), Heidelberg, Germany.
Partner Site Heidelberg, German Cancer Consortium (DKTK), Heidelberg, Germany.
Helmholtz Information and Data Science School for Health, Karlsruhe/Heidelberg, Germany.
Institute for Artificial Intelligence in Medicine (IKIM), University Medicine Essen, Essen, Germany.

Klaus H Maier-Hein (KH)

German Cancer Research Center (DKFZ), Heidelberg, Germany.
Partner Site Heidelberg, German Cancer Consortium (DKTK), Heidelberg, Germany.
Helmholtz Information and Data Science School for Health, Karlsruhe/Heidelberg, Germany.
Heidelberg University, Heidelberg, Germany.

Classifications MeSH