Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching.

adverse drug reactions clinical notes word embeddings

Journal

JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109

Informations de publication

Date de publication:
25 Jan 2022
Historique:
received: 10 06 2021
accepted: 14 11 2021
revised: 02 11 2021
entrez: 25 1 2022
pubmed: 26 1 2022
medline: 26 1 2022
Statut: epublish

Résumé

Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available. The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN). Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype. The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline. The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs.

Sections du résumé

BACKGROUND BACKGROUND
Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available.
OBJECTIVE OBJECTIVE
The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN).
METHODS METHODS
Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype.
RESULTS RESULTS
The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline.
CONCLUSIONS CONCLUSIONS
The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs.

Identifiants

pubmed: 35076407
pii: v10i1e31063
doi: 10.2196/31063
pmc: PMC8826143
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e31063

Informations de copyright

©Klaske R Siegersma, Maxime Evers, Sophie H Bots, Floor Groepenhoff, Yolande Appelman, Leonard Hofstra, Igor I Tulevski, G Aernout Somsen, Hester M den Ruijter, Marco Spruit, N Charlotte Onland-Moret. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 25.01.2022.

Références

Br J Clin Pharmacol. 2019 Jul;85(7):1507-1515
pubmed: 30941789
J Am Med Inform Assoc. 2003 Jul-Aug;10(4):339-50
pubmed: 12668691
AMIA Annu Symp Proc. 2018 Apr 16;2017:411-420
pubmed: 29854105
Int J Med Inform. 2019 Aug;128:62-70
pubmed: 31160013
J Am Med Inform Assoc. 2001 May-Jun;8(3):254-66
pubmed: 11320070
Ann Pharmacother. 2008 Jul;42(7):1017-25
pubmed: 18594048
J Am Med Inform Assoc. 2020 Jun 1;27(6):901-907
pubmed: 32388549
Drug Saf. 2006;29(5):385-96
pubmed: 16689555
NPJ Digit Med. 2021 Feb 26;4(1):37
pubmed: 33637859
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):947-53
pubmed: 23703825
J Biomed Semantics. 2018 Mar 30;9(1):12
pubmed: 29602312
J Am Med Inform Assoc. 2020 Jan 1;27(1):47-55
pubmed: 31334805
PLoS One. 2015 Aug 14;10(8):e0134208
pubmed: 26273830
Qual Saf Health Care. 2007 Apr;16(2):132-4
pubmed: 17403760
Epidemiology. 2014 May;25(3):470-1
pubmed: 24713887
Lancet Oncol. 2016 May;17(5):e209-19
pubmed: 27301048
Drug Saf. 2018 Jul;41(7):665-675
pubmed: 29520645
JAMA. 2013 Apr 3;309(13):1351-2
pubmed: 23549579
J Biomed Inform. 2018 Nov;87:12-20
pubmed: 30217670
JMIR Med Inform. 2019 Apr 27;7(2):e12239
pubmed: 31066697
Drug Saf. 1999 Feb;20(2):109-17
pubmed: 10082069
J Biomed Inform. 2019;100S:100057
pubmed: 34384583
J Pharmacol Pharmacother. 2013 Dec;4(Suppl 1):S73-7
pubmed: 24347988
BMC Cardiovasc Disord. 2021 Jun 10;21(1):287
pubmed: 34112101
J Allergy Clin Immunol. 2020 Feb;145(2):463-469
pubmed: 31883846

Auteurs

Klaske R Siegersma (KR)

Laboratory of Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.
Department of Cardiology, Amsterdam University Medical Centers, VU University Medical Center, Amsterdam, Netherlands.

Maxime Evers (M)

Laboratory of Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.

Sophie H Bots (SH)

Laboratory of Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.

Floor Groepenhoff (F)

Laboratory of Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.
Central Diagnostic Laboratory, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.

Yolande Appelman (Y)

Department of Cardiology, Amsterdam University Medical Centers, VU University Medical Center, Amsterdam, Netherlands.

Leonard Hofstra (L)

Department of Cardiology, Amsterdam University Medical Centers, VU University Medical Center, Amsterdam, Netherlands.
Cardiology Centers of the Netherlands, Utrecht, Netherlands.

Igor I Tulevski (II)

Cardiology Centers of the Netherlands, Utrecht, Netherlands.

G Aernout Somsen (GA)

Cardiology Centers of the Netherlands, Utrecht, Netherlands.

Hester M den Ruijter (HM)

Laboratory of Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.

Marco Spruit (M)

Department of Public Health and Primary Care, Leiden University Medical Center, Leiden University, Leiden, Netherlands.
Leiden Institute of Advanced Computer Science, Leiden University, Leiden, Netherlands.

N Charlotte Onland-Moret (NC)

Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.

Classifications MeSH