The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard.

MedDRA Racine Pharma data mining drug-related side effects and adverse reactions natural language processing social media

Journal

JMIR research protocols

ISSN: 1929-0748

Titre abrégé: JMIR Res Protoc

Pays: Canada

ID NLM: 101599504

Informations de publication

Date de publication:
07 May 2019

Historique:

received: 29 06 2018

accepted: 21 12 2018

revised: 16 11 2018

entrez: 9 5 2019

pubmed: 9 5 2019

medline: 9 5 2019

Statut: epublish

Résumé

Social media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts. We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project. Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing-based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored. Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing. This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance. RR1-10.2196/11448.

Sections du résumé

BACKGROUND BACKGROUND

OBJECTIVE OBJECTIVE

We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project.

METHODS METHODS

Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing-based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored.

RESULTS RESULTS

Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing.

CONCLUSIONS CONCLUSIONS

This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) UNASSIGNED

RR1-10.2196/11448.

Identifiants

DOI: 10.2196/11448 PMID: 31066711 PMC: PMC6528435

pubmed: 31066711

pii: v8i5e11448

doi: 10.2196/11448

pmc: PMC6528435

doi:

Types de publication

Journal Article

Langues

eng

Pagination

e11448

Informations de copyright

©Armelle Arnoux-Guenegou, Yannick Girardeau, Xiaoyi Chen, Myrtille Deldossi, Rim Aboukhamis, Carole Faviez, Badisse Dahamna, Pierre Karapetiantz, Sylvie Guillemin-Lanne, Agnès Lillo-Le Louët, Nathalie Texier, Anita Burgun, Sandrine Katsahian. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 07.05.2019.

Références

Pharmacoepidemiol Drug Saf. 2002 Jan-Feb;11(1):3-10

pubmed: 11998548

Drug Saf. 2002;25(6):381-92

pubmed: 12071774

J Am Med Inform Assoc. 2003 Mar-Apr;10(2):115-28

pubmed: 12595401

Drug Saf. 2006;29(5):385-96

pubmed: 16689555

Drug Saf. 2007;30(8):669-75

pubmed: 17696579

Crit Care Med. 2011 May;39(5):952-60

pubmed: 21283005

J Biomed Inform. 2011 Dec;44(6):989-96

pubmed: 21820083

AMIA Annu Symp Proc. 2011;2011:217-26

pubmed: 22195073

AMIA Annu Symp Proc. 2011;2011:1019-26

pubmed: 22195162

J Biomed Inform. 2013 Apr;46(2):275-85

pubmed: 23380683

BMC Med Inform Decis Mak. 2014 Feb 24;14:13

pubmed: 24559132

Drug Saf. 2014 May;37(5):343-50

pubmed: 24777653

BMC Med Inform Decis Mak. 2014 Oct 23;14:91

pubmed: 25341686

J Biomed Inform. 2015 Feb;53:196-207

pubmed: 25451103

J Biomed Inform. 2015 Apr;54:202-12

pubmed: 25720841

J Am Med Inform Assoc. 2015 May;22(3):671-81

pubmed: 25755127

AMIA Annu Symp Proc. 2014 Nov 14;2014:924-33

pubmed: 25954400

Stud Health Technol Inform. 2015;210:526-30

pubmed: 25991203

BMJ. 2015 Oct 28;351:h5527

pubmed: 26511519

Curr Pharm Des. 2016;22(23):3498-526

pubmed: 27157416

BMJ Open. 2017 Jan 19;7(1):e013474

pubmed: 28104709

JMIR Res Protoc. 2017 Sep 21;6(9):e179

pubmed: 28935617

SHB12 (2012). 2012 Oct 29;2012:25-32

pubmed: 28967001

Stud Health Technol Inform. 2017;245:322-326

pubmed: 29295108

JMIR Public Health Surveill. 2018 May 09;4(2):e51

pubmed: 29743155

Pharmacotherapy. 2018 Aug;38(8):822-841

pubmed: 29884988

Eur J Clin Pharmacol. 1998 Jun;54(4):315-21

pubmed: 9696956

The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Références

Auteurs

Armelle Arnoux-Guenegou (A)

Yannick Girardeau (Y)

Xiaoyi Chen (X)

Myrtille Deldossi (M)

Rim Aboukhamis (R)

Carole Faviez (C)

Badisse Dahamna (B)

Pierre Karapetiantz (P)

Sylvie Guillemin-Lanne (S)

Agnès Lillo-Le Louët (A)

Nathalie Texier (N)

Anita Burgun (A)

Sandrine Katsahian (S)

Classifications MeSH