The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard.
MedDRA
Racine Pharma
data mining
drug-related side effects and adverse reactions
natural language processing
social media
Journal
JMIR research protocols
ISSN: 1929-0748
Titre abrégé: JMIR Res Protoc
Pays: Canada
ID NLM: 101599504
Informations de publication
Date de publication:
07 May 2019
07 May 2019
Historique:
received:
29
06
2018
accepted:
21
12
2018
revised:
16
11
2018
entrez:
9
5
2019
pubmed:
9
5
2019
medline:
9
5
2019
Statut:
epublish
Résumé
Social media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts. We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project. Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing-based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored. Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing. This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance. RR1-10.2196/11448.
Sections du résumé
BACKGROUND
BACKGROUND
Social media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts.
OBJECTIVE
OBJECTIVE
We propose a standardized protocol for the evaluation of a software extracting ADRs from the messages on health forums. The study is conducted as part of the Adverse Drug Reactions from Patient Reports in Social Media project.
METHODS
METHODS
Messages from French health forums were extracted. Entity recognition was based on Racine Pharma lexicon for drugs and Medical Dictionary for Regulatory Activities terminology for potential adverse events (AEs). Natural language processing-based techniques automated the ADR information extraction (relation between the drug and AE entities). The corpus of evaluation was a random sample of the messages containing drugs and/or AE concepts corresponding to recent pharmacovigilance alerts. A total of 2 persons experienced in medical terminology manually annotated the corpus, thus creating the gold standard, according to an annotator guideline. We will evaluate our tool against the gold standard with recall, precision, and f-measure. Interannotator agreement, reflecting gold standard quality, will be evaluated with hierarchical kappa. Granularities in the terminologies will be further explored.
RESULTS
RESULTS
Necessary and sufficient sample size was calculated to ensure statistical confidence in the assessed results. As we expected a global recall of 0.5, we needed at least 384 identified ADR concepts to obtain a 95% CI with a total width of 0.10 around 0.5. The automated ADR information extraction in the corpus for evaluation is already finished. The 2 annotators already completed the annotation process. The analysis of the performance of the ADR information extraction module as compared with gold standard is ongoing.
CONCLUSIONS
CONCLUSIONS
This protocol is based on the standardized statistical methods from clinical research to create the corpus, thus ensuring the necessary statistical power of the assessed results. Such evaluation methodology is required to make the ADR information extraction software useful for postmarketing drug safety surveillance.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)
UNASSIGNED
RR1-10.2196/11448.
Identifiants
pubmed: 31066711
pii: v8i5e11448
doi: 10.2196/11448
pmc: PMC6528435
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e11448Informations de copyright
©Armelle Arnoux-Guenegou, Yannick Girardeau, Xiaoyi Chen, Myrtille Deldossi, Rim Aboukhamis, Carole Faviez, Badisse Dahamna, Pierre Karapetiantz, Sylvie Guillemin-Lanne, Agnès Lillo-Le Louët, Nathalie Texier, Anita Burgun, Sandrine Katsahian. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 07.05.2019.
Références
Pharmacoepidemiol Drug Saf. 2002 Jan-Feb;11(1):3-10
pubmed: 11998548
Drug Saf. 2002;25(6):381-92
pubmed: 12071774
J Am Med Inform Assoc. 2003 Mar-Apr;10(2):115-28
pubmed: 12595401
Drug Saf. 2006;29(5):385-96
pubmed: 16689555
Drug Saf. 2007;30(8):669-75
pubmed: 17696579
Crit Care Med. 2011 May;39(5):952-60
pubmed: 21283005
J Biomed Inform. 2011 Dec;44(6):989-96
pubmed: 21820083
AMIA Annu Symp Proc. 2011;2011:217-26
pubmed: 22195073
AMIA Annu Symp Proc. 2011;2011:1019-26
pubmed: 22195162
J Biomed Inform. 2013 Apr;46(2):275-85
pubmed: 23380683
BMC Med Inform Decis Mak. 2014 Feb 24;14:13
pubmed: 24559132
Drug Saf. 2014 May;37(5):343-50
pubmed: 24777653
BMC Med Inform Decis Mak. 2014 Oct 23;14:91
pubmed: 25341686
J Biomed Inform. 2015 Feb;53:196-207
pubmed: 25451103
J Biomed Inform. 2015 Apr;54:202-12
pubmed: 25720841
J Am Med Inform Assoc. 2015 May;22(3):671-81
pubmed: 25755127
AMIA Annu Symp Proc. 2014 Nov 14;2014:924-33
pubmed: 25954400
Stud Health Technol Inform. 2015;210:526-30
pubmed: 25991203
BMJ. 2015 Oct 28;351:h5527
pubmed: 26511519
Curr Pharm Des. 2016;22(23):3498-526
pubmed: 27157416
BMJ Open. 2017 Jan 19;7(1):e013474
pubmed: 28104709
JMIR Res Protoc. 2017 Sep 21;6(9):e179
pubmed: 28935617
SHB12 (2012). 2012 Oct 29;2012:25-32
pubmed: 28967001
Stud Health Technol Inform. 2017;245:322-326
pubmed: 29295108
JMIR Public Health Surveill. 2018 May 09;4(2):e51
pubmed: 29743155
Pharmacotherapy. 2018 Aug;38(8):822-841
pubmed: 29884988
Eur J Clin Pharmacol. 1998 Jun;54(4):315-21
pubmed: 9696956