Developing Crowdsourced Training Data Sets for Pharmacovigilance Intelligent Automation.
Journal
Drug safety
ISSN: 1179-1942
Titre abrégé: Drug Saf
Pays: New Zealand
ID NLM: 9002928
Informations de publication
Date de publication:
03 2021
03 2021
Historique:
accepted:
23
11
2020
pubmed:
24
12
2020
medline:
21
4
2022
entrez:
23
12
2020
Statut:
ppublish
Résumé
Machine learning offers an alluring solution to developing automated approaches to the increasing individual case safety report burden being placed upon pharmacovigilance. Leveraging crowdsourcing to annotate unstructured data may provide accurate, efficient, and contemporaneous training data sets in support of machine learning. The objective of this study was to evaluate whether crowdsourcing can be used to accurately and efficiently develop training data sets in support of pharmacovigilance automation. Pharmacovigilance experts created a reference dataset by reviewing 15,490 de-identified social media posts of narratives pertaining to 15 drugs and 22 medically relevant topics. A random sampling of posts from the reference dataset was published on Amazon Turk and its users (Turkers) were asked a series of questions about those same medical concepts. Accuracy, price elasticity, and time efficiency were evaluated. Accuracy of crowdsourced curation exceeded 90% when compared to the reference dataset and was completed in about 5% of the time. There was an increase in time efficiency with higher pay, but there was no significant difference in accuracy. Additionally, having a social media post reviewed by more than one Turker (using a voting system) did not offer significant improvements in terms of accuracy. Crowdsourcing is an accurate and efficient method that can be used to develop training data sets in support of pharmacovigilance automation. More research is needed to better understand the breadth and depth of possible uses as well as strengths, limitations, and generalizability of results.
Identifiants
pubmed: 33354751
doi: 10.1007/s40264-020-01028-w
pii: 10.1007/s40264-020-01028-w
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
373-382Références
Stergiopoulos S, Fehrle M, Caubel P, Tan L, Jebson L. Adverse drug reaction case safety practices in large biopharmaceutical organizations from 2007 to 2017: an industry survey. Pharm Med. 2019;33(6):499–510.
doi: 10.1007/s40290-019-00307-x
Bate A, Hornbuckle K, Juhaeri J, Motsko SP, Reynolds RF. Hypothesis-free signal detection in healthcare databases: finding its value for pharmacovigilance. Ther Adv Drug Saf. 2019;5(10):2042098619864744.
Li Y, Muthiah M, Routh A, Dorai C. Cognitive computing in action to enhance invoice processing with customized language translation. In: Proceedings of the 2017 IEEE international conference on cognitive computing (ICCC), 25−30 June 2017, Honolulu; 2019. p. 136–139.
Ghosh R, Kempf D, Pufko A, Martinez LFB, Davis CM, Sethi S. Automation opportunities in pharmacovigilance: an industry survey. Pharm Med. 2020;34(1):7–18.
doi: 10.1007/s40290-019-00320-0
Lewis DJ, McCallum JF. Utilizing advanced technologies to augment pharmacovigilance systems: challenges and opportunities. Ther Innov Regul Sci. 2020;54(4):888–99.
doi: 10.1007/s43441-019-00023-3
Comfort S, Perera S, Hudson Z, Dorrell D, Meireis S, Nagarajan M, et al. Sorting through the safety data haystack: using machine learning to identify individual case safety reports in social-digital media. Drug Saf. 2018;41(6):579–90.
doi: 10.1007/s40264-018-0641-7
Abatemarco D, Perera S, Bao SH, Desai S, Assuncao B, Tetarenko N, et al. Training augmented intelligent capabilities for pharmacovigilance: applying deep-learning approaches to individual case safety report processing. Pharm Med. 2018;32(6):391–401.
doi: 10.1007/s40290-018-0251-9
Merriam Webster Dictionary. https://www.merriam-webster.com/dictionary/crowdsourcing . Accessed 18 Oct 2017.
Khare R, Burger JD, Aberdeen JS, et al. Scaling drug indication curation through crowdsourcing. Database (Oxford). 2015;2015:bav016.
doi: 10.1093/database/bav016
Khare R, Good BM, Leaman R, et al. Crowdsourcing in biomedicine: challenges and opportunities. Brief Bioinform. 2016;17(1):23–32.
doi: 10.1093/bib/bbv021
Bentzien J, Bharadwaj R, Thompson DC. Crowdsourcing in pharma: a strategic framework. Drug Discov Today. 2015;20(7):874–83.
doi: 10.1016/j.drudis.2015.01.011
MacLean DL, Heer J. Identifying medical terms in patient-authored text: a crowdsourcing-based approach. J Am Med Inform Assoc. 2013;20:1120–7.
doi: 10.1136/amiajnl-2012-001110
Bate A, Beckmann J, Dodoo A, Härmark L, Hartigan-Go K, Hegerius A, et al. Developing a crowdsourcing approach and tool for pharmacovigilance education material delivery. Drug Saf. 2017;40(3):191–9.
doi: 10.1007/s40264-016-0495-9
Casperson TA, Painter JL, Dietrich J. Strategies for distributed curation of social media data for safety and pharmacovigilance. In: Proceedings of the international conference on data mining (MDIN 2016); 27 July 2016; Las Vegas (NV).
Ross J, Irani I, Silberman M, et al. Who are the crowdworkers? Shifting demographics in Amazon Mechanical Turk. In: ACM CHI conference, April 2010, Atlanta; 2010. p. 2863–2872.
Mason W, Suri S. Conducting behavioral research on Amazon’s Mechanical Turk. Behav Res Methods. 2012;44:1–23.
doi: 10.3758/s13428-011-0124-6
Paolacci G, Chandler J, Ipeirotis PG. Running experiments on Amazon Mechanical Turk. Judgm Decis Mak. 2010;5(5):411–9.
Crump MJ, McDonnell JV, Gureckis TM. Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. PLoS ONE. 2013;8(3):e57410.
doi: 10.1371/journal.pone.0057410
Cheung JH, Burns DK, Sinclair RR, Sliter M. Amazon Mechanical Turk in organizational psychology: an evaluation and practical recommendations. J Business Psychol. 2017;32(4):347–61.
doi: 10.1007/s10869-016-9458-5
Introduction to Amazon Mechanical Turk. Amazon Mechanical Turk developer guide. Amazon Web Services; 2018. https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMechanicalTurkGettingStartedGuide/SvcIntro.html . Accessed 12 Feb 2020.
US FDA. Enhancing tobacco surveillance through online monitoring. https://www.fda.gov/tobacco-products/research/enhancing-tobacco-surveillance-through-online-monitoring . Accessed 30 Nov 2020.
Pierce CE, Bouri K, Pamer C, et al. Evaluation of Facebook and Twitter monitoring to detect safety signals for medical products: an analysis of recent FDA safety alerts. Drug Saf. 2017;40(4):317–31. https://doi.org/10.1007/s40264-016-0491-0 .
doi: 10.1007/s40264-016-0491-0
pubmed: 28044249
pmcid: 5362648
Policy & Medicine. FDA releases MedWatcher reporting for healthcare providers, patients and caregivers. https://www.policymed.com/2014/01/fda-releases-medwatcher-reporting-for-healthcare-providers-patients-and-caregivers.html . Accessed 30 Nov 2020.
van Stekelenborg J, Ellenius J, Maskell S, et al. Recommendations for the use of social media in pharmacovigilance: lessons from IMI WEB-RADR. Drug Saf. 2019;42:1393–407. https://doi.org/10.1007/s40264-019-00858-7 .
doi: 10.1007/s40264-019-00858-7
pubmed: 31446567
pmcid: 6858385
Pierce CE, de Vries ST, Bodin-Parssinen S, Härmark L, Tregunno P, Lewis DJ, et al. Recommendations on the use of mobile applications for the collection and communication of pharmaceutical product safety information: lessons from IMI WEB-RADR. Drug Saf. 2019;42(4):477–89. https://doi.org/10.1007/s40264-019-00813-6 .
doi: 10.1007/s40264-019-00813-6
pubmed: 30911975
pmcid: 6450855
Öztamur D, Karakadılarilhan IS. Exploring the role of social media for SMEs: as a new marketing strategy tool for the firm performance perspective. Proc Soc Behav Sci. 2014;150:511–50.
doi: 10.1016/j.sbspro.2014.09.067
Dey L, Haque SM, Khurdiya A, Shroff G. Acquiring competitive intelligence from social media. In: Proceedings of the 2011 joint workshop on multilingual OCR and analytics for noisy unstructured text data; 2011: p. 3.
Facebook. An update on our plans to restrict data access on Facebook. https://about.fb.com/news/2018/04/restricting-data-access/ . Accessed 30 Nov 2020.
Rosen A. Tweeting made easier. https://blog.twitter.com/en_us/topics/product/2017/tweetingmadeeasier.html . Accessed 30 Nov 2020.
Landers RN, Behrend TS. An inconvenient truth: arbitrary distinctions between organizational, Mechanical Turk, and other convenience samples. Ind Organ Psychol. 2015;8(2):142–64.
doi: 10.1017/iop.2015.13
Ipeirotis PG. Demographics of mechanical turk (March 2010). NYU Working Paper No. CEDER-10-01, Available at SSRN: https://ssrn.com/abstract=1585030
Suri S, Goldstein DG, Mason WA. Honesty in an online labor market. Proceedings of the 3rd Human Computation Workshop (HCOMP); August 2011; San Francisco (CA).
Ipeirotis PG, Provost F, Wang J. Quality management on Amazon Mechanical Turk. In: HCOMP '10: Proceedings of the ACM SIGKDD Workshop on Human Computation. Washington, DC; 2010. p. 64–67. https://doi.org/10.1145/1837885.1837906
Bentley FR, Daskalova N, White B. CHI EA ’17: Proceedings of the 2017 CHI Conference extended abstracts on human factors in computing systems; 2017; pp. 1092–9.
Buhrmester M, Kwang T, Gosling S. Amazon’s Mechanical Turk: a new source of inexpensive, yet high-quality, data? Perspect Psychol Sci. 2011;1:3–5.
doi: 10.1177/1745691610393980
Nowak S, Rüger S. How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: Proceedings of the international conference on multimedia information retrieval; 2010; pp. 557–66.
Hsueh PY, Melville P, Sindhwani V. Data quality from crowdsourcing: a study of annotation selection criteria. In: Proceedings of the NAACL HLT 2009 workshop on active learning for natural language processing; 2019: p. 27–35.
Zubiaga A, Liakata M, Procter R, Bontcheva K, Tolmie P. Crowdsourcing the annotation of rumourous conversations in social media. In: Proceedings of the 24th international conference on World Wide Web; 2015; pp. 347–53.
Good BM, Nanis M, Wu C, Su AI. Microtask crowdsourcing for disease mention annotation in PubMed abstracts. Pacific Symposium on Biocomputing Co-Chairs, 3–7 January 2014, Fairmont Orchid, Big Island of Hawaii; 2014, p. 282–293.
Bourhis P, Demartini G, Elbassuoni S, Hoareau E, Rao HR. Ethical challenges in the future of work. Data Eng. 2019;55:55–64.
Adda G, Cohen KB. Amazon Mechanical Turk: gold mine or coal mine. Comput Lingustics. 2017;37(2):2–10.
Newsweek. The internet creates a new kind of sweatshop. https://www.newsweek.com/internet-creates-new-kind-sweatshop-75751 . Accessed 1 Dec 2020.
Cohen KB, Fort K, Adda G, et al. Ethical issues in corpus linguistics and annotation: pay per HIT does not affect hourly rate for linguistic resource development on Amazon Mechanical Turk. LREC Int Conf Lang Resour Eval. 2016;W40:8–12.
Busarovs A. Ethical aspects of crowdsourcing, or is it a modern form of exploitation. Int J Econ Business Admin. 2017;1(1):3–14.
Wertheimer A. Exploitation. In: Zalta EN, editor. The Stanford Encyclopaedia of Philosophy (Fall Edition). 2018. http://plato.stanford.edu/archives/fall2008/entries/exploitation/ . Accessed 30 Nov 2020.