Threats of Bots and Other Bad Actors to Data Quality Following Research Participant Recruitment Through Social Media: Cross-Sectional Questionnaire.


Journal

Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882

Informations de publication

Date de publication:
07 10 2020
Historique:
received: 30 07 2020
accepted: 16 09 2020
revised: 16 09 2020
entrez: 7 10 2020
pubmed: 8 10 2020
medline: 30 1 2021
Statut: epublish

Résumé

Recruitment of health research participants through social media is becoming more common. In the United States, 80% of adults use at least one social media platform. Social media platforms may allow researchers to reach potential participants efficiently. However, online research methods may be associated with unique threats to sample validity and data integrity. Limited research has described issues of data quality and authenticity associated with the recruitment of health research participants through social media, and sources of low-quality and fraudulent data in this context are poorly understood. The goal of the research was to describe and explain threats to sample validity and data integrity following recruitment of health research participants through social media and summarize recommended strategies to mitigate these threats. Our experience designing and implementing a research study using social media recruitment and online data collection serves as a case study. Using published strategies to preserve data integrity, we recruited participants to complete an online survey through the social media platforms Twitter and Facebook. Participants were to receive $15 upon survey completion. Prior to manually issuing remuneration, we reviewed completed surveys for indicators of fraudulent or low-quality data. Indicators attributable to respondent error were labeled suspicious, while those suggesting misrepresentation were labeled fraudulent. We planned to remove cases with 1 fraudulent indicator or at least 3 suspicious indicators. Within 7 hours of survey activation, we received 271 completed surveys. We classified 94.5% (256/271) of cases as fraudulent and 5.5% (15/271) as suspicious. In total, 86.7% (235/271) provided inconsistent responses to verifiable items and 16.2% (44/271) exhibited evidence of bot automation. Of the fraudulent cases, 53.9% (138/256) provided a duplicate or unusual response to one or more open-ended items and 52.0% (133/256) exhibited evidence of inattention. Research findings from several disciplines suggest studies in which research participants are recruited through social media are susceptible to data quality issues. Opportunistic individuals who use virtual private servers to fraudulently complete research surveys for profit may contribute to low-quality data. Strategies to preserve data integrity following research participant recruitment through social media are limited. Development and testing of novel strategies to prevent and detect fraud is a research priority.

Sections du résumé

BACKGROUND
Recruitment of health research participants through social media is becoming more common. In the United States, 80% of adults use at least one social media platform. Social media platforms may allow researchers to reach potential participants efficiently. However, online research methods may be associated with unique threats to sample validity and data integrity. Limited research has described issues of data quality and authenticity associated with the recruitment of health research participants through social media, and sources of low-quality and fraudulent data in this context are poorly understood.
OBJECTIVE
The goal of the research was to describe and explain threats to sample validity and data integrity following recruitment of health research participants through social media and summarize recommended strategies to mitigate these threats. Our experience designing and implementing a research study using social media recruitment and online data collection serves as a case study.
METHODS
Using published strategies to preserve data integrity, we recruited participants to complete an online survey through the social media platforms Twitter and Facebook. Participants were to receive $15 upon survey completion. Prior to manually issuing remuneration, we reviewed completed surveys for indicators of fraudulent or low-quality data. Indicators attributable to respondent error were labeled suspicious, while those suggesting misrepresentation were labeled fraudulent. We planned to remove cases with 1 fraudulent indicator or at least 3 suspicious indicators.
RESULTS
Within 7 hours of survey activation, we received 271 completed surveys. We classified 94.5% (256/271) of cases as fraudulent and 5.5% (15/271) as suspicious. In total, 86.7% (235/271) provided inconsistent responses to verifiable items and 16.2% (44/271) exhibited evidence of bot automation. Of the fraudulent cases, 53.9% (138/256) provided a duplicate or unusual response to one or more open-ended items and 52.0% (133/256) exhibited evidence of inattention.
CONCLUSIONS
Research findings from several disciplines suggest studies in which research participants are recruited through social media are susceptible to data quality issues. Opportunistic individuals who use virtual private servers to fraudulently complete research surveys for profit may contribute to low-quality data. Strategies to preserve data integrity following research participant recruitment through social media are limited. Development and testing of novel strategies to prevent and detect fraud is a research priority.

Identifiants

pubmed: 33026360
pii: v22i10e23021
doi: 10.2196/23021
pmc: PMC7578815
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e23021

Subventions

Organisme : NCI NIH HHS
ID : U54 CA156732
Pays : United States

Informations de copyright

©Rachel Pozzar, Marilyn J Hammer, Meghan Underhill-Blazey, Alexi A Wright, James A Tulsky, Fangxin Hong, Daniel A Gundersen, Donna L Berry. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 07.10.2020.

Références

Contemp Clin Trials. 2015 Nov;45(Pt A):41-54
pubmed: 26176884
JMIR Public Health Surveill. 2019 Feb 04;5(1):e12344
pubmed: 30714944
Int J Methods Psychiatr Res. 2014 Mar;23(1):120-9
pubmed: 24431134
Educ Psychol Meas. 2016 Dec;76(6):912-932
pubmed: 29795893
JMIR Res Protoc. 2018 Apr 24;7(4):e96
pubmed: 29691203
J Med Internet Res. 2016 Nov 7;18(11):e286
pubmed: 27821383
Digit Health. 2018 May 07;4:2055207618771757
pubmed: 29942634
Internet Interv. 2019 Apr 12;17:100246
pubmed: 31080751
Appl Nurs Res. 2016 Nov;32:144-147
pubmed: 27969019
J Law Med Ethics. 2015 Spring;43(1):116-33
pubmed: 25846043
Nurs Res. 2019 Nov/Dec;68(6):423-432
pubmed: 31693547
Fam Relat. 2016 Oct;65(4):550-561
pubmed: 28804184
J Biomed Inform. 2009 Apr;42(2):377-81
pubmed: 18929686
J Med Internet Res. 2017 Aug 28;19(8):e290
pubmed: 28851679
West J Nurs Res. 2019 Sep;41(9):1270-1281
pubmed: 30729866
JMIR Res Protoc. 2016 Aug 10;5(3):e161
pubmed: 27511829
Internet Interv. 2014 Apr;1(2):58-64
pubmed: 25045624
J Med Internet Res. 2016 Nov 15;18(11):e288
pubmed: 27847353
J Med Internet Res. 2018 Nov 08;20(11):e290
pubmed: 30409765
J Med Internet Res. 2016 Jun 15;18(6):e117
pubmed: 27306780
Health Place. 2019 Jan;55:37-42
pubmed: 30466814

Auteurs

Rachel Pozzar (R)

Phyllis F Cantor Center for Research in Nursing and Patient Care Services, Dana-Farber Cancer Institute, Boston, MA, United States.

Marilyn J Hammer (MJ)

Phyllis F Cantor Center for Research in Nursing and Patient Care Services, Dana-Farber Cancer Institute, Boston, MA, United States.

Meghan Underhill-Blazey (M)

Phyllis F Cantor Center for Research in Nursing and Patient Care Services, Dana-Farber Cancer Institute, Boston, MA, United States.
School of Nursing, University of Rochester, Rochester, NY, United States.

Alexi A Wright (AA)

McGraw/Patterson Center for Population Sciences, Dana-Farber Cancer Institute, Boston, MA, United States.

James A Tulsky (JA)

Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, MA, United States.

Fangxin Hong (F)

Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, United States.

Daniel A Gundersen (DA)

Survey and Data Management Core, Dana-Farber Cancer Institute, Boston, MA, United States.

Donna L Berry (DL)

Phyllis F Cantor Center for Research in Nursing and Patient Care Services, Dana-Farber Cancer Institute, Boston, MA, United States.
Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle, WA, United States.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH