Threats of Bots and Other Bad Actors to Data Quality Following Research Participant Recruitment Through Social Media: Cross-Sectional Questionnaire.

Cross-Sectional Studies Data Accuracy Female Humans Male Patient Selection / ethics Research Design Social Media / statistics & numerical data Surveys and Questionnaires

data accuracy fraud internet methods social media

Journal

Journal of medical Internet research

ISSN: 1438-8871

Titre abrégé: J Med Internet Res

Pays: Canada

ID NLM: 100959882

Informations de publication

Date de publication:
07 10 2020

Historique:

received: 30 07 2020

accepted: 16 09 2020

revised: 16 09 2020

entrez: 7 10 2020

pubmed: 8 10 2020

medline: 30 1 2021

Statut: epublish

Résumé

Recruitment of health research participants through social media is becoming more common. In the United States, 80% of adults use at least one social media platform. Social media platforms may allow researchers to reach potential participants efficiently. However, online research methods may be associated with unique threats to sample validity and data integrity. Limited research has described issues of data quality and authenticity associated with the recruitment of health research participants through social media, and sources of low-quality and fraudulent data in this context are poorly understood. The goal of the research was to describe and explain threats to sample validity and data integrity following recruitment of health research participants through social media and summarize recommended strategies to mitigate these threats. Our experience designing and implementing a research study using social media recruitment and online data collection serves as a case study. Using published strategies to preserve data integrity, we recruited participants to complete an online survey through the social media platforms Twitter and Facebook. Participants were to receive $15 upon survey completion. Prior to manually issuing remuneration, we reviewed completed surveys for indicators of fraudulent or low-quality data. Indicators attributable to respondent error were labeled suspicious, while those suggesting misrepresentation were labeled fraudulent. We planned to remove cases with 1 fraudulent indicator or at least 3 suspicious indicators. Within 7 hours of survey activation, we received 271 completed surveys. We classified 94.5% (256/271) of cases as fraudulent and 5.5% (15/271) as suspicious. In total, 86.7% (235/271) provided inconsistent responses to verifiable items and 16.2% (44/271) exhibited evidence of bot automation. Of the fraudulent cases, 53.9% (138/256) provided a duplicate or unusual response to one or more open-ended items and 52.0% (133/256) exhibited evidence of inattention. Research findings from several disciplines suggest studies in which research participants are recruited through social media are susceptible to data quality issues. Opportunistic individuals who use virtual private servers to fraudulently complete research surveys for profit may contribute to low-quality data. Strategies to preserve data integrity following research participant recruitment through social media are limited. Development and testing of novel strategies to prevent and detect fraud is a research priority.

Sections du résumé

BACKGROUND

OBJECTIVE

The goal of the research was to describe and explain threats to sample validity and data integrity following recruitment of health research participants through social media and summarize recommended strategies to mitigate these threats. Our experience designing and implementing a research study using social media recruitment and online data collection serves as a case study.

METHODS

Using published strategies to preserve data integrity, we recruited participants to complete an online survey through the social media platforms Twitter and Facebook. Participants were to receive $15 upon survey completion. Prior to manually issuing remuneration, we reviewed completed surveys for indicators of fraudulent or low-quality data. Indicators attributable to respondent error were labeled suspicious, while those suggesting misrepresentation were labeled fraudulent. We planned to remove cases with 1 fraudulent indicator or at least 3 suspicious indicators.

RESULTS

Within 7 hours of survey activation, we received 271 completed surveys. We classified 94.5% (256/271) of cases as fraudulent and 5.5% (15/271) as suspicious. In total, 86.7% (235/271) provided inconsistent responses to verifiable items and 16.2% (44/271) exhibited evidence of bot automation. Of the fraudulent cases, 53.9% (138/256) provided a duplicate or unusual response to one or more open-ended items and 52.0% (133/256) exhibited evidence of inattention.

CONCLUSIONS

Research findings from several disciplines suggest studies in which research participants are recruited through social media are susceptible to data quality issues. Opportunistic individuals who use virtual private servers to fraudulently complete research surveys for profit may contribute to low-quality data. Strategies to preserve data integrity following research participant recruitment through social media are limited. Development and testing of novel strategies to prevent and detect fraud is a research priority.

Identifiants

DOI: 10.2196/23021 PMID: 33026360 PMC: PMC7578815

pubmed: 33026360

pii: v22i10e23021

doi: 10.2196/23021

pmc: PMC7578815

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

e23021

Subventions

Organisme : NCI NIH HHS

ID : U54 CA156732

Pays : United States

Informations de copyright

©Rachel Pozzar, Marilyn J Hammer, Meghan Underhill-Blazey, Alexi A Wright, James A Tulsky, Fangxin Hong, Daniel A Gundersen, Donna L Berry. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 07.10.2020.

Références

Contemp Clin Trials. 2015 Nov;45(Pt A):41-54

pubmed: 26176884

JMIR Public Health Surveill. 2019 Feb 04;5(1):e12344

pubmed: 30714944

Int J Methods Psychiatr Res. 2014 Mar;23(1):120-9

pubmed: 24431134

Educ Psychol Meas. 2016 Dec;76(6):912-932

pubmed: 29795893

JMIR Res Protoc. 2018 Apr 24;7(4):e96

pubmed: 29691203

J Med Internet Res. 2016 Nov 7;18(11):e286

pubmed: 27821383

Digit Health. 2018 May 07;4:2055207618771757

pubmed: 29942634

Internet Interv. 2019 Apr 12;17:100246

pubmed: 31080751

Appl Nurs Res. 2016 Nov;32:144-147

pubmed: 27969019

J Law Med Ethics. 2015 Spring;43(1):116-33

pubmed: 25846043

Nurs Res. 2019 Nov/Dec;68(6):423-432

pubmed: 31693547

Fam Relat. 2016 Oct;65(4):550-561

pubmed: 28804184

J Biomed Inform. 2009 Apr;42(2):377-81

pubmed: 18929686

J Med Internet Res. 2017 Aug 28;19(8):e290

pubmed: 28851679

West J Nurs Res. 2019 Sep;41(9):1270-1281

pubmed: 30729866

JMIR Res Protoc. 2016 Aug 10;5(3):e161

pubmed: 27511829

Internet Interv. 2014 Apr;1(2):58-64

pubmed: 25045624

J Med Internet Res. 2016 Nov 15;18(11):e288

pubmed: 27847353

J Med Internet Res. 2018 Nov 08;20(11):e290

pubmed: 30409765

J Med Internet Res. 2016 Jun 15;18(6):e117

pubmed: 27306780

Health Place. 2019 Jan;55:37-42

pubmed: 30466814

Threats of Bots and Other Bad Actors to Data Quality Following Research Participant Recruitment Through Social Media: Cross-Sectional Questionnaire.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Informations de copyright

Références

Auteurs

Rachel Pozzar (R)

Marilyn J Hammer (MJ)

Meghan Underhill-Blazey (M)

Alexi A Wright (AA)

James A Tulsky (JA)

Fangxin Hong (F)

Daniel A Gundersen (DA)

Donna L Berry (DL)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH