Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks.

autism biomedical data science citizen healthcare crowdsourcing diagnosis diagnostics digital health human-computer interaction mechanical turk mobile health neuropsychiatric conditions pediatrics

Journal

Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882

Informations de publication

Date de publication:
23 05 2019
Historique:
received: 09 02 2019
accepted: 16 04 2019
revised: 15 04 2019
entrez: 25 5 2019
pubmed: 28 5 2019
medline: 14 2 2020
Statut: epublish

Résumé

Obtaining a diagnosis of neuropsychiatric disorders such as autism requires long waiting times that can exceed a year and can be prohibitively expensive. Crowdsourcing approaches may provide a scalable alternative that can accelerate general access to care and permit underserved populations to obtain an accurate diagnosis. We aimed to perform a series of studies to explore whether paid crowd workers on Amazon Mechanical Turk (AMT) and citizen crowd workers on a public website shared on social media can provide accurate online detection of autism, conducted via crowdsourced ratings of short home video clips. Three online studies were performed: (1) a paid crowdsourcing task on AMT (N=54) where crowd workers were asked to classify 10 short video clips of children as "Autism" or "Not autism," (2) a more complex paid crowdsourcing task (N=27) with only those raters who correctly rated ≥8 of the 10 videos during the first study, and (3) a public unpaid study (N=115) identical to the first study. For Study 1, the mean score of the participants who completed all questions was 7.50/10 (SD 1.46). When only analyzing the workers who scored ≥8/10 (n=27/54), there was a weak negative correlation between the time spent rating the videos and the sensitivity (ρ=-0.44, P=.02). For Study 2, the mean score of the participants rating new videos was 6.76/10 (SD 0.59). The average deviation between the crowdsourced answers and gold standard ratings provided by two expert clinical research coordinators was 0.56, with an SD of 0.51 (maximum possible SD is 3). All paid crowd workers who scored 8/10 in Study 1 either expressed enjoyment in performing the task in Study 2 or provided no negative comments. For Study 3, the mean score of the participants who completed all questions was 6.67/10 (SD 1.61). There were weak correlations between age and score (r=0.22, P=.014), age and sensitivity (r=-0.19, P=.04), number of family members with autism and sensitivity (r=-0.195, P=.04), and number of family members with autism and precision (r=-0.203, P=.03). A two-tailed t test between the scores of the paid workers in Study 1 and the unpaid workers in Study 3 showed a significant difference (P<.001). Many paid crowd workers on AMT enjoyed answering screening questions from videos, suggesting higher intrinsic motivation to make quality assessments. Paid crowdsourcing provides promising screening assessments of pediatric autism with an average deviation <20% from professional gold standard raters, which is potentially a clinically informative estimate for parents. Parents of children with autism likely overfit their intuition to their own affected child. This work provides preliminary demographic data on raters who may have higher ability to recognize and measure features of autism across its wide range of phenotypic manifestations.

Sections du résumé

BACKGROUND
Obtaining a diagnosis of neuropsychiatric disorders such as autism requires long waiting times that can exceed a year and can be prohibitively expensive. Crowdsourcing approaches may provide a scalable alternative that can accelerate general access to care and permit underserved populations to obtain an accurate diagnosis.
OBJECTIVE
We aimed to perform a series of studies to explore whether paid crowd workers on Amazon Mechanical Turk (AMT) and citizen crowd workers on a public website shared on social media can provide accurate online detection of autism, conducted via crowdsourced ratings of short home video clips.
METHODS
Three online studies were performed: (1) a paid crowdsourcing task on AMT (N=54) where crowd workers were asked to classify 10 short video clips of children as "Autism" or "Not autism," (2) a more complex paid crowdsourcing task (N=27) with only those raters who correctly rated ≥8 of the 10 videos during the first study, and (3) a public unpaid study (N=115) identical to the first study.
RESULTS
For Study 1, the mean score of the participants who completed all questions was 7.50/10 (SD 1.46). When only analyzing the workers who scored ≥8/10 (n=27/54), there was a weak negative correlation between the time spent rating the videos and the sensitivity (ρ=-0.44, P=.02). For Study 2, the mean score of the participants rating new videos was 6.76/10 (SD 0.59). The average deviation between the crowdsourced answers and gold standard ratings provided by two expert clinical research coordinators was 0.56, with an SD of 0.51 (maximum possible SD is 3). All paid crowd workers who scored 8/10 in Study 1 either expressed enjoyment in performing the task in Study 2 or provided no negative comments. For Study 3, the mean score of the participants who completed all questions was 6.67/10 (SD 1.61). There were weak correlations between age and score (r=0.22, P=.014), age and sensitivity (r=-0.19, P=.04), number of family members with autism and sensitivity (r=-0.195, P=.04), and number of family members with autism and precision (r=-0.203, P=.03). A two-tailed t test between the scores of the paid workers in Study 1 and the unpaid workers in Study 3 showed a significant difference (P<.001).
CONCLUSIONS
Many paid crowd workers on AMT enjoyed answering screening questions from videos, suggesting higher intrinsic motivation to make quality assessments. Paid crowdsourcing provides promising screening assessments of pediatric autism with an average deviation <20% from professional gold standard raters, which is potentially a clinically informative estimate for parents. Parents of children with autism likely overfit their intuition to their own affected child. This work provides preliminary demographic data on raters who may have higher ability to recognize and measure features of autism across its wide range of phenotypic manifestations.

Identifiants

pubmed: 31124463
pii: v21i5e13668
doi: 10.2196/13668
pmc: PMC6552453
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

e13668

Subventions

Organisme : NIBIB NIH HHS
ID : R01 EB025025
Pays : United States
Organisme : NICHD NIH HHS
ID : R21 HD091500
Pays : United States
Organisme : NLM NIH HHS
ID : T15 LM007033
Pays : United States
Organisme : NLM NIH HHS
ID : T32 LM012409
Pays : United States

Commentaires et corrections

Type : ErratumIn

Informations de copyright

©Peter Washington, Haik Kalantarian, Qandeel Tariq, Jessey Schwartz, Kaitlyn Dunlap, Brianna Chrisman, Maya Varma, Michael Ning, Aaron Kline, Nathaniel Stockham, Kelley Paskov, Catalin Voss, Nick Haber, Dennis Paul Wall. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 23.05.2019.

Références

PLoS Med. 2018 Nov 27;15(11):e1002705
pubmed: 30481180
JMIR Mhealth Uhealth. 2015 Jun 17;3(2):e68
pubmed: 26085230
BMC Med Inform Decis Mak. 2012 Jul 10;12:67
pubmed: 22781312
J Healthc Inform Res. 2019;3:43-66
pubmed: 33313475
J Med Internet Res. 2018 May 15;20(5):e187
pubmed: 29764795
JMIR Mhealth Uhealth. 2017 Sep 21;5(9):e140
pubmed: 28935618
JAMA Pediatr. 2019 May 1;173(5):446-454
pubmed: 30907929
JAMA Pediatr. 2014 Aug;168(8):721-8
pubmed: 24911948
Pediatrics. 2018 Dec;142(6):
pubmed: 30478241
J Med Internet Res. 2018 Apr 24;20(4):e134
pubmed: 29691210
JMIR Hum Factors. 2018 Jan 04;5(1):e1
pubmed: 29301738
J Med Internet Res. 2016 Jan 14;18(1):e12
pubmed: 26769236
J Med Internet Res. 2015 Dec 17;17(12):e281
pubmed: 26678085
Appl Clin Inform. 2018 Jan;9(1):129-140
pubmed: 29466819
NPJ Digit Med. 2018 Aug 2;1:32
pubmed: 31304314
J Autism Dev Disord. 2015 Dec;45(12):4135-9
pubmed: 26183723
Pediatrics. 2005 Dec;116(6):1480-6
pubmed: 16322174
JMIR Ment Health. 2018 Mar 24;5(2):e25
pubmed: 29610109
J Gen Intern Med. 2014 Jan;29(1):187-203
pubmed: 23843021
Pediatr Clin North Am. 2016 Oct;63(5):851-9
pubmed: 27565363

Auteurs

Peter Washington (P)

Department of Bioengineering, Stanford University, Stanford, CA, United States.

Haik Kalantarian (H)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Qandeel Tariq (Q)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Jessey Schwartz (J)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Kaitlyn Dunlap (K)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Brianna Chrisman (B)

Department of Bioengineering, Stanford University, Stanford, CA, United States.

Maya Varma (M)

Department of Computer Science, Stanford University, Stanford, CA, United States.

Michael Ning (M)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Aaron Kline (A)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Nathaniel Stockham (N)

Department of Neuroscience, Stanford University, Stanford, CA, United States.

Kelley Paskov (K)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.

Catalin Voss (C)

Department of Computer Science, Stanford University, Stanford, CA, United States.

Nick Haber (N)

Department of Biomedical Data Science, Stanford University, Stanford, CA, United States.
Department of Pediatrics, Stanford University, Stanford, CA, United States.
Department of Psychology, Stanford University, Stanford, CA, United States.
Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States.

Dennis Paul Wall (DP)

Department of Pediatrics, Stanford University, Stanford, CA, United States.
Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States.
Division of Systems Medicine, Department of Biomedical Data Science, Stanford University, Palo Alto, CA, United States.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH