Quality control questions on Amazon's Mechanical Turk (MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7

dc.contributor.authorAgley, Jon
dc.contributor.authorXiao, Yunyu
dc.contributor.authorNolan, Rachael
dc.contributor.authorGolzarri-Arroyo, Lilian
dc.contributor.departmentSchool of Social Worken_US
dc.date.accessioned2021-11-12T19:44:33Z
dc.date.available2021-11-12T19:44:33Z
dc.date.issued2021-08-06
dc.descriptionThis article is made available for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.en_US
dc.description.abstractCrowdsourced psychological and other biobehavioral research using platforms like Amazon's Mechanical Turk (MTurk) is increasingly common - but has proliferated more rapidly than studies to establish data quality best practices. Thus, this study investigated whether outcome scores for three common screening tools would be significantly different among MTurk workers who were subject to different sets of quality control checks. We conducted a single-stage, randomized controlled trial with equal allocation to each of four study arms: Arm 1 (Control Arm), Arm 2 (Bot/VPN Check), Arm 3 (Truthfulness/Attention Check), and Arm 4 (Stringent Arm - All Checks). Data collection was completed in Qualtrics, to which participants were referred from MTurk. Subjects (n = 1100) were recruited on November 20-21, 2020. Eligible workers were required to claim U.S. residency, have a successful task completion rate > 95%, have completed a minimum of 100 tasks, and have completed a maximum of 10,000 tasks. Participants completed the US-Alcohol Use Disorders Identification Test (USAUDIT), the Patient Health Questionnaire (PHQ-9), and a screener for Generalized Anxiety Disorder (GAD-7). We found that differing quality control approaches significantly, meaningfully, and directionally affected outcome scores on each of the screening tools. Most notably, workers in Arm 1 (Control) reported higher scores than those in Arms 3 and 4 for all tools, and a higher score than workers in Arm 2 for the PHQ-9. These data suggest that the use, or lack thereof, of quality control questions in crowdsourced research may substantively affect findings, as might the types of quality control items.en_US
dc.description.sponsorshipThis study was funded by the Office of the Vice Provost of Research at Indiana University Bloomington through the Grant-in-Aid program.en_US
dc.eprint.versionFinal published versionen_US
dc.identifier.citationAgley, J., Xiao, Y., Nolan, R., & Golzarri-Arroyo, L. (2021). Quality control questions on Amazon’s Mechanical Turk (MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7. Behavior Research Methods. https://doi.org/10.3758/s13428-021-01665-8en_US
dc.identifier.issn1554-3528en_US
dc.identifier.urihttps://hdl.handle.net/1805/26989
dc.language.isoen_USen_US
dc.publisherSpringeren_US
dc.relation.isversionof10.3758/s13428-021-01665-8en_US
dc.relation.journalBehavior Research Methodsen_US
dc.rightsPublic Health Emergencyen_US
dc.sourcePublisheren_US
dc.subjectcrowdsourced samplingen_US
dc.subjectdata qualityen_US
dc.subjectMTurken_US
dc.subjectreproducibilityen_US
dc.titleQuality control questions on Amazon's Mechanical Turk (MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7en_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Agley2021Quality-PHE.pdf
Size:
921.91 KB
Format:
Adobe Portable Document Format
Description:
Article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: