Evaluating Amazon's Mechanical Turk as a Tool for Experimental Behavioral Research

TLDR

Amazon Mechanical Turk is an online crowdsourcing platform that attracts experimental psychologists for efficient data collection, yet its uncontrolled testing environment poses challenges for cognitive behavioral experiments requiring sustained attention, complex instructions, and millisecond response accuracy. This paper empirically evaluates the fidelity of AMT for cognitive behavioral experiments. We replicated a range of experimental psychology tasks—including Stroop, Switching, Flanker, Simon, Posner cuing, attentional blink, subliminal priming, and category learning—using AMT participants. Most task replications succeeded qualitatively, confirming that anonymous online data collection is viable, but several showed discrepancies with laboratory results, highlighting important lessons for researchers.

Abstract

Amazon Mechanical Turk (AMT) is an online crowdsourcing service where anonymous online workers complete web-based tasks for small sums of money. The service has attracted attention from experimental psychologists interested in gathering human subject data more efficiently. However, relative to traditional laboratory studies, many aspects of the testing environment are not under the experimenter's control. In this paper, we attempt to empirically evaluate the fidelity of the AMT system for use in cognitive behavioral experiments. These types of experiment differ from simple surveys in that they require multiple trials, sustained attention from participants, comprehension of complex instructions, and millisecond accuracy for response recording and stimulus presentation. We replicate a diverse body of tasks from experimental psychology including the Stroop, Switching, Flanker, Simon, Posner Cuing, attentional blink, subliminal priming, and category learning tasks using participants recruited using AMT. While most of replications were qualitatively successful and validated the approach of collecting data anonymously online using a web-browser, others revealed disparity between laboratory results and online results. A number of important lessons were encountered in the process of conducting these replications that should be of value to other researchers.

References

Page 1

	Year	Citations

Page 1