Abstract
Online labor markets have great potential as platforms for conducting experiments. They provide immediate access to a large and diverse subject pool, and allow researchers to control the experimental context. Online experiments, we show, can be just as valid—both internally and externally—as laboratory and field experiments, while often requiring far less money and time to design and conduct. To demonstrate their value, we use an online labor market to replicate three classic experiments. The first finds quantitative agreement between levels of cooperation in a prisoner’s dilemma played online and in the physical laboratory. The second shows—consistent with behavior in the traditional laboratory—that online subjects respond to priming by altering their choices. The third demonstrates that when an identical decision is framed differently, individuals reverse their choice, thus replicating a famed Tversky-Kahneman result. Then we conduct a field experiment showing that workers have upward-sloping labor supply curves. Finally, we analyze the challenges to online experiments, proposing methods to cope with the unique threats to validity in an online setting, and examining the conceptual issues surrounding the external validity of online results. We conclude by presenting our views on the potential role that online experiments can play within the social sciences, and then recommend software development priorities and best practices.
Similar content being viewed by others
References
Andreoni, J. (1990). Impure altruism and donations to public goods: a theory of warm-glow giving. The Economic Journal, 464–477.
Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211(4489), 1390.
Bainbridge, W. S. (2007). The scientific research potential of virtual worlds. Science, 317(5837), 472.
Benjamin, D. J., Choi, J. J., & Strickland, A. (2010a). Social identity and preferences. American Economic Review (forthcoming).
Benjamin, D. J., Choi, J. J., Strickland, A., & Fisher, G. (2010b) Religious identity and economic behavior. Cornell University, Mimeo.
Bohnet, I., Greig, F., Herrmann, B., & Zeckhauser, R. (2008). Betrayal aversion: evidence from Brazil, China, Oman, Switzerland, Turkey, and the United States. American Economic Review, 98(1), 294–310.
Brandts, J., & Charness, G. (2000). Hot vs. cold: sequential responses and preference stability in experimental games. Experimental Economics, 2(3), 227–238.
Camerer, C. (2003). Behavioral game theory: experiments in strategic interaction. Princeton: Princeton University Press.
Chandler, D., & Kapelner, A. (2010). Breaking monotony with meaning: motivation in crowdsourcing markets. University of Chicago, Mimeo.
Chen, D., & Horton, J. (2010). The wages of pay cuts: evidence from a field experiment. Harvard University, Mimeo.
Chilton, L. B., Sims, C. T., Goldman, M., Little, G., & Miller, R. C. (2009). Seaweed: a web application for designing economic games. In Proceedings of the ACM SIGKDD workshop on human computation (pp. 34–35). New York: ACM Press.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation. Boston: Houghton Mifflin.
Eckel, C. C., & Wilson, R. K. (2006). Internet cautions: experimental games with Internet partners. Experimental Economics, 9(1), 53–66.
Falk, A., & Heckman, J. J. (2009). Lab experiments are a major source of knowledge in the social sciences. Science, 326(5952), 535.
Fehr, E., & Schmidt, K. M. (1999). A theory of fairness, competition, and cooperation. Quarterly Journal of Economics, 114(3), 817–868.
Fehr, E., Schmidt, K. M., & Gächter, S. (2000). Cooperation and punishment in public goods experiments. American Economic Review, 90(4), 980–994.
Fischbacher, U. (2007). z- tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10(2), 171–178.
Frei, B. (2009). Paid crowdsourcing: current state & progress toward mainstream business use. Produced by Smartsheet.com.
Gneezy, U., Leonard, K. L., & List, J. A. (2009). Gender differences in competition: evidence from a matrilineal and a patriarchal society. Econometrica, 77(5), 1637–1664.
Harrison, G. W., & List, J. A. (2004). Field experiments. Journal of Economic Literature, 42(4), 1009–1055.
Herrmann, B., & Thöni, C. (2009). Measuring conditional cooperation: a replication study in Russia. Experimental Economics, 12(1), 87–92.
Horton, J. (2010). Online labor markets. In Workshop on Internet and network economics (pp. 515–522).
Horton, J. (2011). The condition of the Turking class: are online employers fair and honest? Economic Letters (forthcoming).
Horton, J. & Chilton, L. (2010). The labor economics of paid crowdsourcing. In Proceedings of the 11th ACM conference on electronic commerce.
Ipeirotis, P. (2010). Demographics of Mechanical Turk. New York University Working Paper.
Kagel, J. H., Roth, A. E., & Hey, J. D. (1995). The handbook of experimental economics. Princeton: Princeton University Press.
Kittur, A., Chi, E. H., & Suh, B. (2008). Crowdsourcing user studies with Mechanical Turk.
Kocher, M. G., & Sutter, M. (2005). The decision maker matters: individual versus group behaviour in experimental beauty-contest games*. The Economic Journal, 115(500), 200–223.
Levitt, S. D., & List, J. A. (2009). Field experiments in economics: the past, the present, and the future. European Economic Review, 53(1), 1–18.
Little, G., Chilton, L. B., Goldman, M., & Miller, R. C. (2009). TurKit: tools for iterative tasks on Mechanical Turk. In Proceedings of the ACM SIGKDD workshop on human computation. New York: ACM Press.
Lucking-Reiley, D. (2000). Auctions on the Internet: what’s being auctioned, and how? The Journal of Industrial Economics, 48(3), 227–252.
Mason, W., & Watts, D. J. (2009). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD workshop on human computation (pp. 77–85). New York: ACM Press.
Mason, W., Watts, D. J., & Suri, S. (2010). Conducting behavioral research on Amazon’s Mechanical Turk. SSRN eLibrary.
Pallais, A. (2010). Inefficient hiring in entry-level labor markets.
Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5.
Resnick, P., Kuwabara, K., Zeckhauser, R., & Friedman, E. (2000). Reputation systems. Communications of the ACM, 43(12), 45–48.
Resnick, P., Zeckhauser, R., Swanson, J., & Lockwood, K. (2006). The value of reputation on eBay: a controlled experiment. Experimental Economics, 9(2), 79–101.
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.
Selten, R. (1967). Die Strategiemethode zur Erforschung des eingeschrankt rationalen Verhaltens im Rahmen eines Oligopolexperiments. Beitrage zur experimentellen Wirtschaftsforschung, 1, 136–168.
Shariff, A. F., & Norenzayan, A. (2007). God is watching you. Psychological Science, 18(9), 803–809.
Sheng, V. S., Provost, F., & Ipeirotis, P. G. (2008). Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 614–622). New York: ACM Press.
Sorokin, A., & Forsyth, D. (2008). Utility data annotation with Amazon Mechanical Turk. University of Illinois at Urbana-Champaign, Mimeo, 51, 61820.
Suri, S., & Watts, D. (2011). A study of cooperation and contagion in web-based, networked public goods experiments. PLoS ONE (forthcoming).
Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453.
von Ahn, L., Blum, M., Hopper, N. J., & Langford, J. (2003). CAPTCHA: using hard AI problems for security. In Lecture notes in computer science (pp. 294–311). Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
Thanks to Alex Breinin and Xiaoqi Zhu for excellent research assistance. Thanks to Samuel Arbesman, Dana Chandler, Anna Dreber, Rezwan Haque, Justin Keenan, Robin Yerkes Horton, Stephanie Hurder and Michael Manapat for helpful comments, as well as to participants in the Online Experimentation Workshop hosted by Harvard’s Berkman Center for Internet and Society. Thanks to Anna Dreber, Elizabeth Paci and Yochai Benkler for assistance running the physical laboratory replication study, and to Sarah Hirschfeld-Sussman and Mark Edington for their help with surveying the Harvard Decision Science Laboratory subject pool. This research has been supported by the NSF-IGERT program “Multidisciplinary Program in Inequality and Social Policy” at Harvard University (Grant No. 0333403), and DGR gratefully acknowledges financial support from the John Templeton Foundation’s Foundational Questions in Evolutionary Biology Prize Fellowship.
Rights and permissions
About this article
Cite this article
Horton, J.J., Rand, D.G. & Zeckhauser, R.J. The online laboratory: conducting experiments in a real labor market. Exp Econ 14, 399–425 (2011). https://doi.org/10.1007/s10683-011-9273-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10683-011-9273-9