research-article

Designing and deploying online field experiments

Authors:

Michael S. BernsteinAuthors Info & Claims

WWW '14: Proceedings of the 23rd international conference on World wide web

Pages 283 - 292

https://doi.org/10.1145/2566486.2567967

Published: 07 April 2014 Publication History

Abstract

Online experiments are widely used to compare specific design alternatives, but they can also be used to produce generalizable knowledge and inform strategic decision making. Doing so often requires sophisticated experimental designs, iterative refinement, and careful logging and analysis. Few tools exist that support these needs. We thus introduce a language for online field experiments called PlanOut. PlanOut separates experimental design from application code, allowing the experimenter to concisely describe experimental designs, whether common "A/B tests" and factorial designs, or more complex designs involving conditional logic or multiple experimental units. These latter designs are often useful for understanding causal mechanisms involved in user behaviors. We demonstrate how experiments from the literature can be implemented in PlanOut, and describe two large field experiments conducted on Facebook with PlanOut. For common scenarios in which experiments are run iteratively and in parallel, we introduce a namespaced management system that encourages sound experimental practice.

References

[1]

Aronow, P., and Samii, C. Estimating average causal effects under interference between units. Manuscript, 2013.

[2]

Bakshy, E., and Eckles, D. Uncertainty in online experiments with dependent data: An evaluation of bootstrap methods. In Proc. of the 19th ACM SIGKDD conference on knowledge discovery and data mining, ACM (2013).

Digital Library

[3]

Bakshy, E., Eckles, D., Yan, R., and Rosenn, I. Social influence in social advertising: Evidence from field experiments. In Proc. of the 13th ACM Conference on Electronic Commerce, ACM (2012), 146--161.

Digital Library

[4]

Bakshy, E., Rosenn, I., Marlow, C., and Adamic, L. The role of social networks in information diffusion. In Proc. of the 21st international conference on World Wide Web, ACM (2012), 519--528.

Digital Library

[5]

Bareinboim, E., and Pearl, J. Transportability of causal effects: Completeness results. In Proc. of the Twenty-Sixth National Conference on Artificial Intelligence, AAAI (2012).

[6]

Beenen, G., Ling, K., Wang, X., Chang, K., Frankowski, D., Resnick, P., and Kraut, R. E. Using social psychology to motivate contributions to online communities. In Proc. of the 2004 ACM conference on Computer supported cooperative work, CSCW '04, ACM (2004), 212--221.

Digital Library

[7]

Bond, R. M., Fariss, C. J., Jones, J. J., Kramer, A. D. I., Marlow, C., Settle, J. E., and Fowler, J. H. A 61-million-person experiment in social influence and political mobilization. Nature 489, 7415 (2012), 295--298.

[8]

Box, G. E., Hunter, J. S., and Hunter, W. G. Statistics for Experimenters: Design, Innovation, and Discovery, vol. 13. Wiley Online Library, 2005.

[9]

Bryan, C. J., Walton, G. M., Rogers, T., and Dweck, C. S. Motivating voter turnout by invoking the self. Proc. of the National Academy of Sciences 108, 31 (2011), 12653--12656.

[10]

Burke, M., Marlow, C., and Lento, T. Feed me: Motivating newcomer contribution in social network sites. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, CHI '09, ACM (2009), 945--954.

Digital Library

[11]

Cameron, A., Gelbach, J., and Miller, D. Robust inference with multi-way clustering. Journal of Business & Economic Statistics 29, 2 (2011), 238--249.

[12]

Crook, T., Frasca, B., Kohavi, R., and Longbotham, R. Seven pitfalls to avoid when running controlled experiments on the web. In Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM (2009), 1105--1114.

Digital Library

[13]

Deaton, A. Instruments, randomization, and learning about development. Journal of Economic Literature (2010), 424--455.

[14]

Farahat, A., and Bailey, M. C. How effective is targeted advertising? In Proc. of the 21st international conference on World Wide Web, ACM (2012), 111--120.

Digital Library

[15]

Gelman, A. Analysis of variance -- why it is more important than ever. The Annals of Statistics 33, 1 (2005), 1--53.

[16]

Gerber, A. S., and Green, D. P. Field Experiments: Design, Analysis, and Interpretation. WW Norton, 2012.

[17]

Holland, P. W. Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18 (1988), 449--484.

[18]

Kohavi, R., Deng, A., Frasca, B., Longbotham, R., Walker, T., and Xu, Y. Trustworthy online controlled experiments: Five puzzling outcomes explained. In Proc. of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM (2012), 786--794.

Digital Library

[19]

Kohavi, R., Longbotham, R., Sommerfield, D., and Henne, R. Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery 18, 1 (2009), 140--181.

Digital Library

[20]

Kulkarni, C., and Chi, E. All the news that's fit to read: a study of social annotations for news reading. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, ACM (2013), 2407--2416.

Digital Library

[21]

Lewis, R. A., Rao, J. M., and Reiley, D. H. Here, there, and everywhere: Correlated online behaviors can lead to overestimates of the effects of advertising. In Proc. of the 20th international conference on World wide web, ACM (2011), 157--166.

Digital Library

[22]

Li, L., Chu, W., Langford, J., and Schapire, R. E. A contextual-bandit approach to personalized news article recommendation. In Proc. of the 19th international conference on World wide web, ACM (2010), 661--670.

Digital Library

[23]

Manzi, J. Uncontrolled: The Surprising Payoff of Trial-and-Error for Business, Politics, and Society. Basic Books, 2012.

[24]

Mao, A., Chen, Y., Gajos, K. Z., Parkes, D., Procaccia, A. D., and Zhang, H. Turkserver: Enabling synchronous and longitudinal online experiments. Proc. HCOMP '12 (2012).

[25]

Mason, W., and Suri, S. Conducting behavioral research on Amazon's Mechanical Turk. Behavior research methods 44, 1 (2012), 1--23.

[26]

Miratrix, L. W., Sekhon, J. S., and Yu, B. Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75, 2 (2013), 369--396.

[27]

Morgan, S. L., and Winship, C. Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge University Press, July 2007.

[28]

Neilson, J. Putting A/B testing in its place, 2005. http://www.nngroup.com/articles/putting-ab-testing-in-its-place.

[29]

Rubin, D. B. Statistics and causal inference: Comment: Which ifs have causal answers. Journal of the American Statistical Association 81, 396 (1986), 961--962.

[30]

Scott, S. L. A modern Bayesian look at the multi-armed bandit. Applied Stochastic Models in Business and Industry 26, 6 (2010), 639--658.

Digital Library

[31]

Shadish, W. R., and Cook, T. D. The renaissance of field experimentation in evaluating interventions. Annual Review of Psychology 60, 1 (Jan. 2009), 607--629.

[32]

Tang, D., Agarwal, A., O'Brien, D., and Meyer, M. Overlapping experiment infrastructure: More, better, faster experimentation. In Proc. of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM (2010), 17--26.

Digital Library

[33]

Taylor, S. J., Bakshy, E., and Aral, S. Selection effects in online sharing: Consequences for peer adoption. In Proc. of the Fourteenth ACM Conference on Electronic Commerce, EC '13, ACM (2013), 821--836.

Digital Library

[34]

Ugander, J., Karrer, B., Backstrom, L., and Kleinberg, J. M. Graph cluster randomization: Network exposure to multiple universes. In Proc. of KDD, ACM (2013).

Digital Library

[35]

Watts, D. Everything Is Obvious: *Once You Know the Answer. Crown Publishing Group, 2011.

Cited By

Chen YKonstan J(2025)Online field experiments: a selective survey of methodsJournal of the Economic Science Association10.1007/s40881-015-0005-31:1(29-42)Online publication date: 1-Jan-2025
https://doi.org/10.1007/s40881-015-0005-3
Byambadalai UOka TYasui SSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Estimating distributional treatment effects in randomized experimentsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692269(5082-5113)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692269
Xiong TWang Y(2024)Large-Scale Metric Computation in Online Controlled Experiment PlatformProceedings of the VLDB Endowment10.14778/3685800.368582317:12(4014-4024)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685823
Show More Cited By

Index Terms

Designing and deploying online field experiments
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing design and evaluation methods

Recommendations

Trustworthy online controlled experiments: five puzzling outcomes explained
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Online controlled experiments are often utilized to make data-driven decisions at Amazon, Microsoft, eBay, Facebook, Google, Yahoo, Zynga, and at many other companies. While the theory of a controlled experiment is simple, and dates back to Sir Ronald ...
Online controlled experiments: introduction, learnings, and humbling statistics
RecSys '12: Proceedings of the sixth ACM conference on Recommender systems

The web provides an unprecedented opportunity to accelerate innovation by evaluating ideas quickly and accurately using controlled experiments (e.g., A/B tests and their generalizations). Whether for front-end user-interface changes, or backend ...
Online controlled experiments: introduction, insights, scaling, and humbling statistics
UEO '13: Proceedings of the 1st workshop on User engagement optimization

The web provides an unprecedented opportunity to accelerate innovation by evaluating ideas quickly and accurately using controlled experiments (e.g., A/B tests and their generalizations). From front-end user-interface changes to backend algorithms, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '14: Proceedings of the 23rd international conference on World wide web

April 2014

926 pages

ISBN:9781450327442

DOI:10.1145/2566486

General Chair:
Chin-Wan Chung
Korea Advanced Institute of Science and Technology, Korea
,
Program Chairs:
Andrei Broder
Google Inc., USA
,
Kyuseok Shim
Seoul National University, Korea
,
Torsten Suel
New York University, USA

Copyright © 2014 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 April 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '14

Sponsor:

IW3C2

WWW '14: 23rd International World Wide Web Conference

April 7 - 11, 2014

Seoul, Korea

Acceptance Rates

WWW '14 Paper Acceptance Rate 84 of 645 submissions, 13%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

119
Total Citations
View Citations
1,286
Total Downloads

Downloads (Last 12 months)99
Downloads (Last 6 weeks)11

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen YKonstan J(2025)Online field experiments: a selective survey of methodsJournal of the Economic Science Association10.1007/s40881-015-0005-31:1(29-42)Online publication date: 1-Jan-2025
https://doi.org/10.1007/s40881-015-0005-3
Byambadalai UOka TYasui SSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Estimating distributional treatment effects in randomized experimentsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692269(5082-5113)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692269
Xiong TWang Y(2024)Large-Scale Metric Computation in Online Controlled Experiment PlatformProceedings of the VLDB Endowment10.14778/3685800.368582317:12(4014-4024)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685823
Lin LMeng CBrennan JPouget-Abadie JHan NBi SPeng Y(2024)Country-diverted experiments for mitigation of network effectsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688046(765-767)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688046
Su WDuan WBaeza-Yates RBonchi F(2024)Improving Ego-Cluster for Network Effect MeasurementProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671557(5713-5722)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671557
Mahajan P(2024)Cost-Effective A/B Testing: Leveraging Go and Python for Efficient Experimentation in Hermes Testing Platform2024 10th International Conference on Communication and Signal Processing (ICCSP)10.1109/ICCSP60870.2024.10543437(1048-1050)Online publication date: 12-Apr-2024
https://doi.org/10.1109/ICCSP60870.2024.10543437
Li YMao JBojinov IOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Balance risk and rewardProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669451(76181-76201)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669451
Simchi-Levi DWang CKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Pricing experimental design: causal effect, expected revenue and tail riskProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619727(31788-31799)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619727
Daulton SBalandat MBakshy EKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Hypervolume knowledge gradientProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618692(7167-7204)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618692
N. Lane JLeonardi PContractor NDeChurch L(2023)An Integrative Review: Technology’s Role in Organizational Team Dynamics, Communication, and PerformanceSSRN Electronic Journal10.2139/ssrn.4494495Online publication date: 2023
https://doi.org/10.2139/ssrn.4494495
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents