Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3308558.3313452acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Improving Treatment Effect Estimators Through Experiment Splitting

Published: 13 May 2019 Publication History

Abstract

We present a method for implementing shrinkage of treatment effect estimators, and hence improving their precision, via experiment splitting. Experiment splitting reduces shrinkage to a standard prediction problem. The method makes minimal distributional assumptions, and allows for the degree of shrinkage in one metric to depend on other metrics. Using a dataset of 226 Facebook News Feed A/B tests, we show that a lasso estimator based on repeated experiment splitting has a 44% lower mean squared predictive error than the conventional, unshrunk treatment effect estimator, a 18% lower mean squared predictive error than the James-Stein shrinkage estimator, and would lead to substantially improved launch decisions over both.

References

[1]
Hirotogu Akaike. 1998. Information theory and an extension of the maximum likelihood principle. In Selected Papers of Hirotugu Akaike. Springer, 199-213.
[2]
Michael L Anderson and Jeremy Magruder. 2017. Split-sample strategies for avoiding false discoveries. Technical Report. National Bureau of Economic Research.
[3]
Susan Athey and Guido Imbens. 2016. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences 113, 27(2016), 7353-7360.
[4]
Susan Athey, Julie Tibshirani, and Stefan Wager. 2016. Generalized random forests. arXiv preprint arXiv:1610.01271(2016).
[5]
Susan Athey and Stefan Wager. 2017. Efficient policy learning. arXiv preprint arXiv:1702.02896(2017).
[6]
Eduardo M Azevedo, Alex Deng, Jose Luis Montiel Olea, Justin Rao, and E Glen Weyl. 2018. The A/B Testing Problem. In Proceedings of the 2018 ACM Conference on Economics and Computation. ACM, 461-462.
[7]
Gerard Biau. 2012. Analysis of a random forests model. Journal of Machine Learning Research 13, Apr (2012), 1063-1095.
[8]
Thomas Blake and Dominic Coey. 2014. Why marketplace experimentation is harder than it seems: The role of test-control interference. In Proceedings of the fifteenth ACM conference on Economics and computation. ACM, 567-582.
[9]
Lawrence D Brown. 2008. In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies. The Annals of Applied Statistics(2008), 113-152.
[10]
Lawrence D Brown and Eitan Greenshtein. 2009. Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means. The Annals of Statistics(2009), 1685-1704.
[11]
Bradley P Carlin and Thomas A Louis. 2010. Bayes and empirical Bayes methods for data analysis. Chapman and Hall/CRC.
[12]
George Casella. 1985. An introduction to empirical Bayes data analysis. The American Statistician 39, 2 (1985), 83-87.
[13]
Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 21, 1 (2018), C1-C68.
[14]
Victor Chernozhukov, Whitney Newey, and James Robins. 2018. Double/de-biased machine learning using regularized Riesz representers. arXiv preprint arXiv:1802.08667(2018).
[15]
Alex Deng. 2015. Objective bayesian two sample hypothesis testing for online controlled experiments. In Proceedings of the 24th International Conference on World Wide Web. ACM, 923-928.
[16]
Bradley Efron. 2011. Tweedie's formula and selection bias. J. Amer. Statist. Assoc. 106, 496 (2011), 1602-1614.
[17]
Bradley Efron. 2012. Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press.
[18]
Bradley Efron and Trevor Hastie. 2016. Computer age statistical inference. Cambridge University Press.
[19]
Bradley Efron and Carl Morris. 1973. Stein's estimation rule and its competitors-an empirical Bayes approach. J. Amer. Statist. Assoc. 68, 341 (1973), 117-130.
[20]
Bradley Efron and Carl Morris. 1975. Data analysis using Stein's estimator and its generalizations. J. Amer. Statist. Assoc. 70, 350 (1975), 311-319.
[21]
Bradley Efron, Carl Morris, 1976. Multivariate empirical Bayes and estimation of covariance matrices. The Annals of Statistics 4, 1 (1976), 22-32.
[22]
Bradley Efron, Robert Tibshirani, John D Storey, and Virginia Tusher. 2001. Empirical Bayes analysis of a microarray experiment. Journal of the American statistical association 96, 456(2001), 1151-1160.
[23]
Marcel Fafchamps and Julien Labonne. 2017. Using Split Samples to Improve Inference on Causal Effects. Political Analysis 25, 4 (2017), 465-482.
[24]
Jerome Friedman, Trevor Hastie, Holger Höfling, Robert Tibshirani, 2007. Pathwise coordinate optimization. The Annals of Applied Statistics 1, 2 (2007), 302-332.
[25]
F. Hayashi. 2011. Econometrics. Princeton University Press.
[26]
William James and Charles Stein. 1961. Estimation with quadratic loss. In Proceedings of the fourth Berkeley symposium on mathematical statistics and probability, Vol. 1. 361-379.
[27]
Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. 2013. Online controlled experiments at large scale. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1168-1176.
[28]
Colin L Mallows. 1973. Some comments on Cp. Technometrics 15, 4 (1973), 661-675.
[29]
Whitney K Newey and James R Robins. 2018. Cross-fitting and fast remainder rates for semiparametric estimation. arXiv preprint arXiv:1801.09138(2018).
[30]
Alexander Peysakhovich and Dean Eckles. 2018. Learning Causal Effects From Many Randomized Experiments Using Regularized Instrumental Variables. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 699-707.
[31]
Herbert Robbins. 1956. An Empirical Bayes Approach to Statistics. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. The Regents of the University of California.
[32]
Charles M Stein. 1956. Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proc. Third Berkeley Symp. Math. Statist. Probab., 1956, Vol. 1. Univ. California Press, 197-206.
[33]
Charles M Stein. 1962. Confidence sets for the mean of a multivariate normal distribution. Journal of the Royal Statistical Society. Series B (Methodological) (1962), 265-296.
[34]
Charles M Stein. 1981. Estimation of the mean of a multivariate normal distribution. The annals of Statistics(1981), 1135-1151.
[35]
William E Strawderman. 1971. Proper Bayes minimax estimators of the multivariate normal mean. The Annals of Mathematical Statistics 42, 1 (1971), 385-388.
[36]
Mark J Van der Laan and Sherri Rose. 2011. Targeted learning: causal inference for observational and experimental data. Springer Science & Business Media.
[37]
Stefan Wager and Susan Athey. 2017. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc.just-accepted (2017).
[38]
Stefan Wager, Wenfei Du, Jonathan Taylor, and Robert J Tibshirani. 2016. High-dimensional regression adjustments in randomized experiments. Proceedings of the National Academy of Sciences 113, 45(2016), 12673-12678.

Cited By

View all
  • (2023)Getting the Most Out of Online A/B Tests Using the Minimax-Regret CriteriaSSRN Electronic Journal10.2139/ssrn.4323382Online publication date: 2023
  • (2023)Bayesian A/B Testing with Covariates2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00207(1553-1558)Online publication date: 1-Dec-2023
  • (2021)Empirical Bayes Mean Estimation With Nonparametric Errors Via Order Statistic Regression on Replicated DataJournal of the American Statistical Association10.1080/01621459.2021.1967164118:542(987-999)Online publication date: 24-Sep-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. A/B Tests
  2. Causal Inference
  3. Empirical Bayes Shrinkage
  4. Experiment Meta-Analysis.
  5. Sample Splitting

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)1
Reflects downloads up to 26 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Getting the Most Out of Online A/B Tests Using the Minimax-Regret CriteriaSSRN Electronic Journal10.2139/ssrn.4323382Online publication date: 2023
  • (2023)Bayesian A/B Testing with Covariates2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00207(1553-1558)Online publication date: 1-Dec-2023
  • (2021)Empirical Bayes Mean Estimation With Nonparametric Errors Via Order Statistic Regression on Replicated DataJournal of the American Statistical Association10.1080/01621459.2021.1967164118:542(987-999)Online publication date: 24-Sep-2021
  • (2020)Trustworthy Online Controlled Experiments10.1017/9781108653985Online publication date: 13-Mar-2020
  • (2019)Covariate-powered empirical Bayes estimationProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455150(9620-9632)Online publication date: 8-Dec-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media