Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2684822.2697040acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
tutorial

Offline Evaluation and Optimization for Interactive Systems

Published: 02 February 2015 Publication History

Abstract

Evaluating and optimizing an interactive system (like search engines, recommender and advertising systems) from historical data against a predefined online metric is challenging, especially when that metric is computed from user feedback such as clicks and payments. The key challenge is counterfactual in nature: we only observe a user's feedback for actions taken by the system, but we do not know what that user would have reacted to a different action. The golden standard to evaluate such metrics of a user-interacting system is online A/B experiments (a.k.a. randomized controlled experiments), which can be expensive in terms of both time and engineering resources. Offline evaluation/optimization (sometimes referred to as off-policy learning in the literature) thus becomes critical, aiming to evaluate the same metrics without running (many) expensive A/B experiments on live users. One approach to offline evaluation is to build a user model that simulates user behavior (clicks, purchases, etc.) under various contexts, and then evaluate metrics of a system with this simulator. While being straightforward and common in practice, the reliability of such model-based approaches relies heavily on how well the user model is built. Furthermore, it is often difficult to know a priori whether a user model is good enough to be trustable.
Recent years have seen a growing interest in another solution to the offline evaluation problem. Using statistical techniques like importance sampling and doubly robust estimation, the approach can give unbiased estimates of metrics for a wide range of problems. It enjoys other benefits as well. For example, it often allows data scientists to obtain a confidence interval for the estimate to quantify the amount of uncertainty; it does not require building user models, so is more robust and easier to apply. All these benefits make the approach particularly attractive to a wide range of problems. Successful applications have been reported in the last few years by some of the industrial leaders.
This tutorial gives a review of the basic theory and representative techniques. Applications of these techniques are illustrated through several case studies done at Microsoft and Yahoo!.

Cited By

View all
  • (2016)Bid-aware Gradient Descent for Unbiased Learning with Censored Data in Display AdvertisingProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2939672.2939713(665-674)Online publication date: 13-Aug-2016
  • (undefined)Efficient Counterfactual Learning from Bandit FeedbackSSRN Electronic Journal10.2139/ssrn.3300346

Index Terms

  1. Offline Evaluation and Optimization for Interactive Systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining
      February 2015
      482 pages
      ISBN:9781450333177
      DOI:10.1145/2684822
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 February 2015

      Check for updates

      Author Tags

      1. advertising
      2. contextual bandits
      3. counterfactual analysis
      4. information retrieval
      5. interactive systems
      6. offline evaluation
      7. recommender systems
      8. web search

      Qualifiers

      • Tutorial

      Conference

      WSDM 2015

      Acceptance Rates

      WSDM '15 Paper Acceptance Rate 39 of 238 submissions, 16%;
      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Bid-aware Gradient Descent for Unbiased Learning with Censored Data in Display AdvertisingProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2939672.2939713(665-674)Online publication date: 13-Aug-2016
      • (undefined)Efficient Counterfactual Learning from Bandit FeedbackSSRN Electronic Journal10.2139/ssrn.3300346

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media