research-article

Accelerated Convergence for Counterfactual Learning to Rank

Authors:

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2020

Pages 469 - 478

https://doi.org/10.1145/3397271.3401069

Published: 25 July 2020 Publication History

Get Access

Abstract

Counterfactual Learning To Rank (LTR) algorithms learn a ranking model from logged user interactions, often collected using a production system. Employing such an offline learning approach has many benefits compared to an online one, but it is challenging as user feedback often contains high levels of bias. Unbiased LTR uses Inverse Propensity Scoring (IPS) to enable unbiased learning from logged user interactions. One of the major difficulties in applying Stochastic Gradient Descent (SGD) approaches to counterfactual learning problems is the large variance introduced by the propensity weights. In this paper we show that the convergence rate of SGD approaches with IPS-weighted gradients suffers from the large variance introduced by the IPS weights: convergence is slow, especially when there are large IPS weights.

To overcome this limitation, we propose a novel learning algorithm, called CounterSample, that has provably better convergence than standard IPS-weighted gradient descent methods. We prove that CounterSample converges faster and complement our theoretical findings with empirical results by performing extensive experimentation in a number of biased LTR scenarios -- across optimizers, batch sizes, and different degrees of position bias.

Supplementary Material

MP4 File (3397271.3401069.mp4)

This presentation is about accelerating the convergence of counterfactual learning to rank (LTR). Counterfactual LTR relies on Inverse Propensity Scoring (IPS) to debias the learning process. However, IPS-weights can introduce a large amount of variance, which in turn can slow down the learning process. To address this problem we make the following contributions in this paper: (a) we show that the convergence rate of IPS-weighted SGD scales poorly with IPS weights, (b) we introduce CounterSample: a sample-based SGD method that has provably better convergence than IPS-weighted SGD, and (c) we empirically validate these theoretical findings with experiments in a number of settings -- across optimizers, batch sizes and different severities of position bias.

Download
32.01 MB

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, et almbox. 2016. TensorFlow: A System for Large-scale Machine Learning. In USENIX OSDI. 265--283.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Policy-Aware Unbiased Learning to Rank for Top-k Rankings

Unifying Online and Counterfactual Learning to Rank: A Novel Counterfactual Estimator that Effectively Utilizes Online Interventions

Unbiased Learning to Rank: Online or Offline?

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations