research-article

Policy-Aware Unbiased Learning to Rank for Top-k Rankings

Authors:

Harrie Oosterhuis,

Maarten de RijkeAuthors Info & Claims

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 489 - 498

https://doi.org/10.1145/3397271.3401102

Published: 25 July 2020 Publication History

Abstract

Counterfactual Learning to Rank (LTR) methods optimize ranking systems using logged user interactions that contain interaction biases. Existing methods are only unbiased if users are presented with all relevant items in every ranking. There is currently no existing counterfactual unbiased LTR method for top-k rankings. We introduce a novel policy-aware counterfactual estimator for LTR metrics that can account for the effect of a stochastic logging policy. We prove that the policy-aware estimator is unbiased if every relevant item has a non-zero probability to appear in the top-k ranking. Our experimental results show that the performance of our estimator is not affected by the size of k: for any k, the policy-aware estimator reaches the same retrieval performance while learning from top-k feedback as when learning from feedback on the full ranking. Lastly, we introduce novel extensions of traditional LTR methods to perform counterfactual LTR and to optimize top-k metrics. Together, our contributions introduce the first policy-aware unbiased LTR approach that learns from top-k feedback and optimizes top-k metrics. As a result, counterfactual LTR is now applicable to the very prevalent top-k ranking setting in search and recommendation.

Supplementary Material

MP4 File (3397271.3401102.mp4)

The pre-recorded presentation for the SIGIR'20 full paper:\r\n\r\nPolicy-Aware Unbiased Learning to Rank for Top-k Rankings\r\nHarrie Oosterhuis and Maarten de Rijke

Download
222.42 MB

References

[1]

Aman Agarwal, Kenta Takatsu, Ivan Zaitsev, and Thorsten Joachims. 2019 a. A General Framework for Counterfactual Learning-to-Rank. In Proceedings of the 42nd International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 5--14.

Digital Library

[2]

Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, and Marc Najork. 2019 b. Addressing Trust Bias for Unbiased Learning-to-Rank. In The World Wide Web Conference. ACM, 4--14.

[3]

Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019 c. Estimating Position Bias without Intrusive Interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 474--482.

Digital Library

[4]

Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of the 41st International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 385--394.

Digital Library

[5]

Wolf-Tilo Balke, Ulrich Güntzer, and Werner Kießling. 2002. On Real-Time Top k Querying for Mobile Services. In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems". Springer, 125--143.

[6]

Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010--82. Microsoft.

[7]

Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Foundations and Trends in Information Retrieval, Vol. 10, 4 (2016), 273--363.

Digital Library

[8]

Ben Carterette and Praveen Chandar. 2018. Offline Comparative Evaluation with Incremental, Minimally-Invasive Online Feedback. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 705--714.

[9]

Olivier Chapelle and Yi Chang. 2011. Yahoo! Learning to Rank Challenge Overview. Journal of Machine Learning Research, Vol. 14 (2011), 1--24.

Digital Library

[10]

Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of Recommender Algorithms on Top-n Recommendation Tasks. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 39--46.

Digital Library

[11]

Arthur P Dempster, Nan M Laird, and Donald B Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society: Series B (Methodological), Vol. 39, 1 (1977), 1--22.

[12]

Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten de Rijke. 2013. Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 183--192.

Digital Library

[13]

Ziniu Hu, Yang Wang, Qu Peng, and Hang Li. 2019. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. In The World Wide Web Conference. ACM, 2830--2836.

[14]

Neil Hurley and Mi Zhang. 2011. Novelty and Diversity in Top-n Recommendation--Analysis and Evaluation. ACM Transactions on Internet Technology (TOIT), Vol. 10, 4 (2011), 14.

Digital Library

[15]

Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. In Proceedings of the 42nd International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 15--24.

Digital Library

[16]

Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 133--142.

Digital Library

[17]

Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data as Implicit Feedback. In SIGIR. ACM, 154--161.

[18]

Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 781--789.

Digital Library

[19]

Junpei Komiyama, Junya Honda, and Hiroshi Nakagawa. 2015. Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37 (ICML'15). JMLR.org, 1152--1161.

[20]

Paul Lagrée, Claire Vernade, and Olivier Cappé. 2016. Multiple-play Bandits in the Position-based Model. In Advances in Neural Information Processing Systems. 1597--1605.

[21]

Shuai Li, Yasin Abbasi-Yadkori, Branislav Kveton, S. Muthukrishnan, Vishwa Vinay, and Zheng Wen. 2018. Offline Evaluation of Ranking Policies with Click Models. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1685--1694.

Digital Library

[22]

Qiang Liu, Lihong Li, Ziyang Tang, and Dengyong Zhou. 2018. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation. In Advances in Neural Information Processing Systems. 5356--5366.

[23]

Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331.

Digital Library

[24]

Harrie Oosterhuis and Maarten de Rijke. 2018. Differentiable Unbiased Online Learning to Rank. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1293--1302.

Digital Library

[25]

Harrie Oosterhuis and Maarten de Rijke. 2019. Optimizing Ranking Models in an Online Setting. In Advances in Information Retrieval. Springer International Publishing, Cham, 382--396.

[26]

Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 (2013).

[27]

Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations As Treatments: Debiasing Learning and Evaluation. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML'16). JMLR.org, 1670--1679.

Digital Library

[28]

Igor Shalyminov, Ondvr ej Duvs ek, and Oliver Lemon. 2018. Neural Response Ranking for Social Conversation: A Data-Efficient Approach. In Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd Int'l Workshop on Search-Oriented Conversational AI. 1--8.

[29]

Adith Swaminathan and Thorsten Joachims. 2015. The Self-Normalized Estimator for Counterfactual Learning. In Advances in Neural Information Processing Systems. 3231--3239.

[30]

Vladimir Vapnik. 2013. The Nature of Statistical Learning Theory. Springer Science & Business Media.

Digital Library

[31]

Akrivi Vlachou, Christos Doulkeridis, and Kjetil Nørvåg. 2011. Monitoring Reverse Top-k Queries over Mobile Devices. In Proceedings of the 10th ACM International Workshop on Data Engineering for Wireless and Mobile Access. ACM, 17--24.

Digital Library

[32]

Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 115--124.

Digital Library

[33]

Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018a. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 610--618.

Digital Library

[34]

Xuanhui Wang, Cheng Li, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018b. The LambdaLoss Framework for Ranking Metric Optimization. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1313--1322.

Digital Library

[35]

Yisong Yue and Thorsten Joachims. 2009. Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 1201--1208.

Digital Library

Cited By

Wu XPuthenputhussery AShang HKang CFang Y(2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
https://doi.org/10.1145/3698876
Jeunen OPotapov IUstimenko ABaeza-Yates RBonchi F(2024)On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671687(1222-1233)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671687
Buchholz ALondon BDi Benedetto GLichtenberg JStein YJoachims THui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Counterfactual Ranking Evaluation with Flexible Click ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657810(1200-1210)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657810
Show More Cited By

Index Terms

Policy-Aware Unbiased Learning to Rank for Top-k Rankings
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

Unifying Online and Counterfactual Learning to Rank: A Novel Counterfactual Estimator that Effectively Utilizes Online Interventions
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Optimizing ranking systems based on user interactions is a well-studied problem. State-of-the-art methods for optimizing ranking systems based on user interactions are divided into online approaches - that learn by directly interacting with users - and ...
Unbiased Learning to Rank: Theory and Practice
ICTIR '18: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval

Implicit user feedback (such as clicks and dwell time) is an important source of data for modern search engines. While heavily biased~\citejoachims2005accurately,keane2006modeling,joachims2007evaluating,yue2010beyond, it is cheap to collect and ...
Unbiased Learning to Rank with Unbiased Propensity Estimation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2020

2548 pages

ISBN:9781450380164

DOI:10.1145/3397271

General Chairs:
Jimmy Huang
York University, Canada
,
Yi Chang
Jilin University, China
,
Xueqi Cheng
Chinese Academy of Sciences, China
,
Program Chairs:
Jaap Kamps
University of Amsterdam, Netherlands
,
Vanessa Murdock
Amazon, U.S.A.
,
Ji-Rong Wen
Renmin University of China, China
,
Yiqun Liu
Tsinghua University, China

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Conference

SIGIR '20

Sponsor:

SIGIR

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval

July 25 - 30, 2020

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
649
Total Downloads

Downloads (Last 12 months)59
Downloads (Last 6 weeks)3

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu XPuthenputhussery AShang HKang CFang Y(2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
https://doi.org/10.1145/3698876
Jeunen OPotapov IUstimenko ABaeza-Yates RBonchi F(2024)On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671687(1222-1233)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671687
Buchholz ALondon BDi Benedetto GLichtenberg JStein YJoachims THui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Counterfactual Ranking Evaluation with Flexible Click ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657810(1200-1210)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657810
Jaenich TMcDonald GOunis IHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Fairness-Aware Exposure Allocation via Adaptive RerankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657794(1504-1513)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657794
Gupta SHager PHuang JVardasbi AOosterhuis HAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)Unbiased Learning to Rank: On Recent Advances and Practical ApplicationsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3636451(1118-1121)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3636451
Zheng KZhao HHuang RZhang BMou NNiu YSong YWang HGai KChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Full Stage Learning to Rank: A Unified Framework for Multi-Stage SystemsProceedings of the ACM Web Conference 202410.1145/3589334.3645523(3621-3631)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645523
Yang THan CLuo CGupta PPhillips JAi QChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645487
Gupta SHager POosterhuis H(2023)Recent Advancements in Unbiased Learning to RankProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632942(145-148)Online publication date: 15-Dec-2023
https://dl.acm.org/doi/10.1145/3632754.3632942
Zhao HXu JZhang XCai GDong ZWen J(2023)Unbiased Top-$k$ Learning to Rank with Causal Likelihood DecompositionProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625340(129-138)Online publication date: 26-Nov-2023
https://dl.acm.org/doi/10.1145/3624918.3625340
Xu STan JHeinecke SLi VZhang Y(2023)Deconfounded Causal Collaborative FilteringACM Transactions on Recommender Systems10.1145/36060351:4(1-25)Online publication date: 3-Oct-2023
https://dl.acm.org/doi/10.1145/3606035
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents