Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3397271.3401102acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Policy-Aware Unbiased Learning to Rank for Top-k Rankings

Published: 25 July 2020 Publication History

Abstract

Counterfactual Learning to Rank (LTR) methods optimize ranking systems using logged user interactions that contain interaction biases. Existing methods are only unbiased if users are presented with all relevant items in every ranking. There is currently no existing counterfactual unbiased LTR method for top-k rankings. We introduce a novel policy-aware counterfactual estimator for LTR metrics that can account for the effect of a stochastic logging policy. We prove that the policy-aware estimator is unbiased if every relevant item has a non-zero probability to appear in the top-k ranking. Our experimental results show that the performance of our estimator is not affected by the size of k: for any k, the policy-aware estimator reaches the same retrieval performance while learning from top-k feedback as when learning from feedback on the full ranking. Lastly, we introduce novel extensions of traditional LTR methods to perform counterfactual LTR and to optimize top-k metrics. Together, our contributions introduce the first policy-aware unbiased LTR approach that learns from top-k feedback and optimizes top-k metrics. As a result, counterfactual LTR is now applicable to the very prevalent top-k ranking setting in search and recommendation.

Supplementary Material

MP4 File (3397271.3401102.mp4)
The pre-recorded presentation for the SIGIR'20 full paper:\r\n\r\nPolicy-Aware Unbiased Learning to Rank for Top-k Rankings\r\nHarrie Oosterhuis and Maarten de Rijke

References

[1]
Aman Agarwal, Kenta Takatsu, Ivan Zaitsev, and Thorsten Joachims. 2019 a. A General Framework for Counterfactual Learning-to-Rank. In Proceedings of the 42nd International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 5--14.
[2]
Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, and Marc Najork. 2019 b. Addressing Trust Bias for Unbiased Learning-to-Rank. In The World Wide Web Conference. ACM, 4--14.
[3]
Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019 c. Estimating Position Bias without Intrusive Interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 474--482.
[4]
Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of the 41st International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 385--394.
[5]
Wolf-Tilo Balke, Ulrich Güntzer, and Werner Kießling. 2002. On Real-Time Top k Querying for Mobile Services. In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems". Springer, 125--143.
[6]
Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010--82. Microsoft.
[7]
Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Foundations and Trends in Information Retrieval, Vol. 10, 4 (2016), 273--363.
[8]
Ben Carterette and Praveen Chandar. 2018. Offline Comparative Evaluation with Incremental, Minimally-Invasive Online Feedback. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 705--714.
[9]
Olivier Chapelle and Yi Chang. 2011. Yahoo! Learning to Rank Challenge Overview. Journal of Machine Learning Research, Vol. 14 (2011), 1--24.
[10]
Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of Recommender Algorithms on Top-n Recommendation Tasks. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 39--46.
[11]
Arthur P Dempster, Nan M Laird, and Donald B Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society: Series B (Methodological), Vol. 39, 1 (1977), 1--22.
[12]
Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten de Rijke. 2013. Reusing Historical Interaction Data for Faster Online Learning to Rank for IR. In Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 183--192.
[13]
Ziniu Hu, Yang Wang, Qu Peng, and Hang Li. 2019. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. In The World Wide Web Conference. ACM, 2830--2836.
[14]
Neil Hurley and Mi Zhang. 2011. Novelty and Diversity in Top-n Recommendation--Analysis and Evaluation. ACM Transactions on Internet Technology (TOIT), Vol. 10, 4 (2011), 14.
[15]
Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. In Proceedings of the 42nd International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 15--24.
[16]
Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 133--142.
[17]
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data as Implicit Feedback. In SIGIR. ACM, 154--161.
[18]
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 781--789.
[19]
Junpei Komiyama, Junya Honda, and Hiroshi Nakagawa. 2015. Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37 (ICML'15). JMLR.org, 1152--1161.
[20]
Paul Lagrée, Claire Vernade, and Olivier Cappé. 2016. Multiple-play Bandits in the Position-based Model. In Advances in Neural Information Processing Systems. 1597--1605.
[21]
Shuai Li, Yasin Abbasi-Yadkori, Branislav Kveton, S. Muthukrishnan, Vishwa Vinay, and Zheng Wen. 2018. Offline Evaluation of Ranking Policies with Click Models. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1685--1694.
[22]
Qiang Liu, Lihong Li, Ziyang Tang, and Dengyong Zhou. 2018. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation. In Advances in Neural Information Processing Systems. 5356--5366.
[23]
Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331.
[24]
Harrie Oosterhuis and Maarten de Rijke. 2018. Differentiable Unbiased Online Learning to Rank. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1293--1302.
[25]
Harrie Oosterhuis and Maarten de Rijke. 2019. Optimizing Ranking Models in an Online Setting. In Advances in Information Retrieval. Springer International Publishing, Cham, 382--396.
[26]
Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 (2013).
[27]
Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations As Treatments: Debiasing Learning and Evaluation. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML'16). JMLR.org, 1670--1679.
[28]
Igor Shalyminov, Ondvr ej Duvs ek, and Oliver Lemon. 2018. Neural Response Ranking for Social Conversation: A Data-Efficient Approach. In Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd Int'l Workshop on Search-Oriented Conversational AI. 1--8.
[29]
Adith Swaminathan and Thorsten Joachims. 2015. The Self-Normalized Estimator for Counterfactual Learning. In Advances in Neural Information Processing Systems. 3231--3239.
[30]
Vladimir Vapnik. 2013. The Nature of Statistical Learning Theory. Springer Science & Business Media.
[31]
Akrivi Vlachou, Christos Doulkeridis, and Kjetil Nørvåg. 2011. Monitoring Reverse Top-k Queries over Mobile Devices. In Proceedings of the 10th ACM International Workshop on Data Engineering for Wireless and Mobile Access. ACM, 17--24.
[32]
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 115--124.
[33]
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018a. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 610--618.
[34]
Xuanhui Wang, Cheng Li, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018b. The LambdaLoss Framework for Ranking Metric Optimization. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1313--1322.
[35]
Yisong Yue and Thorsten Joachims. 2009. Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem. In Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 1201--1208.

Cited By

View all
  • (2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
  • (2024)On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671687(1222-1233)Online publication date: 25-Aug-2024
  • (2024)Counterfactual Ranking Evaluation with Flexible Click ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657810(1200-1210)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. Policy-Aware Unbiased Learning to Rank for Top-k Rankings

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2020
    2548 pages
    ISBN:9781450380164
    DOI:10.1145/3397271
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. counterfactual learning
    2. counterfactual learning to rank
    3. learning to rank
    4. recommendation
    5. selection bias
    6. top-k ranking

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
    • (2024)On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671687(1222-1233)Online publication date: 25-Aug-2024
    • (2024)Counterfactual Ranking Evaluation with Flexible Click ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657810(1200-1210)Online publication date: 10-Jul-2024
    • (2024)Fairness-Aware Exposure Allocation via Adaptive RerankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657794(1504-1513)Online publication date: 10-Jul-2024
    • (2024)Unbiased Learning to Rank: On Recent Advances and Practical ApplicationsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3636451(1118-1121)Online publication date: 4-Mar-2024
    • (2024)Full Stage Learning to Rank: A Unified Framework for Multi-Stage SystemsProceedings of the ACM Web Conference 202410.1145/3589334.3645523(3621-3631)Online publication date: 13-May-2024
    • (2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
    • (2023)Recent Advancements in Unbiased Learning to RankProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632942(145-148)Online publication date: 15-Dec-2023
    • (2023)Unbiased Top-$k$ Learning to Rank with Causal Likelihood DecompositionProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625340(129-138)Online publication date: 26-Nov-2023
    • (2023)Deconfounded Causal Collaborative FilteringACM Transactions on Recommender Systems10.1145/36060351:4(1-25)Online publication date: 3-Oct-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media