Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Unbiased Learning to Rank: Online or Offline?

Published: 17 February 2021 Publication History

Abstract

How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank (ULTR) can be broadly categorized into two groups—the studies on unbiased learning algorithms with logged data, namely, the offline unbiased learning, and the studies on unbiased parameters estimation with real-time user interactions, namely, the online learning to rank. While their definitions of unbiasness are different, these two types of ULTR algorithms share the same goal—to find the best models that rank documents based on their intrinsic relevance or utility. However, most studies on offline and online unbiased learning to rank are carried in parallel without detailed comparisons on their background theories and empirical performance. In this article, we formalize the task of unbiased learning to rank and show that existing algorithms for offline unbiased learning and online learning to rank are just the two sides of the same coin. We evaluate eight state-of-the-art ULTR algorithms and find that many of them can be used in both offline settings and online environments with or without minor modifications. Further, we analyze how different offline and online learning paradigms would affect the theoretical foundation and empirical effectiveness of each algorithm on both synthetic and real search data. Our findings provide important insights and guidelines for choosing and deploying ULTR algorithms in practice.

References

[1]
Aman Agarwal, Kenta Takatsu, Ivan Zaitsev, and Thorsten Joachims. 2019. A general framework for counterfactual learning-to-rank. In Proceedings of the ACM Conference on Research and Development in Information Retrieval (SIGIR’19).
[2]
Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, and Marc Najork. 2019. Addressing trust bias for unbiased learning-to-rank. In Proceedings of the World Wide Web Conference. ACM, 4--14.
[3]
Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019. Estimating position bias without intrusive interventions. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. ACM, 474--482.
[4]
Qingyao Ai, Keping Bi, Jiafeng Guo, and W. Bruce Croft. 2018. Learning a deep listwise context model for ranking refinement. In Proceedings of the 41th ACM Conference on Research and Development in Information Retrieval (SIGIR’18). ACM.
[5]
Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W. Bruce Croft. 2018. Unbiased learning to rank with unbiased propensity estimation. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 385--394.
[6]
Qingyao Ai, Jiaxin Mao, Yiqun Liu, and W. Bruce Croft. 2018. Unbiased learning to rank: Theory and practice. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2305--2306.
[7]
Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2019. Learning groupwise multivariate scoring functions using deep neural networks. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval. 85--92.
[8]
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning (ICML’05). ACM, 89--96.
[9]
Christopher J. C. Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning 11 (2010), 23--581.
[10]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). ACM, 129--136.
[11]
Olivier Cappé and Eric Moulines. 2009. On-line expectation--maximization algorithm for latent data models. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 71, 3 (2009), 593--613.
[12]
Olivier Chapelle and Yi Chang. 2011. Yahoo! Learning to rank challenge overview. In Yahoo! Learning to Rank Challenge. 1--24.
[13]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, 621--630.
[14]
Ruey-Cheng Chen, Qingyao Ai, Gaya Jayasinghe, and W. Bruce Croft. 2019. Correcting for recency bias in job recommendation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2185--2188.
[15]
Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click models for web search. Synth. Lect. Info. Concepts Retriev. Serv. 7, 3 (2015), 1--115.
[16]
Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An experimental comparison of click position-bias models. In Proceedings of the 1st International Conference on Web Search and Data Mining (WSDM’08). ACM, 87--94.
[17]
Yajuan Duan, Long Jiang, Tao Qin, Ming Zhou, and Heung-Yeung Shum. 2010. An empirical study on learning to rank of tweets. In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 295--303.
[18]
Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st ACM Conference on Research and Development in Information Retrieval (SIGIR’08). ACM, 331--338.
[19]
Artem Grotov and Maarten de Rijke. 2016. Online learning to rank for information retrieval. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1215--1218.
[20]
Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient multiple-click models in web search. In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining. ACM, 124--131.
[21]
Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten De Rijke. 2013. Reusing historical interaction data for faster online learning to rank for IR. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining. 183--192.
[22]
Ziniu Hu, Yang Wang, Qu Peng, and Hang Li. 2019. Unbiased LambdaMART: An unbiased pairwise learning-to-rank algorithm. In Proceedings of the World Wide Web Conference. ACM, 2830--2836.
[23]
Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To model or to intervene: A comparison of counterfactual and online learning to rank from user interactions. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’19). ACM, New York, NY, 15--24.
[24]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Info. Syst. 20, 4 (2002), 422--446.
[25]
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD. ACM, 133--142.
[26]
Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD. ACM, 217--226.
[27]
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR’05). Acm, 154--161.
[28]
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans. Info. Syst. 25, 2 (2007), 7.
[29]
Thorsten Joachims and Adith Swaminathan. 2016. Counterfactual evaluation and learning for search, recommendation and ad placement. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 1199--1201.
[30]
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining (WSDM’17). ACM, 781--789.
[31]
Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, and Zheng Wen. 2016. DCM bandits: Learning to rank with multiple clicks. In Proceedings of the International Conference on Machine Learning. 1215--1224.
[32]
Mark T. Keane and Maeve O’Brien. 2006. Modeling result-list searching in the world wide web: The role of relevance topologies and trust bias. In Proceedings of the Cognitive Science Society, Vol. 28.
[33]
Branislav Kveton, Csaba Szepesvari, Zheng Wen, and Azin Ashkan. 2015. Cascading bandits: Learning to rank in the cascade model. In Proceedings of the International Conference on Machine Learning. 767--776.
[34]
Tor Lattimore, Branislav Kveton, Shuai Li, and Csaba Szepesvari. 2018. TopRank: A practical algorithm for online stochastic ranking. In Advances in Neural Information Processing Systems. MIT Press, 3945--3954.
[35]
Hang Li. 2011. A short introduction to learning to rank. IEICE Trans. Info. Syst. 94, 10 (2011), 1854--1862.
[36]
Ping Li, Qiang Wu, and Christopher J. Burges. 2008. Mcrank: Learning to rank using multiple classification and gradient boosting. In Advances in Neural Information Processing Systems. MIT Press, 897--904.
[37]
Shuai Li, Tor Lattimore, and Csaba Szepesvári. 2018. Online learning to rank with features. Retrieved from https://arXiv:1810.02567.
[38]
Tie-Yan Liu. 2009. Learning to rank for information retrieval. Found. Trends Info. Retriev. 3, 3 (2009), 225--331.
[39]
Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Salvatore Trani. 2016. Post-learning optimization of tree ensembles for efficient ranking. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 949--952.
[40]
Jiaxin Mao, Zhumin Chu, Yiqun Liu, Min Zhang, and Shaoping Ma. 2019. Investigating the reliability of click models. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval. 125--128.
[41]
Jiaxin Mao, Cheng Luo, Min Zhang, and Shaoping Ma. 2018. Constructing click models for mobile search. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 775--784.
[42]
Harrie Oosterhuis and Maarten de Rijke. 2018. Differentiable unbiased online learning to rank. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1293--1302.
[43]
Harrie Oosterhuis and Maarten de Rijke. 2019. Optimizing ranking models in an online setting. In Proceedings of the European Conference on Information Retrieval. Springer, 382--396.
[44]
Joao Palotti. 2016. Learning to Rank for Personalized e-commerce Search at CIKM Cup 2016. Technical Report.
[45]
Liang Pang, Jun Xu, Qingyao Ai, Yanyan Lan, Xueqi Cheng, and Jirong Wen. 2020. SetRank: Learning a permutation-invariant ranking model for information retrieval. In Proceedings of the 43th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM.
[46]
Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, and Marc Najork. 2019. Self-attentive document interaction networks for permutation equivariant ranking. Retrieved from https://arXiv:1910.09676.
[47]
Jay M. Ponte and W. Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st Annual ACM Conference on Research and Development in Information Retrieval (SIGIR’98). ACM, 275--281.
[48]
Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting clicks: Estimating the click-through rate for new ads. In Proceedings of the 16th International Conference on World Wide Web. ACM, 521--530.
[49]
Stephen E. Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR’94). Springer-Verlag, New York, 232--241.
[50]
Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. 2016. Multileave gradient descent for fast online learning to rank. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining (WSDM’16). ACM, 457--466.
[51]
Anne Schuth, Floor Sietsma, Shimon Whiteson, Damien Lefortier, and Maarten de Rijke. 2014. Multileaved comparisons for fast online evaluation. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 71--80.
[52]
Mark D. Smucker, James Allan, and Ben Carterette. 2007. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, 623--632.
[53]
Chao Wang, Yiqun Liu, Min Zhang, Shaoping Ma, Meihong Zheng, Jing Qian, and Kuo Zhang. 2013. Incorporating vertical results into search click models. In Proceedings of the 36th ACM Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, 503--512.
[54]
Huazheng Wang, Sonwoo Kim, Eric McCord-Snook, Qingyun Wu, and Hongning Wang. 2019. Variance reduction in gradient exploration for online learning to rank. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.
[55]
Huazheng Wang, Ramsey Langley, Sonwoo Kim, Eric McCord-Snook, and Hongning Wang. 2018. Efficient exploration of gradient space for online learning to rank. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.
[56]
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th ACM Conference on Research and Development in Information Retrieval (SIGIR’16). ACM, 115--124.
[57]
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position bias estimation for unbiased learning to rank in personal search. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM’18). ACM, New York, NY, 610--618.
[58]
Liu Yang, Qingyao Ai, Damiano Spina, Ruey-Cheng Chen, Liang Pang, W. Bruce Croft, Jiafeng Guo, and Falk Scholer. 2016. Beyond factoid QA: Effective methods for non-factoid answer sentence retrieval. In Proceedings of the European Conference on Information Retrieval (ECIR’16). Springer, 115--128.
[59]
Tao Yang, Shikai Fang, Shibo Li, Yulan Wang, and Qingyao Ai. 2020. Analysis of multivariate scoring functions for automatic unbiased learning to rank. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 2277--2280.
[60]
Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of the 26th International Conference on Machine Learning (ICML’09). ACM, 1201--1208.
[61]
Chengxiang Zhai and John Lafferty. 2017. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the ACM Conference on Research and Development in Information Retrieval (SIGIR’17), Vol. 51. ACM, 268--276.
[62]
Tong Zhao and Irwin King. 2016. Constructing reliable gradient exploration for online learning to rank. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 1643--1652.
[63]
Masrour Zoghi, Tomas Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvari, and Zheng Wen. 2017. Online learning to rank in stochastic click models. In Proceedings of the International Conference on Machine Learning (ICML’17). 4199--4208.

Cited By

View all
  • (2024)A Self-Distilled Learning to Rank Model for Ad Hoc RetrievalACM Transactions on Information Systems10.1145/368178442:6(1-28)Online publication date: 25-Jul-2024
  • (2024)Learning to Rank for Maps at AirbnbProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671648(5061-5069)Online publication date: 25-Aug-2024
  • (2024)Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search DatasetProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657892(1546-1556)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. Unbiased Learning to Rank: Online or Offline?

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 39, Issue 2
    April 2021
    391 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/3444752
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 February 2021
    Accepted: 01 November 2020
    Revised: 01 October 2020
    Received: 01 July 2020
    Published in TOIS Volume 39, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Learning to rank
    2. online learning
    3. unbiased learning

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • School of Computing, University of Utah

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)83
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Self-Distilled Learning to Rank Model for Ad Hoc RetrievalACM Transactions on Information Systems10.1145/368178442:6(1-28)Online publication date: 25-Jul-2024
    • (2024)Learning to Rank for Maps at AirbnbProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671648(5061-5069)Online publication date: 25-Aug-2024
    • (2024)Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search DatasetProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657892(1546-1556)Online publication date: 10-Jul-2024
    • (2024)Unbiased Learning-to-Rank Needs Unconfounded Propensity EstimationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657772(1535-1545)Online publication date: 10-Jul-2024
    • (2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
    • (2024) LT 2 R: Learning to Online Learning to Rank for Web Search 2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00360(4733-4746)Online publication date: 13-May-2024
    • (2024)How to Forget Clients in Federated Online Learning to Rank?Advances in Information Retrieval10.1007/978-3-031-56063-7_7(105-121)Online publication date: 24-Mar-2024
    • (2024)Learning-to-Rank with Nested FeedbackAdvances in Information Retrieval10.1007/978-3-031-56063-7_22(306-315)Online publication date: 24-Mar-2024
    • (2023)Unbiased Top-$k$ Learning to Rank with Causal Likelihood DecompositionProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625340(129-138)Online publication date: 26-Nov-2023
    • (2023)Vertical Allocation-based Fair Exposure Amortizing in RankingProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625313(234-244)Online publication date: 26-Nov-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media