Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3269206.3271686acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Differentiable Unbiased Online Learning to Rank

Published: 17 October 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients. Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model. We prove that the gradient of PDGD is unbiased w.r.t. user document pair preferences. Our experiments on the largest publicly available Learning to Rank (LTR) datasets show considerable and significant improvements under all levels of interaction noise. PDGD outperforms existing OLTR methods both in terms of learning speed as well as final convergence. Furthermore, unlike previous OLTR methods, PDGD also allows for non-linear models to be optimized effectively. Our results show that using a neural network leads to even better performance at convergence than a linear model. In summary, PDGD is an efficient and unbiased OLTR approach that provides a better user experience than previously possible.

    References

    [1]
    James Allan, Ben Carterette, Javed A. Aslam, Virgil Pavlu, Blagovest Dachev, and Evangelos Kanoulas. 2007. Million query track 2007 overview. In TREC. NIST.
    [2]
    Christopher J. C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An overview. Technical Report. Microsoft Research.
    [3]
    Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: From pairwise Approach to listwise approach. In ICML. ACM, 129--136.
    [4]
    Olivier Chapelle and Yi Chang. 2011. Yahoo! Learning to Rank Challenge Overview. Journal of Machine Learning Research 14 (2011), 1--24.
    [5]
    Sergiu Chelaru, Claudia Orellana-Rodriguez, and Ismail Sengor Altingovde. 2014. How useful is social feedback for learning to rank YouTube videos? World Wide Web 17, 5 (2014), 997--1025.
    [6]
    Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool Publishers.
    [7]
    Domenico Dato, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2016. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Transactions on Information Systems (TOIS) 35, 2 (2016), 15.
    [8]
    Susan T. Dumais. 2010. The web changes everything: Understanding and supporting people in dynamic information environments. In ECDL. Springer, 1.
    [9]
    Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256.
    [10]
    Fan Guo, Chao Liu, and Yi M Wang. 2009. Efficient multiple-click models in web search. In WSDM. ACM, 124--131.
    [11]
    Jing He, Chengxiang Zhai, and Xiaoming Li. 2009. Evaluation of methods for relative comparison of retrieval systems based on clickthroughs. In CIKM. ACM, 2029--2032.
    [12]
    Katja Hofmann. 2013. Fast and Reliably Online Learning to Rank for Information Retrieval. Ph.D. Dissertation. University of Amsterdam.
    [13]
    Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten de Rijke. 2013. Reusing historical interaction data for faster online learning to rank for IR. In WSDM. ACM, 183--192.
    [14]
    Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2011. Balancing Exploration and Exploitation in Learning to Rank Online. In ECIR. Springer, 251--263.
    [15]
    Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2011. A probabilistic method for inferring preferences from clicks. In CIKM. ACM, 249--258.
    [16]
    Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2012. Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval. Information Retrieval 16, 1 (2012), 63--90.
    [17]
    Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In KDD. ACM, 133--142.
    [18]
    Alexandros Karatzoglou, Linas Baltrunas, and Yue Shi. 2013. Learning to rank for recommender systems. In RecSys. ACM, 493--494.
    [19]
    Shubhra Kanti Karmaker Santu, Parikshit Sondhi, and ChengXiang Zhai. 2017. On application of learning to rank for e-commerce search. In SIGIR. ACM, 475--484.
    [20]
    Damien Lefortier, Pavel Serdyukov, and Maarten de Rijke. 2014. Online exploration for detecting shifts in fresh intent. In CIKM. ACM, 589--598.
    [21]
    Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225--331.
    [22]
    Tie-Yan Liu, Jun Xu, Tao Qin, Wenying Xiong, and Hang Li. 2007. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. In LR4IR '07.
    [23]
    Harrie Oosterhuis and Maarten de Rijke. 2017. Balancing speed and quality in online learning to rank for information retrieval. In CIKM. ACM, 277--286.
    [24]
    Harrie Oosterhuis and Maarten de Rijke. 2017. Sensitive and scalable online evaluation with theoretical guarantees. In CIKM. ACM, 77--86.
    [25]
    Harrie Oosterhuis, Anne Schuth, and Maarten de Rijke. 2016. Probabilistic Multileave Gradient Descent. In ECIR. Springer, 661--668.
    [26]
    Art B. Owen. 2013. Monte Carlo theory, methods and examples. Monte Carlo Theory, Methods and Examples. Art Owen (2013).
    [27]
    Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 (2013).
    [28]
    Filip Radlinski and Nick Craswell. 2017. A theoretical framework for conversational search. In CHIIR. 117--126.
    [29]
    Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In ICML. ACM Press, 784--791.
    [30]
    Filip Radlinski, Madhu Kurup, and Thorsten Joachims. 2008. How does clickthrough data reflect retrieval quality?. In CIKM. ACM, 43--52.
    [31]
    Mark Sanderson. 2010. Test Collection Based Evaluation of Information Retrieval Systems. Foundations and Trends in Information Retrieval 4, 4 (2010), 247--375.
    [32]
    Anne Schuth, Robert-Jan Bruintjes, Fritjof Büttner, Joost van Doorn, and others. 2015. Probabilistic multileave for online retrieval evaluation. In SIGIR. ACM, 955--958.
    [33]
    Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. 2016. Multileave gradient descent for fast online learning to rank. In WSDM. ACM, 457--466.
    [34]
    Anne Schuth, Floor Sietsma, Shimon Whiteson, Damien Lefortier, and Maarten de Rijke. 2014. Multileaved Comparisons for Fast Online Evaluation. In CIKM. ACM, 71--80.
    [35]
    Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. 2013. Ranked Bandits in Metric Spaces: Learning Diverse Rankings over Large Document Collections. Journal of Machine Learning Research 14, 1 (2013), 399--436.
    [36]
    Balázs Szörényi, Róbert Busa-Fekete, Adil Paul, and Eyke Hüllermeier. 2015. Online rank elicitation for Plackett-Luce: A dueling bandits approach. In NIPS. 604--612.
    [37]
    Pertti Vakkari and Nana Hakala. 2000. Changes in relevance criteria and problem stages in task performance. Journal of Documentation 56 (2000), 540--562.
    [38]
    Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In SIGIR. ACM, 115--124.
    [39]
    Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201--1208.
    [40]
    Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In WWW. ACM, 1011--1018.
    [41]
    Masrour Zoghi, Tomá? Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvari, and Zheng Wen. 2017. Online Learning to Rank in Stochastic Click Models. In ICML. 4199--4208.
    [42]
    Masrour Zoghi, Shimon Whiteson, Maarten de Rijke, and Remi Munos. 2014. Relative confidence sampling for efficient on-line ranker evaluation. In WSDM. 73--82.

    Cited By

    View all
    • (2024)Optimizing Learning-to-Rank Models for Ex-Post Fair RelevanceProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657751(1525-1534)Online publication date: 10-Jul-2024
    • (2024)Unbiased Learning to Rank: On Recent Advances and Practical ApplicationsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3636451(1118-1121)Online publication date: 4-Mar-2024
    • (2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM on Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
    • Show More Cited By

    Index Terms

    1. Differentiable Unbiased Online Learning to Rank

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
      October 2018
      2362 pages
      ISBN:9781450360142
      DOI:10.1145/3269206
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 October 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. gradient descent
      2. learning to rank
      3. online learning

      Qualifiers

      • Research-article

      Funding Sources

      • Netherlands Organisation for Scientific Research (NWO)

      Conference

      CIKM '18
      Sponsor:

      Acceptance Rates

      CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)39
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 10 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Optimizing Learning-to-Rank Models for Ex-Post Fair RelevanceProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657751(1525-1534)Online publication date: 10-Jul-2024
      • (2024)Unbiased Learning to Rank: On Recent Advances and Practical ApplicationsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3636451(1118-1121)Online publication date: 4-Mar-2024
      • (2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM on Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
      • (2024) LT 2 R: Learning to Online Learning to Rank for Web Search 2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00360(4733-4746)Online publication date: 13-May-2024
      • (2024)Privacy Preserved Federated Learning for Online Ranking System (OLTR) for 6G Internet TechnologyWireless Personal Communications10.1007/s11277-024-11206-zOnline publication date: 31-May-2024
      • (2024)How to Forget Clients in Federated Online Learning to Rank?Advances in Information Retrieval10.1007/978-3-031-56063-7_7(105-121)Online publication date: 23-Mar-2024
      • (2024)Learning-to-Rank with Nested FeedbackAdvances in Information Retrieval10.1007/978-3-031-56063-7_22(306-315)Online publication date: 23-Mar-2024
      • (2023)Recent Advancements in Unbiased Learning to RankProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632942(145-148)Online publication date: 15-Dec-2023
      • (2023)Efficient Exploration and Exploitation for Sequential Music RecommendationACM Transactions on Recommender Systems10.1145/36258272:4(1-23)Online publication date: 27-Sep-2023
      • (2023)Towards Sequential Counterfactual Learning to RankProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625325(122-128)Online publication date: 26-Nov-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media