research-article

Differentiable Unbiased Online Learning to Rank

Authors:

Harrie Oosterhuis,

Maarten de RijkeAuthors Info & Claims

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Pages 1293 - 1302

https://doi.org/10.1145/3269206.3271686

Published: 17 October 2018 Publication History

Abstract

Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients. Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model. We prove that the gradient of PDGD is unbiased w.r.t. user document pair preferences. Our experiments on the largest publicly available Learning to Rank (LTR) datasets show considerable and significant improvements under all levels of interaction noise. PDGD outperforms existing OLTR methods both in terms of learning speed as well as final convergence. Furthermore, unlike previous OLTR methods, PDGD also allows for non-linear models to be optimized effectively. Our results show that using a neural network leads to even better performance at convergence than a linear model. In summary, PDGD is an efficient and unbiased OLTR approach that provides a better user experience than previously possible.

References

[1]

James Allan, Ben Carterette, Javed A. Aslam, Virgil Pavlu, Blagovest Dachev, and Evangelos Kanoulas. 2007. Million query track 2007 overview. In TREC. NIST.

[2]

Christopher J. C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An overview. Technical Report. Microsoft Research.

[3]

Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: From pairwise Approach to listwise approach. In ICML. ACM, 129--136.

Digital Library

[4]

Olivier Chapelle and Yi Chang. 2011. Yahoo! Learning to Rank Challenge Overview. Journal of Machine Learning Research 14 (2011), 1--24.

Digital Library

[5]

Sergiu Chelaru, Claudia Orellana-Rodriguez, and Ismail Sengor Altingovde. 2014. How useful is social feedback for learning to rank YouTube videos? World Wide Web 17, 5 (2014), 997--1025.

Digital Library

[6]

Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool Publishers.

[7]

Domenico Dato, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2016. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Transactions on Information Systems (TOIS) 35, 2 (2016), 15.

Digital Library

[8]

Susan T. Dumais. 2010. The web changes everything: Understanding and supporting people in dynamic information environments. In ECDL. Springer, 1.

Digital Library

[9]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256.

[10]

Fan Guo, Chao Liu, and Yi M Wang. 2009. Efficient multiple-click models in web search. In WSDM. ACM, 124--131.

Digital Library

[11]

Jing He, Chengxiang Zhai, and Xiaoming Li. 2009. Evaluation of methods for relative comparison of retrieval systems based on clickthroughs. In CIKM. ACM, 2029--2032.

Digital Library

[12]

Katja Hofmann. 2013. Fast and Reliably Online Learning to Rank for Information Retrieval. Ph.D. Dissertation. University of Amsterdam.

[13]

Katja Hofmann, Anne Schuth, Shimon Whiteson, and Maarten de Rijke. 2013. Reusing historical interaction data for faster online learning to rank for IR. In WSDM. ACM, 183--192.

Digital Library

[14]

Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2011. Balancing Exploration and Exploitation in Learning to Rank Online. In ECIR. Springer, 251--263.

Digital Library

[15]

Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2011. A probabilistic method for inferring preferences from clicks. In CIKM. ACM, 249--258.

Digital Library

[16]

Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. 2012. Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval. Information Retrieval 16, 1 (2012), 63--90.

Digital Library

[17]

Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In KDD. ACM, 133--142.

Digital Library

[18]

Alexandros Karatzoglou, Linas Baltrunas, and Yue Shi. 2013. Learning to rank for recommender systems. In RecSys. ACM, 493--494.

Digital Library

[19]

Shubhra Kanti Karmaker Santu, Parikshit Sondhi, and ChengXiang Zhai. 2017. On application of learning to rank for e-commerce search. In SIGIR. ACM, 475--484.

Digital Library

[20]

Damien Lefortier, Pavel Serdyukov, and Maarten de Rijke. 2014. Online exploration for detecting shifts in fresh intent. In CIKM. ACM, 589--598.

Digital Library

[21]

Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225--331.

Digital Library

[22]

Tie-Yan Liu, Jun Xu, Tao Qin, Wenying Xiong, and Hang Li. 2007. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. In LR4IR '07.

[23]

Harrie Oosterhuis and Maarten de Rijke. 2017. Balancing speed and quality in online learning to rank for information retrieval. In CIKM. ACM, 277--286.

Digital Library

[24]

Harrie Oosterhuis and Maarten de Rijke. 2017. Sensitive and scalable online evaluation with theoretical guarantees. In CIKM. ACM, 77--86.

Digital Library

[25]

Harrie Oosterhuis, Anne Schuth, and Maarten de Rijke. 2016. Probabilistic Multileave Gradient Descent. In ECIR. Springer, 661--668.

[26]

Art B. Owen. 2013. Monte Carlo theory, methods and examples. Monte Carlo Theory, Methods and Examples. Art Owen (2013).

[27]

Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 (2013).

[28]

Filip Radlinski and Nick Craswell. 2017. A theoretical framework for conversational search. In CHIIR. 117--126.

Digital Library

[29]

Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. 2008. Learning diverse rankings with multi-armed bandits. In ICML. ACM Press, 784--791.

Digital Library

[30]

Filip Radlinski, Madhu Kurup, and Thorsten Joachims. 2008. How does clickthrough data reflect retrieval quality?. In CIKM. ACM, 43--52.

Digital Library

[31]

Mark Sanderson. 2010. Test Collection Based Evaluation of Information Retrieval Systems. Foundations and Trends in Information Retrieval 4, 4 (2010), 247--375.

[32]

Anne Schuth, Robert-Jan Bruintjes, Fritjof Büttner, Joost van Doorn, and others. 2015. Probabilistic multileave for online retrieval evaluation. In SIGIR. ACM, 955--958.

Digital Library

[33]

Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. 2016. Multileave gradient descent for fast online learning to rank. In WSDM. ACM, 457--466.

Digital Library

[34]

Anne Schuth, Floor Sietsma, Shimon Whiteson, Damien Lefortier, and Maarten de Rijke. 2014. Multileaved Comparisons for Fast Online Evaluation. In CIKM. ACM, 71--80.

Digital Library

[35]

Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. 2013. Ranked Bandits in Metric Spaces: Learning Diverse Rankings over Large Document Collections. Journal of Machine Learning Research 14, 1 (2013), 399--436.

Digital Library

[36]

Balázs Szörényi, Róbert Busa-Fekete, Adil Paul, and Eyke Hüllermeier. 2015. Online rank elicitation for Plackett-Luce: A dueling bandits approach. In NIPS. 604--612.

Digital Library

[37]

Pertti Vakkari and Nana Hakala. 2000. Changes in relevance criteria and problem stages in task performance. Journal of Documentation 56 (2000), 540--562.

[38]

Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In SIGIR. ACM, 115--124.

Digital Library

[39]

Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201--1208.

Digital Library

[40]

Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In WWW. ACM, 1011--1018.

Digital Library

[41]

Masrour Zoghi, Tomá? Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvari, and Zheng Wen. 2017. Online Learning to Rank in Stochastic Click Models. In ICML. 4199--4208.

[42]

Masrour Zoghi, Shimon Whiteson, Maarten de Rijke, and Remi Munos. 2014. Relative confidence sampling for efficient on-line ranker evaluation. In WSDM. 73--82.

Digital Library

Cited By

Gorantla SBhansali EDeshpande ALouis AHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Optimizing Learning-to-Rank Models for Ex-Post Fair RelevanceProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657751(1525-1534)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657751
Gupta SHager PHuang JVardasbi AOosterhuis HAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)Unbiased Learning to Rank: On Recent Advances and Practical ApplicationsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3636451(1118-1121)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3636451
Yang THan CLuo CGupta PPhillips JAi QChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM on Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645487
Show More Cited By

Index Terms

Differentiable Unbiased Online Learning to Rank
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
WWW '19: The World Wide Web Conference

Recently a number of algorithms under the theme of 'unbiased learning-to-rank' have been proposed, which can reduce position bias, the major type of bias in click data, and train a high-performance ranker with click data. Most of the existing algorithms,...
Unbiased Learning to Rank: Online or Offline?

How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank (ULTR) can be broadly categorized into two groups—the studies on unbiased learning ...
Multileave Gradient Descent for Fast Online Learning to Rank
WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining

Modern search systems are based on dozens or even hundreds of ranking features. The dueling bandit gradient descent (DBGD) algorithm has been shown to effectively learn combinations of these features solely from user interactions. DBGD explores the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

October 2018

2362 pages

ISBN:9781450360142

DOI:10.1145/3269206

General Chair:
Alfredo Cuzzocrea
University of Trieste, Italy
,
Program Chairs:
James Allan
University of Massachusetts, USA
,
Norman Paton
University of Manchester, United Kingdom
,
Divesh Srivastava
AT&T Labs Research, USA
,
Rakesh Agrawal
Data Insights Lab, USA
,
Andrei Broder
Google Research, USA
,
Mohammed Zaki
Rensselaer Polytechnic Institute, USA
,
Selcuk Candan
Arizona State University, USA
,
Alexandros Labrinidis
University of Pittsburgh, USA
,
Assaf Schuster
Technion, Israel
,
Haixun Wang
Google Research, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Netherlands Organisation for Scientific Research (NWO)

Conference

CIKM '18

Sponsor:

CIKM '18: The 27th ACM International Conference on Information and Knowledge Management

October 22 - 26, 2018

Torino, Italy

Acceptance Rates

CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

59
Total Citations
View Citations
578
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)3

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gorantla SBhansali EDeshpande ALouis AHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Optimizing Learning-to-Rank Models for Ex-Post Fair RelevanceProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657751(1525-1534)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657751
Gupta SHager PHuang JVardasbi AOosterhuis HAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)Unbiased Learning to Rank: On Recent Advances and Practical ApplicationsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3636451(1118-1121)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3636451
Yang THan CLuo CGupta PPhillips JAi QChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes ApproachProceedings of the ACM on Web Conference 202410.1145/3589334.3645487(1486-1496)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645487
Chu XHao CWang SYin DZhao JZou LLi C(2024) LT 2 R: Learning to Online Learning to Rank for Web Search 2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00360(4733-4746)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00360
Tao YTao M(2024)Privacy Preserved Federated Learning for Online Ranking System (OLTR) for 6G Internet TechnologyWireless Personal Communications10.1007/s11277-024-11206-zOnline publication date: 31-May-2024
https://doi.org/10.1007/s11277-024-11206-z
Wang SLiu BZuccon G(2024)How to Forget Clients in Federated Online Learning to Rank?Advances in Information Retrieval10.1007/978-3-031-56063-7_7(105-121)Online publication date: 23-Mar-2024
https://doi.org/10.1007/978-3-031-56063-7_7
Sagtani HJeunen OUstimenko A(2024)Learning-to-Rank with Nested FeedbackAdvances in Information Retrieval10.1007/978-3-031-56063-7_22(306-315)Online publication date: 23-Mar-2024
https://doi.org/10.1007/978-3-031-56063-7_22
Gupta SHager POosterhuis H(2023)Recent Advancements in Unbiased Learning to RankProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632942(145-148)Online publication date: 15-Dec-2023
https://dl.acm.org/doi/10.1145/3632754.3632942
Pereira BChaves PSantos R(2023)Efficient Exploration and Exploitation for Sequential Music RecommendationACM Transactions on Recommender Systems10.1145/36258272:4(1-23)Online publication date: 27-Sep-2023
https://dl.acm.org/doi/10.1145/3625827
Xiao TKveton BKatariya SGangwani TRangi A(2023)Towards Sequential Counterfactual Learning to RankProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625325(122-128)Online publication date: 26-Nov-2023
https://dl.acm.org/doi/10.1145/3624918.3625325
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents