research-article

Relative confidence sampling for efficient on-line ranker evaluation

Authors:

Shimon A. Whiteson,

Maarten de Rijke,

Remi MunosAuthors Info & Claims

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

Pages 73 - 82

https://doi.org/10.1145/2556195.2556256

Published: 24 February 2014 Publication History

Abstract

A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are constructed using interleaved comparison methods, which interleave lists proposed by two different candidate rankers, then the problem of minimizing the total regret accumulated while evaluating the rankers can be formalized as a K-armed dueling bandits problem. In this paper, we propose a new method called relative confidence sampling (RCS) that aims to reduce cumulative regret by being less conservative than existing methods in eliminating rankers from contention. In addition, we present an empirical comparison between RCS and two state-of-the-art methods, relative upper confidence bound and SAVAGE. The results demonstrate that RCS can substantially outperform these alternatives on several large learning to rank datasets.

References

[1]

S. Agrawal and N. Goyal. Analysis of Thompson sampling for the multi-armed bandit problem. In Conference on Learning Theory, pages 1--26, 2012.

[2]

P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235--256, 2002.

Digital Library

[3]

S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvari. X-armed bandits. Journal of Machine Learning Research, 12:1655--1695, 2011.

Digital Library

[4]

O. Chapelle and Y. Chang. Yahoo! learning to rank challenge overview. Journal of Machine Learning Research-Proceedings Track, 14:1--24, 2011.

[5]

O. Chapelle and L. Li. An empirical evaluation of Thompson sampling. In NIPS, 2011.

Digital Library

[6]

O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. ACM Trans. Inf. Syst., 30(1): 6:1--6:41, 2012.

Digital Library

[7]

A. Chuklin, A. Schuth, K. Hofmann, P. Serdyukov, and M. de Rijke. Evaluating aggregated search using interleaving. In CIKM 2013. ACM, October 2013.

Digital Library

[8]

C. W. Cleverdon, J. Mills, and M. Keen. Factors determining the performance of indexing systems. In ASLIB Cranfield project. 1966.

[9]

N. Craswell, O. Zoeter, M. Taylor, and B. Ramsey. An experimental comparison of click position-bias models. In WSDM '08, pages 87--94, 2008.

Digital Library

[10]

N. de Freitas, A. Smola, and M. Zoghi. Exponential regret bounds for Gaussian process bandits with deterministic observations. In ICML, 2012.

Digital Library

[11]

F. Guo, L. Li, and C. Faloutsos. Tailoring click models to user goals. In WSCD '09, pages 88--92, 2009.

Digital Library

[12]

F. Guo, C. Liu, and Y. M. Wang. Efficient multiple-click models in web search. In WSDM '09, pages 124--131, New York, NY, USA, 2009. ACM.

Digital Library

[13]

J. He, C. Zhai, and X. Li. Evaluation of methods for relative comparison of retrieval systems based on clickthroughs. In CIKM '09, pages 2029--2032, 2009.

Digital Library

[14]

K. Hofmann, S. Whiteson, and M. de Rijke. A probabilistic method for inferring preferences from clicks. In CIKM '11, pages 249--258, USA, 2011.

Digital Library

[15]

K. Hofmann, S. Whiteson, and M. de Rijke. Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Trans. Inf. Syst., 31(4), 2013.

Digital Library

[16]

K. Hofmann, S. Whiteson, and M. de Rijke. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval, 16(1):63--90, 2013.

Digital Library

[17]

S. Ji, K. Zhou, C. Liao, Z. Zheng, G.-R. Xue, O. Chapelle, G. Sun, and H. Zha. Global ranking by exploiting user clicks. In SIGIR '09, pages 35--42, 2009.

Digital Library

[18]

T. Joachims. Optimizing search engines using clickthrough data. In KDD '02, pages 133--142, 2002.

Digital Library

[19]

T. Joachims. Evaluating retrieval performance using clickthrough data. In J. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining, pages 79--96. 2003.

[20]

S. Jung, J. L. Herlocker, and J. Webster. Click data as implicit relevance feedback in web search. Information Processing & Management, 43(3):791--807, 2007.

Digital Library

[21]

J. Kamps, M. Koolen, and A. Trotman. Comparative analysis of clicks and judgments for IR evaluation. In WSCD'09, pages 80--87, 2009.

Digital Library

[22]

E. Kauffmann, N. Korda, and R. Munos. Thompson sampling: an asymptotically optimal finite time analysis. In ALT'12, pages 199--213, 2012.

Digital Library

[23]

T. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1):4--22, 1985.

Digital Library

[24]

Microsoft Learning to Rank Datasets, 2012. http://research.microsoft.com/en-us/projects/mslr/default.aspx.

[25]

R. Munos. Optimistic optimization of a deterministic function without the knowledge of its smoothness. In NIPS, 2011.

[26]

F. Radlinski and N. Craswell. Comparing the sensitivity of information retrieval metrics. In SIGIR '10, pages 667--674, 2010.

Digital Library

[27]

F. Radlinski and N. Craswell. Optimized interleaving for online retrieval evaluation. In WSDM '13, 2013.

Digital Library

[28]

F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reect retrieval quality? In CIKM'08, pages 43--52, 2008.

Digital Library

[29]

F. Scholer, M. Shokouhi, B. Billerbeck, and A. Turpin. Using clicks as implicit judgments: expectations versus observations. In ECIR'08, pages 28--39, 2008.

Digital Library

[30]

N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger. Gaussian process optimization in the bandit setting: No regret and experimental design. In ICML, 2010.

Digital Library

[31]

W. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, pages 285--294, 1933.

[32]

T. Urvoy, F. Clerot, R. Féraud, and S. Naamane. Generic exploration and k-armed voting bandits. In ICML, 2013.

Digital Library

[33]

M. Valko, A. Carpentier, and R. Munos. Stochastic simultaneous optimistic optimization. In ICML, 2013.

[34]

E. M. Voorhees and D. K. Harman. TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.

Digital Library

[35]

Y. Yue and T. Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML, 2009.

Digital Library

[36]

Y. Yue and T. Joachims. Beat the mean bandit. In ICML, 2011.

Digital Library

[37]

Y. Yue, J. Broder, R. Kleinberg, and T. Joachims. The K-armed dueling bandits problem. Journal of Computer and System Sciences, 78(5):1538--1556, Sept. 2009.

Digital Library

[38]

M. Zoghi, S. Whiteson, R. Munos, and M. de Rijke. Relative upper confidence bound for the k-armed dueling bandits problem. Techn. Report arXiv:1312.3393, 2013.

Cited By

Zhu BJordan MJiao JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Principled reinforcement learning with human feedback from pairwise or K-wise comparisonsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620222(46037-43067)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620222
Çapan GGündoğdu İTürkmen ACemgil A(2022)Dirichlet–Luce choice model for learning from interactionsUser Modeling and User-Adapted Interaction10.1007/s11257-022-09331-032:4(611-648)Online publication date: 4-Jun-2022
https://doi.org/10.1007/s11257-022-09331-0
Li CMarkov IRijke MZoghi M(2020)MergeDTSACM Transactions on Information Systems10.1145/341175338:4(1-28)Online publication date: 10-Sep-2020
https://dl.acm.org/doi/10.1145/3411753
Show More Cited By

Index Terms

Relative confidence sampling for efficient on-line ranker evaluation
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
      1. Relevance assessment

Recommendations

MergeRUCB: A Method for Large-Scale Online Ranker Evaluation
WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

A key challenge in information retrieval is that of on-line ranker evaluation: determining which one of a finite set of rankers performs the best in expectation on the basis of user clicks on presented document lists. When the presented lists are ...
Aggregated Search and Interleaving Methods: A survey
BDAW '16: Proceedings of the International Conference on Big Data and Advanced Wireless Technologies

Aggregated search attempts to satisfy user's need by searching and assembling information from variety verticals and placing them into a single result page. Aggregated search has two research directions namely, cross-vertical Aggregated Search (cvAS) ...
An Empirical Evaluation on Semantic Search Performance of Keyword-Based and Semantic Search Engines: Google, Yahoo, Msn and Hakia
ICIMP '09: Proceedings of the 2009 Fourth International Conference on Internet Monitoring and Protection

This paper investigates the semantic search performance of search engines. Initially, three keyword-based search engines (Google, Yahoo and Msn) and a semantic search engine (Hakia) were selected. Then, ten queries, from various topics, and four phrases,...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

February 2014

712 pages

ISBN:9781450323512

DOI:10.1145/2556195

General Chairs:
Ben Carterette
University of Delaware, USA
,
Fernando Diaz
Microsoft Research, USA
,
Program Chairs:
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Donald Metzler
Google, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM 2014

Sponsor:

WSDM 2014: Seventh ACM International Conference on Web Search and Data Mining

February 24 - 28, 2014

New York, New York, USA

Acceptance Rates

WSDM '14 Paper Acceptance Rate 64 of 355 submissions, 18%;

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
236
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhu BJordan MJiao JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Principled reinforcement learning with human feedback from pairwise or K-wise comparisonsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620222(46037-43067)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620222
Çapan GGündoğdu İTürkmen ACemgil A(2022)Dirichlet–Luce choice model for learning from interactionsUser Modeling and User-Adapted Interaction10.1007/s11257-022-09331-032:4(611-648)Online publication date: 4-Jun-2022
https://doi.org/10.1007/s11257-022-09331-0
Li CMarkov IRijke MZoghi M(2020)MergeDTSACM Transactions on Information Systems10.1145/341175338:4(1-28)Online publication date: 10-Sep-2020
https://dl.acm.org/doi/10.1145/3411753
Peköz ERoss SZhang Z(2020)DUELING BANDIT PROBLEMSProbability in the Engineering and Informational Sciences10.1017/S026996482000060136:2(264-275)Online publication date: 20-Nov-2020
https://doi.org/10.1017/S0269964820000601
Zhuang SZuccon G(2020)Counterfactual Online Learning to RankAdvances in Information Retrieval10.1007/978-3-030-45439-5_28(415-430)Online publication date: 8-Apr-2020
https://doi.org/10.1007/978-3-030-45439-5_28
Muwanei SWai Lam HDevi Ravana SKunda D(2019)Bandit algorithms in information retrieval evaluation and rankingJournal of Physics: Conference Series10.1088/1742-6596/1339/1/0120051339(012005)Online publication date: 16-Dec-2019
https://doi.org/10.1088/1742-6596/1339/1/012005
Oosterhuis Hde Rijke M(2019)Optimizing Ranking Models in an Online SettingAdvances in Information Retrieval10.1007/978-3-030-15712-8_25(382-396)Online publication date: 7-Apr-2019
https://doi.org/10.1007/978-3-030-15712-8_25
Sui YZoghi MHofmann KYue Y(2018)Advancements in dueling banditsProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304790(5502-5510)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304652.3304790
Oosterhuis Hde Rijke MCuzzocrea AAllan JPaton NSrivastava DAgrawal RBroder AZaki MCandan SLabrinidis ASchuster AWang H(2018)Differentiable Unbiased Online Learning to RankProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3271686(1293-1302)Online publication date: 17-Oct-2018
https://dl.acm.org/doi/10.1145/3269206.3271686
Wu HLiu X(2016)Double Thompson sampling for dueling banditsProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157096.3157169(649-657)Online publication date: 5-Dec-2016
https://dl.acm.org/doi/10.5555/3157096.3157169
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents