research-article

Open access

How do Ties Affect the Uncertainty in Rank-Biased Overlap?

Authors:

Julián UrbanoAuthors Info & Claims

SIGIR-AP 2024: Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

Pages 125 - 134

https://doi.org/10.1145/3673791.3698422

Published: 08 December 2024 Publication History

Abstract

Rank-Biased Overlap (RBO) is a popular measure of the similarity between two rankings. A key characteristic of RBO is that it can be computed even when the rankings are not fully seen and only a prefix is known, but this introduces uncertainty in the computation. In such cases, one would normally compute the point estimate RBO_EXT, as well as bounds representing the best and worst cases; their difference is thus a residual quantifying the amount of uncertainty. Another source of uncertainty is the presence of tied items, because their actual relative order is unknown. Current approaches to this issue similarly provide a point estimate by considering the average RBO score over all the permutations of the ties, such as RBO^a. However, there is currently no approach to quantify and bound the uncertainty due to ties, just as there is for the uncertainty due to unseen items. In this paper we fill this gap and provide algorithmic solutions to the problem of finding the arrangements of tied items that yield the lowest and highest possible RBO scores, naturally leading to total bounds and residuals. We also show that the current RBO^a estimate only equals the average RBO over permutations when the rankings have the same length, so we also generalize it to rankings of different lengths. In summary, this work provides a full account for the uncertainty in RBO, allowing practitioners to make more sensible decisions on the grounds of rank similarity. The main realization is that residuals can actually be much larger once we account for both sources of uncertainty. To illustrate this, we present empirical results using both synthetic and TREC data, demonstrating that a realistic picture for the residual of RBO can only be provided by considering both sources of uncertainty.

References

[1]

Aly Abdelrazek, Yomna Eid, Eman Gawish, Walaa Medhat, and Ahmed Hassan Yousef. 2022. Topic Modeling Algorithms and Applications: A Survey. Information Systems, Vol. 112 (2022).

[2]

Peter Bailey, Alistair Moffat, Falk Scholer, and Paul Thomas. 2017. Retrieval Consistency in the Presence of Query Variations. In PInternational ACM SIGIR Conference on Research and Development in Information Retrieval. 395--404.

[3]

Judit Bar-Ilan, Mazlita Mat-Hassan, and Mark Levene. 2006. Methods for Comparing Rankings of Search Engine Results. Computer Networks, Vol. 50, 10 (2006), 1448--1463.

Digital Library

[4]

Chris Buckley. 2004. Topic Prediction Based on Comparative Retrieval Rankings. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 506--507.

[5]

Guillaume Cabanac, Gilles Hubert, Mohand Boughanem, and Claude Chrisment. 2010. Tie-breaking Bias: Effect of an Uncontrolled Parameter on Information Retrieval Evaluation. In International Conference of the Cross-Language Evaluation Forum for European Languages. 112--123.

[6]

Rocío Cañamares, Pablo Castells, and Alistair Moffat. 2020. Offline Evaluation Options for Recommender Systems. Information Retrieval Journal, Vol. 23, 4 (2020), 387--410.

Digital Library

[7]

Bruno Cardoso and João Magalhães. 2011. Google, Bing and a New Perspective on Ranking Similarity. ACM International Conference on Information and Knowledge Management, 1933--1936.

Digital Library

[8]

Ben Carterette. 2009. On rank correlation and the distance between rankings. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. 436--443.

Digital Library

[9]

Charles L. A. Clarke, Mark D. Smucker, and Alexandra Vtyurina. 2020. Offline Evaluation by Maximum Similarity to an Ideal Ranking. In ACM International Conference on Information and Knowledge Management. 225--234.

Digital Library

[10]

Gordon V Cormack and Maura R Grossman. 2018. Beyond pooling. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1169--1172.

Digital Library

[11]

Matteo Corsi and Julián Urbano. 2024. The Treatment of Ties in Rank-Biased Overlap. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 251--260.

Digital Library

[12]

Arthur Eddington. 1939. The Philosophy of Physical Science. Cambridge University Press.

[13]

Ronald Fagin, Ravi Kumar, and Dakshinamurthi Sivakumar. 2003. Comparing Top k Lists. SIAM Journal on Discrete Mathematics, Vol. 17, 1 (2003), 134--160.

Digital Library

[14]

Soumyajit Gupta, Mücahid Kutlu, Vivek Khetan, and Matthew Lease. 2019. Correlation, Prediction and Ranking of Evaluation Metrics in Information Retrieval. In European Conference on Information Retrieval. 636--651.

[15]

Maurice G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika, Vol. 30, 1 (1938), 81--93.

[16]

Maurice G. Kendall. 1945. The Treatment of Ties in Ranking Problems. Biometrika, Vol. 33, 3 (1945), 239--251.

[17]

A. Kolmogorov. 1933. Sulla Determinazione Empirica di una Legge di Distribuzione. Giornale dell'Istituto Italiano degli Attuari, Vol. 4 (1933), 83--91.

[18]

Ravi Kumar and Sergei Vassilvitskii. 2010. Generalized Distances Between Rankings. In International Conference on World Wide Web. 571--580.

[19]

Jimmy Lin, Tamer Elsayed, Lidan Wang, and Donald Metzler. 2009. Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search. In Text REtrieval Conference TREC, Vol. 500--278.

[20]

Jimmy Lin and Peilin Yang. 2019. The Impact of Score Ties on Repeatability in Document Ranking. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 1125--1128.

[21]

Mykola Makhortykh, Aleksandra Urman, and Roberto Ulloa. 2020. How search engines disseminate information about COVID-19 and why they should do better. Harvard Kennedy School Misinformation Review, Vol. 1, 3 (2020).

[22]

Mika V Mantyla, Maelick Claes, and Umar Farooq. 2018. Measuring LDA topic stability from clusters of replicated runs. In Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement. 1--4.

Digital Library

[23]

Frank McSherry and Marc Najork. 2008. Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores. In European Conference on Information Retrieval. 414--421.

[24]

Massimo Melucci. 2007. On Rank Correlation in Information Retrieval Evaluation. ACM SIGIR Forum, Vol. 41, 1 (jun 2007), 18--33.

Digital Library

[25]

Alistair Moffat, Peter Bailey, Falk Scholer, and Paul Thomas. 2017. Incorporating User Expectations and Behavior into the Measurement of Search Effectiveness. ACM Transactions on Information Systems, Vol. 35, 3 (2017).

Digital Library

[26]

Karl Pearson. 1907. On Further Methods of Determining Correlation. Cambridge University Press.

[27]

Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczyński, and Wouter Joosen. 2018. Tranco: A research-oriented top sites ranking hardened against manipulation. arXiv preprint arXiv:1806.01156 (2018).

[28]

Vijay Raghavan, Peter Bollmann, and Gwang S. Jung. 1989. A Critical Investigation of Recall and Precision as Measures of Retrieval System Performance. ACM Transactions on Information Systems, Vol. 7, 3 (1989), 205--229.

Digital Library

[29]

Chiman Salavati, Alireza Abdollahpouri, and Zhaleh Manbari. 2018. BridgeRank: A Novel Fast Centrality Measure based on Local Structure of the Network. Physica A: Statistical Mechanics and its Applications, Vol. 496 (2018), 635--653.

[30]

Avi Segal, Kobi Gal, Guy Shani, and Bracha Shapira. 2019. A difficulty ranking approach to personalization in E-learning. International Journal of Human-Computer Studies, Vol. 130 (2019), 261--272.

Digital Library

[31]

Charles Spearman. 1904. The Proof and Measurement of Association between Two Things. The American Journal of Psychology, Vol. 15, 1 (1904), 72--101.

[32]

Student. 1921. An Experimental Determination of the Probable Error of Dr. Spearman's Correlation Coefficients. Biometrika, Vol. 13, 2/3 (1921), 263--282.

[33]

Mingxuan Sun, Guy Lebanon, and Kevyn Collins-Thompson. 2010. Visualizing Differences in Web Search Algorithms Using the Expected Weighted Hoeffding Distance. In International Conference on World Wide Web. 931--940.

[34]

Luchen Tan and Charles L. A. Clarke. 2015. A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference. IEEE Transactions on Knowledge and Data Engineering. Knowl. Data Eng., Vol. 27, 11 (2015), 2865--2877.

Digital Library

[35]

Julián Urbano and Mónica Marrero. 2017. The Treatment of Ties in AP Correlation. In ACM SIGIR International Conference on the Theory of Information Retrieval. 321--324.

[36]

Sebastiano Vigna. 2015. A Weighted Correlation Index for Rankings with Ties. In International Conference on World Wide Web. 1166--1176.

Digital Library

[37]

Sergey Volokhin and Eugene Agichtein. 2018. Towards intent-aware contextual music recommendation: Initial experiments. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1045--1048.

Digital Library

[38]

Sanne Vrijenhoek, Mesut Kaya, Nadia Metoui, Judith Möller, Daan Odijk, and Natali Helberger. 2021. Recommenders with a Mission: Assessing Diversity in News Recommendations. In ACM SIGIR Conference on Human Information Interaction and Retrieval. 173--183.

Digital Library

[39]

Yu Wang, Yuying Zhao, Yi Zhang, and Tyler Derr. 2023. Collaboration-aware graph convolutional network for recommender systems. In Proceedings of the ACM Web Conference 2023. 91--101.

Digital Library

[40]

William Webber, Alistair Moffat, and Justin Zobel. 2010. A Similarity Measure for Indefinite Rankings. ACM Transactions on Information Systems, Vol. 28, 4 (2010), 1--38.

Digital Library

[41]

Emine Yilmaz, Javed A. Aslam, and Stephen Robertson. 2008. A New Rank Correlation Coefficient for Information Retrieval. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 587--594.

Digital Library

[42]

Oleg Zendel, Anna Shtok, Fiana Raiber, Oren Kurland, and J. Shane Culpepper. 2019. Information Needs, Queries, and Query Performance Prediction. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 395--404.

[43]

Guido Zuccon. 2016. Understandability biased evaluation for information retrieval. In Advances in Information Retrieval: 38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20--23, 2016. Proceedings 38. Springer, 280--292.

Index Terms

How do Ties Affect the Uncertainty in Rank-Biased Overlap?
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
2. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms
      1. Exploratory data analysis

Recommendations

The Treatment of Ties in Rank-Biased Overlap
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Rank-Biased Overlap (RBO) is a similarity measure for indefinite rankings: it is top-weighted, and can be computed when only a prefix of the rankings is known or when they have only some items in common. It is widely used for instance to analyze ...
Google, bing and a new perspective on ranking similarity
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

In this paper, we propose a framework to characterize and compare two search engine results. Typical user-queries are ambiguous and, consequentially, each search engine will compute ranks in different manners, attempting to answer them in the best ...
Learning to rank with ties
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Designing effective ranking functions is a core problem for information retrieval and Web search since the ranking functions directly impact the relevance of the search results. The problem has been the focus of much of the research at the intersection ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR-AP 2024: Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

December 2024

328 pages

ISBN:9798400707247

DOI:10.1145/3673791

General Chairs:
Tetsuya Sakai
Waseda University, Japan
,
Emi Ishita
Kyushu University, Japan
,
Hiroaki Ohshima
University of Hyogo, Japan
,
Program Chairs:
Faegheh Radboud
Radboud University, Netherlands
,
Jiaxin Mao
Renmin University of China, China
,
Joemon Jose
University of Glasgow, United Kingdom

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR-AP 2024

Sponsor:

SIGIR

SIGIR-AP 2024: Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

December 9 - 12, 2024

Tokyo, Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
44
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)44

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents