research-article

A similarity measure for indefinite rankings

Authors:

William Webber,

Alistair Moffat,

Justin ZobelAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 28, Issue 4

Article No.: 20, Pages 1 - 38

https://doi.org/10.1145/1852102.1852106

Published: 23 November 2010 Publication History

Abstract

Ranked lists are encountered in research and daily life and it is often of interest to compare these lists even when they are incomplete or have only some members in common. An example is document rankings returned for the same query by different search engines. A measure of the similarity between incomplete rankings should handle nonconjointness, weight high ranks more heavily than low, and be monotonic with increasing depth of evaluation; but no measure satisfying all these criteria currently exists. In this article, we propose a new measure having these qualities, namely rank-biased overlap (RBO). The RBO measure is based on a simple probabilistic user model. It provides monotonicity by calculating, at a given depth of evaluation, a base score that is non-decreasing with additional evaluation, and a maximum score that is nonincreasing. An extrapolated score can be calculated between these bounds if a point estimate is required. RBO has a parameter which determines the strength of the weighting to top ranks. We extend RBO to handle tied ranks and rankings of different lengths. Finally, we give examples of the use of the measure in comparing the results produced by public search engines and in assessing retrieval systems in the laboratory.

References

[1]

Bar-Ilan, J. 2005. Comparing rankings of search results on the Web. Inform. Proc. Manag. 41, 1511--1519.

Digital Library

[2]

Bar-Ilan, J., Mat-Hassan, M., and Levene, M. 2006. Methods for comparing rankings of search engine results. Comput. Netw. 50, 10 (July), 1448--1463.

Digital Library

[3]

Blest, D. C. 2000. Rank correlation—an alternative measure. Australian and New Zealand J. Statis. 42, 1, 101--111.

[4]

Buckley, C. 2004. Topic prediction based on comparative retrieval rankings. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, M. Sanderson, K. Järvelin, J. Allan, and P. Bruza, Eds. 506--507.

Digital Library

[5]

Carterette, B. 2009. On rank correlation and the distance between rankings. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, J. Allan, J. Aslam, M. Sanderson, C. Zhai, and J. Zobel, Eds. 436--443.

Digital Library

[6]

Cliff, N. 1996. Ordinal Methods for Behavioural Data Analysis. Lawrence Erlbaum Associates.

[7]

Fagin, R., Kumar, R., and Sivakumar, D. 2003. Comparing top k lists. SIAM J. Discrete Mathem. 17, 1, 134--160.

Digital Library

[8]

Gibbons, J. D. and Chakraborti, S. 2003. Nonparametric Statistical Inference 4th Ed. CRC.

[9]

Goodman, L. A. and Kruskal, W. H. 1954. Measures of association for cross classifications. J. Am. Statis. Assoc. 49, 268, 732--764.

[10]

Iman, R. L. and Conover, W. J. 1987. A measure of top-down correlation. Technometrics 29, 351--357.

Digital Library

[11]

Järvelin, K. and Kekäläinen, J. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inform. Syst. 20, 4, 422--446.

Digital Library

[12]

Kendall, M. G. 1948. Rank Correlation Methods 1st Ed. Charles Griffin, London.

[13]

Knuth, D. E. 1997. The Art of Computer Programming, Vol. I: Fundamental Algorithms. 3rd Ed. Addison Wesley, Reading, MA.

Digital Library

[14]

Lester, N., Moffat, A., Webber, W., and Zobel, J. 2005. Space-limited ranked query evaluation using adaptive pruning. In Proceedings of the 6th International Conference on Web Informations Systems. A. H. Ngu, M. Kitsuregawa, E. J. Neuhold, J.-Y. Chung, and Q. Z. Sheng, Eds. Lecture Notes in Computer Science, vol. 3806, 470--477.

Digital Library

[15]

Melucci, M. 2007. On rank correlation in information retrieval evaluation. SIGIR Forum 41, 1, 18--33.

Digital Library

[16]

Melucci, M. 2009. Weighted rank correlation in information retrieval evaluation. In Proceedings of the 5th Asia Information Retrieval Symposium, G. G. Lee, D. Song, C.-Y. Lin, A. Aizawa, K. Kuriyama, M. Yoshioka, and T. Sakai, Eds. Lecture Notes in Computer Science, vol. 5839, 75--86.

Digital Library

[17]

Moffat, A. and Zobel, J. 2008. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inform. Syst. 27, 1, 1--27.

Digital Library

[18]

Quade, D. and Salama, I. A. 1992. A survey of weighted rank correlation. In Order Statistics and Nonparametrics: Theory and Applications, P. K. Sen and I. A. Salama, Eds. Elsevier, 213--224.

[19]

Shieh, G. S. 1998. A weighted Kendall's tau statistic. Statist. Probability Lett. 39, 17--24.

[20]

Tarsitano, A. 2002. Nonlinear rank correlation. Departmental working paper, Universitò degli studi della Calabria.

[21]

Wu, S. and Crestani, F. 2003. Methods for ranking information retrieval systems without relevance judgments. In Proceedings of the ACM Symposium on Applied Computing (SAC). 811--816.

Digital Library

[22]

Yilmaz, E., Aslam, J. A., and Robertson, S. 2008. A new rank correlation coefficient for information retrieval. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, S.-H. Myaeng, D. W. Oard, F. Sebastiani, T.-S. Chua, and M.-K. Leong, Eds. 587--594.

Digital Library

[23]

Zhai, C. and Lafferty, J. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inform. Syst. 22, 2, (Apr.). 179--214.

Digital Library

Cited By

Sarica APelagi AAracri FArcuri FQuattrone AQuattrone A(2024)Sex Differences in Conversion Risk from Mild Cognitive Impairment to Alzheimer’s Disease: An Explainable Machine Learning Study with Random Survival Forests and SHAPBrain Sciences10.3390/brainsci1403020114:3(201)Online publication date: 22-Feb-2024
https://doi.org/10.3390/brainsci14030201
Herrewijnen ENguyen DBex Fvan Deemter K(2024)Human-annotated rationales and explainable text classification: a surveyFrontiers in Artificial Intelligence10.3389/frai.2024.12609527Online publication date: 24-May-2024
https://doi.org/10.3389/frai.2024.1260952
Hirosawa THarada YMizuta KSakamoto TTokumasu KShimizu T(2024)Evaluating ChatGPT-4’s Accuracy in Identifying Final Diagnoses Within Differential Diagnoses Compared With Those of Physicians: Experimental Study for Diagnostic CasesJMIR Formative Research10.2196/592678(e59267)Online publication date: 26-Jun-2024
https://doi.org/10.2196/59267
Show More Cited By

Index Terms

A similarity measure for indefinite rankings
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
2. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms
      1. Exploratory data analysis

Recommendations

Comparing rankings of search results on the Web
Special issue: Infometrics

The Web has become an information source for professional data gathering. Because of the vast amounts of information on almost all topics, one cannot systematically go over the whole set of results, and therefore must rely on the ordering of the results ...
What's Going on in Search Engine Rankings?
AINAW '08: Proceedings of the 22nd International Conference on Advanced Information Networking and Applications - Workshops

Many people use search engines every day to retrieve documents from the Web. Although the social influence of search engine rankings has become significant, ranking algorithms are not disclosed. In this paper, we have investigated three major search ...
Methods for comparing rankings of search engine results
Web dynamics

In this paper we present a number of measures that compare rankings of search engine results. We apply these measures to five queries that were monitored daily for two periods of 14 or 21 days each. Rankings of the different search engines (Google, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 28, Issue 4

November 2010

204 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/1852102

Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 November 2010

Accepted: 01 March 2010

Revised: 01 October 2009

Received: 01 March 2009

Published in TOIS Volume 28, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

466
Total Citations
View Citations
2,949
Total Downloads

Downloads (Last 12 months)427
Downloads (Last 6 weeks)48

Reflects downloads up to 12 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sarica APelagi AAracri FArcuri FQuattrone AQuattrone A(2024)Sex Differences in Conversion Risk from Mild Cognitive Impairment to Alzheimer’s Disease: An Explainable Machine Learning Study with Random Survival Forests and SHAPBrain Sciences10.3390/brainsci1403020114:3(201)Online publication date: 22-Feb-2024
https://doi.org/10.3390/brainsci14030201
Herrewijnen ENguyen DBex Fvan Deemter K(2024)Human-annotated rationales and explainable text classification: a surveyFrontiers in Artificial Intelligence10.3389/frai.2024.12609527Online publication date: 24-May-2024
https://doi.org/10.3389/frai.2024.1260952
Hirosawa THarada YMizuta KSakamoto TTokumasu KShimizu T(2024)Evaluating ChatGPT-4’s Accuracy in Identifying Final Diagnoses Within Differential Diagnoses Compared With Those of Physicians: Experimental Study for Diagnostic CasesJMIR Formative Research10.2196/592678(e59267)Online publication date: 26-Jun-2024
https://doi.org/10.2196/59267
Zhang BNaderi NMishra RTeodoro D(2024)Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and ValidationJMIR AI10.2196/426303(e42630)Online publication date: 2-May-2024
https://doi.org/10.2196/42630
Merk DPaul LTsiami FHohenthanner HKouchesfahani GHaeusser LWalter BBrown APersky NRoot DTabatabai G(2024)CRISPR-Cas9 screens reveal common essential miRNAs in human cancer cell linesGenome Medicine10.1186/s13073-024-01341-416:1Online publication date: 17-Jun-2024
https://doi.org/10.1186/s13073-024-01341-4
Pethes RBodor-Eranus ETakács KKovács L(2024)The Core Might Change Anyhow We Define ItComplexity10.1155/2024/39568772024Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1155/2024/3956877
Yu JFu MIgnatiev ATantithamthavorn CStuckey P(2024)A Formal Explainer for Just-In-Time Defect PredictionsACM Transactions on Software Engineering and Methodology10.1145/3664809Online publication date: 14-May-2024
https://dl.acm.org/doi/10.1145/3664809
Keller JBreuer TSchaer POosterhuis HBast HXiong C(2024)Evaluation of Temporal Change in IR Test CollectionsProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672530(3-13)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3664190.3672530
Nachimovsky HTennenholtz MRaiber FKurland OOosterhuis HBast HXiong C(2024)Ranking-Incentivized Document Manipulations for Multiple QueriesProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672516(61-70)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3664190.3672516
Copul RFrost NMilo TRazmadze K(2024)TabEE: Tabular Embeddings ExplanationsProceedings of the ACM on Management of Data10.1145/36393292:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639329
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents