Article

Comparing and aggregating rankings with ties

Authors:

Mohammad Mahdian,

D. Sivakumar, and

Erik VeeAuthors Info & Claims

PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

June 2004

Pages 47 - 58

https://doi.org/10.1145/1055558.1055568

Published: 14 June 2004 Publication History

Abstract

Rank aggregation has recently been proposed as a useful abstraction that has several applications, including meta-search, synthesizing rank functions from multiple indices, similarity search, and classification. In database applications (catalog searches, fielded searches, parametric searches, etc.), the rankings are produced by sorting an underlying database according to various fields. Typically, there are a number of fields that each have very few distinct values, and hence the corresponding rankings have many ties in them. Known methods for rank aggregation are poorly suited to this context, and the difficulties can be traced back to the fact that we do not have sound mathematical principles to compare two partial rankings, that is, rankings that allow ties.In this work, we provide a comprehensive picture of how to compare partial rankings, We propose several metrics to compare partial rankings, present algorithms that efficiently compute them, and prove that they are within constant multiples of each other. Based on these concepts, we formulate aggregation problems for partial rankings, and develop a highly efficient algorithm to compute the top few elements of a near-optimal aggregation of multiple partial rankings. In a model of access that is suitable for databases, our algorithm reads essentially as few elements of each partial ranking as are necessary to determine the winner(s).

References

[1]

J. A. Aslam and M. Montague. Models for metasearch. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 276--284, 2001.

Digital Library

[2]

K. A. Baggerly. Visual Estimation of Structure in Ranked Data. PhD thesis, Rice University, 1995.

[3]

W. W. Cohen, R. E. Schapire, and Y. Singer. Learning to order things. Journal of Artificial Intelligence Research, 10:243--270, 1999.

Digital Library

[4]

D. E. Critchlow. Metric Methods for Analyzing Partially Ranked Data. Number 34 in Lecture Notes in Statistics. Springer-Verlag, 1980.

[5]

P. Diaconis. Group Representation in Probability and Statistics. Number 11 in IMS Lecture Series. Institute of Mathematical Statistics, 1988.

[6]

P. Diaconis and R. Graham. Spearman's footrule as a measure of disarray. Journal of the Royal Statistical Society, Series B, 39(2):262--268, 1977.

[7]

C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the web. In Proceedings of the 10th International World Wide Web Conference, pages 613--622, 2001.

Digital Library

[8]

R. Fagin, R. Kumar, K. McCurley, J. Novak, D. Sivakumar, J. Tomlin, and D. Williamson. Searching the workplace web. In Proceedings of the 12th International World Wide Web Conference, pages 366--375, 2003.

Digital Library

[9]

R. Fagin, R. Kumar, and D. Sivakumar. Comparing top k lists. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 28--36, 2003. Full version in SIAM Journal on Discrete Mathematics, 17(1): 134--160, 2003.

Digital Library

[10]

R. Fagin, R. Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 301--312, 2003.

Digital Library

[11]

R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 102--113, 2001. Full version in Journal of Computer and System Sciences, 66(4):614--656, 2003.

Digital Library

[12]

L. A. Goodman and W. H. Kruskal. Measures of association for cross classification. Journal of the American Statistical Association, 49:732--764, 1954.

[13]

T. H. Haveliwala, A. Gionis, D. Klein, and P. Indyk. Evaluating strategies for similarity search on the web. In Proceedings of the 11th International World Wide Web Conference, pages 432--442, 2002.

Digital Library

[14]

M. Kendall and J. D. Gibbons. Rank Correlation Methods. Edward Arnold, 1990.

[15]

M. G. Kendall. The treatment of ties in ranking problems. Biometrika, 33(3):239--251, 1945.

[16]

G. Lebanon and J. D. Lafferty. Cranking: Combining rankings using conditional probability models on permutations. In Proceedings of the 19th International Conference on Machine Learning, pages 363--370, 2002.

Digital Library

[17]

M. Montague and J. A. Aslam. Condorcet fusion for improved retrieval. In Proceedings of the 11th International Conference on Information and Knowledge Management, pages 538--548, 2002.

Digital Library

[18]

M. E. Renda and U. Straccia. Web metasearch: Rank vs. score based rank aggregation methods. In Proceedings of the 18th Annual Symposium on Applied Computing, pages 841--846, 2003.

Digital Library

[19]

J. Sese and S. Morishita. Rank aggregation method for biological databases. Genome Informatics, 12:506--507, 2001.

[20]

R. R. Yager and V. Kreinovich. On how to merge sorted lists coming from different web search tools. Soft Computing Research Journal, 3:83--88, 1999.

Cited By

Henderson D(2024)Modelling and Analysis of Rank Ordered Data with Ties via a Generalized Plackett-Luce ModelBayesian Analysis10.1214/24-BA1434-1:-1Online publication date: 1-Jan-2024
https://doi.org/10.1214/24-BA1434
Vacher BJouglet ANace DBouznif M(2023)Aggregating disjoint partial sub-orders – an internal logistics applicationInternational Journal of Systems Science: Operations & Logistics10.1080/23302674.2023.217886210:1Online publication date: 20-Feb-2023
https://doi.org/10.1080/23302674.2023.2178862
Labbé MLandete MMonge J(2023)Bilevel integer linear models for ranking items and setsOperations Research Perspectives10.1016/j.orp.2023.10027110(100271)Online publication date: 2023
https://doi.org/10.1016/j.orp.2023.100271
Show More Cited By

Recommendations

Comparing rankings of search results on the Web
Special issue: Infometrics

The Web has become an information source for professional data gathering. Because of the vast amounts of information on almost all topics, one cannot systematically go over the whole set of results, and therefore must rely on the ordering of the results ...
Read More
Methods for comparing rankings of search engine results
Web dynamics

In this paper we present a number of measures that compare rankings of search engine results. We apply these measures to five queries that were monitored daily for two periods of 14 or 21 days each. Rankings of the different search engines (Google, ...
Read More
User rankings of search engine results

In this study, we investigate the similarities and differences between rankings of search results by users and search engines. Sixty-seven students took part in a 3-week-long experiment, during which they were asked to identify and rank the top 10 ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

June 2004

350 pages

ISBN:158113858X

DOI:10.1145/1055558

Conference Chair:
Catriel Beeri
Hebrew University of Jerusalem

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

SIGMOD/PODS04

Sponsor:

SIGMOD/PODS04: International Conference on Management of Data and Symposium on Principles Database and Systems

June 14 - 16, 2004

Paris, France

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

122
Total Citations
View Citations
1,194
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)6

Other Metrics

View Author Metrics

Citations

Cited By

Henderson D(2024)Modelling and Analysis of Rank Ordered Data with Ties via a Generalized Plackett-Luce ModelBayesian Analysis10.1214/24-BA1434-1:-1Online publication date: 1-Jan-2024
https://doi.org/10.1214/24-BA1434
Vacher BJouglet ANace DBouznif M(2023)Aggregating disjoint partial sub-orders – an internal logistics applicationInternational Journal of Systems Science: Operations & Logistics10.1080/23302674.2023.217886210:1Online publication date: 20-Feb-2023
https://doi.org/10.1080/23302674.2023.2178862
Labbé MLandete MMonge J(2023)Bilevel integer linear models for ranking items and setsOperations Research Perspectives10.1016/j.orp.2023.10027110(100271)Online publication date: 2023
https://doi.org/10.1016/j.orp.2023.100271
Xiao YZhu HChen DDeng YWu J(2023)Measuring robustness in rank aggregation based on the error-effectiveness curveInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10335560:4Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103355
Odaka YKaneiwa K(2023)Block-segmentation vectors for arousal prediction using semi-supervised learningApplied Soft Computing10.1016/j.asoc.2023.110327142(110327)Online publication date: Jul-2023
https://doi.org/10.1016/j.asoc.2023.110327
Chen WZhou RTian CShen C(2022)On Top-$k$ Selection From $m$-Wise Partial Rankings via Borda CountingIEEE Transactions on Signal Processing10.1109/TSP.2022.316715970(2031-2045)Online publication date: 2022
https://doi.org/10.1109/TSP.2022.3167159
Francia MGolfarelli MMarcel PRizzi SVassiliadis P(2022)Suggesting Assess Queries for Interactive Analysis of Multidimensional DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317151635:6(6421-6434)Online publication date: 3-May-2022
https://dl.acm.org/doi/10.1109/TKDE.2022.3171516
Bustince HBedregal BCampion Mda Silva IFernandez JIndurain ERaventos-Pujol ASantiago R(2022)Aggregation of Individual Rankings Through Fusion Functions: Criticism and Optimality AnalysisIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2020.304261130:3(638-648)Online publication date: Mar-2022
https://doi.org/10.1109/TFUZZ.2020.3042611
Lorena LChaves AMauri GLorena L(2022)An Adaptive Biased Random-key Genetic Algorithm for Rank Aggregation with Ties and Incomplete Rankings2022 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC55065.2022.9870203(1-8)Online publication date: 18-Jul-2022
https://dl.acm.org/doi/10.1109/CEC55065.2022.9870203
Akbari SEscobedo A(2022)Top-k List Aggregation: Mathematical Formulations and Polyhedral ComparisonsCombinatorial Optimization10.1007/978-3-031-18530-4_4(51-63)Online publication date: 18-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-18530-4_4
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents