Article

Cluster analysis of heterogeneous rank data

Authors:

Ludwig M. Busse,

Peter Orbanz,

Joachim M. BuhmannAuthors Info & Claims

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 113 - 120

https://doi.org/10.1145/1273496.1273511

Published: 20 June 2007 Publication History

Get Access

Abstract

Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.

References

[1]

Ailon, N., Charikar, M., & Newman, A. (2005). Aggregating inconsistent information: Ranking and clustering. ACM Symposium on the Theory of Computing.

Digital Library

Google Scholar

[2]

Beckett, L. A. (1993). Maximum likelihood estimation in Mallows' model using partially ranked data. In M. A. Fligner and J. S. Verducci (Eds.), Probability models and statistical analyses for ranking data.

Google Scholar

[3]

Critchlow, D. (1985). Metric methods for analyzing partially ranked data. Springer.

Google Scholar

[4]

Diaconis, P. (1988). Group representations in probability and statistics. Institute of Mathematical Statistics.

Google Scholar

[5]

Diaconis, P. (1989). A generalization of spectral analysis with applications to ranked data. Annals of Statistics, 17, 949--979.

Crossref

Google Scholar

[6]

Fligner, M. A., & Verducci, J. S. (1986). Distance based rank models. Journal of the Royal Statistical Society B, 48, 359--369.

Google Scholar

[7]

Hofmann, T., & Buhmann, J. (1997). Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 1--14.

Digital Library

Google Scholar

[8]

Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81--93.

Crossref

Google Scholar

[9]

Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220, 671--680.

Crossref

Google Scholar

[10]

Lebanon, G., & Lafferty, J. (2002). Cranking: Combining rankings using conditional probability models on permutations. International Conference on Machine Learning.

Digital Library

Google Scholar

[11]

Mallows, C. L. (1957). Non-null ranking models I. Biometrika, 44, 114--130.

Crossref

Google Scholar

[12]

Marden, J. I. (1995). Analyzing and modeling rank data. Chapman & Hall.

Google Scholar

[13]

McLachlan, G. J., & Krishnan, T. (1997). The EM algorithm and extensions. John Wiley & Sons.

Google Scholar

[14]

Murphy, T. B., & Martin, D. (2003). Mixtures of distance-based models for ranking data. Computational Statistics and Data Analysis, 41, 645--655.

Digital Library

Google Scholar

Cited By

View all

Hüllermeier ESłowiński R(2024)Preference learning and multiple criteria decision aiding: differences, commonalities, and synergies—part II4OR10.1007/s10288-023-00561-5Online publication date: 30-Jan-2024
https://doi.org/10.1007/s10288-023-00561-5
Boehmer NFaliszewski PKraiczy SKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Properties of the mallows model depending on the number of alternativesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618521(2689-2711)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618521
Gormley IMurphy TRaftery A(2023)Model-Based ClusteringAnnual Review of Statistics and Its Application10.1146/annurev-statistics-033121-11532610:1(573-595)Online publication date: 10-Mar-2023
https://doi.org/10.1146/annurev-statistics-033121-115326
Show More Cited By

Cluster analysis of heterogeneous rank data
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning

Recommendations

Effective rank aggregation for metasearching

Nowadays, mashup services and especially metasearch engines play an increasingly important role on the Web. Most of users use them directly or indirectly to access and aggregate information from more than one data sources. Similarly to the rest of the ...
Enhanced Learning to Rank using Cluster-loss Adjustment
CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

Most Learning To Rank (LTR) algorithms like Ranking SVM, RankNet, LambdaRank and LambdaMART use only relevance label judgments as ground truth for training. But in common scenarios like ranking of information cards (google now, other personal assistants)...
Learning to re-rank: query-dependent image re-ranking using click data
WWW '11: Proceedings of the 20th international conference on World wide web

Our objective is to improve the performance of keyword based image search engines by re-ranking their original results. To this end, we address three limitations of existing search engines in this paper. First, there is no straight-forward, fully ...

Comments

Information & Contributors

Information

Published In

ICML '07: Proceedings of the 24th international conference on Machine learning

June 2007

1233 pages

ISBN:9781595937933

DOI:10.1145/1273496

Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

ICML '07 & ILP '07

Sponsor:

ICML '07 & ILP '07: The 24th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 20 - 24, 2007

Oregon, Corvalis, USA

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

55
Total Citations
View Citations
757
Total Downloads

Downloads (Last 12 months)56
Downloads (Last 6 weeks)7

Reflects downloads up to

Other Metrics

View Author Metrics

Citations

Cited By

View all

Hüllermeier ESłowiński R(2024)Preference learning and multiple criteria decision aiding: differences, commonalities, and synergies—part II4OR10.1007/s10288-023-00561-5Online publication date: 30-Jan-2024
https://doi.org/10.1007/s10288-023-00561-5
Boehmer NFaliszewski PKraiczy SKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Properties of the mallows model depending on the number of alternativesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618521(2689-2711)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618521
Gormley IMurphy TRaftery A(2023)Model-Based ClusteringAnnual Review of Statistics and Its Application10.1146/annurev-statistics-033121-11532610:1(573-595)Online publication date: 10-Mar-2023
https://doi.org/10.1146/annurev-statistics-033121-115326
Crispino MMollica CAstuti VTardella L(2023)Efficient and accurate inference for mixtures of Mallows models with Spearman distanceStatistics and Computing10.1007/s11222-023-10266-833:5Online publication date: 5-Jul-2023
https://dl.acm.org/doi/10.1007/s11222-023-10266-8
Mao CWu Y(2022)Learning mixtures of permutations: Groups of pairwise comparisons and combinatorial method of momentsThe Annals of Statistics10.1214/22-AOS218550:4Online publication date: 1-Aug-2022
https://doi.org/10.1214/22-AOS2185
Kateri MNikolov N(2022)A generalized Mallows model based on ϕ-divergence measuresJournal of Multivariate Analysis10.1016/j.jmva.2022.104958190:COnline publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1016/j.jmva.2022.104958
Nikolov NStoimenova E(2021)Rank Data Clustering Based on Lee DistanceAdvanced Computing in Industrial Mathematics10.1007/978-3-030-71616-5_27(303-312)Online publication date: 4-Apr-2021
https://doi.org/10.1007/978-3-030-71616-5_27
Ping HStoyanovich JKimelfeld B(2020)Supporting hard queries over probabilistic preferencesProceedings of the VLDB Endowment10.14778/3384345.338435913:7(1134-1146)Online publication date: 26-Mar-2020
https://dl.acm.org/doi/10.14778/3384345.3384359
Salehi‐Abari ALarson K(2020)Group recommendation with noisy subjective preferencesComputational Intelligence10.1111/coin.1239837:1(210-225)Online publication date: 3-Sep-2020
https://doi.org/10.1111/coin.12398
Gupta UBhattacherjee VBishnu P(2020)Clustering on Ranked Data for Campaign SelectionIEEE Access10.1109/ACCESS.2020.30193948(162421-162431)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3019394
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Effective rank aggregation for metasearching

Enhanced Learning to Rank using Cluster-loss Adjustment

Learning to re-rank: query-dependent image re-ranking using click data

Comments

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Recommendations

Effective rank aggregation for metasearching

Enhanced Learning to Rank using Cluster-loss Adjustment

Learning to re-rank: query-dependent image re-ranking using click data

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations