Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

An Exponential Model for Infinite Rankings

Published: 01 December 2010 Publication History

Abstract

This paper presents a statistical model for expressing preferences through rankings, when the number of alternatives (items to rank) is large. A human ranker will then typically rank only the most preferred items, and may not even examine the whole set of items, or know how many they are. Similarly, a user presented with the ranked output of a search engine, will only consider the highest ranked items. We model such situations by introducing a stagewise ranking model that operates with finite ordered lists called top-t orderings over an infinite space of items. We give algorithms to estimate this model from data, and demonstrate that it has sufficient statistics, being thus an exponential family model with continuous and discrete parameters. We describe its conjugate prior and other statistical properties. Then, we extend the estimation problem to multimodal data by introducing an Exponential-Blurring-Mean-Shift nonparametric clustering algorithm. The experiments highlight the properties of our model and demonstrate that infinite models over permutations can be simple, elegant and practical.

References

[1]
S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 3(213):403-410, 1990. PMID 2231712.
[2]
L.M. Busse, P. Orbanz, and J. Bühmann. Cluster analysis of heterogeneous rank data. In Proceedings of the International Conference on Machine Learning ICML, 2007.
[3]
M.A. Carreira-Perpiñán. Fast nonparametric clustering with gaussian blurring mean-shift. In 23rd International Conference on Machine Learning (ICML), pages 153-160, 2006.
[4]
Y. Cheng. Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell., 17 (8):790-799, 1995. ISSN 0162-8828.
[5]
W.C. Cohen, R.S. Schapire, and Y. Singer. Learning to order things. Journal of Artificial Intelligence Research, 10:243-270, 1999.
[6]
D.E. Critchlow. Metric methods for analyzing partially ranked data. Number 34 in Lecture notes in statistics. Springer-Verlag, Berlin Heidelberg New York Tokyo, 1985.
[7]
M.H. DeGroot. Probability and Statistics. Addison-Wesley Pub. Co., Reading, MA, 1975.
[8]
J.K. Eng, A.L. McCormack, and J.R. Yates. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society of Mass Spectrometry, 5:976-989, 1994.
[9]
M.A. Fligner and J.S. Verducci. Distance based ranking models. Journal of the Royal Statistical Society B, 48:359-369, 1986.
[10]
M.A. Fligner and J.S. Verducci. Multistage ranking models. Journal of the American Statistical Association, 88:892-901, 1988.
[11]
M.A. Fligner and J.S. Verducci. Posterior probability for a consensus ordering. Psychometrika, 55: 53-63, 1990.
[12]
K. Fukunaga and L.D. Hostetler. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Information Theory, IT-21:32-40, 1975. ISSN 0018-9448.
[13]
K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval, 4(2):133-151, July 2001. http://goldberg.berkeley.edu/jester-data/, to cite for Jester data set.
[14]
I.C. Gormley and T.B. Murphy. Analysis of irish third-level college applications. Journal of the Royal Statistical Society, Series A, 169(2):361-380, 2006.
[15]
I.C. Gormley and T.B. Murphy. Exploring heterogeneity in irish voting data: A mixture modelling approach. Technical Report 05/09, Department of Statistics, Trinity College Dublin, 2005.
[16]
G. Lebanon and J. Lafferty. Conditional models on the ranking poset. In Advances in Neural Information Processing Systems, number 15, Cambridge, MA, 2003. MIT Press.
[17]
R.D. Luce. Individual Choice Behavior. Wiley, New York, 1959.
[18]
C.L. Mallows. Non-null ranking models. Biometrika, 44:114-130, 1957.
[19]
B. Mandhani and M. Meila. Better search for learning exponential models of rankings. In David VanDick and Max Welling, editors, Artificial Intelligence and Statistics AISTATS, number 12, 2009.
[20]
Y. Mao and G. Lebanon. Non-parametric modelling of partially ranked data. Journal of Machine Learning Research, 9:2401-2429, 2008. URL jmlr.csail.mit.edu/papers/v9/lebanon08a.html.
[21]
M. Meila and L. Bao. Estimation and clustering with infinite rankings. In David McAllester and Petri Millimäki, editors, Proceedings of the 24-th Conference on Uncertainty in Artificial Intelligence (UAI 2008). AUAI Press, 2008.
[22]
M. Meila, K. Phadnis, A. Patterson, and J. Bilmes. Consensus ranking under the exponential model. In Ron Parr and Linda Van den Gaag, editors, Proceedings of the 23rd Conference on Uncertainty in AI, volume 23, page (to appear), 2007.
[23]
T.B. Murphy and D. Martin. Mixtures of distance-based models for ranking data. Computational Statistics and Data Analysis, 41(3-4):645-655, 2003.
[24]
J. Pearl. Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, 1984.
[25]
R.L. Plackett. The analysis of permutations. Applied Statistics, 24:193-202, 1975.
[26]
N. Quadrianto, A.J. Smola, L. Song, and T. Tuytelaars. Kernelized sorting. IEEE Transactions on Pattern Analysis and Machine Intelligence, (preprint), 2010.
[27]
R.P. Stanley. Enumerative Combinatorics, volume 1. Cambridge Unversity Press, Cambridge, New York, Melbourne, 1997.
[28]
E. Thoma. Die unzerlegbaren, positiv-definiten Klassenfunctionen der abzälbar unendlichen, symmetrische Gruppen. Mathematische Zeitschrift, 85:40-61, 1964.
[29]
A.W. van der Vaart. Asymptotic statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambrigde University Press, 1998.

Cited By

View all
  • (2023)Efficient and accurate inference for mixtures of Mallows models with Spearman distanceStatistics and Computing10.1007/s11222-023-10266-833:5Online publication date: 5-Jul-2023
  • (2022)On Fitness Landscape Analysis of Permutation Problems: From Distance Metrics to Mutation Operator SelectionMobile Networks and Applications10.1007/s11036-022-02060-z28:2(507-517)Online publication date: 8-Nov-2022
  • (2021)Identity testing for Mallows modelProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542036(23179-23190)Online publication date: 6-Dec-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

Publisher

JMLR.org

Publication History

Published: 01 December 2010
Published in JMLR Volume 11

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)13
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Efficient and accurate inference for mixtures of Mallows models with Spearman distanceStatistics and Computing10.1007/s11222-023-10266-833:5Online publication date: 5-Jul-2023
  • (2022)On Fitness Landscape Analysis of Permutation Problems: From Distance Metrics to Mutation Operator SelectionMobile Networks and Applications10.1007/s11036-022-02060-z28:2(507-517)Online publication date: 8-Nov-2022
  • (2021)Identity testing for Mallows modelProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542036(23179-23190)Online publication date: 6-Dec-2021
  • (2021)Private and non-private uniformity testing for ranking dataProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540987(9480-9492)Online publication date: 6-Dec-2021
  • (2019)Analysis of ranking dataWIREs Computational Statistics10.1002/wics.148311:6Online publication date: 10-Oct-2019
  • (2018)Mallows models for top-k listsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327345.3327351(4387-4397)Online publication date: 3-Dec-2018
  • (2017)Probabilistic preference learning with the mallows rank modelThe Journal of Machine Learning Research10.5555/3122009.324201518:1(5796-5844)Online publication date: 1-Jan-2017
  • (2016)The Permutation in a Haystack Problem and the Calculus of Search LandscapesIEEE Transactions on Evolutionary Computation10.1109/TEVC.2015.247728420:3(434-446)Online publication date: 26-May-2016

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media