Abstract
In comparing clusterings, several different distances and indices are in use. We prove that the Misclassification Error distance, the Hamming distance (equivalent to the unadjusted Rand index), and the χ 2 distance between partitions are equivalent in the neighborhood of 0. In other words, if two partitions are very similar, then one distance defines upper and lower bounds on the other and viceversa. The proofs are geometric and rely on the concavity of the distances. The geometric intuitions themselves advance the understanding of the space of all clusterings. To our knowledge, this is the first result of its kind.
Practically, distances are frequently used to compare two clusterings of a set of observations. But the motivation for this work is in the theoretical study of data clustering. Distances between partitions are involved in constructing new methods for cluster validation, determining the number of clusters, and analyzing clustering algorithms. From a probability theory point of view, the present results apply to any pair of finite valued random variables, and provide simple yet tight upper and lower bounds on the χ 2 measure of (in)dependence valid when the two variables are strongly dependent.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Bach, F., & Jordan, M. I. (2006). Learning spectral clustering with applications to speech separation. Journal of Machine Learning Research, 7, 1963–2001.
Ben-David, S., von Luxburg, U., & Pal, D. (2006). A sober look at clustering stability. In 19th annual conference on learning theory, COLT 2006. Berlin: Springer.
Candès, E. J., & Tao, T. (2005). The Dantzig selector: statistical estimation when p is much larger than n. Annals of Statistics, 35, 2313–2351.
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273–297.
Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. New York: Wiley.
Donoho, D. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.
Lancaster, H. (1969). The Chi-squared distribution. New York: Wiley.
Meilă, M. (2005). Comparing clusterings—an axiomatic view. In S. Wrobel & L. De Raedt (Eds.), Proceedings of the international machine learning conference (ICML). New York: ACM Press.
Meilă, M. (2006). The uniqueness of a good optimum for K-means. In A. Moore & W. Cohen (Eds.), Proceedings of the international machine learning conference (ICML) (pp. 625–632). Princeton: International Machine Learning Society.
Meilă, M. (2007). Comparing clusterings—an information based distance. Journal of Multivariate Analysis, 98(5), 873–895.
Meilă, M., Shortreed, S., & Xu, L. (2005). Regularized spectral learning. In R. Cowell & Z. Ghahramani (Eds.), Proceedings of the artificial intelligence and statistics workshop (AISTATS 05).
Mirkin, B. G. (1996). Mathematical classification and clustering. Dordrecht: Kluwer Academic.
Papadimitriou, C., & Steiglitz, K. (1998). Combinatorial optimization. Algorithms and complexity. Minneola: Dover.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846–850.
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Cambridge: MIT Press.
Srebro, N., Shakhnarovich, G., & Roweis, S. (2006). An investigation of computational and informational limits in Gaussian mixture clustering. In Proceedings of the 23rd international conference on machine learning (ICML).
Vajda, I. (1989). Theory of statistical inference and information. Theory and decision library. Series B: Mathematical and statistical methods. Norwell: Kluwer Academic Publishers.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Carla Brodley.
Rights and permissions
About this article
Cite this article
Meilă, M. Local equivalences of distances between clusterings—a geometric perspective. Mach Learn 86, 369–389 (2012). https://doi.org/10.1007/s10994-011-5267-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-011-5267-2