Abstract
The Correlation Clustering problem has been introduced recently [5] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring respectively the similarity and dissimilarity of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up scores of pairs involving points from a same cluster and scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Min Consensus Clustering is APX-hard even for three input partitions, answering an open question, while Max Consensus Clustering admits a PTAS on instances with a bounded number of input partitions. We exhibit a combinatorial and practical \({4}\over{5}\)-approximation algorithm based on a greedy technique for Max Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Max Correlation Clustering when the maximum ratio between two scores is at most a constant.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alimonti, P., Kann, V.: Some APX-completeness results for cubic graphs. Theoretical Computer Science 237(1–2), 123–134 (2000)
Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of \(\mathcal{NP}\)-hard problems. Journal of Computer and System Sciences 58, 193–210 (2000)
Ailon, N., Charikar, M., Newman, A.: Aggregating Inconsistent Information: Ranking and Clustering. In: Proc. 37th Symposium on Theory of Computing (STOC 2005), pp. 684–693 (2005)
Ausiello, G., Crescenzi, P., Gambosi, V., Kann, G., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial optimization problems and their approximability properties. Springer, Heidelberg (1999)
Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56(1-3), 89–113 (2004)
Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. In: Proc. 44th Symp. Foundations of Computer Science (FOCS), pp. 524–533 (2003)
Demaine, E.D., Immorlica, N.: Correlation clustering with partial information. In: Arora, S., Jansen, K., Rolim, J.D.P., Sahai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 1–13. Springer, Heidelberg (2003)
Emanuel, D., Fiat, A.: Correlation clustering – minimizing disagreements on arbitrary weighted graphs. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 208–220. Springer, Heidelberg (2003)
Filkov, V., Skiena, S.: Integrating microarray data by consensus clustering. In: Proc. 15th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 418–425 (2003)
Filkov, V., Skiena, S.: Heterogeneous data integration with the consensus clustering formalism. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 110–123. Springer, Heidelberg (2004)
Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Mathematical Programming 45, 52–96 (1989)
Krivanek, M., Moravek, J.: Hard problems in hierarchical-tree clustering. Acta Informatica 23, 311–323 (1986)
Swamy, C.: Correlation clustering: maximizing agreements via semidefinite programming. In: Proc. 15th Symp. on Discrete Algorithms (SODA), pp. 526–527 (2004)
Wakabayashi, Y.: The complexity of computing medians of relations. Resenhas 3(3), 323–349 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonizzoni, P., Della Vedova, G., Dondi, R., Jiang, T. (2005). Correlation Clustering and Consensus Clustering. In: Deng, X., Du, DZ. (eds) Algorithms and Computation. ISAAC 2005. Lecture Notes in Computer Science, vol 3827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602613_24
Download citation
DOI: https://doi.org/10.1007/11602613_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30935-2
Online ISBN: 978-3-540-32426-3
eBook Packages: Computer ScienceComputer Science (R0)