Abstract
Clustering ensembles have emerged as a powerful method for improving both the robustness and the stability of unsupervised classification solutions. However, finding a consensus clustering from multiple partitions is a difficult problem that can be approached from graph-based, combinatorial or statistical perspectives. A consensus scheme via the k-modes algorithm is proposed in this paper. A combined partition is found as a solution to the corresponding categorical data clustering problem using the k-modes algorithm. This study compares the performance of the k-modes consensus algorithm with other fusion approaches for clustering ensembles. Experimental results demonstrate the effectiveness of the proposed method.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Huang, Z.: A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. In: Proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (1997)
Topchy, A., Jain, A.K., Punch, W.: A Mixture Model for Clustering Ensembles. In: Proc. SIAM Conf. on Data Mining, pp. 379–390 (2004)
Minaei-Bidgoli, B., Topchy, A.P., Punch, W.F.: A Comparison of Resampling Methods for Clustering Ensembles. IC-AI, 939–945 (2004)
Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Law, M., Topchy, A., Jain, A.K.: Multiobjective Data Clustering. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 424–430 (2004)
Topchy, A., Jain, A., Punch, W.: Combining multiple weak clusterings. In: Proc. Third IEEE International Conference on Data Mining (ICDM 2003) (2003)
Fred, A.L.N.: Finding Consistent Clusters in Data Partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)
Fern, X.Z., Brodley, C.E.: Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach. In: Proc. of the 20th International Conference on Machine Learning (ICML 2003), Washington DC, USA (2003)
Fischer, B., Buhmann, J.M.: Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation. IEEE Trans. on PAMI 25, 513–518 (2003)
Minaei-Bidgoli, B., Topchy, A.P., Punch, W.F.: Ensembles of Partitions via Data Resampling. In: International Conference on Information Technology: Coding and Computing (ITCC 2004), pp. 188–192 (2004)
Fred, A.L.N., Jain, A.K.: Data Clustering using Evidence Accumulation. In: Proc. of the 16th Intl. Conference on Pattern Recognition ICPR 2002, Quebec City, pp. 276–280 (2002)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing 20, 359–392 (1998)
Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: Application in VLSI domain. In: Proc. 34th ACM/IEEE Design Automation Conference, pp. 526–529 (1997)
Fischer, B., Buhmann, J.M.: Bagging for Path-Based Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 25, 1411–1415 (2003)
Dudoit, F.J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)
Topchy, A., Jain, A.K., Punch, W.: Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1866–1881 (2005)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine learning 2, 139–157 (2000)
Zelnik-Manor, L., Perona, P.: Self-Tuning Spectral Clustering. In: Eighteenth Annual Conference on Neural Information Processing Systems (NIPS) (2004)
Kuhn, H.W.: The hungarian method for the assignment problem. Naval Re-search Logistics Quaterly 2, 83–97 (1955)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, H., Kong, F., Li, Y. (2006). Combining Multiple Clusterings Via k-Modes Algorithm. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_34
Download citation
DOI: https://doi.org/10.1007/11811305_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)