Abstract
Cluster validation to determine the right number of clusters is an important issue in clustering processes. In this work, a strategy to address the problem of cluster validation based on cluster stability properties is introduced. The stability index proposed is based on information measures taking into account the variation on some of these measures due to the variability in clustering solutions produced by different sample sets of the same problem. The experiments carried out on synthetic and real database show the effectiveness of the cluster stability index when the clustering algorithm is based on a data structure model adequate to the problem.
Chapter PDF
Similar content being viewed by others
References
Bouguessa, M., Wang, S., Sun, H.: An Objective approach to cluster validation. Pattern Recognition Letters 27, 1419–1430 (2006)
Ertoz, L., Steinbach, M., Kumar, V.: Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data. In: Third SIAM International Conference on data Mining (2003)
Lange, T., Braun, M.L., Buhmann, J.M.: Stability-Based Validation of Clustering Solutions. Neural Computation 16, 1299–1323 (2004)
Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)
Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)
Pascual, D., Pla, F., Sánchez, J.S.: Non Parametric Local Density-based Clustering for Multimodal Overlapping Distributions. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds.) IDEAL 2006. LNCS, vol. 4224, pp. 671–678. Springer, Heidelberg (2006)
Sugar, C.: Techniques for clustering and classification with applications to medical problems. PhD Dissertation Stanford University, Stanford (1998)
Sugar, C., Lenert, L., Olshen, R.: An application of cluster analysis to health services research: empirically defined health states for depression from the sf-12. Technical Report Stanford University, Stanford (1999)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Statist Soc. B 63, Part 2, 411–423 (2001)
Ben-Hur, A., Guyon, I.: Detecting stable clusters using principal component analysis. In: Brownstein, M., Khodursky, A. (eds.) Methods in Molecular Biology, pp. 159–182. Humana press (2003)
Mufti, G.B., Bertrand, P., Moubarki, L.E.: Determining the number of groups from measures of cluster validity. In: ASMDA 2005, pp. 404–414 (2005)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Li, J.: Divergence measures based on Shannon entropy. IEEE Trans. on Information Theory 37(1), 145–151 (1991)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pascual, D., Pla, F., Sánchez, J.S. (2008). Cluster Stability Assessment Based on Theoretic Information Measures. In: Ruiz-Shulcloper, J., Kropatsch, W.G. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2008. Lecture Notes in Computer Science, vol 5197. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85920-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-85920-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85919-2
Online ISBN: 978-3-540-85920-8
eBook Packages: Computer ScienceComputer Science (R0)