Abstract
As one of the most popular algorithms for cluster analysis, fuzzy c-means (FCM) and its variants have been widely studied. In this paper, a novel generalized version called double indices-induced FCM (DI-FCM) is developed from another perspective. DI-FCM introduces a power exponent r into the constraints of the objective function such that the fuzziness index m is generalized and a new criterion of selecting an appropriate fuzziness index m is defined. Furthermore, it can be explained from the viewpoint of entropy concept that the power exponent r facilitates the introduction of entropy-based constraints into fuzzy clustering algorithms. As an attractive and judicious application, DI-FCM is integrated with a fuzzy subspace clustering (FSC) algorithm so that a new fuzzy subspace clustering algorithm called double indices-induced fuzzy subspace clustering (DI-FSC) algorithm is proposed for high-dimensional data. DI-FSC replaces the commonly used Euclidean distance with the feature-weighted distance, resulting in having two fuzzy matrices in the objective function. A convergence proof of DI-FSC is also established by applying Zangwill’s convergence theorem. Several experiments on both artificial data and real data were conducted and the experimental results show the effectiveness of the proposed algorithm.
Similar content being viewed by others
References
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Bezdek JC (1980) A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Trans Pattern Anal Mach Intell 2:1–8
Baraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans Syst Man Cybern Part B 29(6):778–801
Pal NR, Bezdek JC (1995) On cluster validity for the fuzzy c-mean model. IEEE Trans Fuzzy Syst 3:370–379
Hall LO, Bensaid AM, Clarke LP (1992) A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain. IEEE Trans Neural Netw 3:672–682
Cannon RL, Dave JV, Bezdek JC (1986) Efficient implementation of the fuzzy c-means clustering algorithms. IEEE Trans Pattern Anal Mach Intell 8:248–255
Yu J, Cheng Q, Huang H (2004) Analysis of the weighting exponent in the FCM. IEEE Trans Syst Man Cybern Part B 34(1):634–638
Hsu CM, Chen MS (2009) On the design and applicability of distance functions in high-dimensional data space. IEEE Trans Knowl Data Eng 21(4):523–536
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is ‘nearest neighbor’ meaningful. Lect Notes Comput Sci 1540:217–235
Green PE, Carmone FJ, Kim J (1990) A preliminary study of optimal variable weighting in k-means clustering. J Classif 7:271–285
De Soete G (1986) Optimal variable weighting for ultrametric and additive tree clustering. Qual Quant 20:169–180
Gnanadesikan R, Kettenring J, Tsao S (1995) Weighting and selection of variables for cluster analysis. J Classif 12:113–136
Shen H, Yang J, Wang S, Liu X (2006) Attribute weighted mercer kernel based fuzzy clustering algorithm for general non-spherical datasets. Soft Comput 10(11):1061–1073
Makarenkov V, Legendre P (2001) Optimal variable weighting for ultrametric and additive trees and k-means partitioning: methods and software. J Classif 18:245–271
Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):1–12
Lee DD, Seung HS (2000) Algorithms for nonnegative matrix factorization. In: Proceedings of neural information systems, pp 942–948
Lin CJ (2007) Projected gradient methods for non-negative matrix factorization. Neural Comput 19:2756–2779
Bucak SS, Gunsel B (2010) Efficient document clustering via online nonnegative matrix factorization. In: Proceedings of 2010 IEEE international conference on data mining
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of ACM special interest group on information retrieval, pp 267–273, August 2003
Guan N, Tao D, Luo Z, Yuan B (2012) NeNMF: an optimal gradient method for non-negative matrix factorization. IEEE Trans Signal Process 60(6):2882–2898
Ding C, He X, Simon HD (2006) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of 2006 SIAM data mining conference, pp 606–610
Deng Z, Choi K, Chung F, Wang S (2010) Enhanced soft subspace clustering integrating within-cluster and between-cluster information. Pattern Recogn 43(3):767–781
Zangwill WI (1969) Nonlinear programming: a unified approach. Prentice-Hall, Englewood Cliffs, NJ
Gan G, Wu J (2008) A convergence theorem for the fuzzy subspace clustering (FSC) algorithm. Pattern Recogn 41:1939–1947
Hathaway R, Bezdek J, Tucker W (1987) An improved convergence theorem for the fuzzy c-means clustering algorithms. In: Bezdek J (ed) Analysis of fuzzy information, vol.III. CRC Press, Boca Raton, pp 123–131
Havrda JH, Charvat F (1967) Quantification methods of classification processes: concepts of structural α-entropy. Kybernetica 3:30–35
Krishnapuram R, Keller J (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1:98–110
Pedrycz W, Vukovich G (2002) Logic-oriented fuzzy clustering. Pattern Recogn Lett 23(13):1515–1527
Wu KL, Yang MS (2002) Alternative c-means clustering algorithms. Pattern Recogn 35(10):2267–2278
Lin Z, Chung FL, Wang ST (2009) Generalized fuzzy c-means clustering algorithm with improved fuzzy partitions. Trans Syst Man Cybern Part B 39(3):578–591
Newman DJ, Hettich S, Blake CL et al (1998) UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine. (http://archive.ics.uci.edu/ml/)
Aggarwal CC, Procopiuc CM, Wolf JL, Yu PS, Park JS (1999) Fast algorithms for projected clustering. SIGMOD Record 28(2):61–72
http://glaros.dtc.umn.edu/gkhome/fetch/sw/cluto/datasets.tar.gz, 2009
UCI KDD Archive. http://kdd.ics.uci.edu/databases/20newsgroups, 2005
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Zhong S, Ghosh J (2003) A comparative study of generative models for documents clustering. In: Proceedings of SDW workshop clustering high-dimensional data and its applications, May 2003
Jing L, Ng MK, Huang JZ (2007) An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1026–1041
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856
Kim J, Park H (2008) Toward faster nonnegative matrix factorization: A new algorithm and comparisons. In: Proceedings of 8th IEEE international conference of data mining, pp 353–362
Yu J, Yang MS (2005) Optimality test for generalized FCM and its application to parameter selection. IEEE Trans Fuzzy Syst 13(1):164–176
Acknowledgments
This work is supported by the Hong Kong Polytechnic University Grants (Grant Nos. Z-08R and G-U296), National Science Foundation of China (Grants 61170122, 61272210), Natural Science Foundation of Jiangsu Province (Grant BK2011003, BK2011417), JiangSu 333 expert engineering Grant (BRA2011142), Fundamental Research Funds for the Central Universities (Grant JUSRP21128), the Opening Project of Jiangsu Engineering R&D Center for Information (Grant SR-2011-01).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, J., Chung, Fl., Wang, S. et al. Double indices-induced FCM clustering and its integration with fuzzy subspace clustering. Pattern Anal Applic 17, 549–566 (2014). https://doi.org/10.1007/s10044-013-0341-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-013-0341-y