Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations

Published: 01 February 2010 Publication History

Abstract

Clustering-combination methods have received considerable attentions in recent years, and many ensemble-based clustering methods have been introduced. However, clustering-combination techniques have been limited to "flat" clustering combination, and the combination of hierarchical clusterings has yet to be addressed. In this paper, we address and formalize the concept of hierarchical-clustering combination and introduce an algorithmic framework in which multiple hierarchical clusterings could be easily combined. In this framework, the similarity-based description matrices of input hierarchical clusterings are aggregated into a transitive consensus matrix in which the final hierarchy could be formed. Empirical evaluation, by using popular available datasets, confirms the superiority of combined hierarchical clustering introduced by our method over the standard (single) hierarchical-clustering methods.

References

[1]
H. Ayad and M. Kamel, "Finding Natural clusters using multi-clusterer combiner based on shared nearest neighbors," in Proc. 4th Int. Workshop Multiple Classifier Syst. (Lecture Notes in Computer Science 2709), T. Windeatt and F. Roli, Eds. Guildford, U.K.: Springer-Verlag, 2003, pp. 166-175.
[2]
H. G. Ayad and M. S. Kamel, "Cumulative voting consensus method for partitions with variable number of clusters," IEEE Trans. Pattern Anal. Mach., vol. 30, no. 1, pp. 160-173, Jan. 2008.
[3]
P. Berman, B. D. Gupta, M. Y. Kao, and J. Wang, "On constructing an optimal consensus clustering from multiple clusterings," Inf. Process. Lett., vol. 104, no. 4, pp. 137-145, 2007.
[4]
L. Breiman, "Bagging predictors," Mach. Learn., vol. 24, no. 2, pp. 123- 140, 1996.
[5]
R. J. G. B. Campello, "A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment," Pattern Recognit. Lett., vol. 28, no. 7, pp. 833-841, 2007.
[6]
P. Dawyndt, H. D. Meyer, and B. D. Baets, "On the min-transitive approximation of symmetric fuzzy relations," in Proc. IEEE Int. Conf. Fuzzy Syst., vol. 1, Budapest, Hungary, 2004, vol. 1, pp. 167-171.
[7]
I. S. Dhillon, S. Mallela, and R. Kumar, "A divisive information-theoretic feature clustering algorithm for text classification," J. Mach. Learn. Res., vol. 3, pp. 1265-1287, 2003.
[8]
S. Dudoit and J. Fridlyand, "Bagging to improve the accuracy of clustering procedure," Bioinform., vol. 19, no. 9, pp. 1090-1099, 2003.
[9]
J. C. Dunn, "Some recent investigations of a new fuzzy partitioning algorithm and its application to pattern classification problems," J. Cybern., vol. 4, pp. 1-15, 1974.
[10]
X. Z. Fern and C. E. Brodley, "Solving cluster ensemble problems by bipartite graph partitioning," in Proc. 21st Int. Conf. Mach. Learn., 2004, vol. 69, p. 36.
[11]
B. Fischer and J. M. Buhmann, "Bagging for path-based clustering," IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 11, pp. 1411-1415, Nov. 2003.
[12]
A. Fred, "Finding consistent clusters in data partitions," in Proc. 2nd Int. Workshop Multiple Classifier Syst. (Lecture Notes in Computer Science 2096), F. Roli and J. Kittler, Eds. Cambridge, U.K.: Springer-Verlag, 2001, pp. 309-318.
[13]
A. Fred and A. K. Jain, "Data clustering using evidence accumulation," in Proc. 16th Int. Conf. Pattern Recognit., 2002, pp. 276-280.
[14]
A. Fred and A. K. Jain, "Robust data clustering," in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2003, pp. 128-133.
[15]
A. Fred and A. K. Jain, "Combining multiple clusterings using evidence accumulation," IEEE Trans. Pattern Anal.Mach. Intell., vol. 27, no. 6, pp. 835-850, Jun. 2005.
[16]
D. Frossyniotis, A. Likas, and A. Stafylopatis, "A clustering method based on boosting," Pattern Recognit. Lett., vol. 25, no. 6, pp. 641-654, 2004.
[17]
G. Fu, "An algorithm for computing the transitive closure of a fuzzy similarity matrix," Fuzzy Sets Syst., vol. 51, pp. 189-194, 1992.
[18]
A. Gionis, H. Mannila, and P. Tsaparas, "Clustering aggregation," presented at the Int. Conf. Data Eng., Tokyo, Japan, 2005.
[19]
D. Gondek and T. Hofmann, "Non-redundant clustering with conditional ensembles," in Proc. ACM SIGKDD Conf. Knowl. Discov. Data Mining, 2005, pp. 70-77.
[20]
L. A. Goodman and W. H. Kruskal, "Measure of association for cross-classification," J. Amer. Stat. Assoc., vol. 49, pp. 732-764, 1954.
[21]
A. D. Gordon and M. Vichi, "Partitions of partitions," J. Classification, vol. 15, no. 2, pp. 265-285, 1998.
[22]
A. D. Gordon and M. Vichi, "Fuzzy partition models for fitting a set of partitions," Psychometrika, vol. 66, no. 2, pp. 229-247, 2001.
[23]
D. Gusfield, "Partition-distance: A problem and class of perfect graphs arising in clustering," Inf. Process. Lett., vol. 82, pp. 159-164, 2002.
[24]
S.T. Hadjitodorov, L. I. Kuncheva, and L. P. Todorova, "Moderate diversity for better cluster ensembles," Inf. Fusion, vol. 7, no. 3, pp. 264-275, 2006.
[25]
A. K. Jain, M. N. Murty, and P. J. Flynn, "Data clustering: A review," ACM Comput. Surveys, vol. 31, pp. 264-323, 1999.
[26]
G. J. Klir, U. S. Clair, and B. Yuan, Fuzzy Set Theory: Foundations and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1997.
[27]
A. M. Krieger and P. E. Green, "A generalized Rand-index method for consensus clustering of separate partitions of the same data base," J. Classification, vol. 16, no. 1, pp. 63-89, 1999.
[28]
L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms. New York: Wiley, 2004.
[29]
L. I. Kuncheva, S. T. Hadjitodorov, and L. P. Todorova, "Experimental comparison of cluster ensemble methods," in Proc. 19th Int. Conf. Inf. Fusion, pp. 1-7, 2006.
[30]
S. Kundu, "An optimal O(N2) algorithm for computing the min-transitive closure of a weighted graph," Inf. Process. Lett., vol. 74, pp. 215-220, 2000.
[31]
F. J. Lapointe and G. Cucumel, "The average consensus (Abstract)," presented at the Int. Fed. Classification Soc. (IFCS-91) Meeting, Edinburgh, U.K., 1991.
[32]
F. J. Lapointe and P. Legendre, "Statistical significance of the matrix correlation coefficient for comparing independent phylogenetic trees," Syst. Biol., vol. 41, no. 3, pp. 378-384, 1992.
[33]
H. K. Larsen and R. Yager, "A fast max-min similarity algorithm," in The Interface Between AI and OR in a Fuzzy Environment, J. C. Verdegay and M. Delgado, Eds. Köln, Germany: IS 95 Verlag TUV Rheinland, 1989, pp. 147-155.
[34]
H. S. Lee, "An optimal algorithm for computing the max-min transitive closure of a fuzzy similarity matrix," Fuzzy Sets Syst., vol. 123, pp. 129- 136, 2001.
[35]
F. Leisch and K. Hornik, "Stabilization of k-means with bagged clustering," presented at the Joint Stat. Meeting, Baltimore, MD, 1999.
[36]
V. Makarenkov and P. Legendre, "Optimal variable weighting for ultrametric and additive trees and k-means partitioning: Methods and software," J. Classification, vol. 18, pp. 245-271, 2001.
[37]
H. D. Meyer, H. Naessens, and B. D. Baets, "Algorithms for computing the min-transitive closure and associated partition tree of a symmetric fuzzy relation," Eur. J. Oper. Res., vol. 155, no. 1, pp. 226-238, 2004.
[38]
B. Minaei, A. Topchy, andW. F. Punch, "Ensembles of partitions via data resampling," presented at the Int. Conf. Inf. Technol., Las Vegas, NV, 2004.
[39]
A. Mirzaei, M. Rahmati, and M. Ahmadi, "A new method for hierarchical clustering combination," Intell. Data Anal., vol. 12, no. 6, pp. 549-571, 2008.
[40]
S. Monti, P. Tamayo, J. Mesirov, and T. Golub, "Consensus clustering: A resampling based method for class discovery and visualization of gene expression microarray data," Mach. Learn., vol. 52, pp. 91-118, 2003.
[41]
H. Naessens, H. De Meyer, and B. De Baets, "Algorithms for the computation of T-transitive closures," IEEE Trans. Fuzzy Syst., vol. 10, no. 4, pp. 541-551, Aug. 2002.
[42]
W. Pedrycz, "Collaborative fuzzy clustering," Pattern Recognit. Lett., vol. 23, no. 14, pp. 1675-1686, 2002.
[43]
W. Pedrycz and K. Hirota, "A consensus-driven fuzzy clustering," Pattern Recognit. Lett., vol. 29, no. 9, pp. 1333-1343, 2008.
[44]
J. Podani, "Simulation of random dendrograms and comparison tests: Some comments," J. Classification, vol. 17, pp. 123-142, 2000.
[45]
K. Punera and J. Ghosh, "Consensus-based ensembles of soft clusterings," Appl. Artif., vol. 22, no. 7, pp. 780-810, 2008.
[46]
R. E. Schapire, "The strength of weak learnability," Mach. Learn., vol. 5, no. 2, pp. 197-227, 1990.
[47]
A. Strehl and J. Ghosh, "Cluster ensembles--A knowledge reuse framework for combining multiple partitions," J. Mach. Learn. Res., vol. 3, pp. 583-617, 2002.
[48]
A. Strehl and J. Ghosh, "Cluster ensembles--A knowledge reuse framework for combining partitions," in Proc. Conf. Artif. Intell. Edmonton, AB, Canada: AAAI/MIT Press, Jul. 2002, pp. 93-98.
[49]
S. Theodoridis and K. Koutroumbas, Pattern Recognition, 2nd ed. Amsterdam, The Netherlands/New York: Elsevier/Academic, 2003.
[50]
N. Tishby, F. Perira, and W. Bialek, "The information bottleneck method," in Proc. 37th Ann. Allerton Conf. Commun., Control, Comput., 1999, pp. 368-377.
[51]
A. Topchy, A. K. Jain, and W. Punch, "Combining multiple weak clustering," in Proc. IEEE Int. Conf. Data Mining, 2003, pp. 331-338.
[52]
A. Topchy, A. K. Jain, and W. Punch, "A mixture model for clustering ensembles," in Proc. SIAM Conf. Data Mining, 2004, pp. 379-390.
[53]
A. Topchy, A. K. Jain, and W. Punch, "Clustering ensembles: Models of consensus and weak partitions," IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 12, pp. 1866-1881, Dec. 2005.
[54]
A. Topchy, B. Minaei, A. K. Jain, and W. Punch, "Adaptive clustering," in Proc. ICPR, Cambridge, U.K., 2004, pp. 272-275.
[55]
S. J. Verzi, G. L. Heileman, and M. Georgiopoulos, "Boosted ARTMAP: Modifications to fuzzy ARTMAP motivated by boosting theory," Neural Netw., vol. 19, no. 4, pp. 446-468, 2006.
[56]
M. Vichi, "Principal classifications analysis: a method for generating consensus dendrograms and its application to three-way data," Comput. Stat. Data Anal., vol. 27, no. 3, pp. 311-331, 1998.
[57]
M. Vichi, "One-mode classification of a three-way data matrix," J. Classification, vol. 16, no. 1, pp. 27-44, 1999.
[58]
M. Wallace, Y. Avrithis, and S. Kollias, "Computationally efficient sup-t transitive closure for sparse fuzzy binary relations," Fuzzy Sets Syst., vol. 157, pp. 341-372, 2006.
[59]
A. Weingessel, E. Dimitriadou, and K. Hornik. (2003). An ensemble method for clustering. DSC Working Papers {Online}. Available: http://www.ci.tuwien.ac.at/Conferences/DSC-2003
[60]
Y. Zhao and G. Karypis, "Evaluation of hierarchical clustering algorithms for document datasets," in Proc. 11th Int. Conf. Inf. Knowl. Manag., McLean, VA, 2002, pp. 515-524.
[61]
Z. H. Zhou and W. Tang, "Clusterer ensemble," Knowl.-Based Syst., vol. 19, no. 1, pp. 77-83, 2006.
[62]
Z. H. Zhou and Y. Yu., "Ensembling local learners through multimodal perturbation," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 35, no. 4, pp. 725-735, Aug. 2005.
[63]
Gunnar Raetsch's Benchmark Datasets. (2008, Oct. 13). {Online}. Available: http://users.rsise.anu.edu.au/~raetsch/data/index
[64]
Univ. Calif. Irvine Repository of Machine Learning Databases. (2008, Oct. 13). {Online}. Available: http://www.ics.uci.edu/~mlearn/ MLRepository.htm
[65]
Real Medical Datasets. (2008, Oct. 13). {Online}. Available: http://www. informatics.bangor.ac.uk/~kuncheva/activities/real_data.htm

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Fuzzy Systems
IEEE Transactions on Fuzzy Systems  Volume 18, Issue 1
February 2010
233 pages

Publisher

IEEE Press

Publication History

Published: 01 February 2010
Accepted: 05 September 2009
Revised: 08 March 2009
Received: 13 October 2008

Author Tags

  1. Clustering combination
  2. clustering combination
  3. dendrogram descriptor
  4. fuzzy-equivalence relation
  5. hierarchical clustering
  6. min-transitive closure
  7. ultrametric property

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Consensus function based on cluster-wise two level clusteringArtificial Intelligence Review10.1007/s10462-020-09862-154:1(639-665)Online publication date: 1-Jan-2021
  • (2018)Unsupervised Heterogeneous Domain Adaptation via Shared Fuzzy Equivalence RelationsIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2018.283636426:6(3555-3568)Online publication date: 1-Dec-2018
  • (2018)Improving the Fuzzy MinMax neural network performance with an ensemble of clustering treesNeurocomputing10.1016/j.neucom.2017.10.025275:C(1744-1751)Online publication date: 31-Jan-2018
  • (2016)Optimized aggregation function in hierarchical clustering combinationIntelligent Data Analysis10.3233/IDA-16080520:2(281-291)Online publication date: 1-Jan-2016
  • (2016)DenPEHCInformation Sciences: an International Journal10.1016/j.ins.2016.08.086373:C(200-218)Online publication date: 10-Dec-2016
  • (2016)Co-clustering of multi-view datasetsKnowledge and Information Systems10.1007/s10115-015-0861-447:3(545-570)Online publication date: 1-Jun-2016
  • (2015)Adaptive Noise Immune Cluster Ensemble Using Affinity PropagationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.245316227:12(3176-3189)Online publication date: 1-Dec-2015
  • (2015)A modified fuzzy min-max neural network for data clustering and its application to power quality monitoringApplied Soft Computing10.1016/j.asoc.2014.09.05028:C(19-29)Online publication date: 1-Mar-2015
  • (2015)Unsupervised mining of visually consistent shots for sports genre categorization over large-scale databaseTelecommunications Systems10.1007/s11235-014-9943-y59:3(381-391)Online publication date: 1-Jul-2015
  • (2014)Automatic quantitative analysis and localisation of protein expression with GDFInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2014.06453910:3(300-314)Online publication date: 1-Sep-2014
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media