article

Free access

Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

Authors:

Alexander Strehl,

Joydeep GhoshAuthors Info & Claims

The Journal of Machine Learning Research, Volume 3

Pages 583 - 617

https://doi.org/10.1162/153244303321897735

Published: 01 March 2003 Publication History

Abstract

This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse' framework that we call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information. In addition to a direct maximization approach, we propose three effective and efficient techniques for obtaining high-quality combiners (consensus functions). The first combiner induces a similarity measure from the partitionings and then reclusters the objects. The second combiner is based on hypergraph partitioning. The third one collapses groups of clusters into meta-clusters which then compete for each object to determine the combined clustering. Due to the low computational costs of our techniques, it is quite feasible to use a supra-consensus function that evaluates all three approaches against the objective function and picks the best solution for a given situation. We evaluate the effectiveness of cluster ensembles in three qualitatively different application scenarios: (i) where the original clusters were formed based on non-identical sets of features, (ii) where the original clustering algorithms worked on non-identical sets of objects, and (iii) where a common data-set is used and the main purpose of combining multiple clusterings is to improve the quality and robustness of the solution. Promising results are obtained in all three situations for synthetic as well as real data-sets.

References

[1]

C. J. Alpert and A. B. Kahng. Recent directions in netlist partitioning: A survey. Integration: The VLSI Journal, 19:1-18, 1995.]]

[2]

J.A. Barnett. Computational methods for a mathematical theory of evidence. In Proc. of IJCAI, pages 868-875, 1981.]]

[3]

J. P. Barthelemy, B. Laclerc, and B. Monjardet. On the use of ordered sets in problems of comparison and consensus of classifications. Journal of Classification, 3:225-256, 1986.]]

[4]

A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT- 98), pages 92-100, 1998.]]

[5]

D. Boley, M. Gini, R. Gross, E. Han, K. Hastings, G. Karypis, V. Kumar, B. Mobasher, and J. Moore. Partitioning-based clustering for web document categorization. Decision Support Systems, 27:329-341, 1999.]]

[6]

Kurt D. Bollacker and Joydeep Ghosh. A supra-classifier architecture for scalable knowledge reuse. In Proc. Int'l Conf. on Machine Learning (ICML-98), pages 64-72, July 1998.]]

[7]

Kurt D. Bollacker and Joydeep Ghosh. Effective supra-classifiers for knowledge base construction. Pattern Recognition Letters, 20(11-13):1347-52, November 1999.]]

[8]

P.S. Bradley and U. M. Fayyad. Refining initial points for K--means clustering. In Proc. Int'l Conf. on Machine Learning (ICML-98), pages 91-99, July 1998.]]

[9]

Rich Caruana. Learning many related tasks at the same time with backpropagation. In Advances in Neural Information Processing Systems 7, pages 657-664, 1995.]]

[10]

S. V. Chakaravathy and J. Ghosh. Scale based clustering using a radial basis function network. IEEE Transactions on Neural Networks, 2(5):1250-61, Sept 1996.]]

[11]

Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley, 1991.]]

[12]

B. Dasarathy. Decision Fusion. IEEE CS Press, Los Alamitos, CA, 1994.]]

[13]

I. S. Dhillon and D. S. Modha. Concept decompositions for large sparse text data using clustering. Machine Learning, 42(1):143-175, January 2001.]]

[14]

T. G. Dietterich. Ensemble methods in machine learning. In J. Kittler and F. Roli, editors, Multiple Classifier Systems, pages 1-15. LNCS Vol. 1857, Springer, 2001.]]

[15]

M. Ester, H. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of 2nd International Conference on KDD, pages 226-231, 1996.]]

[16]

U. M. Fayyad, C. Reina, and P. S. Bradley. Initialization of iterative refinement clustering algorithms. In Proc. 14th Intl. Conf. on Machine Learning (ICML), pages 194-198, 1998.]]

[17]

Doug Fisher. Iterative optimization and simplification of hierarchical clusterings. Journal of Artificial Intelligence Research, 4:147-180, 1996.]]

[18]

W. Frakes. Stemming algorithms. In W. Frakes and R. Baeza-Yates, editors, Information Retrieval: Data Structures and Algorithms, pages 131-160. Prentice Hall, New Jersey, 1992.]]

[19]

A. L. N. Fred and A. K. Jain. Data clustering using evidence accumulation. In Proc. ICPR, page to appear, 2002.]]

[20]

N. Friedman, O. Mosenzon, N. Slonim, and N. Tishby. Multivariate information bottleneck. In Proc. of the Seventeenth Conf. on Uncertainty in Artificial Intelligence (UAI). AAAI Press, 2001.]]

[21]

Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman, San Francisco, CA, 1979.]]

[22]

J. Ghosh. Multiclassifier systems: Back to the future. Keynote Talk, 3rd Int'l Workshop on Multiple Classifier Systems, Cagliari, June, 2002a. Downloadable from http://www.lans.ece.utexas.edu/publications.html.]]

[23]

J. Ghosh. Multiclassifier systems: Back to the future (invited paper). In F. Roli and J. Kittler, editors, Multiple Classifier Systems, pages 1-15. LNCS Vol. 2364, Springer, 2002b.]]

[24]

J. Ghosh, A. Strehl, and S. Merugu. A consensus framework for integrating distributed clusterings under limited knowledge sharing. In Proc. NSF Workshop on Next Generation Data Mining, Baltimore, pages 99-108, Nov 2002.]]

[25]

C. W. J. Granger. Combining forecasts-twenty years later. Journal of Forecasting, 8(3): 167-173, 1989.]]

[26]

E. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a highdimensional space using hypergraph models. Technical Report 97-019, University of Minnesota, Department of Computer Science, 1997.]]

[27]

A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, New Jersey, 1988.]]

[28]

R. A. Jarvis and E. A. Patrick. Clustering using a similarity measure based on shared nearest neighbors. IEEE Transactions on Computers, C-22, No. 11:1025-1034, 1973.]]

[29]

E. Johnson and H. Kargupta. Collective, hierarchical clustering from distribuied, heterogeneous data. In M. Zaki and C. Ho, editors, Large-Scale Parallel KDD Systems, volume 1759 of Lecture Notes in Computer Science, pages 221-244. Springer-Verlag, 1999.]]

[30]

S. Kannan, T. Warnow, and S. Yooseph. Computing the local consensus of trees. In Association for Computing Machinery and the Society of Industrial Applied Mathematics, Proceedings, ACM/SIAM Symposium on Discrete Algorithms, pages 68-77, 1995.]]

[31]

H. Kargupta and P. Chan, editors. Advances in Distributed and Parallel Knowledge Discovery . AAAI/MIT Press, Cambridge, MA, 2000.]]

[32]

H. Kargupta, W. Huang, Krishnamoorthy, and E. Johnson. Distributed clustering using collective principal component analysis. Knowledge and Information Systems Journal Special Issue on Distributed and Parallel Knowledge Discovery, 3:422-448, 2001.]]

[33]

H. Kargupta, B. Park, D. Hershberger, and E. Johnson. Collective data mining: A new perspective toward distributed data mining. In Hillol Kargupta and Philip Chan, editors, Advances in Distributed and Parallel Knowledge Discovery. MIT/AAAI Press, 1999.]]

[34]

G. Karypis, E.-H. Han, and V. Kumar. Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer, 32(8):68-75, August 1999.]]

[35]

G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359-392, 1993.]]

[36]

George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. Multilevel hypergraph partitioning: Applications in VLSI domain. In Proceedings of the Design and Automation Conference, 1997.]]

[37]

Branko Kavsek, Nada Lavrac, and Anuska Ferligoj. Consensus decision trees: Using consensus hierarchical clustering for data relabelling and reduction. In Proceedings of ECML 2001, volume 2167 of LNAI, pages 251-262. Springer, 2001.]]

[38]

B. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell Systems Technical Journal, 49:291-307, 1970.]]

[39]

J. Kim and T. Warnow. Tutorial on phylogenetic tree estimation. In Intelligent Systems for Molecular Biology, Heidelberg, 1999.]]

[40]

J. Kittler and F. Roli, editors. Multiple Classifier Systems. LNCS Vol. 2634, Springer, 2002.]]

[41]

Teuvo Kohonen. Self-Organizing Maps. Springer, Berlin, Heidelberg, 1995. (Second Extended Edition 1997).]]

[42]

A. Krogh and J. Vedelsby. Neural network ensembles, cross validation and active learning. In D.S. Touretzky G. Tesauro and T.K. Leen, editors, Advances in Neural Information Processing Systems-7, pages 231-238, 1995.]]

[43]

Mala Mehrotra. Multi-viewpoint clustering analysis (mvp-ca) technology for mission rule set development and case-based retrieval, Technical Report AFRL-VS-TR-1999-1029, Air Force Research Laboratory, 1999.]]

[44]

Dharmendra S. Modha and W. Scott Spangler. Clustering hypertext with applications to web searching. In Proceedings of the ACM Hypertext 2000 Conference, San Antonio, TX, May 30-June 3, 2000.]]

[45]

Ion Muslea, Steve Minton, and Craig Knoblock. Selective sampling + semi-supervised learning = robust multi-view learning. In IJCAI-2001 Workshop on Text Learning Beyond Supervision, 2001.]]

[46]

D. A. Neumann and V. T. Norton. Clustering and isolation in the consensus problem for partitions. Journal of Classification, 3:281-298, 1986a.]]

[47]

D. A. Neumann and V. T. Norton. On lattice consensus methods. Journal of Classification, 3:225-256, 1986b.]]

[48]

K. Nigam and R. Ghani. Analyzing the applicability and effectiveness of co-training. In Proceedings of CIKM-00, 9th ACM International Conference on Information and Knowledge Management, pages 86-93. ACM, 2000.]]

[49]

E. Pekalska, P. Paclik, and R.P.W. Duin. A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, Special Issue on Kernel Methods, 2(2):175-211, 2002.]]

[50]

Lorien Y. Pratt. Experiments on the transfer of knowledge between neural networks. In S. Hanson, G. Drastal, and R. Rivest, editors, Computational Learning Theory and Natural Learning Systems, Constraints and Prospects, chapter 19, pages 523-560. MIT Press, 1994.]]

[51]

A. Prodromidis, P. Chan, and S. Stolfo. Meta-learning in distributed data mining systems: Issues and approaches. In H. Kargupta and P. Chan, editors, Advances in Distributed and Parallel Knowledge Discovery. AAAI/MIT Press, Cambridge, MA, 2000.]]

[52]

M.D. Richard and R.P. Lippmann. Neural network classifiers estimate bayesian a posteriori probabilities. Neural Computation, 3(4):461-483, 1991.]]

[53]

A. Sharkey. On combining artificial neural networks. Connection Science, 8(3/4):299-314, 1996.]]

[54]

A. Sharkey. Combining Artificial Neural Nets. Springer-Verlag, 1999.]]

[55]

D. Silver and R. Mercer. The parallel transfer of task knowledge using dynamic learning rates based on a measure of relatedness. Connection Science Special Issue: Transfer in Inductive Systems, 1996.]]

[56]

N. Slonim and N. Tishby. Agglomerative information bottleneck. In Proc. of NIPS-12, pages 617-623. MIT Press, 2000.]]

[57]

A. Strehl and J. Ghosh. Cluster ensembles - a knowledge reuse framework for combining partitionings. In Proceedings of AAAI 2002, Edmonton, Canada, pages 93-98. AAAI, July 2002a.]]

[58]

Alexander Strehl and Joydeep Ghosh. A scalable approach to balanced, high-dimensional clustering of market-baskets. In Proc. HiPC 2000, Bangalore, volume 1970 of LNCS, pages 525-536. Springer, December 2000.]]

[59]

Alexander Strehl and Joydeep Ghosh. Relationship-based clustering and visualization for high-dimensional data mining. INFORMS Journal on Computing, 2002b. in press.]]

[60]

Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. Impact of similarity measures on web-page clustering. In Proc. AAAI Workshop on AI for Web Search (AAAI 2000), Austin, pages 58-64. AAAI, July 2000.]]

[61]

S. Thrun. Is learning the n-th thing any easier than learning the first? In M.C. Mozer D.S. Touretzky and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems-8, pages 640-646. MIT Press, Cambridge, MA, 1996.]]

[62]

S. Thrun and L.Y. Pratt. Learning To Learn. Kluwer Academic, Norwell, MA, 1997.]]

[63]

Sebastian Thrun and Joseph O'Sullivan. Discovering structure in multiple learning tasks: The TC alogorithm. In The 13th International Conference on Machine Learning, pages 489-497, 1996.]]

[64]

K. Tumer and J. Ghosh. Linear and order statistics combiners for pattern classification. In A. Sharkey, editor, Combining Artificial Neural Nets, pages 127-162. Springer-Verlag, 1999.]]

Cited By

Li MYang ZZhou XFang YLi KLi K(2025)Clustering on Attributed Graphs: From Single-view to Multi-viewACM Computing Surveys10.1145/371440757:7(1-36)Online publication date: 10-Feb-2025
https://dl.acm.org/doi/10.1145/3714407
Zhou PLi RLing ZDu LLiu X(2025)Fair Clustering Ensemble With Equal Cluster CapacityIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.350785747:3(1729-1746)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1109/TPAMI.2024.3507857
Ji XSun JPeng JPang YZhou P(2025)Clustering Ensemble Based on Fuzzy Matrix Self-EnhancementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348955337:1(148-161)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3489553
Show More Cited By

Index Terms

Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

Recommendations

On Seeking Consensus Between Document Similarity Measures

This paper investigates the application of consensus clustering and meta-clustering to the set of all possible partitions of a data set. We show that when using a “complement” of Rand Index as a measure of cluster similarity, the total-separation ...
Combining Multiple Clusterings Using Evidence Accumulation

We explore the idea of evidence accumulation (EAC) for combining the results of multiple clusterings. First, a clustering ensemble a set of object partitions, is produced. Given a data set (n objects or patterns in d dimensions), different ways of ...
Weighted cluster ensembles: Methods and analysis

Cluster ensembles offer a solution to challenges inherent to clustering arising from its ill-posed nature. Cluster ensembles can provide robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out ...

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 3, Issue

3/1/2003

1437 pages

ISSN:1532-4435

EISSN:1533-7928

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 March 2003

Published in JMLR Volume 3

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

794
Total Citations
View Citations
5,665
Total Downloads

Downloads (Last 12 months)421
Downloads (Last 6 weeks)41

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li MYang ZZhou XFang YLi KLi K(2025)Clustering on Attributed Graphs: From Single-view to Multi-viewACM Computing Surveys10.1145/371440757:7(1-36)Online publication date: 10-Feb-2025
https://dl.acm.org/doi/10.1145/3714407
Zhou PLi RLing ZDu LLiu X(2025)Fair Clustering Ensemble With Equal Cluster CapacityIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.350785747:3(1729-1746)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1109/TPAMI.2024.3507857
Ji XSun JPeng JPang YZhou P(2025)Clustering Ensemble Based on Fuzzy Matrix Self-EnhancementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348955337:1(148-161)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3489553
Yan JLiu JChen YYou TMa XZhang Z(2025)CCEGANInformation Sciences: an International Journal10.1016/j.ins.2024.121663692:COnline publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.ins.2024.121663
Zahir AJbilou KRatnani A(2025)Multilinear algebra methods for higher-dimensional graphsApplied Numerical Mathematics10.1016/j.apnum.2023.11.009208:PA(390-407)Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.apnum.2023.11.009
Wang XJiang BWang XLuo B(2025)Learning Dynamic Batch-Graph Representation for Deep Representation LearningInternational Journal of Computer Vision10.1007/s11263-024-02175-8133:1(84-105)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11263-024-02175-8
Song ZHu ZHong R(2025)Grid Jigsaw Representation with CLIP: a new perspective on image clusteringMultimedia Systems10.1007/s00530-025-01703-x31:2Online publication date: 1-Apr-2025
https://dl.acm.org/doi/10.1007/s00530-025-01703-x
Lu RFu XFeng Y(2024)Consensus Clustering for Simulation-Based Reputation Measurement for Online ServicesInternational Journal of Gaming and Computer-Mediated Simulations10.4018/IJGCMS.36199716:1(1-18)Online publication date: 16-Aug-2024
https://dl.acm.org/doi/10.4018/IJGCMS.361997
Hämäläinen WRybicki JMalmi LJung APollari-Malmi K(2024)Clustering students’ text form feedback data: comparison of eight vector space modelsProceedings of the 2024 7th International Conference on Big Data and Education10.1145/3704289.3704294(57-64)Online publication date: 24-Sep-2024
https://dl.acm.org/doi/10.1145/3704289.3704294
Du LShi YChen YZhou PQian YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Fast and Scalable Incomplete Multi-View Clustering with Duality Optimal Graph FilteringProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681346(8893-8902)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681346
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents