Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

Published: 01 March 2003 Publication History

Abstract

This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse' framework that we call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information. In addition to a direct maximization approach, we propose three effective and efficient techniques for obtaining high-quality combiners (consensus functions). The first combiner induces a similarity measure from the partitionings and then reclusters the objects. The second combiner is based on hypergraph partitioning. The third one collapses groups of clusters into meta-clusters which then compete for each object to determine the combined clustering. Due to the low computational costs of our techniques, it is quite feasible to use a supra-consensus function that evaluates all three approaches against the objective function and picks the best solution for a given situation. We evaluate the effectiveness of cluster ensembles in three qualitatively different application scenarios: (i) where the original clusters were formed based on non-identical sets of features, (ii) where the original clustering algorithms worked on non-identical sets of objects, and (iii) where a common data-set is used and the main purpose of combining multiple clusterings is to improve the quality and robustness of the solution. Promising results are obtained in all three situations for synthetic as well as real data-sets.

References

[1]
C. J. Alpert and A. B. Kahng. Recent directions in netlist partitioning: A survey. Integration: The VLSI Journal, 19:1-18, 1995.]]
[2]
J.A. Barnett. Computational methods for a mathematical theory of evidence. In Proc. of IJCAI, pages 868-875, 1981.]]
[3]
J. P. Barthelemy, B. Laclerc, and B. Monjardet. On the use of ordered sets in problems of comparison and consensus of classifications. Journal of Classification, 3:225-256, 1986.]]
[4]
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT- 98), pages 92-100, 1998.]]
[5]
D. Boley, M. Gini, R. Gross, E. Han, K. Hastings, G. Karypis, V. Kumar, B. Mobasher, and J. Moore. Partitioning-based clustering for web document categorization. Decision Support Systems, 27:329-341, 1999.]]
[6]
Kurt D. Bollacker and Joydeep Ghosh. A supra-classifier architecture for scalable knowledge reuse. In Proc. Int'l Conf. on Machine Learning (ICML-98), pages 64-72, July 1998.]]
[7]
Kurt D. Bollacker and Joydeep Ghosh. Effective supra-classifiers for knowledge base construction. Pattern Recognition Letters, 20(11-13):1347-52, November 1999.]]
[8]
P.S. Bradley and U. M. Fayyad. Refining initial points for K--means clustering. In Proc. Int'l Conf. on Machine Learning (ICML-98), pages 91-99, July 1998.]]
[9]
Rich Caruana. Learning many related tasks at the same time with backpropagation. In Advances in Neural Information Processing Systems 7, pages 657-664, 1995.]]
[10]
S. V. Chakaravathy and J. Ghosh. Scale based clustering using a radial basis function network. IEEE Transactions on Neural Networks, 2(5):1250-61, Sept 1996.]]
[11]
Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley, 1991.]]
[12]
B. Dasarathy. Decision Fusion. IEEE CS Press, Los Alamitos, CA, 1994.]]
[13]
I. S. Dhillon and D. S. Modha. Concept decompositions for large sparse text data using clustering. Machine Learning, 42(1):143-175, January 2001.]]
[14]
T. G. Dietterich. Ensemble methods in machine learning. In J. Kittler and F. Roli, editors, Multiple Classifier Systems, pages 1-15. LNCS Vol. 1857, Springer, 2001.]]
[15]
M. Ester, H. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of 2nd International Conference on KDD, pages 226-231, 1996.]]
[16]
U. M. Fayyad, C. Reina, and P. S. Bradley. Initialization of iterative refinement clustering algorithms. In Proc. 14th Intl. Conf. on Machine Learning (ICML), pages 194-198, 1998.]]
[17]
Doug Fisher. Iterative optimization and simplification of hierarchical clusterings. Journal of Artificial Intelligence Research, 4:147-180, 1996.]]
[18]
W. Frakes. Stemming algorithms. In W. Frakes and R. Baeza-Yates, editors, Information Retrieval: Data Structures and Algorithms, pages 131-160. Prentice Hall, New Jersey, 1992.]]
[19]
A. L. N. Fred and A. K. Jain. Data clustering using evidence accumulation. In Proc. ICPR, page to appear, 2002.]]
[20]
N. Friedman, O. Mosenzon, N. Slonim, and N. Tishby. Multivariate information bottleneck. In Proc. of the Seventeenth Conf. on Uncertainty in Artificial Intelligence (UAI). AAAI Press, 2001.]]
[21]
Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman, San Francisco, CA, 1979.]]
[22]
J. Ghosh. Multiclassifier systems: Back to the future. Keynote Talk, 3rd Int'l Workshop on Multiple Classifier Systems, Cagliari, June, 2002a. Downloadable from http://www.lans.ece.utexas.edu/publications.html.]]
[23]
J. Ghosh. Multiclassifier systems: Back to the future (invited paper). In F. Roli and J. Kittler, editors, Multiple Classifier Systems, pages 1-15. LNCS Vol. 2364, Springer, 2002b.]]
[24]
J. Ghosh, A. Strehl, and S. Merugu. A consensus framework for integrating distributed clusterings under limited knowledge sharing. In Proc. NSF Workshop on Next Generation Data Mining, Baltimore, pages 99-108, Nov 2002.]]
[25]
C. W. J. Granger. Combining forecasts-twenty years later. Journal of Forecasting, 8(3): 167-173, 1989.]]
[26]
E. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a highdimensional space using hypergraph models. Technical Report 97-019, University of Minnesota, Department of Computer Science, 1997.]]
[27]
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, New Jersey, 1988.]]
[28]
R. A. Jarvis and E. A. Patrick. Clustering using a similarity measure based on shared nearest neighbors. IEEE Transactions on Computers, C-22, No. 11:1025-1034, 1973.]]
[29]
E. Johnson and H. Kargupta. Collective, hierarchical clustering from distribuied, heterogeneous data. In M. Zaki and C. Ho, editors, Large-Scale Parallel KDD Systems, volume 1759 of Lecture Notes in Computer Science, pages 221-244. Springer-Verlag, 1999.]]
[30]
S. Kannan, T. Warnow, and S. Yooseph. Computing the local consensus of trees. In Association for Computing Machinery and the Society of Industrial Applied Mathematics, Proceedings, ACM/SIAM Symposium on Discrete Algorithms, pages 68-77, 1995.]]
[31]
H. Kargupta and P. Chan, editors. Advances in Distributed and Parallel Knowledge Discovery . AAAI/MIT Press, Cambridge, MA, 2000.]]
[32]
H. Kargupta, W. Huang, Krishnamoorthy, and E. Johnson. Distributed clustering using collective principal component analysis. Knowledge and Information Systems Journal Special Issue on Distributed and Parallel Knowledge Discovery, 3:422-448, 2001.]]
[33]
H. Kargupta, B. Park, D. Hershberger, and E. Johnson. Collective data mining: A new perspective toward distributed data mining. In Hillol Kargupta and Philip Chan, editors, Advances in Distributed and Parallel Knowledge Discovery. MIT/AAAI Press, 1999.]]
[34]
G. Karypis, E.-H. Han, and V. Kumar. Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer, 32(8):68-75, August 1999.]]
[35]
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359-392, 1993.]]
[36]
George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. Multilevel hypergraph partitioning: Applications in VLSI domain. In Proceedings of the Design and Automation Conference, 1997.]]
[37]
Branko Kavsek, Nada Lavrac, and Anuska Ferligoj. Consensus decision trees: Using consensus hierarchical clustering for data relabelling and reduction. In Proceedings of ECML 2001, volume 2167 of LNAI, pages 251-262. Springer, 2001.]]
[38]
B. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell Systems Technical Journal, 49:291-307, 1970.]]
[39]
J. Kim and T. Warnow. Tutorial on phylogenetic tree estimation. In Intelligent Systems for Molecular Biology, Heidelberg, 1999.]]
[40]
J. Kittler and F. Roli, editors. Multiple Classifier Systems. LNCS Vol. 2634, Springer, 2002.]]
[41]
Teuvo Kohonen. Self-Organizing Maps. Springer, Berlin, Heidelberg, 1995. (Second Extended Edition 1997).]]
[42]
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation and active learning. In D.S. Touretzky G. Tesauro and T.K. Leen, editors, Advances in Neural Information Processing Systems-7, pages 231-238, 1995.]]
[43]
Mala Mehrotra. Multi-viewpoint clustering analysis (mvp-ca) technology for mission rule set development and case-based retrieval, Technical Report AFRL-VS-TR-1999-1029, Air Force Research Laboratory, 1999.]]
[44]
Dharmendra S. Modha and W. Scott Spangler. Clustering hypertext with applications to web searching. In Proceedings of the ACM Hypertext 2000 Conference, San Antonio, TX, May 30-June 3, 2000.]]
[45]
Ion Muslea, Steve Minton, and Craig Knoblock. Selective sampling + semi-supervised learning = robust multi-view learning. In IJCAI-2001 Workshop on Text Learning Beyond Supervision, 2001.]]
[46]
D. A. Neumann and V. T. Norton. Clustering and isolation in the consensus problem for partitions. Journal of Classification, 3:281-298, 1986a.]]
[47]
D. A. Neumann and V. T. Norton. On lattice consensus methods. Journal of Classification, 3:225-256, 1986b.]]
[48]
K. Nigam and R. Ghani. Analyzing the applicability and effectiveness of co-training. In Proceedings of CIKM-00, 9th ACM International Conference on Information and Knowledge Management, pages 86-93. ACM, 2000.]]
[49]
E. Pekalska, P. Paclik, and R.P.W. Duin. A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, Special Issue on Kernel Methods, 2(2):175-211, 2002.]]
[50]
Lorien Y. Pratt. Experiments on the transfer of knowledge between neural networks. In S. Hanson, G. Drastal, and R. Rivest, editors, Computational Learning Theory and Natural Learning Systems, Constraints and Prospects, chapter 19, pages 523-560. MIT Press, 1994.]]
[51]
A. Prodromidis, P. Chan, and S. Stolfo. Meta-learning in distributed data mining systems: Issues and approaches. In H. Kargupta and P. Chan, editors, Advances in Distributed and Parallel Knowledge Discovery. AAAI/MIT Press, Cambridge, MA, 2000.]]
[52]
M.D. Richard and R.P. Lippmann. Neural network classifiers estimate bayesian a posteriori probabilities. Neural Computation, 3(4):461-483, 1991.]]
[53]
A. Sharkey. On combining artificial neural networks. Connection Science, 8(3/4):299-314, 1996.]]
[54]
A. Sharkey. Combining Artificial Neural Nets. Springer-Verlag, 1999.]]
[55]
D. Silver and R. Mercer. The parallel transfer of task knowledge using dynamic learning rates based on a measure of relatedness. Connection Science Special Issue: Transfer in Inductive Systems, 1996.]]
[56]
N. Slonim and N. Tishby. Agglomerative information bottleneck. In Proc. of NIPS-12, pages 617-623. MIT Press, 2000.]]
[57]
A. Strehl and J. Ghosh. Cluster ensembles - a knowledge reuse framework for combining partitionings. In Proceedings of AAAI 2002, Edmonton, Canada, pages 93-98. AAAI, July 2002a.]]
[58]
Alexander Strehl and Joydeep Ghosh. A scalable approach to balanced, high-dimensional clustering of market-baskets. In Proc. HiPC 2000, Bangalore, volume 1970 of LNCS, pages 525-536. Springer, December 2000.]]
[59]
Alexander Strehl and Joydeep Ghosh. Relationship-based clustering and visualization for high-dimensional data mining. INFORMS Journal on Computing, 2002b. in press.]]
[60]
Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. Impact of similarity measures on web-page clustering. In Proc. AAAI Workshop on AI for Web Search (AAAI 2000), Austin, pages 58-64. AAAI, July 2000.]]
[61]
S. Thrun. Is learning the n-th thing any easier than learning the first? In M.C. Mozer D.S. Touretzky and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems-8, pages 640-646. MIT Press, Cambridge, MA, 1996.]]
[62]
S. Thrun and L.Y. Pratt. Learning To Learn. Kluwer Academic, Norwell, MA, 1997.]]
[63]
Sebastian Thrun and Joseph O'Sullivan. Discovering structure in multiple learning tasks: The TC alogorithm. In The 13th International Conference on Machine Learning, pages 489-497, 1996.]]
[64]
K. Tumer and J. Ghosh. Linear and order statistics combiners for pattern classification. In A. Sharkey, editor, Combining Artificial Neural Nets, pages 127-162. Springer-Verlag, 1999.]]

Cited By

View all
  • (2025)Clustering on Attributed Graphs: From Single-view to Multi-viewACM Computing Surveys10.1145/371440757:7(1-36)Online publication date: 10-Feb-2025
  • (2025)Fair Clustering Ensemble With Equal Cluster CapacityIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.350785747:3(1729-1746)Online publication date: 1-Mar-2025
  • (2025)Clustering Ensemble Based on Fuzzy Matrix Self-EnhancementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348955337:1(148-161)Online publication date: 1-Jan-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 3, Issue
3/1/2003
1437 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 March 2003
Published in JMLR Volume 3

Author Tags

  1. cluster analysis
  2. clustering
  3. consensus functions
  4. ensemble
  5. knowledge reuse
  6. multi-learner systems
  7. mutual information
  8. partitioning
  9. unsupervised learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)421
  • Downloads (Last 6 weeks)41
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Clustering on Attributed Graphs: From Single-view to Multi-viewACM Computing Surveys10.1145/371440757:7(1-36)Online publication date: 10-Feb-2025
  • (2025)Fair Clustering Ensemble With Equal Cluster CapacityIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.350785747:3(1729-1746)Online publication date: 1-Mar-2025
  • (2025)Clustering Ensemble Based on Fuzzy Matrix Self-EnhancementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348955337:1(148-161)Online publication date: 1-Jan-2025
  • (2025)CCEGANInformation Sciences: an International Journal10.1016/j.ins.2024.121663692:COnline publication date: 1-Feb-2025
  • (2025)Multilinear algebra methods for higher-dimensional graphsApplied Numerical Mathematics10.1016/j.apnum.2023.11.009208:PA(390-407)Online publication date: 1-Feb-2025
  • (2025)Learning Dynamic Batch-Graph Representation for Deep Representation LearningInternational Journal of Computer Vision10.1007/s11263-024-02175-8133:1(84-105)Online publication date: 1-Jan-2025
  • (2025)Grid Jigsaw Representation with CLIP: a new perspective on image clusteringMultimedia Systems10.1007/s00530-025-01703-x31:2Online publication date: 1-Apr-2025
  • (2024)Consensus Clustering for Simulation-Based Reputation Measurement for Online ServicesInternational Journal of Gaming and Computer-Mediated Simulations10.4018/IJGCMS.36199716:1(1-18)Online publication date: 16-Aug-2024
  • (2024)Clustering students’ text form feedback data: comparison of eight vector space modelsProceedings of the 2024 7th International Conference on Big Data and Education10.1145/3704289.3704294(57-64)Online publication date: 24-Sep-2024
  • (2024)Fast and Scalable Incomplete Multi-View Clustering with Duality Optimal Graph FilteringProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681346(8893-8902)Online publication date: 28-Oct-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media