Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICDE.2011.5767925guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Consensus spectral clustering in near-linear time

Published: 11 April 2011 Publication History

Abstract

This paper addresses the scalability issue in spectral analysis which has been widely used in data management applications. Spectral analysis techniques enjoy powerful clustering capability while suffer from high computational complexity. In most of previous research, the bottleneck of computational complexity of spectral analysis stems from the construction of pairwise similarity matrix among objects, which costs at least O(n2) where n is the number of the data points. In this paper, we propose a novel estimator of the similarity matrix using K-means accumulative consensus matrix which is intrinsically sparse. The computational cost of the accumulative consensus matrix is O(nlogn). We further develop a Non-negative Matrix Factorization approach to derive clustering assignment. The overall complexity of our approach remains O(nlogn). In order to validate our method, we (1) theoretically show the local preserving and convergent property of the similarity estimator, (2) validate it by a large number of real world datasets and compare the results to other state-of-the-art spectral analysis, and (3) apply it to large-scale data clustering problems. Results show that our approach uses much less computational time than other state-of-the-art clustering methods, meanwhile provides comparable clustering qualities. We also successfully apply our approach to a 5-million dataset on a single machine using reasonable time. Our techniques open a new direction for high-quality large-scale data analysis.

Cited By

View all

Index Terms

  1. Consensus spectral clustering in near-linear time
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      ICDE '11: Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
      April 2011
      1457 pages
      ISBN:9781424489596

      Publisher

      IEEE Computer Society

      United States

      Publication History

      Published: 11 April 2011

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 17 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)A Layout-Based Classification Method for Visualizing Time-Varying GraphsACM Transactions on Knowledge Discovery from Data10.1145/344130115:4(1-24)Online publication date: 26-Mar-2021
      • (2019)Adversarial graph embedding for ensemble clusteringProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367471.3367536(3562-3568)Online publication date: 10-Aug-2019
      • (2017)Fast kernel spectral clusteringNeurocomputing10.1016/j.neucom.2016.12.085268:C(27-33)Online publication date: 13-Dec-2017
      • (2016)Infinite Ensemble for Image ClusteringProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2939672.2939813(1745-1754)Online publication date: 13-Aug-2016
      • (2014)A Framework for Hierarchical Ensemble ClusteringACM Transactions on Knowledge Discovery from Data10.1145/26113809:2(1-23)Online publication date: 23-Sep-2014

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media