Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Local Spectral Clustering for Overlapping Community Detection

Published: 10 January 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Large graphs arise in a number of contexts and understanding their structure and extracting information from them is an important research area. Early algorithms for mining communities have focused on global graph structure, and often run in time proportional to the size of the entire graph. As we explore networks with millions of vertices and find communities of size in the hundreds, it becomes important to shift our attention from macroscopic structure to microscopic structure in large networks. A growing body of work has been adopting local expansion methods in order to identify communities from a few exemplary seed members.
    In this article, we propose a novel approach for finding overlapping communities called Lemon (Local Expansion via Minimum One Norm). Provided with a few known seeds, the algorithm finds the community by performing a local spectral diffusion. The core idea of Lemon is to use short random walks to approximate an invariant subspace near a seed set, which we refer to as local spectra. Local spectra can be viewed as the low-dimensional embedding that captures the nodes’ closeness in the local network structure. We show that Lemon’s performance in detecting communities is competitive with state-of-the-art methods. Moreover, the running time scales with the size of the community rather than that of the entire graph. The algorithm is easy to implement and is highly parallelizable. We further provide theoretical analysis of the local spectral properties, bounding the measure of tightness of extracted community using the eigenvalues of graph Laplacian.
    We thoroughly evaluate our approach using both synthetic and real-world datasets across different domains, and analyze the empirical variations when applying our method to inherently different networks in practice. In addition, the heuristics on how the seed set quality and quantity would affect the performance are provided.

    References

    [1]
    Bruno Abrahao, Sucheta Soundarajan, John Hopcroft, and Robert Kleinberg. 2014. A separability framework for analyzing community structure. ACM Transactions on Knowledge Discovery from Data 8, 1 (2014), 5.
    [2]
    Yong-Yeol Ahn, James P. Bagrow, and Sune Lehmann. 2010. Link communities reveal multiscale complexity in networks. Nature 466, 7307 (2010), 761--764.
    [3]
    Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. In Proceedings of FOCS. IEEE, 475--486.
    [4]
    Reid Andersen and Kevin J. Lang. 2006. Communities from seed sets. In Proceedings of WWW. ACM, 223--232.
    [5]
    Andrea L. Bertozzi and Arjuna Flenner. 2012. Diffuse interface models on graphs for classification of high dimensional data. Multiscale Modeling 8 Simulation 10, 3 (2012), 1090--1118.
    [6]
    Xavier Bresson, Huiyi Hu, Thomas Laurent, Arthur Szlam, and James von Brecht. 2014. An incremental reseeding strategy for clustering. arXiv preprint arXiv:1406.3837 (2014).
    [7]
    Fan R. K. Chung. 1997. Spectral Graph Theory. Vol. 92. American Mathematical Soc.
    [8]
    Michele Coscia, Giulio Rossetti, Fosca Giannotti, and Dino Pedreschi. 2012. Demon: A local-first discovery method for overlapping communities. In Proceedings of KDD. ACM, 615--623.
    [9]
    Santo Fortunato. 2010. Community detection in graphs. Physics Reports 486, 3 (2010), 75--174.
    [10]
    Kun He, Yiwei Sun, David Bindel, John Hopcroft, and Yixuan Li. 2015. Detecting overlapping communities from local spectral subspaces. In Proceedings of ICDM. ACM.
    [11]
    Di Jin, Bo Yang, Carlos Baquero, Dayou Liu, Dongxiao He, and Jie Liu. 2011. A Markov random walk under constraint for discovering overlapping communities in complex networks. Journal of Statistical Mechanics: Theory and Experiment 2011, 5 (2011), P05031.
    [12]
    Kyle Kloster and David F. Gleich. 2014. Heat kernel based community detection. In Proceedings of KDD. ACM.
    [13]
    Kyle Kloster and Yixuan Li. 2016. Scalable and robust local community detection via adaptive subgraph extraction and diffusions. arXiv preprint arXiv:1611.05152 (2016).
    [14]
    Isabel M. Kloumann and Jon M. Kleinberg. 2014. Community membership identification from small seed sets. In Proceedings of KDD. ACM.
    [15]
    Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. 2008. Benchmark graphs for testing community detection algorithms. Physical Review E 78, 4 (2008), 046110.
    [16]
    Andrea Lancichinetti, Filippo Radicchi, José J. Ramasco, and Santo Fortunato. 2011. Finding statistically significant communities in networks. PloS One 6, 4 (2011), e18961.
    [17]
    Conrad Lee, Fergal Reid, Aaron McDaid, and Neil Hurley. 2011. Seeding for pervasively overlapping communities. Physical Review E 83, 6 (2011).
    [18]
    Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2008. Statistical properties of community structure in large social and information networks. In Proceedings of WWW. ACM, 695--704.
    [19]
    Yixuan Li, Kun He, David Bindel, and John E. Hopcroft. 2015. Uncovering the small community structure in large networks: A local spectral approach. In Proceedings of WWW. ACM, 658--668.
    [20]
    Michael W. Mahoney, Lorenzo Orecchia, and Nisheeth K. Vishnoi. 2012. A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally. The Journal of Machine Learning Research 13, 1 (2012), 2339--2365.
    [21]
    Marina Meila and Jianbo Shi. 2001. Learning segmentation by random walks. Advances in Neural Information Processing Systems. 873--879.
    [22]
    Michael Molloy and Bruce Reed. 1995. A critical point for random graphs with a given degree sequence. Random Structures 8 Algorithms 6, 2--3 (1995), 161--180.
    [23]
    Andrew Y. Ng, Michael I. Jordan, Yair Weiss, and others. 2002. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems 2 (2002), 849--856.
    [24]
    Pascal Pons and Matthieu Latapy. 2005. Computing communities in large networks using random walks. In Proceedings of the International Symposium on Computer and Information Sciences (ISCIS’05). Springer, 284--293.
    [25]
    Usha Nandini Raghavan, Réka Albert, and Soundar Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E 76, 3 (2007), 036106.
    [26]
    Martin Rosvall and Carl T. Bergstrom. 2011. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PLoS One 6, 4 (2011), e18209.
    [27]
    Yousef Saad. 2003. Iterative Methods for Sparse Linear Systems. SIAM.
    [28]
    Daniel A. Spielman and Shang-Hua Teng. 2004. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of STOC. ACM, 81--90.
    [29]
    Joyce Jiyoung Whang, David F. Gleich, and Inderjit S. Dhillon. 2013. Overlapping community detection using seed set expansion. In Proceedings of CIKM. ACM, 2099--2108.
    [30]
    Jierui Xie, Stephen Kelley, and Boleslaw K. Szymanski. 2013. Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys 45, 4 (2013), 43.
    [31]
    Jaewon Yang and Jure Leskovec. 2012. Defining and evaluating network communities based on ground-truth. In Proceedings of ICDM. 10--13.
    [32]
    Vinko Zlatić, Andrea Gabrielli, and Guido Caldarelli. 2010. Topologically biased random walk and community finding in networks. Physical Review E 82, 6 (2010), 066109.

    Cited By

    View all
    • (2024)Local Community Detection in Multiple Private NetworksACM Transactions on Knowledge Discovery from Data10.1145/364407818:5(1-21)Online publication date: 10-Feb-2024
    • (2024)LSADEN: Local Spatial-aware Community Detection in Evolving Geo-social NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3348975(1-16)Online publication date: 2024
    • (2024)Overlapping Community Detection Based on Weak EquiconceptIEEE Access10.1109/ACCESS.2024.337488212(42147-42162)Online publication date: 2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 12, Issue 2
    Survey Papers and Regular Papers
    April 2018
    376 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3178544
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 January 2018
    Accepted: 01 June 2017
    Revised: 01 February 2017
    Received: 01 September 2015
    Published in TKDD Volume 12, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Community detection
    2. graph diffusion
    3. local spectral clustering
    4. random walk
    5. seed set expansion

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • US Army Research Office
    • National Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)68
    • Downloads (Last 6 weeks)9

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Local Community Detection in Multiple Private NetworksACM Transactions on Knowledge Discovery from Data10.1145/364407818:5(1-21)Online publication date: 10-Feb-2024
    • (2024)LSADEN: Local Spatial-aware Community Detection in Evolving Geo-social NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3348975(1-16)Online publication date: 2024
    • (2024)Overlapping Community Detection Based on Weak EquiconceptIEEE Access10.1109/ACCESS.2024.337488212(42147-42162)Online publication date: 2024
    • (2024)Community Detection in Multiplex Networks Based on Orthogonal Nonnegative Matrix Tri-FactorizationIEEE Access10.1109/ACCESS.2024.335170912(6423-6436)Online publication date: 2024
    • (2024)A comprehensive review of community detection in graphsNeurocomputing10.1016/j.neucom.2024.128169600(128169)Online publication date: Oct-2024
    • (2024)WSNMF: Weighted Symmetric Nonnegative Matrix Factorization for attributed graph clusteringNeurocomputing10.1016/j.neucom.2023.127041566(127041)Online publication date: Jan-2024
    • (2024)Integrating heterogeneous structures and community semantics for unsupervised community detection in heterogeneous networksExpert Systems with Applications10.1016/j.eswa.2023.121821238(121821)Online publication date: Mar-2024
    • (2024)Community detection algorithm for social network based on node intimacy and graph embedding modelEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.107947132(107947)Online publication date: Jun-2024
    • (2023)A local community detection algorithm based on potential community explorationFrontiers in Physics10.3389/fphy.2023.111429611Online publication date: 6-Feb-2023
    • (2023)Multiresolution Local Spectral Attributed Community SearchACM Transactions on the Web10.1145/362458018:1(1-28)Online publication date: 19-Sep-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media