Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Use of Local Group Information to Identify Communities in Networks

Published: 01 April 2015 Publication History

Abstract

The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a “community” truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are “round,” with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify “annotated communities.” A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected “subcommunities” of nodes, each corresponding to a single node’s “perception” of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities. When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.

References

[1]
Bruno Abrahao, Sucheta Soundarajan, John Hopcroft, and Robert Kleinberg. 2012. On the separability of structural classes of communities. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 624--632.
[2]
Balázs Adamcsek, Gergely Palla, Illés J. Farkas, Imre Derényi, and Tamás Vicsek. 2006. CFinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 22, 8 (2006), 1021--1023.
[3]
Yong-Yeol Ahn, James P. Bagrow, and Sune Lehmann. 2010. Link communities reveal multiscale complexity in networks. Nature 466 (2010), 761--764.
[4]
Luis A. Amaral. 2008. A truer measure of our ignorance. Proc. Natl. Acad. Sci. 105, 19 (2008), 6795--6796.
[5]
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: Membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. ACM, 44--54.
[6]
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. J. Stat. Mech. 2008 (2008), P10008.
[7]
Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. J. Math. Soc. 25, 2 (2001), 163--177.
[8]
Aaron Clauset, Cosma Shalizi, and Mark Newman. 2009. Power-law distributions in empirical data. SIAM Rev. 51, 4 (2009), 661--703.
[9]
Michele Coscia, Giulio Rossetti, Fosca Gianotti, and Dino Pedreschi. 2012. DEMON: A local-first discovery method for overlapping communities. Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2012). ACM, 615--623.
[10]
Robert L. Cross, Andrew Parker, and Rob Cross. 2004. The Hidden Power of Social Networks: Understanding How Work Really Gets Done in Organizations. Harvard Business School Press.
[11]
Paul Erdős and Alfred Rényi. 1959. On random graphs I. Publ. Math. Debrecen 6 (1959), 290--297.
[12]
Santo Fortunato. 2010. Community detection in graphs. Phys. Rep. 486, 3--5 (2010), 75--174.
[13]
Santo Fortunato and Marc Barthelemy. 2006. Resolution limit in community detection. Proc. oNatl. Acad. Sci. 104, 1 (2006), 36--41.
[14]
Adrien Friggeri, Guillaume Chelieu, and Eric Fleury. 2011. Egomunities, Exploring Socially Cohesive Person-Based Communities. INRIA Research Report RR-7535.
[15]
Michelle Girvan and Mark Newman. 2002. Community structure in social and biological networks. In Proc. Natl. Acade. Sci. 99, 12 (2002), 7821--7826.
[16]
David Gleich and C. Seshadhri. 2012. Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2012). ACM, 597--605.
[17]
Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. 2008. Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference (2008). 11--15.
[18]
Paul Jaccard. 1901. Etude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull. Soc. Vaudoise Sci. Nat. 37, 140 (1901), 241--272.
[19]
Ravi Kannan, Santosh Vempala, and Adrian Vetta. 2004. On clusterings: Good, bad and spectral. J. ACM 51, 3 (2004), 497--515.
[20]
George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 1 (Aug. 1998), 359--392.
[21]
Andrea Lancichinetti and Santo Fortunato. 2009. Community detection algorithms: A comparative analysis. Phys. Rev. E 80 (2009), 056117.
[22]
Andrea Lancichinetti, Filippo Radicchi, José Ramasco, and Santo Fortunato. 2011. Finding statistically significant communities in networks. PLoS ONE 6, 4 (2011), e18961.
[23]
Jure Leskovec, Lada Adamic, and Bernardo Huberman. 2006. The dynamics of viral marketing. In Proceedings of the 7th ACM Conference on Electronic Commerce (2006).
[24]
Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, and Christos Faloutsos. 2005. Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. In Proceedings of the 9th European Conference on Principles and Practices of Knowledge Discovery in Databases (2005). Springer-Verlag, 133--145.
[25]
Alan Mislove, Bimal Viswanath, Krishna Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. ACM, 251--260.
[26]
Mark Newman. 2002. Assortative mixing in networks. Phys. Rev. Lett. 89 (2002), 208701.
[27]
Michael Newman. 2006. Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 23 (2006), 8577--8582.
[28]
Gergely Palla, Imre Derenyi, Illes Farkas, and Tamas Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435 (2005), 814--818.
[29]
Daniel Park, Rohit Singh, Michael Baym, Chung-Shou Liao, and Bonnie Berger. 2011. IsoBase: A database of functionally related proteins across PPI networks. Nucleic Acids Rese. 39, D295--D300.
[30]
Usha Raghavan, Albert Reka, and Soundar Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76, 3 (2007), 036106.
[31]
Martin Rosvall and Carl Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. 105, 4 (2008), 1118--1123.
[32]
Jari Saramäki, Mikko Kivelä, Jukka-Pekka Onnela, Kimmo Kaski, and János Kertész. 2007. Generalizations of the clustering coefficient to weighted complex networks. Phys. Rev. E 75, 2 (2007), 027105.
[33]
Rohit Singh, Jinbo Xu, and Bonnier Berger. 2008. Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc. Natl. Acad. Sci. 105, 35 (2008), 12763--12768.
[34]
Michael Stumpf, Thomas Thorne, Eric de Silva, Ronald Stewart, Hyeong Jun J. An, Michael Lappe, and Carsten Wiuf. 2008. Estimating the size of the human interactome. Proc. Natl. Acad. Sci. 105, 19 (2008), 6959--6964.
[35]
Stanley Wasserman and Katherine Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press.
[36]
Jaewon Yang and Jure Leskovec. 2012. Defining and evaluating network communities based on ground-truth. In Proceedings of the IEEE International Conference on Data Mining. IEEE, 745--754.

Cited By

View all
  • (2021)Detecting overlapping communities using ensemble-based distributed neighbourhood threshold method in social networksIntelligent Decision Technologies10.3233/IDT-200059(1-17)Online publication date: 21-May-2021
  • (2021) Accurate prediction of cis -regulatory modules reveals a prevalent regulatory genome of humans NAR Genomics and Bioinformatics10.1093/nargab/lqab0523:2Online publication date: 17-Jun-2021
  • (2020)Overlapping Community Detection Based on Membership Degree PropagationEntropy10.3390/e2301001523:1(15)Online publication date: 24-Dec-2020
  • Show More Cited By

Index Terms

  1. Use of Local Group Information to Identify Communities in Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 9, Issue 3
    TKDD Special Issue (SIGKDD'13)
    April 2015
    313 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/2737800
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 April 2015
    Accepted: 01 September 2014
    Revised: 01 June 2014
    Received: 01 October 2012
    Published in TKDD Volume 9, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Social networks
    2. communities

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • AFOSR

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 03 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Detecting overlapping communities using ensemble-based distributed neighbourhood threshold method in social networksIntelligent Decision Technologies10.3233/IDT-200059(1-17)Online publication date: 21-May-2021
    • (2021) Accurate prediction of cis -regulatory modules reveals a prevalent regulatory genome of humans NAR Genomics and Bioinformatics10.1093/nargab/lqab0523:2Online publication date: 17-Jun-2021
    • (2020)Overlapping Community Detection Based on Membership Degree PropagationEntropy10.3390/e2301001523:1(15)Online publication date: 24-Dec-2020
    • (2020)ANGEL: efficient, and effective, node-centric community discovery in static and dynamic networksApplied Network Science10.1007/s41109-020-00270-65:1Online publication date: 10-Jun-2020
    • (2019)Krylov Subspace Approximation for Local Community Detection in Large NetworksACM Transactions on Knowledge Discovery from Data10.1145/334070813:5(1-30)Online publication date: 24-Sep-2019
    • (2019)Tensorizing Restricted Boltzmann MachineACM Transactions on Knowledge Discovery from Data10.1145/332151713:3(1-16)Online publication date: 7-Jun-2019
    • (2017)Mining Community Structures in Multidimensional NetworksACM Transactions on Knowledge Discovery from Data10.1145/308057411:4(1-36)Online publication date: 29-Jun-2017
    • (2017)Approach to detect non-adversarial overlapping collusion in crowdsourcing2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2017.8280462(1-8)Online publication date: Dec-2017
    • (2017)Efficient detection of communities with significant overlaps in networks: Partial community merger algorithmNetwork Science10.1017/nws.2017.326:1(71-96)Online publication date: 20-Nov-2017
    • (2017)Hierarchical Community Detection Based on Multi Degrees of Distance Space and Submodularity OptimizationSocial Media Processing10.1007/978-981-10-6805-8_28(343-354)Online publication date: 26-Oct-2017
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media