Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3336191.3371839acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article
Open access

The Power of Pivoting for Exact Clique Counting

Published: 22 January 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Clique counting is a fundamental task in network analysis, and even the simplest setting of $3$-cliques (triangles) has been the center of much recent research. Getting the count of k-cliques for larger k is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires local counts, the number of k-cliques per vertex/edge. Our main contribution is Pivoter, an algorithm that exactly counts the number of k-cliques, for all values of k. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge k-clique counts (for all k) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called pivoting, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics (k-clique counts, local counts) can be read off efficiently.

    References

    [1]
    Nesreen K. Ahmed, Jennifer Neville, Ryan A. Rossi, and Nick Duffield. 2015. Efficient Graphlet Counting for Large Networks. In Proceedings of International Conference on Data Mining (ICDM).
    [2]
    E. A. Akkoyunlu. 1973. The enumeration of maximal cliques of large graphs. SIAM J. Comput., Vol. 2 (1973), 1--6.
    [3]
    Noga Alon, Raphy Yuster, and Uri Zwick. 1994. Color-coding: A New Method for Finding Simple Paths, Cycles and Other Small Subgraphs Within Large Graphs. In Symposium on the Theory of Computing (STOC). 326--335. https://doi.org/10.1145/195058.195179
    [4]
    A. Benson, D. F. Gleich, and J. Leskovec. 2016. Higher-order organization of complex networks. Science, Vol. 353, 6295 (2016), 163--166.
    [5]
    Jonathan W. Berry, Bruce Hendrickson, Randall A. LaViolette, and Cynthia A. Phillips. 2011. Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E, Vol. 83 (May 2011), 056119. Issue 5. https://doi.org/10.1103/PhysRevE.83.056119
    [6]
    Nadja Betzler, René van Bevern, Michael R. Fellows, Christian Komusiewicz, and Rolf Niedermeier. 2011. Parameterized Algorithmics for Finding Connected Motifs in Biological Networks. IEEE/ACM Trans. Comput. Biology Bioinform., Vol. 8, 5 (2011), 1296--1308.
    [7]
    Coen Bron and Joep Kerbosch. 1973. Algorithm 457: Finding All Cliques of an Undirected Graph. Commun. ACM, Vol. 16, 9 (Sept. 1973), 575--577. https://doi.org/10.1145/362342.362367
    [8]
    Jianer Chen, Xiuzhen Huang, Iyad A. Kanj, and Ge Xia. 2004. Linear FPT reductions and computational lower bounds. In Symposium on the Theory of Computing (STOC), Lá szló Babai (Ed.). ACM, 212--221. https://doi.org/10.1145/1007352.1007391
    [9]
    Norishige Chiba and Takao Nishizeki. 1985. Arboricity and subgraph listing algorithms. SIAM J. Comput., Vol. 14 (1985), 210--223. Issue 1. https://doi.org/10.1137/0214017
    [10]
    Maximilien Danisch, Oana Denisa Balalau, and Mauro Sozio. 2018. Listing k-cliques in Sparse Real-World Graphs. In World Wide Web (WWW). 589--598. https://doi.org/10.1145/3178876.3186125
    [11]
    David Eppstein, Maarten Löffler, and Darren Strash. 2010. Listing all maximal cliques in sparse graphs in near-optimal time. In International Symposium on Algorithms and Computation. Springer, 403--414.
    [12]
    David Eppstein, Maarten Lö ffler, and Darren Strash. 2013. Listing All Maximal Cliques in Large Sparse Real-World Graphs. ACM Journal of Experimental Algorithmics, Vol. 18 (2013). https://doi.org/10.1145/2543629
    [13]
    Irene Finocchi, Marco Finocchi, and Emanuele G. Fusco. 2015. Clique Counting in MapReduce: Algorithms and Experiments. ACM Journal of Experimental Algorithmics, Vol. 20 (2015). https://doi.org/10.1145/2794080
    [14]
    Robert A. Hanneman and Mark Riddle. 2005. Introduction to social network methods.University of California, Riverside. http://faculty.ucr.edu/ hanneman/nettext/.
    [15]
    Matthew O. Jackson. 2010. Social and Economic Networks.Princeton University Press.
    [16]
    Shweta Jain and C Seshadhri. 2017. A Fast and Provable Method for Estimating Clique Counts Using Turán's Theorem. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 441--449.
    [17]
    M. Jha, C. Seshadhri, and A. Pinar. 2015. Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts. In World Wide Web (WWW). 495--505.
    [18]
    Zhenqi Lu, Johan Wahlström, and Arye Nehorai. 2018. Community detection in complex networks via clique conductance. Scientific reports, Vol. 8, 1 (2018), 5982.
    [19]
    Dror Marcus and Yuval Shavitt. 2010. Efficient Counting of Network Motifs. In ICDCS Workshops. IEEE Computer Society, 92--98.
    [20]
    David W Matula and Leland L Beck. 1983. Smallest-last ordering and clustering and graph coloring algorithms. Journal of the ACM (JACM), Vol. 30, 3 (1983), 417--427.
    [21]
    Ali Pinar, C Seshadhri, and Vaidyanathan Vishal. 2017. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1431--1440.
    [22]
    Natasa Przulj. 2007. Biological network comparison using graphlet degree distribution. Bioinformatics, Vol. 23, 2 (2007), 177--183.
    [23]
    Rahmtin Rotabi, Krishna Kamath, Jon M. Kleinberg, and Aneesh Sharma. 2017. Detecting Strong Ties Using Network Motifs. In World Wide Web (WWW). 983--992. https://doi.org/10.1145/3041021.3055139
    [24]
    Ahmet Erdem Sariyü ce, C. Seshadhri, Ali Pinar, and Ü mit V. cC atalyü rek. 2015. Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions. In World Wide Web (WWW). ACM, 927--937.
    [25]
    C. Seshadhri, Ali Pinar, and Tamara G. Kolda. 2014. Wedge sampling for computing clustering coefficients and triangle counts on large graphs. Statistical Analysis and Data Mining, Vol. 7, 4 (2014), 294--307. https://doi.org/10.1002/sam.11224
    [26]
    C. Seshadhri and Srikanta Tirthapura. 2019. Scalable Subgraph Counting: The Methods Behind The Madness: WWW 2019 Tutorial. In Proceedings of the Web Conference (WWW).
    [27]
    Ann Sizemore, Chad Giusti, and Danielle S. Bassett. 2016. Classification of weighted networks through mesoscale homological features. Journal of Complex Networks, Vol. 10.1093 (2016).
    [28]
    Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. ArnetMiner: Extraction and Mining of Academic Social Networks. In KDD'08. 990--998.
    [29]
    Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The Worst-Case Time Complexity for Generating All Maximal Cliques. Theoretical Computer Science, Vol. 363, 1 (2006), 28--42. https://doi.org/10.1007/978--3--540--27798--9_19
    [30]
    Charalampos E. Tsourakakis. 2015. The K-clique Densest Subgraph Problem. In Proceedings of the Conference on World Wide Web WWW. 1122--1132. https://doi.org/10.1145/2736277.2741098
    [31]
    Charalampos E. Tsourakakis, Jakub Pachocki, and Michael Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering. In World Wide Web (WWW). 1451--1460. https://doi.org/10.1145/3038912.3052653
    [32]
    Johan Ugander, Lars Backstrom, and Jon M. Kleinberg. 2013. Subgraph frequencies: mapping the empirical and extremal geography of large graph collections. In WWW. 1307--1318.
    [33]
    Virginia Vassilevska. 2009. Efficient algorithms for clique problems. Inform. Process. Lett., Vol. 109, 4 (2009), 254 -- 257. https://doi.org/10.1016/j.ipl.2008.10.014
    [34]
    Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John C. S. Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2018. MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs., Vol. 30, 1 (2018), 73--86. https://doi.org/10.1109/TKDE.2017.2756836
    [35]
    Hao Yin, Austin R. Benson, and Jure Leskovec. 2018. Higher-order clustering in networks. Phys. Rev. E, Vol. 97 (2018), 052306. https://doi.org/10.1145/3289600.3290991
    [36]
    Hao Yin, Austin R. Benson, and Jure Leskovec. 2019. The Local Closure Coefficient: A New Perspective On Network Clustering. 303--311. https://doi.org/10.1145/3289600.3290991
    [37]
    Z. Zhao, G. Wang, A. Butt, M. Khan, V. S. Anil Kumar, and M. Marathe. 2012. SAHAD: Subgraph Analysis in Massive Networks Using Hadoop. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS). 390--401.

    Cited By

    View all
    • (2024)A Counting-based Approach for Efficient k-Clique Densest Subgraph DiscoveryProceedings of the ACM on Management of Data10.1145/36549222:3(1-27)Online publication date: 30-May-2024
    • (2024)Differentiating Set Intersections in Maximal Clique Enumeration by Function and Subproblem SizeProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656607(150-163)Online publication date: 30-May-2024
    • (2024)Efficient -Clique Counting on Large Graphs: The Power of Color-Based Sampling ApproachesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331464336:4(1518-1536)Online publication date: Apr-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining
    January 2020
    950 pages
    ISBN:9781450368223
    DOI:10.1145/3336191
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 January 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Best Paper

    Author Tags

    1. clique counting
    2. local clique counting
    3. social network analysis

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WSDM '20

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)304
    • Downloads (Last 6 weeks)32

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Counting-based Approach for Efficient k-Clique Densest Subgraph DiscoveryProceedings of the ACM on Management of Data10.1145/36549222:3(1-27)Online publication date: 30-May-2024
    • (2024)Differentiating Set Intersections in Maximal Clique Enumeration by Function and Subproblem SizeProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656607(150-163)Online publication date: 30-May-2024
    • (2024)Efficient -Clique Counting on Large Graphs: The Power of Color-Based Sampling ApproachesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331464336:4(1518-1536)Online publication date: Apr-2024
    • (2024)Efficient Balanced Signed Biclique Search in Signed Bipartite GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.329672136:3(1069-1083)Online publication date: Mar-2024
    • (2024)Distributed Memory Implementation of Bron-Kerbosch AlgorithmIEEE Access10.1109/ACCESS.2024.339377112(59575-59588)Online publication date: 2024
    • (2024)Clique Counts for Network SimilarityModelling and Mining Networks10.1007/978-3-031-59205-8_12(174-183)Online publication date: 29-Apr-2024
    • (2023)Efficient Biclique Counting in Large Bipartite GraphsProceedings of the ACM on Management of Data10.1145/35889321:1(1-26)Online publication date: 30-May-2023
    • (2023)Scaling Up k-Clique Densest Subgraph DetectionProceedings of the ACM on Management of Data10.1145/35889231:1(1-26)Online publication date: 30-May-2023
    • (2023)On linear algebraic algorithms for the subgraph matching problem and its variantsOptimization Letters10.1007/s11590-023-02001-z17:7(1533-1549)Online publication date: 9-Apr-2023
    • (2022)Toward interpretable and actionable data analysis with explanations and causalityProceedings of the VLDB Endowment10.14778/3554821.355490215:12(3812-3820)Online publication date: 1-Aug-2022
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media