Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512167acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Lightning Fast and Space Efficient k-clique Counting

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    K-clique counting is a fundamental problem in network analysis which has attracted much attention in recent years. Computing the count of k-cliques in a graph for a large k (e.g., k = 8) is often intractable as the number of k-cliques increases exponentially w.r.t. (with respect to) k. Existing exact k-clique counting algorithms are often hard to handle large dense graphs, while sampling-based solutions either require a huge number of samples or consume very high storage space to achieve a satisfactory accuracy. To overcome these limitations, we propose a new framework to estimate the number of k-cliques which integrates both the exact k-clique counting technique and two novel color-based sampling techniques. The key insight of our framework is that we only apply the exact algorithm to compute the k-clique counts in the sparse regions of a graph, and use the proposed sampling-based techniques to estimate the number of k-cliques in the dense regions of the graph. Specifically, we develop two novel dynamic programming based k-color set sampling techniques to efficiently estimate the k-clique counts, where a k-color set contains k nodes with k different colors. Since a k-color set is often a good approximation of a k-clique in the dense regions of a graph, our sampling-based solutions are extremely efficient and accurate. Moreover, the proposed sampling techniques are space efficient which use near-linear space w.r.t. graph size. We conduct extensive experiments to evaluate our algorithms using 8 real-life graphs. The results show that our best algorithm is at least one order of magnitude faster than the state-of-the-art sampling-based solutions (with the same relative error 0.1%) and can be up to three orders of magnitude faster than the state-of-the-art exact algorithm on large graphs.

    References

    [1]
    Mohammad Almasri, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, and Wen-Mei W. Hwu. 2021. K-Clique Counting on GPUs. CoRR abs/2104.13209(2021).
    [2]
    Noga Alon, Raphael Yuster, and Uri Zwick. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In STOC.
    [3]
    Balabhaskar Balasundaram and Sergiy Butenko. 2006. Graph Domination, Coloring and Cliques in Telecommunications. In Handbook of Optimization in Telecommunications. Springer, 865–890.
    [4]
    Vladimir Batagelj and Matjaz Zaversnik. 2003. An O(m) Algorithm for Cores Decomposition of Networks. CoRR cs.DS/0310049(2003).
    [5]
    Luca Becchetti, Paolo Boldi, Carlos Castillo, and Aristides Gionis. 2008. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In KDD.
    [6]
    Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016).
    [7]
    J. W. Berry, B. Hendrickson, R. A. Laviolette, and C. A. Phillips. 2011. Tolerating the Community Detection Resolution Limit with Edge Weighting. Physical Review E Statistical Nonlinear & Soft Matter Physics 83, 5(2011), 056119.
    [8]
    Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. 2018. Motif Counting Beyond Five Nodes. ACM Trans. Knowl. Discov. Data 12, 4 (2018), 48:1–48:25.
    [9]
    Marco Bressan, Stefano Leucci, and Alessandro Panconesi. 2019. Motivo: Fast Motif Counting via Succinct Color Coding and Adaptive Sampling. Proc. VLDB Endow. 12, 11 (2019), 1651–1663.
    [10]
    S Ronald Burt. 2004. Structural holes and good ideas. Amer. J. Sociology 110, 2 (2004), 349–399.
    [11]
    Keren Censor-Hillel, Yi-Jun Chang, François Le Gall, and Dean Leitersdorf. 2021. Tight Distributed Listing of Cliques. In SODA.
    [12]
    Lijun Chang and Lu Qin. 2019. Cohesive Subgraph Computation Over Large Sparse Graphs. In ICDE.
    [13]
    Norishige Chiba and Takao Nishizeki. 1985. Arboricity and Subgraph Listing Algorithms. SIAM J. Comput. 14, 1 (1985), 210–223.
    [14]
    Shumo Chu and James Cheng. 2011. Triangle listing in massive networks and its applications. In KDD.
    [15]
    Maximilien Danisch, Oana Balalau, and Mauro Sozio. 2018. Listing k-cliques in Sparse Real-World Graphs. In WWW.
    [16]
    Talya Eden, Dana Ron, and C. Seshadhri. 2018. On approximating the number of k-cliques in sublinear time. In STOC.
    [17]
    Talya Eden, Dana Ron, and C. Seshadhri. 2020. Faster sublinear approximation of the number of k-cliques in low-arboricity graphs. In SODA.
    [18]
    Katherine Faust. 2010. A puzzle concerning triads in social networks: Graph constraints and the triad census. Soc. Networks 32, 3 (2010), 221–233.
    [19]
    Irene Finocchi, Marco Finocchi, and Emanuele G. Fusco. 2015. Clique Counting in MapReduce: Algorithms and Experiments. ACM J. Exp. Algorithmics 20 (2015), 1.7:1–1.7:20.
    [20]
    Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, and Torsten Hoefler. 2021. Parallel Algorithms for Finding Large Cliques in Sparse Graphs. In SPAA.
    [21]
    William Hasenplaugh, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2014. Ordering heuristics for parallel graph coloring. In SPAA.
    [22]
    Lin Hu, Lei Zou, and Yu Liu. 2021. Accelerating Triangle Counting on GPU. In SIGMOD.
    [23]
    Shweta Jain and C. Seshadhri. 2017. A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem. In WWW.
    [24]
    Shweta Jain and C. Seshadhri. 2020. The Power of Pivoting for Exact Clique Counting. In WSDM.
    [25]
    Shweta Jain and C. Seshadhri. 2020. Provably and Efficiently Approximating Near-cliques using the Turán Shadow: PEANUTS. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. 1966–1976.
    [26]
    Madhav Jha, C. Seshadhri, and Ali Pinar. 2015. Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts. In WWW.
    [27]
    Matthieu Latapy. 2008. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407, 1-3 (2008), 458–473.
    [28]
    Ronghua Li, Sen Gao, Lu Qin, Guoren Wang, Weihua Yang, and Jeffrey Xu Yu. 2020. Ordering Heuristics for k-clique Listing. Proc. VLDB Endow. 13, 11 (2020), 2536–2548.
    [29]
    Kazuhisa Makino and Takeaki Uno. 2004. New Algorithms for Enumerating All Maximal Cliques. In 9th Scandinavian Workshop on Algorithm Theory.
    [30]
    David W. Matula and Leland L. Beck. 1983. Smallest-Last Ordering and clustering and Graph Coloring Algorithms. J. ACM 30, 3 (1983), 417–427.
    [31]
    R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D Chklovskii, and U. Alon. 2010. Network Motifs: Simple Building Blocks of Complex Networks. Science 298, 5594 (2010), 763–764.
    [32]
    Mark Ortmann and Ulrik Brandes. 2014. Triangle Listing Algorithms: Back from the Diversion. In ALENEX.
    [33]
    Noujan Pashanasangi and C. Seshadhri. 2020. Efficiently Counting Vertex Orbits of All 5-vertex Subgraphs, by EVOKE. In WSDM.
    [34]
    Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs. In WWW.
    [35]
    Natasa Przulj, Derek G. Corneil, and Igor Jurisica. 2004. Modeling interactome: scale-free or geometric?Bioinform. 20, 18 (2004), 3508–3515.
    [36]
    Mahmudur Rahman, Mansurul Alam Bhuiyan, and Mohammad Al Hasan. 2014. Graft: An Efficient Graphlet Counting Method for Large Graph Analysis. IEEE Trans. Knowl. Data Eng. 26, 10 (2014), 2466–2478.
    [37]
    Ahmet Erdem Sariyüce, C. Seshadhri, Ali Pinar, and Ümit V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions. In WWW.
    [38]
    Comandur Seshadhri and Srikanta Tirthapura. 2019. Scalable Subgraph Counting: The Methods Behind The Madness. In WWW.
    [39]
    Bintao Sun, Maximilien Danisch, T.-H. Hubert Chan, and Mauro Sozio. 2020. KClist++: A Simple Algorithm for Finding k-Clique Densest Subgraphs in Large Graphs. Proc. VLDB Endow. 13, 10 (2020), 1628–1640.
    [40]
    Ancy Sarah Tom, Narayanan Sundaram, Nesreen K. Ahmed, Shaden Smith, Stijn Eyerman, Midhunchandra Kodiyath, Ibrahim Hur, Fabrizio Petrini, and George Karypis. 2017. Exploring optimizations on shared-memory platforms for parallel triangle counting algorithms. In HPEC.
    [41]
    Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363, 1 (2006), 28–42.
    [42]
    Charalampos E. Tsourakakis. 2015. The K-clique Densest Subgraph Problem. In WWW.
    [43]
    Charalampos E. Tsourakakis, U Kang, Gary L. Miller, and Christos Faloutsos. 2009. DOULION: counting triangles in massive graphs with a coin. In KDD.
    [44]
    Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John C. S. Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2018. MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 73–86.
    [45]
    Hao Yin, Austin R. Benson, and Jure Leskovec. 2017. Higher-order clustering in networks. Physical Review E 97, 5 (2017), 052306.
    [46]
    Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2017. Effective and Efficient Dynamic Graph Coloring. Proc. VLDB Endow. 11, 3 (2017), 338–351.

    Cited By

    View all
    • (2024)Cardinality Estimation of Subgraph Matching: A Filtering-Sampling ApproachProceedings of the VLDB Endowment10.14778/3654621.365463517:7(1697-1709)Online publication date: 30-May-2024
    • (2024)DeepDenseExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121816238:PBOnline publication date: 27-Feb-2024
    • (2023)Triangular Stability Maximization by Influence Spread over Social NetworksProceedings of the VLDB Endowment10.14778/3611479.361149016:11(2818-2831)Online publication date: 1-Jul-2023

    Index Terms

    1. Lightning Fast and Space Efficient k-clique Counting
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '22: Proceedings of the ACM Web Conference 2022
        April 2022
        3764 pages
        ISBN:9781450390965
        DOI:10.1145/3485447
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 25 April 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. cohesive subgraphs
        2. graph sampling
        3. k-clique counting

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '22
        Sponsor:
        WWW '22: The ACM Web Conference 2022
        April 25 - 29, 2022
        Virtual Event, Lyon, France

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)150
        • Downloads (Last 6 weeks)12
        Reflects downloads up to 27 Jul 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Cardinality Estimation of Subgraph Matching: A Filtering-Sampling ApproachProceedings of the VLDB Endowment10.14778/3654621.365463517:7(1697-1709)Online publication date: 30-May-2024
        • (2024)DeepDenseExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121816238:PBOnline publication date: 27-Feb-2024
        • (2023)Triangular Stability Maximization by Influence Spread over Social NetworksProceedings of the VLDB Endowment10.14778/3611479.361149016:11(2818-2831)Online publication date: 1-Jul-2023

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media