Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Counting triangles in real-world networks using projections

Published: 01 March 2011 Publication History

Abstract

Triangle counting is an important problem in graph mining. Two frequently used metrics in complex network analysis that require the count of triangles are the clustering coefficients and the transitivity ratio of the graph. Triangles have been used successfully in several real-world applications, such as detection of spamming activity, uncovering the hidden thematic structure of the web and link recommendation in online social networks. Furthermore, the count of triangles is a frequently used network statistic in exponential random graph models. However, counting the number of triangles in a graph is computationally expensive. In this paper, we propose the EigenTriangle and EigenTriangleLocal algorithms to estimate the number of triangles in a graph. The efficiency of our algorithms is based on the special spectral properties of real-world networks, which allow us to approximate accurately the number of triangles. We verify the efficacy of our method experimentally in almost 160 experiments using several Web Graphs, social, co-authorship, information, and Internet networks where we obtain significant speedups with respect to a straightforward triangle counting algorithm. Furthermore, we propose an algorithm based on Fast SVD which allows us to apply the core idea of the EigenTriangle algorithm on graphs which do not fit in the main memory. The main idea is a simple node-sampling process according to which node i is selected with probability $${\frac{d_i}{2m}}$$ where di is the degree of node i and m is the total number of edges in the graph. Our theoretical contributions also include a theorem that gives a closed formula for the number of triangles in Kronecker graphs, a model of networks which mimics several properties of real-world networks.

References

[1]
Achlioptas D, McSherry F (2001) Fast computation of low rank matrix approximations. Symp Theory Comput
[2]
Adamic L, Glance N (2005) The political blogosphere and the 2004 US election: divided they blog. Workshop Link Discov
[3]
Alon N, Matias Y, Szegedy M (1996) The space complexity of approximating the frequency moments. Symp Theory Comput
[4]
Alon N, Yuster R, Zwick U: Finding and counting given length cycles. Algorithmica 17(3), 209---223 (1997)
[5]
Bar-Yosseff Z, Kumar R, Sivakumar D (2002) Reductions in streaming algorithms, with an application to counting triangles in graphs. Symp Discrete Algorithms
[6]
Becchetti L, Boldi P, Castillo C, Gionis A (2008) Efficient semi-streaming algorithms for local triangle counting in massive graphs. Knowl Discov Data Min
[7]
Bollobas B: Random graphs. Cambridge University Press, Cambridge (2001)
[8]
Broder AZ, Charikar M, Frieze A, Mitzenmacher M (1998) Min-wise independent permutations. Symp Theory Comput
[9]
Buriol L, Frahling G, Leonardi S, Marchetti-Spaccamela A, Sohler C (2006) Counting triangles in data streams. Princc Database Syst
[10]
Chung F, Lu L, Vu V: Eigenvalues of random power law graphs. Ann Comb 7, 21---33 (2003)
[11]
Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4)
[12]
Coppersmith D, Winograd S (1987) Matrix multiplication via arithmetic progressions. Symp Theory Comput
[13]
Cullum J, Willoughby RA: Lanczos algorithms for large symmetric eigenvalue computations vol 1. Society for Industrial and Applied Mathematics, Philadelphia (2002)
[14]
Cuppen JJM: A divide and conquer method for the symmetric tridiagonal eigenproblem. Numer Math 36, 177---195 (1981)
[15]
Deerwester S, Dumais S, Furnas G, Landauer T, Harshman R: Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6), 391---407 (1990)
[16]
Demmel J: Applied numerical linear algebra. Society for Industrial and Applied Mathematics, Philadelphia (1997)
[17]
Drineas P, Frieze A, Kannan R, Vempala S, Vinay V: Clustering large graphs via the Singular Value Decomposition. Mach Learn J 04(56), 9---33 (2004)
[18]
Eckmann JP, Moses E: Curvature of co-links uncovers hidden thematic layers in the World Wide Web. Proc Natl Acad Sci (PNAS) 99(9), 5825---5829 (2002)
[19]
Edwards JT, Licciardello DC, Thouless DJ: Use of Lanczos methos for finding complete sets of eigenvalues of large sparse symmetric matrices. IMA J Appl Math 23, 277---283 (1979)
[20]
Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationshipds of the internet topology. SIGCOMM
[21]
Farkas I, Derenyi I, Barabasi AL, Vicsek T (2001) Spectra of real-world graphs: beyond the semicircle law. Phys Rev E 64
[22]
Fienberg S, Rinaldo A, Zhou Y (2009) On the geometry of discrete exponential families with application to exponential random graph models. CMU Technical Report STAT-TR871
[23]
Furedi Z, Komlos J: The eigenvalues of random symmetric matrices. J Comb 1(3), 233---241 (1981)
[24]
Godsil CD, Royle G: Algebraic graph theory. Springer, Berlin (2001)
[25]
Golub GH, Van Loan CF: Matrix computations. Johns Hopkins Press, Baltimore (1989)
[26]
Kang U, Tsourakakis C, Faloutsos C (2009) PEGASUS: a peta-scale graph mining system--implementation and observations. IEEE Int Conf Data Min, Available at http://www.cs.cmu.edu/~pegasus/
[27]
Kleinberg J: Authoritative sources in a hyperlinked environment. J ACM 46(5), 604---632 (1999)
[28]
Latapy M: Practical algorithms for triangle computations in very large (sparse (power-law)) graphs. J Theor Comput Sci 407, 458---473 (2008)
[29]
Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C (2005) Realistic, mathematically tractable graph generation and evolution, using Kronecker multiplication. Pract Knowl Discov Databases
[30]
Leskovec J, Faloutsos C (2007) Scalable modeling of real graphs using Kronecker multiplication. Int Conf Mach Learn
[31]
Leskovec J, Backstrom L, Kumar R, Tomkins A (2008) Microscopic evolution of networks. Knowl Discov Data Min
[32]
Meurant G: The Lanczos and conjugate gradient algorithms, from theory to finite precision computations. Society for Industrial and Applied Mathematics, Philadelphia (2006)
[33]
Mihail M, Papadimitriou C (2002) The eigenvalue power law. RANDOM
[34]
Ove F, Strauss D: Markov graph. J Am Stat Assoc 81, 832---842 (1986)
[35]
Papadimitriou C, Raghavan P, Tamaki H, Vempala S (1998) Latent semantic indexing: a probabilistic analysis. Princ Database Syst
[36]
Strang G: Introduction to linear algebra. Society for Industrial and Applied Mathematics, Philadelphia (2003)
[37]
Schank T, Wagner D (2004) DELIS-TR-0043 finding, counting and listing all triangles in large graphs, an experimental study. Tech Report 0043
[38]
Schank T, Wagner D: Approximating clustering coefficient and transitivity. J Graph Algorithms Appl 9, 265---275 (2005)
[39]
Song G, Cui B, Zheng B, Xie K, Yang D: Accelerating sequence searching: dimensionality reduction method. Knowl Inf Syst 20, 301---322 (2009)
[40]
Tsourakakis C (2010) MACH: fast randomized tensor decompositions. SIAM Conf Data Min
[41]
Tsourakakis C (2008) Fast counting of triangles in large real networks without counting: algorithms and laws. IEEE Int Conf Data Min
[42]
Tsourakakis C, Kang U, Miller GL, Faloutsos C (2009) DOULION: counting triangles in massive graphs with a coin. Knowl Discov Data Min
[43]
Tsourakakis C, Kolountzakis M, Miller GL (2009) Approximate triangle counting. In Arxiv 0904.3761
[44]
Tsourakakis C, Drineas P, Michelakis E, Koutis I, Faloutsos C (2009) Spectral counting of triangles in power-law networks via element-wise sparsification. Adv Soc Netw Anal Min
[45]
Tsourakakis C, Drineas P, Michelakis E, Koutis I, Faloutsos C (2010) Spectral counting of triangles in power-law networks via element-wise sparsification and triangle-based link recommendation. Invited book chapter in advances in social networks analysis and mining (submitted)
[46]
Wasserman S, Faust K: Social network analysis. Cambridge University Press, Cambridge (1994)
[47]
Xiang S, Nie F, Song Y, Zhang C, Zhang C: Embedding new data points for manifold learning via coordinate propagation. Knowl Inf Syst 19, 159---184 (2009)

Cited By

View all
  1. Counting triangles in real-world networks using projections

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Knowledge and Information Systems
    Knowledge and Information Systems  Volume 26, Issue 3
    March 2011
    169 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 March 2011

    Author Tags

    1. Algorithms
    2. Network analysis
    3. SVD
    4. Triangles

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media