Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Sublinear Time Estimation of Degree Distribution Moments: : The Arboricity Connection

Published: 01 January 2019 Publication History

Abstract

We revisit the classic problem of estimating the moments of the degree distribution of an undirected simple graph. Consider an undirected simple graph $G=(V,E)$ with $n$ (nonisolated) vertices, and define (for $s > 0$) $M_s= \sum_{v \in V} d^s_v$. Our aim is to estimate $M_s$ within a multiplicative error of $(1+\varepsilon)$ (for a given approximation parameter $\varepsilon>0$) in sublinear time. We consider the sparse-graph model that allows access to uniform random vertices, queries for the degree of any vertex, and queries for a neighbor of any vertex. For the case of $s=1$ (the average degree), $O^*(\sqrt{n})$ queries suffice for any constant $\varepsilon$ [U. Feige, SIAM J. Comput., 35 (2006), pp. 964--984], [O. Goldreich and D. Ron, Random Structures Algorithms, 32 (2008), pp. 473--493]. (We use the $O^*$ notation to suppress dependencies in $\log n$ and $1/\varepsilon$.) Gonen, Ron, and Shavitt [SIAM J. Discrete Math., 25 (2011), pp. 1365--1411] extended this result to all integral $s > 0$ by designing an algorithm that performs $O^*(n^{1-1/(s+1)})$ queries. (Strictly speaking, their algorithm approximates the number of star-subgraphs of a given size, but a slight modification gives an algorithm for moments.) We design a new, significantly simpler algorithm for this problem. In the worst case, it exactly matches the bounds of Gonen, Ron, and Shavitt and has a much simpler proof. More importantly, the running time of this algorithm is connected to the arboricity of $G$. This is (essentially) the maximum density of an induced subgraph. For the family of graphs with arboricity at most $\alpha$, it has a query complexity of $O^*\big(\frac{n \cdot \alpha^{1/s}}{M_s^{1/s}} + \min\big\{\frac{m}{M_s^{1/s}},\frac{m \cdot n^{s-1}}{M_s}\big\}\big)$ which is always upper bounded by $O^*\big(\frac{n\alpha}{M_s^{1/s}}\big)$. Thus, for the class of constant-arboricity graphs (which includes, among others, all minor-closed families and preferential attachment graphs), we can estimate the average degree in $O^*(1)$ queries, and we can estimate the variance of the degree distribution in $O^*(\sqrt{n})$ queries. This is a major improvement over the previous worst-case bounds.

References

[1]
M. Aliakbarpour, A. S. Biswas, T. Gouleakis, J. Peebles, R. Rubinfeld, and A. Yodpinyanee, Sublinear-time algorithms for counting star subgraphs via edge sampling, Algorithmica, 80 (2018), pp. 668--697.
[2]
N. Alon and S. Gutner, Linear time algorithms for finding a dominating set of fixed size in degenerated graphs, in Proceedings of the Annual International Conference Computing and Combinatorics (COCOON), 2008, pp. 394--405.
[3]
N. Alon, Y. Matias, and M. Szegedy, The space complexity of approximating the frequency moments, J. Comput. System Sci., 58 (1999), pp. 137--147.
[4]
A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science, 286 (1999), pp. 509--512.
[5]
J. W. Berry, L. A. Fostvedt, D. J. Nordman, C. A. Phillips, C. Seshadhri, and A. G. Wilson, Why do simple algorithms for triangle enumeration work in the real world?, Internet Math., 11 (2015), pp. 555--571.
[6]
Z. Bi, C. Faloutsos, and F. Korn, The “DGX” distribution for mining massive, skewed data, in Proceedings of the International Conference on Knowledge Discovery and Data Mining (SIGKDD), ACM, 2001, pp. 17--26.
[7]
P. Bickel, A. Chen, and E. Levina, The method of moments and degree distributions for network models, Ann. Statist., 39 (2011), pp. 2280--2301.
[8]
P. Brach, M. Cygan, J. Ła̧cki, and P. Sankowski, Algorithmic complexity of power law networks, in Proceedings of the 27th Annual Symposium on Discrete Algorithms (SODA), SIAM, 2016, pp. 1306--1325, https://doi.org/10.1137/1.9781611974331.ch91.
[9]
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener, Graph structure in the Web, Computer Networks, 33 (2000), pp. 309--320.
[10]
B. Chazelle, R. Rubinfeld, and L. Trevisan, Approximating the minimum spanning tree weight in sublinear time, SIAM J. Comput., 34 (2005), pp. 1370--1379, https://doi.org/10.1137/S0097539702403244.
[11]
N. Chiba and T. Nishizeki, Arboricity and subgraph listing algorithms, SIAM J. Comput., 14 (1985), pp. 210--223, https://doi.org/10.1137/0214017.
[12]
F. Chierichetti, A. Dasgupta, R. Kumar, S. Lattanzi, and T. Sarlós, On sampling nodes in a network, in Proceedings of the 25th International Conference on World Wide Web (WWW), 2016, pp. 471--481.
[13]
A. Clauset, C. R. Shalizi, and M. E. J. Newman, Power-law distributions in empirical data, SIAM Rev., 51 (2009), pp. 661--703, https://doi.org/10.1137/070710111.
[14]
A. Czumaj, F. Ergün, L. Fortnow, A. Magen, I. Newman, R. Rubinfeld, and C. Sohler, Approximating the weight of the Euclidean minimum spanning tree in sublinear time, SIAM J. Comput., 35 (2005), pp. 91--109, https://doi.org/10.1137/S0097539703435297.
[15]
A. Czumaj and C. Sohler, Estimating the weight of metric minimum spanning trees in sublinear time, SIAM J. Comput., 39 (2009), pp. 904--922, https://doi.org/10.1137/060672121.
[16]
A. Dasgupta, R. Kumar, and T. Sarlos, On estimating the average degree, in Proceedings of the 23rd International Conference on World Wide Web (WWW), 2014, pp. 795--806.
[17]
R. Diestel, Graph Theory, 4th ed., Springer, 2010.
[18]
T. Eden, A. Levi, D. Ron, and C. Seshadhri, Approximately counting triangles in sublinear time, in Proceedings of the 56th Annual Symposium on Foundations of Computer Science (FOCS), IEEE, 2015, pp. 614--633.
[19]
T. Eden, D. Ron, and C. Seshadhri, Sublinear Time Estimation of Degree Distribution Moments: The Arboricity Connection, preprint, https://arxiv.org/abs/1604.03661v1, 2016.
[20]
T. Eden and W. Rosenbaum, Lower Bounds for Approximating Graph Parameters via Communication Complexity, preprint, https://arxiv.org/abs/1709.04262, 2017.
[21]
D. Eppstein, M. Löffler, and D. Strash, Listing all maximal cliques in sparse graphs in near-optimal time, in 21st International Symposium on Algorithms and Computation (ISAAC), Springer, 2010, pp. 403--413.
[22]
M. Faloutsos, P. Faloutsos, and C. Faloutsos, On power-law relationships of the Internet topology, in Proceedings of Computer Communication Review (SIGCOMM), ACM, 1999, pp. 251--262.
[23]
U. Feige, On sums of independent random variables with unbounded variance and estimating the average degree in a graph, SIAM J. Comput., 35 (2006), pp. 964--984, https://doi.org/10.1137/S0097539704447304.
[24]
O. Goldreich and D. Ron, Approximating average parameters of graphs, Random Structures Algorithms, 32 (2008), pp. 473--493.
[25]
M. Gonen, D. Ron, and Y. Shavitt, Counting stars and other small subgraphs in sublinear-time, SIAM J. Discrete Math., 25 (2011), pp. 1365--1411, https://doi.org/10.1137/100783066.
[26]
A. Hassidim, J. A. Kelner, H. N. Nguyen, and K. Onak, Local graph partitions for approximation and testing, in Proceedings of the 50th Annual Symposium on Foundations of Computer Science (FOCS), IEEE, 2009, pp. 22--31.
[27]
S. Marko and D. Ron, Approximating the distance to properties in bounded-degree and general sparse graphs, ACM Trans. Algorithms, 5 (2009), 22.
[28]
D. Matula and L. Beck, Smallest-last ordering and clustering and graph coloring algorithms, J. ACM, 30 (1983), pp. 417--427.
[29]
C. St. J. Nash-Williams, Edge-disjoint spanning trees of finite graphs, J. London Math. Soc., 1 (1961), pp. 445--450.
[30]
C. St. J. Nash-Williams, Decomposition of finite graphs into forests, J. London Math. Soc., 1 (1964), pp. 12--12.
[31]
J. Nešetril and P. O. de Mendez, Sparsity, Springer, 2010.
[32]
H. N. Nguyen and K. Onak, Constant-time approximation algorithms via local improvements, in Proceedings of the 49th Annual Symposium on Foundations of Computer Science (FOCS), IEEE, 2008, pp. 327--336.
[33]
K. Onak, D. Ron, M. Rosen, and R. Rubinfeld, A near-optimal sublinear-time algorithm for approximating the minimum vertex cover size, in Proceedings of the 23rd Annual Symposium on Discrete Algorithms (SODA), SIAM, 2012, pp. 1123--1131, https://doi.org/10.1137/1.9781611973099.88.
[34]
M. Parnas and D. Ron, Approximating the minimum vertex cover in sublinear time and a connection to distributed algorithms, Theoret. Comput. Sci., 381 (2007), pp. 183--196.
[35]
D. Pennock, G. Flake, S. Lawrence, E. Glover, and C. L. Giles, Winners don't take all: Characterizing the competition for links on the web, Proc. Natl. Acad. Sci. USA, 99 (2002), pp. 5207--5211.
[36]
A. Sala, L. Cao, C. Wilson, R. Zablit, H. Zheng, and B. Y. Zhao, Measurement-calibrated graph models for social network experiments, in Proceedings of the 19th International Conference on World Wide Web (WWW), ACM, 2010, pp. 861--870.
[37]
O. Simpson, C. Seshadhri, and A. McGregor, Catching the head, tail, and everything in between: A streaming algorithm for the degree distribution, in Proceedings on the 2015 IEEE International Conference on Data Mining (ICDM), IEEE, 2015, pp. 979--984.
[38]
Wikipedia contributors, Arboricity, Wikipedia, https://en.wikipedia.org/w/index.php?title=Arboricity&oldid=802150732, 2017 [accessed 26 Nov. 2018].
[39]
Wikipedia contributors, Degeneracy (graph theory), Wikipedia, https://en.wikipedia.org/w/index.php?title=Degeneracy_(graph_theory)&oldid=860270084, 2018 [accessed 26 Nov. 2018].
[40]
Y. Yoshida, M. Yamamoto, and H. Ito, An improved constant-time approximation algorithm for maximum-matchings, in Proceedings of the 41st Annual Symposium on the Theory of Computing (STOC), ACM, 2009, pp. 225--234.

Index Terms

  1. Sublinear Time Estimation of Degree Distribution Moments: The Arboricity Connection
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image SIAM Journal on Discrete Mathematics
      SIAM Journal on Discrete Mathematics  Volume 33, Issue 4
      2019
      710 pages
      ISSN:0895-4801
      DOI:10.1137/sjdmec.33.4
      Issue’s Table of Contents

      Publisher

      Society for Industrial and Applied Mathematics

      United States

      Publication History

      Published: 01 January 2019

      Author Tags

      1. sublinear time algorithms
      2. degeneracy
      3. degree distribution
      4. approximation algorithms
      5. arboricity
      6. degree moments

      Author Tags

      1. 68Q25
      2. 68W20
      3. 68W25

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 17 Oct 2024

      Other Metrics

      Citations

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media