REPT: A streaming algorithm of approximating global and local triangle counts in parallel

P Wang, P Jia, Y Qi, Y Sun, J Tao… - 2019 IEEE 35th …, 2019 - ieeexplore.ieee.org
P Wang, P Jia, Y Qi, Y Sun, J Tao, X Guan
2019 IEEE 35th International Conference on Data Engineering (ICDE), 2019ieeexplore.ieee.org
Recently, considerable efforts have been devoted to approximately computing the global
and local (ie, incident to each node) triangle counts of a large graph stream represented as
a sequence of edges. Existing approximate triangle counting algorithms rely on sampling
techniques to reduce the computational cost. However, their estimation errors are
significantly determined by the covariance between sampled triangles. Moreover, little
attention has been paid to developing parallel one-pass streaming algorithms that can be …
Recently, considerable efforts have been devoted to approximately computing the global and local (i.e., incident to each node) triangle counts of a large graph stream represented as a sequence of edges. Existing approximate triangle counting algorithms rely on sampling techniques to reduce the computational cost. However, their estimation errors are significantly determined by the covariance between sampled triangles. Moreover, little attention has been paid to developing parallel one-pass streaming algorithms that can be used to fast and approximately count triangles on a multi-core machine or a cluster of machines. To solve these problems, we develop a novel parallel method REPT to significantly reduce the covariance (even completely eliminate the covariance for some cases) between sampled triangles. We theoretically prove that REPT is more accurate than parallelizing existing triangle count estimation algorithms in a direct manner. In addition, we also conduct extensive experiments on a variety of real-world graphs, and the results demonstrate that our method REPT is several times more accurate than state-of-the-art methods.
ieeexplore.ieee.org