Abstract
Graphs are an effective approach for data representation and organization, and graph analysis is a promising killer application for AI systems. However, recently emerging extremely large graphs (consisting of trillions of vertices and edges) exceed the capacity of any small-/medium-scale clusters and thus necessitate the adoption of supercomputers for efficient graph processing. Graph500 is the de facto standard for benchmarking supercomputers’ graph processing performance, and connected component (CC) is an important basic algorithm for Graph500’s BFS and SSSP tests. However, current CC algorithms are inefficient on supercomputers and fast CC is expensive and challenging. In this paper, we propose VPC, an efficient method that prunes connected components using vector-based path compression. It includes the following innovations: (i) The data structure of the traversal algorithm is customized with the two-dimensional adjacency vector. (ii) The vector-based path compression is proposed for the union-find algorithm. (iii) Parallel VPC is proposed customized with Tianhe. Experimental results validate that the two-dimensional adjacency vector has better performance than other data structures and the vector-based path compression is used in the realization of the union-find algorithm. When the scale is 26, the performance of our algorithm is 1.38\(\times\), 1.69\(\times\) and 2.57\(\times\) that of other algorithms. The union-find algorithm is used for connected components, and the performance of the algorithm is 5.14\(\times\) and 5.01\(\times\) that of BFS and DFS respectively.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
15 November 2021
A Correction to this paper has been published: https://doi.org/10.1007/s42514-021-00085-6
References
Albert, R.: Scale-free networks in cell biology Scale-free networks in cell biology. J. Cell Sci. 118(21), 4947–4957 (2005)
Andoni, A., Song, Z., Stein, C., Wang, Z., Zhong, P.: Parallel graph connectivity in log diameter rounds. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 674–685 (2018)
Awerbuch, B., Shiloach, Y.: New connectivity and MSF algorithms for shuffle-exchange network and PRAM New connectivity and msf algorithms for shuffle-exchange network and pram. IEEE Comput. Archit. Lett. 36(10), 1258–1263 (1987)
Azad, A., Buluç, A.: LACC: a linear-algebraic algorithm for finding connected components in distributed memory Lacc: a linear-algebraic algorithm for finding connected components in distributed memory. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 2–12 (2019)
Buluç, A., Mattson, T., McMillan, S., Moreira, J., Yang, C.: Design of the GraphBLAS API for C Design of the graphblas api for c. In 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 643–652 (2017)
Chen, R., Shi, J., Chen, Y., Zang, B., Guan, H., Chen, H.: Powerlyra: Differentiated graph computation and partitioning on skewed graphs. ACM Trans. Parallel Comput. (TOPC) 5(3), 1–39 (2019)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms. MIT press, Cambridge (2009)
Everitt, T., Hutter, M.: Universal artificial intelligence. In: Foundations of trusted autonomy, pp. 15–46. Springer (2018)
Fich, F.E.: The complexity of computation on the parallel random access machine. Citeseer (1993)
Gazit, H.: An optimal randomized parallel algorithm for finding connected components in a graph. SIAM J. Comput. 20(6), 1046–1067 (1991)
Giani, A., Bitar, E., Garcia, M., McQueen, M., Khar-gonekar, P.P., Poolla, K.: Smart grid data integrity attacks. IEEE Trans. Smart Grid 4(3), 1244–1253 (2013)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: 10th fUSENIXg Symposium on Operating Systems Design and Implementation (fOSDIg 12), pp. 17–30 (2012)
Gonzalez, J. E., Xin, R. S., Dave, A., Crankshaw, D., Franklin, M. J., Stoica, I.: Graphx: Graph processing in a distributed data flow frame-work. In 11th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 14) 11th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 14), pp. 599–613 (2014)
Halperin, S., Zwick, U.: An optimal ran- domised logarithmic time connectivity algorithm for the erew pram. J. Comput. Syst. Sci. 53(3), 395–416 (1996)
He, L., Chao, Y., Suzuki, K., Wu, K.: Fast connected-component labeling. Pattern Recogn. 42(9), 1977–1987 (2009)
Hirschberg, D.S., Chandra, A.K., Sarwate, D.V.: Computing connected components on parallel computers. Commun. ACM 22(8), 461–464 (1979)
Hogan, E., Hui, P., Choudhury, S., Halappanavar, M., Oler, K., Joslyn, C.: Towards a multi-scale approach to cybersecurity modeling. In: 2013 IEEE International Conference on Technologies for Homeland Security (hst), pp. 80–85 (2013)
Hopcroft, J., Tarjan, R.: Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM 16(6), 372–378 (1973)
Hopcroft, J.E., Ullman, J.D.: Set merging algorithms. SIAM J. Comput. 2(4), 294–303 (1973)
Huijbregts, M.: Segmentation, diarization and speech transcription: surprise data unraveled. Citeseer (2008)
Jain, C., Flick, P., Pan, T., Green, O., Aluru, S.: An adaptive parallel algorithm for computing connected components. IEEE Trans. Parallel Distrib. Syst. 28(9), 2428–2439 (2017)
Jung, J., Shin, K., Sael, L., Kang, U.: Random walk with restart on large graphs using block elimination. ACM Trans. Database Syst. (TODS) 41(2), 1–43 (2016)
Kang, U., Faloutsos, C.: Beyond’caveman communities’: Hubs and spokes for graph com- pression and mining. In: 2011 IEEE 11th International Conference on Data Mining, pp. 300–309 (2011)
Kang, U., McGlohon, M., Akoglu, L., Faloutsos, C.: Patterns on the connected components of terabyte-scale graphs. In: 2010 IEEE International Conference on Data Mining, pp. 875–880 (2010)
Kikuchi, K., Masuda, Y., Yamashita, T., Sato, K., Katagiri, C., Hirao, T., Yaguchi, H.: A new quantitative evaluation method for age- related changes of individual pigmented spots in facial skin. Skin Res. Technol. 22(3), 318–324 (2016)
Liao, X.-K., Pang, Z.-B., Wang, K.-F., Lu, Y.-T., Xie, M., Xia, J., Suo, G.: High performance interconnect network for tianhe system. J. Comput. Sci. Technol. 30(2), 259–272 (2015)
Lim, Y., Lee, W.-J., Choi, H.-J., Kang, U.: Discovering large subsets with high quality partitions in real world graphs. In: 2015 International Conference on Big Data and Smart Computing (big-comp), pp. 186–193 (2015)
Lim, Y., Kang, U., Faloutsos, C.: Slashburn: Graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26(12), 3077–3089 (2014)
Lim, Y., Lee, W.-J., Choi, H.-J., Kang, U.: Mtp: discovering high quality partitions in real world graphs. World Wide Web 20(3), 491–514 (2017)
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: A framework for machine learning in the cloud. Preprint at arXiv:1204.6078 (2012)
Lu, X., Wang, H., Wang, J.: Internet-based virtual computing environment (ivce): Concepts and architecture. Sci. China Ser. F Inf. Sci. 49(6), 681–701 (2006)
Lu, X., Wang, H., Wang, J., Xu, J., Li, D.: Internet-based virtual computing environment: Beyond the data center as a computer. Futur. Gener. Comput. Syst. 29(1), 309–322 (2013)
Medini, D., Covacci, A., Donati, C.: Protein homology network families reveal step-wise diversification of type iii and type iv secretion systems. PLoS Comput. Biol. 2(12), e173 (2006)
Nowosielski, A., Frejlichowski, D., Forczmański, P., Gościewska, K., Hofman, R.: Automatic analysis of vehicle trajectory applied to visual surveillance. In: Image processing and communications challenges, vol. 7, pp. 89–96. Springer (2016)
Patil, G.P., Acharya, R., Phoha, S.: Digital governance, hotspot detection, and homeland security. Encyclopedia of Quantitative Risk Analysis and Assessment, vol. 2 (2008)
Pettie, S., Ramachandran, V.: A randomized time-work optimal parallel algorithm for finding a minimum spanning forest. SIAM J. Comput. 31(6), 1879–1895 (2002)
Reif, J.H.: Depth-first search is inherently sequential. Inf. Process. Lett. 20(5), 229–234 (1985a)
Reif, J. H.: Optimal parallel algorithms for interger sorting and graph connectivity. (Tech. Rep.). HARVARD UNIV CAMBRIDGE MA AIKEN COMPUTATION LAB (1985b)
Shiloach, Y., Vishkin, U.: An o (log n) parallel connectivity algorithm (Tech. Rep.). Computer Science Department, Technion (1980)
Shun, J., Dhulipala, L., Blelloch, G.: A simple and practical linear-work parallel algorithm for connectivity. In: Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 143–153 (2014)
Slota, G. M., Rajamanickam, S., Madduri, K.: A case study of complex graph analysis in distributed memory: Implementation and optimiza- tion. In: 2016 IEEE International Parallel and Dis- Tributed Processing Symposium (ipdps), pp. 293–302 (2016)
Song, W., Wu, D., Xi, Y., Park, Y.W., Cho, K.: Motion-based skin region of interest detection with a real-time connected component labeling algorithm. Multimed. Tools Appl. 76(9), 11199–11214 (2017)
Tarjan, R.E., Van Leeuwen, J.: Worst-case analysis of set union algorithms. Journal of the CM (JACM), 31(2), 245–281 (1984). https://investor.fb.com/investor-news/press-release-details/2021/Facebook-Reports-First-Quarter-2021-Results/default.aspx.(n.d.)https://www.tencent.com/zh-cn/investors.html.(n.d.)
Tarjan, R.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972)
Tarjan, R.E.: Efficiency of a good but not linear set union algorithm. J. ACM (JACM) 22(2), 215–225 (1975)
Vishkin, U.: An optimal parallel connectivity algorithm. Discret. Appl. Math. 9(2), 197–207 (1984)
Wang, R., Lu, K., Chen, J., Zhang, W., Li, J., Yuan, Y., Fan, X.: Brief introduction of tianhe exascale prototype system. Tsinghua Sci. Technol. 26(3), 361–369 (2020)
Wu, X., Yuan, P., Peng, Q., Ngo, C.-W., He, J.-Y.: Detection of bird nests in overhead catenary system images for high-speed rail. Pattern Recogn. 51, 242–254 (2016)
Yao, A.C.: On the expected performance of path compression algorithms. SIAM J. Comput. 14(1), 129–133 (1985)
Yip, M., Shadbolt, N., Webber, C.: Structural analysis of online criminal social networks. In: 2012 IEEE International Conference on Intelligence and Security Informatics, pp. 60–65 (2012)
Zhang, Y., Azad, A., Hu, Z.: Fastsv: A distributed-memory connected component algo- rithm with fast convergence. In: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, pp. 46–57 (2020)
Zhang, Y., Azad, A., Buluc, A.: Parallel algorithms for finding connected components using linear algebra. J. Parallel Distrib. Comput. 144, 14–27 (2020)
Acknowledgements
This work was supported by the National Numerical Wind Tunnel Project(NNW2019ZT6-B21), the National Key Research and Development Program of China (2018YFB0204301), the Hunan Natural Science Foundation of China(2020JJ4669), and the Foundation of Parallel and Distributed Processing Laboratory (6142110190206).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
About this article
Cite this article
Bai, H., Gan, X., Xu, T. et al. VPC: Pruning connected components using vector-based path compression for Graph500. CCF Trans. HPC 3, 271–285 (2021). https://doi.org/10.1007/s42514-021-00070-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42514-021-00070-z