research-article

HotGraph: Efficient Asynchronous Processing for Real-World Graphs

Authors:

Bing Bing ZhouAuthors Info & Claims

IEEE Transactions on Computers, Volume 66, Issue 5

Pages 799 - 809

https://doi.org/10.1109/TC.2016.2624289

Published: 01 May 2017 Publication History

Abstract

For large-scale graph analysis on a single PC, asynchronous processing methods are known to converge more quickly than the synchronous approach, because of more efficient propagation of vertices state. However, current asynchronous methods are still very suboptimal in propagating state across different graph partitions. This presents a bottleneck for cross-partition state update and slows down the convergence of the processing task. To tackle this problem, we propose a new method, named the HotGraph, to faster graph processing by extracting a backbone structure, called hot graph, that spans all the partitions of the original graph. With this approach, most cross-partition state propagations in traditional solutions now take place within only a few hot graph partitions, thus removing the cross-partition bottleneck. We also develop a partition scheduling algorithm to maximize the hot graph’s effectiveness by keeping it in memory and assigning it the highest priority for processing as much as possible. A forward and backward sweeping execution strategy is then proposed to further accelerate the convergence. Experimental results show that HotGraph can reduce the number of vertex state updates processed by 51.5 percent, compared with state-of-the-art schemes. Applying our optimizations further reduces this number by 72.6 percent and the execution time by 80.8 percent.

References

[1]

D. Horowitz and S. D. Kamvar, “The anatomy of a large-scale social search engine,” in Proc. 19th Int. Conf. World Wide Web, 2010, pp. 431–440.

Digital Library

[2]

P. Wang, B. Xu, Y. Wu, and X. Zhou, “Link prediction in social networks: The state-of-the-art,” Sci. China Inf. Sci., vol. Volume 58, no. Issue 1, pp. 1–38, 2015.

[3]

S. Baluja, et al., “Video suggestion and discovery for YouTube: Taking random walks through the view graph,” in Proc. 17th Int. Conf. World Wide Web, 2008, pp. 895–904.

Digital Library

[4]

G. Jeh and J. Widom, “SimRank: A measure of structural-context similarity,” in Proc. 8th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2002, pp. 538–543.

Digital Library

[5]

M. E. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E, vol. Volume 69, no. Issue 2, pp. 26–113, 2004.

[6]

A. Roy, I. Mihailovic, and W. Zwaenepoel, “X-stream: Edge-centric graph processing using streaming partitions,” in Proc. 24th ACM Symp. Operating Syst. Principles, 2013, pp. 472–488.

Digital Library

[7]

P. Yuan, W. Zhang, C. Xie, H. Jin, L. Liu, and K. Lee, “Fast iterative graph computation: A path centric approach,” in Proc. Int. Conf. High Perform. Comput. Netw. Storage Anal., 2014, pp. 401–412.

Digital Library

[8]

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed GraphLab: A framework for machine learning and data mining in the cloud,” Proc. VLDB Endowment, vol. Volume 5, no. Issue 8, pp. 716–727, 2012.

Digital Library

[9]

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein, “GraphLab: A new framework for parallel machine learning,” in Proc. 26th Conf. Uncertainty Artif. Intell., 2010, pp. 1–10.

Digital Library

[10]

S. R. Mihaylov, Z. G. Ives, and S. Guha, “REX: Recursive, delta-based data-centric computation,” Proc. VLDB Endowment, vol. Volume 5, no. Issue 11, pp. 1280–1291, 2012.

Digital Library

[11]

Y. Zhang, Q. Gao, L. Gao, and C. Wang, “Maiter: An asynchronous graph processing framework for delta-based accumulative iterative computation,” IEEE Trans. Parallel Distrib. Syst., vol. Volume 25, no. Issue 8, pp. 2091–2100, 2014.

[12]

K. Lee and L. Liu, “Efficient data partitioning model for heterogeneous graphs in the cloud,” in Proc. Int. Conf. High Perform. Comput. Netw. Storage Anal., 2013, pp. 1–12.

Digital Library

[13]

G. Malewicz et al., “Pregel: A system for large-scale graph processing,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2010, pp. 135–146.

Digital Library

[14]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “PowerGraph: Distributed graph-parallel computation on natural graphs,” in Proc. 10th USENIX Symp. Operating Syst. Des. Implementation, 2012, pp. 17–30.

Digital Library

[15]

A. Kyrola, G. E. Blelloch, and C. Guestrin, “GraphChi: Large-scale graph computation on just a PC,” in Proc. 10th USENIX Symp. Operating Syst. Des. Implementation, 2012, pp. 31–46.

Digital Library

[16]

Stanford, “Stanford dataset,” 2016. {Online}. Available: http://snap.stanford.edu/data/

[17]

L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank citation ranking: Bringing order to the web,” Stanford Digit. Library Technol. Project, Stanford Univ., Stanford, CA, USA, Tech. Rep., 1998.

[18]

H. Kwak, C. Lee, H. Park, and S. Moon, “What is Twitter, a social network or a news media?” in Proc. 19th Int. Conf. World Wide Web, 2010, pp. 591–600.

Digital Library

[19]

L. for Web Algorithmics, “Datasets,” 2016. {Online}. Available: http://law.di.unimi.it/datasets.php

[20]

Y. Lab, “Datasets,” 2016. {Online}. Available: http://webscope.sandbox.yahoo.com/catalog.php?datatype=g

[21]

R. Power and J. Li, “Piccolo: Building fast, distributed programs with partitioned tables,” in Proc. 9th USENIX Symp. Operating Syst. Des. Implementation, 2010, pp. 1–14.

Digital Library

[22]

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster computing with working sets,” in Proc. 2nd USENIX Conf. Hot Topics Cloud Comput., 2010, pp. 1–10.

Digital Library

[23]

J. Fitzhardinge, “Cachegrind,” 2016. {Online}. Available: http://www.valgrind.org/

[24]

Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, “Haloop: Efficient iterative data processing on large clusters,” Proc. VLDB Endowment, vol. Volume 3, no. Issue 1/2, pp. 285–296, 2010.

Digital Library

[25]

J. Ekanayake et al., “Twister: A runtime for iterative MapReduce,” in Proc. 19th ACM Int Symp. High Perform. Distrib. Comput., 2010, pp. 810–818.

Digital Library

[26]

D. G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand, “CIEL: A universal execution engine for distributed data-flow computing,” in Proc. 8th USENIX Symp. Netw. Syst. Des. Implementation, 2011, pp. 113–126.

Digital Library

[27]

L. Chen, X. Huo, B. Ren, S. Jain, and G. Agrawal, “Efficient and simplified parallel graph processing over CPU and MIC,” in Proc. IEEE Int. Parallel Distrib. Process. Symp., 2015, pp. 819–828.

Digital Library

[28]

J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica, “GraphX: Graph processing in a distributed dataflow framework,” in Proc. 11th USENIX Symp. Operating Syst. Des. Implementation, 2014, pp. 599–613.

Digital Library

[29]

D. Nguyen, A. Lenharth, and K. Pingali, “A lightweight infrastructure for graph analytics,” in Proc. 24th ACM Symp. Operating Syst. Principles, 2013, pp. 456–471.

Digital Library

[30]

A. Roy, L. Bindschaedler, J. Malicevic, and W. Zwaenepoel, “Chaos: Scale-out graph processing from secondary storage,” in Proc. 25th Symp. Operating Syst. Principles, 2015, pp. 410–424.

Digital Library

[31]

Y. Zhou, L. Liu, K. Lee, C. Pu, and Q. Zhang, “Fast iterative graph computation with resource aware graph parallel abstractions,” in Proc. 24th Int. Symp. High-Perform. Parallel Distrib. Comput., 2015, pp. 179–190.

Digital Library

[32]

W. Xie, G. Wang, D. Bindel, A. Demers, and J. Gehrke, “Fast iterative graph computation with block updates,” Proc. VLDB Endowment, vol. Volume 6, no. Issue 14, pp. 2014–2025, 2013.

Digital Library

[33]

R. Chen, J. Shi, Y. Chen, and H. Chen, “PowerLyra: Differentiated graph computation and partitioning on skewed graphs,” in Proc. 10th Eur. Conf. Comput. Syst., 2015, pp. 1–15.

Digital Library

[34]

X. Zhu, W. Han, and W. Chen, “GridGraph: Large scale graph processing on a single machine using 2-level hierarchical partitioning,” in Proc. USENIX Annu. Tech. Conf., 2015, pp. 375–386.

Digital Library

[35]

G. Karypis and V. Kumar, “Multilevel graph partitioning schemes,” in Proc. Int. Conf. Parallel Process., 1995, pp. 113–122.

Digital Library

[36]

B. Hendrickson and R. Leland, “A multi-level algorithm for partitioning graphs,” in Proc. ACM/IEEE Conf. Supercomputing, 1995, pp. 1–28.

Digital Library

[37]

S. T. Barnard, “PMRSB: Parallel multilevel recursive spectral bisection,” in Proc. ACM/IEEE Conf. Supercomputing, 1995, pp. 1–27.

Digital Library

[38]

J. Mondal and A. Deshpande, “Managing large dynamic graphs efficiently,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2012, pp. 145–156.

Digital Library

Cited By

Liao XZhao WJin HYao PHuang YWang QZhao JZheng LZhang YShao Z(2024)Towards High-Performance Graph Processing: From a Hardware/Software Co-Design PerspectiveJournal of Computer Science and Technology10.1007/s11390-024-4150-039:2(245-266)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11390-024-4150-0
Yao FTao QYu WZhang YGong SWang QYu GZhou J(2023)RAGraph: A Region-Aware Framework for Geo-Distributed Graph ProcessingProceedings of the VLDB Endowment10.14778/3632093.363209417:3(264-277)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.14778/3632093.3632094
Zhao JZhang YHe LLi QZhang XJiang XYu HLiao XJin HGu LLiu HHe BZhang JSong XWang LZhou J(2023)GraphTune: An Efficient Dependency-Aware Substrate to Alleviate Irregularity in Concurrent Graph ProcessingACM Transactions on Architecture and Code Optimization10.1145/360009120:3(1-24)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3600091
Show More Cited By

Recommendations

On the Multichromatic Number of s-Stable Kneser Graphs

For positive integers n and s, a subset Sï [n] is s-stable if sï |i-j|ï n-s for distinct i,j∈S . The s-stable r-uniform Kneser hypergraph KGrn,ks-stable is the r-uniform hypergraph that has the collection of all s-stable k-element subsets of [n] as ...
Removable edges in a 5-connected graph and a construction method of 5-connected graphs

An edge e of a k-connected graph G is said to be a removable edge if G@?e is still k-connected. A k-connected graph G is said to be a quasi (k+1)-connected if G has no nontrivial k-separator. The existence of removable edges of 3-connected and 4-...
Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes

An adjacent vertex-distinguishing edge coloring of a simple graph G is a proper edge coloring of G such that incident edge sets of any two adjacent vertices are assigned different sets of colors. A total coloring of a graph G is a coloring of both the ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers

IEEE Transactions on Computers Volume 66, Issue 5

May 2017

182 pages

ISSN:0018-9340

Issue’s Table of Contents

Copyright © 2017.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 May 2017

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liao XZhao WJin HYao PHuang YWang QZhao JZheng LZhang YShao Z(2024)Towards High-Performance Graph Processing: From a Hardware/Software Co-Design PerspectiveJournal of Computer Science and Technology10.1007/s11390-024-4150-039:2(245-266)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11390-024-4150-0
Yao FTao QYu WZhang YGong SWang QYu GZhou J(2023)RAGraph: A Region-Aware Framework for Geo-Distributed Graph ProcessingProceedings of the VLDB Endowment10.14778/3632093.363209417:3(264-277)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.14778/3632093.3632094
Zhao JZhang YHe LLi QZhang XJiang XYu HLiao XJin HGu LLiu HHe BZhang JSong XWang LZhou J(2023)GraphTune: An Efficient Dependency-Aware Substrate to Alleviate Irregularity in Concurrent Graph ProcessingACM Transactions on Architecture and Code Optimization10.1145/360009120:3(1-24)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3600091
Lei GGuo WZheng ZWang J(2022)A Preliminary Performance Evaluation of Breadth-first Search on a Configurable ProcessorProceedings of the 4th International Conference on Big Data Engineering10.1145/3538950.3538965(113-121)Online publication date: 26-May-2022
https://dl.acm.org/doi/10.1145/3538950.3538965
Zhang YPeng DLiao XJin HLiu HGu LHe B(2021)LargeGraphACM Transactions on Architecture and Code Optimization10.1145/347760318:4(1-24)Online publication date: 29-Sep-2021
https://dl.acm.org/doi/10.1145/3477603
Zhao JZhang YLiao XHe LHe BJin HLiu Hde Supinski BHall MGamblin T(2021)LCCGProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3480854(1-14)Online publication date: 14-Nov-2021
https://dl.acm.org/doi/10.1145/3458817.3480854
Zhang YLiao XGu LJin HHu KLiu HHe B(2020)AsynGraphACM Transactions on Architecture and Code Optimization10.1145/341649517:4(1-21)Online publication date: 30-Sep-2020
https://dl.acm.org/doi/10.1145/3416495
Gong SZhang YYu G(2020)Accelerating Large-Scale Prioritized Graph Computations by Hotness Balanced PartitionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.303270932:4(746-759)Online publication date: 12-Nov-2020
https://dl.acm.org/doi/10.1109/TPDS.2020.3032709
Chen RShi JChen YZang BGuan HChen H(2019)PowerLyraACM Transactions on Parallel Computing10.1145/32989895:3(1-39)Online publication date: 22-Jan-2019
https://dl.acm.org/doi/10.1145/3298989
Zhang YLiao XJin HHe BLiu HGu LBahar IHerlihy MWitchel ELebeck A(2019)DiGraphProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304029(601-614)Online publication date: 4-Apr-2019
https://dl.acm.org/doi/10.1145/3297858.3304029
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents