research-article

Adaptive Asynchronous Parallelization of Graph Algorithms

Authors:

Ruiqi XuAuthors Info & Claims

SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data

Pages 1141 - 1156

https://doi.org/10.1145/3183713.3196918

Published: 27 May 2018 Publication History

Abstract

This paper proposes an Adaptive Asynchronous Parallel (AAP) model for graph computations. As opposed to Bulk Synchronous Parallel (BSP) and Asynchronous Parallel (AP) models, AAP reduces both stragglers and stale computations by dynamically adjusting relative progress of workers. We show that BSP, AP and Stale Synchronous Parallel model (SSP) are special cases of AAP. Better yet, AAP optimizes parallel processing by adaptively switching among these models at different stages of a single execution. Moreover, employing the programming model of GRAPE, AAP aims to parallelize existing sequential algorithms based on fixpoint computation with partial and incremental evaluation. Under a monotone condition, AAP guarantees to converge at correct answers if the sequential algorithms are correct. Furthermore, we show that AAP can optimally simulate MapReduce, PRAM, BSP, AP and SSP. Using real-life and synthetic graphs, we experimentally verify that AAP outperforms BSP, AP and SSP for a variety of graph computations.

References

[1]

Friendster. https://snap.stanford.edu/data/com-Friendster.html.

[2]

Giraph. http://giraph.apache.org/.

[3]

GTgraph. http://www.cse.psu.edu/~kxm85/software/GTgraph/.

[4]

Movielens. http://grouplens.org/datasets/movielens/.

[5]

MPICH. https://www.mpich.org/.

[6]

Netflix prize data. https://www.kaggle.com/netflix-inc/netflix-prize-data.

[7]

Traffic. http://www.dis.uniroma1.it/challenge9/download.shtml.

[8]

UKWeb. http://law.di.unimi.it/webdata/uk-union-2006-06--2007-05/, 2006.

[9]

U. A. Acar. Self-Adjusting Computation. PhD thesis, Carnegie Mellon University, 2005.

Digital Library

[10]

U. A. Acar, A. Charguéraud, and M. Rainey. Scheduling parallel programs by work stealing with private deques. In PPoPP, 2013.

Digital Library

[11]

K. Andreev and H. Racke. Balanced graph partitioning. Theory of Computing Systems, 39(6):929--939, 2006.

Digital Library

[12]

J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.

Digital Library

[13]

N. T. Bao and T. Suzumura. Towards highly scalable pregel-based graph processing platform with x10. In WWW '13, pages 501--508, 2013.

Digital Library

[14]

R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999.

Digital Library

[15]

S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 56(18):3825--3833, 2012.

Digital Library

[16]

K. M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63--75, 1985.

Digital Library

[17]

Y. Chen, S. Goldberg, D. Z. Wang, and S. S. Johri. Ontological pathfinding. In SIGMOD, 2016.

Digital Library

[18]

C. C. Cowen, K. Davidson, and R. Kaufman. Rearranging the alternating harmonic series. The American Mathematical Monthly, 87(10):817--819, 1980.

[19]

W. Dai, A. Kumar, J. Wei, Q. Ho, G. Gibson, and E. P. Xing. High-performance distributed ML at scale through parameter server consistency models. In AAAI, 2015.

Digital Library

[20]

J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Commun. ACM, 51(1), 2008.

Digital Library

[21]

J. Dinan, S. Olivier, G. Sabin, J. Prins, P. Sadayappan, and C. Tseng. Dynamic load balancing of unbalanced computations using message passing. In IPDPS, 2007.

[22]

W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.

Digital Library

[23]

W. Fan, J. Xu, Y. Wu, W. Yu, and J. Jiang. GRAPE: Parallelizing sequential graph computations. PVLDB, 10(12):1889--1892, 2017.

Digital Library

[24]

W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, B. Zhang, Z. Zheng, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, 2017.

Digital Library

[25]

M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596--615, 1987.

Digital Library

[26]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.

Digital Library

[27]

J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014.

Digital Library

[28]

M. Han and K. Daudjee. Giraph unchained: Barrierless asynchronous parallel execution in pregel-like graph processing systems. PVLDB, 8(9):950--961, 2015.

Digital Library

[29]

M. Han, K. Daudjee, K. Ammar, M. T. Ozsu, X. Wang, and T. Jin. An experimental comparison of Pregel-like graph processing systems. VLDB, 7(12), 2014.

Digital Library

[30]

Q. Ho, J. Cipar, H. Cui, S. Lee, J. K. Kim, P. B. Gibbons, G. A. Gibson, G. R. Ganger, and E. P. Xing. More effective distributed ML via a stale synchronous parallel parameter server. In NIPS, pages 1223--1231, 2013.

Digital Library

[31]

F. Hutter, L. Xu, H. H. Hoos, and K. Leyton-Brown. Algorithm runtime prediction: Methods &evaluation. Artif. Intell., pages 79--111, 2014.

Digital Library

[32]

H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, pages 938--948, 2010.

Digital Library

[33]

Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: a system for dynamic load balancing in large-scale graph processing. In EuroSys '13, pages 169--182, 2013.

Digital Library

[34]

M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data &Knowledge Engineering, 72:285--303, 2012.

Digital Library

[35]

K. Knopp. Theory and application of infinite series. Courier Corporation, 2013.

[36]

Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30--37, 2009.

Digital Library

[37]

M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su. Scaling Distributed Machine Learning with the Parameter Server. In USENIX, 2014.

Digital Library

[38]

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012.

Digital Library

[39]

G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In SIGMOD, 2010.

Digital Library

[40]

R. R. McCune, T. Weninger, and G. Madey. Thinking like a vertex: A survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv., 48(2):25:1--25:39, 2015.

Digital Library

[41]

F. McSherry, M. Isard, and D. G. Murray. Scalability! But at what cost? In HotOS, 2015.

Digital Library

[42]

G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996.

Digital Library

[43]

G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996.

Digital Library

[44]

S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013.

Digital Library

[45]

X. Shi, B. Cui, Y. Shao, and Y. Tong. Tornado: A system for real-time iterative analysis over evolving data. In SIGMOD, 2016.

Digital Library

[46]

G. M. Slota, S. Rajamanickam, K. Devine, and K. Madduri. Partitioning trillion-edge graphs in minutes. In IPDPS, 2017.

[47]

Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From "think like a vertex" to "think like a graph". PVLDB, 7(7):193--204, 2013.

Digital Library

[48]

L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990.

Digital Library

[49]

L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990.

Digital Library

[50]

G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.

[51]

J. Wei, W. Dai, A. Qiao, Q. Ho, H. Cui, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P. Xing. Managed communication and consistency for fast data-parallel iterative analytics. In SOCC, pages 381--394, 2015.

Digital Library

[52]

C. Xie, R. Chen, H. Guan, B. Zang, and H. Chen. SYNC or ASYNC: time to fuse for distributed graph-parallel computation. In PPOPP, 2015.

Digital Library

[53]

E. P. Xing, Q. Ho, W. Dai, J. K. Kim, J. Wei, S. Lee, X. Zheng, P. Xie, A. Kumar, and Y. Yu. Petuum: A New Platform for Distributed Machine Learning on Big Data. IEEE Transactions on Big Data, 1(2):49--67, June 2015.

[54]

D. Yan, Y. Bu, Y. Tian, and A. Deshpande. Big graph analytics platforms. Foundations and Trends in Databases, 7(1--2):1--195, 2017.

Digital Library

[55]

D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981--1992, 2014.

Digital Library

[56]

M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In OSDI, 2008.

Digital Library

[57]

Y. Zhang, Q. Gao, L. Gao, and C. Wang. Maiter: An asynchronous graph processing framework for delta-based accumulative iterative computation. TPDS, 25(8):2091--2100, 2014.

[58]

Z. Zhang and C. Douligeris. Convergence of synchronous and asynchronous algorithms in multiclass networks. In INFOCOM, pages 939--943. IEEE, 1991.

Cited By

Zhou YGong SYao FChen HYu SLiu PZhang YYu GYu J(2024)Fast Iterative Graph Computing with Updated Neighbor States2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00193(2449-2462)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00193
Xu RWang YXiao X(2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00169
Wang QAi XZhang YChen JYu G(2023)HyTGraph: GPU-Accelerated Graph Processing with Hybrid Transfer Management2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00049(558-571)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00049
Show More Cited By

Index Terms

Adaptive Asynchronous Parallelization of Graph Algorithms
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs

Recommendations

Adaptive Asynchronous Parallelization of Graph Algorithms
Best of SIGMOD 2018, Best of PODS 2018 and Regular Papers

This article proposes an Adaptive Asynchronous Parallel (AAP) model for graph computations. As opposed to Bulk Synchronous Parallel (BSP) and Asynchronous Parallel (AP) models, AAP reduces both stragglers and stale computations by dynamically adjusting ...
Parallelizing Sequential Graph Computations
Best of SIGMOD 2017 Papers

This article presents GRAPE, a parallel <underline>GRAP</underline>h <underline>E</underline>ngine for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole, without the need ...
The Kremlin Oracle for Sequential Code Parallelization

The Kremlin open-source tool helps programmers by automatically identifying regions in sequential programs that merit parallelization. Kremlin combines a novel dynamic program analysis, hierarchical critical-path analysis, with multicore processor ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data

May 2018

1874 pages

ISBN:9781450347037

DOI:10.1145/3183713

General Chairs:
Gautam Das
University of Texas at Arlington, USA
,
Christopher Jermaine
Rice University, USA
,
Philip Bernstein
Microsoft Research, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

973 Program
European Research Council
National Natural Science Foundation of China
Engineering and Physical Sciences Research Council
Beijing Advanced Innovation Center for Big Data and Brain Computing

Conference

SIGMOD/PODS '18

Sponsor:

SIGMOD

SIGMOD/PODS '18: International Conference on Management of Data

June 10 - 15, 2018

TX, Houston, USA

Acceptance Rates

SIGMOD '18 Paper Acceptance Rate 90 of 461 submissions, 20%;

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
762
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)2

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou YGong SYao FChen HYu SLiu PZhang YYu GYu J(2024)Fast Iterative Graph Computing with Updated Neighbor States2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00193(2449-2462)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00193
Xu RWang YXiao X(2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00169
Wang QAi XZhang YChen JYu G(2023)HyTGraph: GPU-Accelerated Graph Processing with Hybrid Transfer Management2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00049(558-571)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00049
Meng KGeng LLi XTao QYu WZhou J(2023)Efficient Multi-GPU Graph Processing with Remote Work Stealing2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00022(191-204)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00022
Fan W(2022)Big graphsProceedings of the VLDB Endowment10.14778/3554821.355489915:12(3782-3797)Online publication date: Aug-2022
https://doi.org/10.14778/3554821.3554899
Wu JWang JZaniolo CIves ZBonifati AEl Abbadi A(2022)Optimizing Parallel Recursive Datalog Evaluation on Multicore MachinesProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517853(1433-1446)Online publication date: 11-Jun-2022
https://doi.org/10.1145/3514221.3517853
Tang ZHe MFu ZYang L(2021)IncGraph: An Improved Distributed Incremental Graph Computing Model and Framework based on Spark GraphXIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.3014150(1-1)Online publication date: 2021
https://doi.org/10.1109/TKDE.2020.3014150
Song ZGu YWang ZYu G(2021)DRPS: efficient disk-resident parameter servers for distributed machine learningFrontiers of Computer Science10.1007/s11704-021-0445-216:4Online publication date: 3-Dec-2021
https://doi.org/10.1007/s11704-021-0445-2
Fan WJin RLiu MLu PLuo XXu RYin QYu WZhou JMaier DPottinger RDoan ATan WAlawini ANgo H(2020)Application Driven Graph PartitioningProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389745(1765-1779)Online publication date: 11-Jun-2020
https://dl.acm.org/doi/10.1145/3318464.3389745
Wang QZhang YWang HGeng LLee RZhang XYu GMaier DPottinger RDoan ATan WAlawini ANgo H(2020)Automating Incremental and Asynchronous Evaluation for Recursive Aggregate Data ProcessingProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389712(2439-2454)Online publication date: 11-Jun-2020
https://dl.acm.org/doi/10.1145/3318464.3389712
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents