Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3183713.3196918acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Adaptive Asynchronous Parallelization of Graph Algorithms

Published: 27 May 2018 Publication History

Abstract

This paper proposes an Adaptive Asynchronous Parallel (AAP) model for graph computations. As opposed to Bulk Synchronous Parallel (BSP) and Asynchronous Parallel (AP) models, AAP reduces both stragglers and stale computations by dynamically adjusting relative progress of workers. We show that BSP, AP and Stale Synchronous Parallel model (SSP) are special cases of AAP. Better yet, AAP optimizes parallel processing by adaptively switching among these models at different stages of a single execution. Moreover, employing the programming model of GRAPE, AAP aims to parallelize existing sequential algorithms based on fixpoint computation with partial and incremental evaluation. Under a monotone condition, AAP guarantees to converge at correct answers if the sequential algorithms are correct. Furthermore, we show that AAP can optimally simulate MapReduce, PRAM, BSP, AP and SSP. Using real-life and synthetic graphs, we experimentally verify that AAP outperforms BSP, AP and SSP for a variety of graph computations.

References

[1]
Friendster. https://snap.stanford.edu/data/com-Friendster.html.
[2]
Giraph. http://giraph.apache.org/.
[3]
GTgraph. http://www.cse.psu.edu/~kxm85/software/GTgraph/.
[4]
Movielens. http://grouplens.org/datasets/movielens/.
[5]
MPICH. https://www.mpich.org/.
[6]
Netflix prize data. https://www.kaggle.com/netflix-inc/netflix-prize-data.
[7]
Traffic. http://www.dis.uniroma1.it/challenge9/download.shtml.
[8]
UKWeb. http://law.di.unimi.it/webdata/uk-union-2006-06--2007-05/, 2006.
[9]
U. A. Acar. Self-Adjusting Computation. PhD thesis, Carnegie Mellon University, 2005.
[10]
U. A. Acar, A. Charguéraud, and M. Rainey. Scheduling parallel programs by work stealing with private deques. In PPoPP, 2013.
[11]
K. Andreev and H. Racke. Balanced graph partitioning. Theory of Computing Systems, 39(6):929--939, 2006.
[12]
J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.
[13]
N. T. Bao and T. Suzumura. Towards highly scalable pregel-based graph processing platform with x10. In WWW '13, pages 501--508, 2013.
[14]
R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999.
[15]
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 56(18):3825--3833, 2012.
[16]
K. M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63--75, 1985.
[17]
Y. Chen, S. Goldberg, D. Z. Wang, and S. S. Johri. Ontological pathfinding. In SIGMOD, 2016.
[18]
C. C. Cowen, K. Davidson, and R. Kaufman. Rearranging the alternating harmonic series. The American Mathematical Monthly, 87(10):817--819, 1980.
[19]
W. Dai, A. Kumar, J. Wei, Q. Ho, G. Gibson, and E. P. Xing. High-performance distributed ML at scale through parameter server consistency models. In AAAI, 2015.
[20]
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Commun. ACM, 51(1), 2008.
[21]
J. Dinan, S. Olivier, G. Sabin, J. Prins, P. Sadayappan, and C. Tseng. Dynamic load balancing of unbalanced computations using message passing. In IPDPS, 2007.
[22]
W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.
[23]
W. Fan, J. Xu, Y. Wu, W. Yu, and J. Jiang. GRAPE: Parallelizing sequential graph computations. PVLDB, 10(12):1889--1892, 2017.
[24]
W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, B. Zhang, Z. Zheng, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, 2017.
[25]
M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596--615, 1987.
[26]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.
[27]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014.
[28]
M. Han and K. Daudjee. Giraph unchained: Barrierless asynchronous parallel execution in pregel-like graph processing systems. PVLDB, 8(9):950--961, 2015.
[29]
M. Han, K. Daudjee, K. Ammar, M. T. Ozsu, X. Wang, and T. Jin. An experimental comparison of Pregel-like graph processing systems. VLDB, 7(12), 2014.
[30]
Q. Ho, J. Cipar, H. Cui, S. Lee, J. K. Kim, P. B. Gibbons, G. A. Gibson, G. R. Ganger, and E. P. Xing. More effective distributed ML via a stale synchronous parallel parameter server. In NIPS, pages 1223--1231, 2013.
[31]
F. Hutter, L. Xu, H. H. Hoos, and K. Leyton-Brown. Algorithm runtime prediction: Methods &evaluation. Artif. Intell., pages 79--111, 2014.
[32]
H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, pages 938--948, 2010.
[33]
Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: a system for dynamic load balancing in large-scale graph processing. In EuroSys '13, pages 169--182, 2013.
[34]
M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data &Knowledge Engineering, 72:285--303, 2012.
[35]
K. Knopp. Theory and application of infinite series. Courier Corporation, 2013.
[36]
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30--37, 2009.
[37]
M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su. Scaling Distributed Machine Learning with the Parameter Server. In USENIX, 2014.
[38]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012.
[39]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In SIGMOD, 2010.
[40]
R. R. McCune, T. Weninger, and G. Madey. Thinking like a vertex: A survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv., 48(2):25:1--25:39, 2015.
[41]
F. McSherry, M. Isard, and D. G. Murray. Scalability! But at what cost? In HotOS, 2015.
[42]
G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996.
[43]
G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996.
[44]
S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013.
[45]
X. Shi, B. Cui, Y. Shao, and Y. Tong. Tornado: A system for real-time iterative analysis over evolving data. In SIGMOD, 2016.
[46]
G. M. Slota, S. Rajamanickam, K. Devine, and K. Madduri. Partitioning trillion-edge graphs in minutes. In IPDPS, 2017.
[47]
Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From "think like a vertex" to "think like a graph". PVLDB, 7(7):193--204, 2013.
[48]
L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990.
[49]
L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990.
[50]
G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.
[51]
J. Wei, W. Dai, A. Qiao, Q. Ho, H. Cui, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P. Xing. Managed communication and consistency for fast data-parallel iterative analytics. In SOCC, pages 381--394, 2015.
[52]
C. Xie, R. Chen, H. Guan, B. Zang, and H. Chen. SYNC or ASYNC: time to fuse for distributed graph-parallel computation. In PPOPP, 2015.
[53]
E. P. Xing, Q. Ho, W. Dai, J. K. Kim, J. Wei, S. Lee, X. Zheng, P. Xie, A. Kumar, and Y. Yu. Petuum: A New Platform for Distributed Machine Learning on Big Data. IEEE Transactions on Big Data, 1(2):49--67, June 2015.
[54]
D. Yan, Y. Bu, Y. Tian, and A. Deshpande. Big graph analytics platforms. Foundations and Trends in Databases, 7(1--2):1--195, 2017.
[55]
D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981--1992, 2014.
[56]
M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. Improving mapreduce performance in heterogeneous environments. In OSDI, 2008.
[57]
Y. Zhang, Q. Gao, L. Gao, and C. Wang. Maiter: An asynchronous graph processing framework for delta-based accumulative iterative computation. TPDS, 25(8):2091--2100, 2014.
[58]
Z. Zhang and C. Douligeris. Convergence of synchronous and asynchronous algorithms in multiclass networks. In INFOCOM, pages 939--943. IEEE, 1991.

Cited By

View all
  • (2024)Fast Iterative Graph Computing with Updated Neighbor States2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00193(2449-2462)Online publication date: 13-May-2024
  • (2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
  • (2023)HyTGraph: GPU-Accelerated Graph Processing with Hybrid Transfer Management2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00049(558-571)Online publication date: Apr-2023
  • Show More Cited By

Index Terms

  1. Adaptive Asynchronous Parallelization of Graph Algorithms

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
    May 2018
    1874 pages
    ISBN:9781450347037
    DOI:10.1145/3183713
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. church-rosser
    2. graph computations
    3. parallel model
    4. parallelization

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGMOD/PODS '18
    Sponsor:

    Acceptance Rates

    SIGMOD '18 Paper Acceptance Rate 90 of 461 submissions, 20%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Fast Iterative Graph Computing with Updated Neighbor States2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00193(2449-2462)Online publication date: 13-May-2024
    • (2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
    • (2023)HyTGraph: GPU-Accelerated Graph Processing with Hybrid Transfer Management2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00049(558-571)Online publication date: Apr-2023
    • (2023)Efficient Multi-GPU Graph Processing with Remote Work Stealing2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00022(191-204)Online publication date: Apr-2023
    • (2022)Big graphsProceedings of the VLDB Endowment10.14778/3554821.355489915:12(3782-3797)Online publication date: Aug-2022
    • (2022)Optimizing Parallel Recursive Datalog Evaluation on Multicore MachinesProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517853(1433-1446)Online publication date: 11-Jun-2022
    • (2021)IncGraph: An Improved Distributed Incremental Graph Computing Model and Framework based on Spark GraphXIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.3014150(1-1)Online publication date: 2021
    • (2021)DRPS: efficient disk-resident parameter servers for distributed machine learningFrontiers of Computer Science10.1007/s11704-021-0445-216:4Online publication date: 3-Dec-2021
    • (2020)Application Driven Graph PartitioningProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389745(1765-1779)Online publication date: 11-Jun-2020
    • (2020)Automating Incremental and Asynchronous Evaluation for Recursive Aggregate Data ProcessingProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389712(2439-2454)Online publication date: 11-Jun-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media