Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2621934.2621940acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections

Asymmetry in Large-Scale Graph Analysis, Explained

Published: 22 June 2014 Publication History


Iterative computations are in the core of large-scale graph processing. In these applications, a set of parameters is continuously refined, until a fixed point is reached. Such fixed point iterations often exhibit non-uniform computational behavior, where changes propagate with different speeds throughout the parameter set, making them active or inactive during iterations. This asymmetrical behavior can lead to a many redundant computations, if not exploited. Many specialized graph processing systems and APIs exist that run iterative algorithms efficiently exploiting this asymmetry. However, their functionality is sometimes vaguely defined and due to their different programming models and terminology used, it is often challenging to derive equivalence between them.
We describe an optimization framework for iterative graph processing, which utilizes dataset dependencies. We explain several optimization techniques that exploit asymmetrical behavior of graph algorithms. We formally specify the conditions under which, an algorithm can use a certain technique. We also design template execution plans, using a canonical set of dataflow operators and we evaluate them using real-world datasets and applications. Our experiments show that optimized plans can significantly reduce execution time, often by an order of magnitude. Based on our experiments, we identify a trade-off that can be easily captured and could serve as the basis for automatic optimization of large-scale graph-processing applications.


SNAP: Stanford Network Analysis Platform. http://snap.stanford.edu/index.html. {Online; Last accessed January 2014}.
Stratosphere. http://getstratosphere.org. {Online; Last accessed January 2014}.
D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, and D. Warneke. Nephele/pacts: a programming model and execution framework for web-scale analytical processing. In Proceedings of the 1st ACM symposium on Cloud computing, pages 119--130, 2010.
P. Boldi, M. Rosa, M. Santini, and S. Vigna. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th international conference on World Wide Web, 2011.
P. Boldi and S. Vigna. The WebGraph framework I: Compression techniques. In Proc. of the Thirteenth International World Wide Web Conference (WWW 2004), pages 595--601, Manhattan, USA, 2004.
S. Ewen, K. Tzoumas, M. Kaufmann, and V. Markl. Spinning fast iterative data flows. Proc. VLDB Endow., 5(11):1268--1279, 2012.
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, pages 17--30, Berkeley, CA, USA, 2012. USENIX Association.
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow., 5(8):716--727, 2012.
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135--146, 2010.
S. R. Mihaylov, Z. G. Ives, and S. Guha. Rex: Recursive, delta-based data-centric computation. Proc. VLDB Endow., 5(11):1280--1291, 2012.
D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: A timely dataflow system. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 439--455, 2013.
R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. Graphx: A resilient distributed graph system on spark. In First International Workshop on Graph Data Management Experiences and Systems, New York, NY, USA, 2013. ACM.
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, 2012.

Cited By

View all
  • (2021)An analysis of the graph processing landscapeJournal of Big Data10.1186/s40537-021-00443-98:1Online publication date: 9-Apr-2021
  • (2018)High-Level Programming Abstractions for Distributed Graph ProcessingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.276229430:2(305-324)Online publication date: 1-Feb-2018



Information & Contributors


Published In

cover image ACM Conferences
GRADES'14: Proceedings of Workshop on GRAph Data management Experiences and Systems
June 2014
79 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2014


Request permissions for this article.

Check for updates


  • Tutorial
  • Research
  • Refereed limited



Acceptance Rates

Overall Acceptance Rate 29 of 61 submissions, 48%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Feb 2025

Other Metrics


Cited By

View all
  • (2021)An analysis of the graph processing landscapeJournal of Big Data10.1186/s40537-021-00443-98:1Online publication date: 9-Apr-2021
  • (2018)High-Level Programming Abstractions for Distributed Graph ProcessingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.276229430:2(305-324)Online publication date: 1-Feb-2018

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media