Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3087556.3087580acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article
Public Access

Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing

Published: 24 July 2017 Publication History

Abstract

Existing graph-processing frameworks let users develop efficient implementations for many graph problems, but none of them support efficiently bucketing vertices, which is needed for bucketing-based graph algorithms such as \Delta-stepping and approximate set-cover. Motivated by the lack of simple, scalable, and efficient implementations of bucketing-based algorithms, we develop the Julienne framework, which extends a recent shared-memory graph processing framework called Ligra with an interface for maintaining a collection of buckets under vertex insertions and bucket deletions.
We provide a theoretically efficient parallel implementation of our bucketing interface and study several bucketing-based algorithms that make use of it (either bucketing by remaining degree or by distance) to improve performance: the peeling algorithm for k-core (coreness), \Delta-stepping, weighted breadth-first search, and approximate set cover. The implementations are all simple and concise (under 100 lines of code). Using our interface, we develop the first work-efficient parallel algorithm for k-core in the literature with nontrivial parallelism.
We experimentally show that our bucketing implementation scales well and achieves high throughput on both synthetic and real-world workloads. Furthermore, the bucketing-based algorithms written in Julienne achieve up to 43x speedup on 72 cores with hyper-threading over well-tuned sequential baselines, significantly outperform existing work-inefficient implementations in Ligra, and either outperform or are competitive with existing special-purpose parallel codes for the same problem. We experimentally study our implementations on the largest publicly available graphs and show that they scale well in practice, processing real-world graphs with billions of edges in seconds, and hundreds of billions of edges in a few minutes. As far as we know, this is the first time that graphs at this scale have been analyzed in the main memory of a single multicore machine.

References

[1]
D. Achlioptas and M. Molloy. The solution space geometry of random linear equations. Random Structures & Algorithms, 46(2), 2015.
[2]
J. I. Alvarez-Hamelin, L. Dall'asta, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decomposition. In Advances in Neural Information Processing Systems. 2005.
[3]
R. Anderson and E. W. Mayr. A P-complete problem and approximations to it. Technical report, 1984.
[4]
V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. CoRR, cs.DS/0310049, 2003.
[5]
S. Beamer, K. Asanović, and D. Patterson. Direction-optimizing breadth-first search. In International Conference on High Performance Computing, Networking, Storage and Analysis, 2012.
[6]
S. Beamer, K. Asanovic, and D. A. Patterson. The GAP benchmark suite. CoRR, abs/1508.03619, 2015.
[7]
B. Berger, J. Rompel, and P. W. Shor. Efficient NC algorithms for set cover with applications to learning and geometry. J. Comput. Syst. Sci., 49(3), Dec. 1994.
[8]
G. E. Blelloch, Y. Gu, Y. Sun, and K. Tangwongsan. Parallel shortest paths using radius stepping. In ACM Symposium on Parallelism in Algorithms and Architectures, 2016.
[9]
G. E. Blelloch, R. Peng, and K. Tangwongsan. Linear-work greedy parallel approximate set cover and variants. In ACM Symposium on Parallelism in Algorithms and Architectures, 2011.
[10]
G. E. Blelloch, H. V. Simhadri, and K. Tangwongsan. Parallel and I/O efficient set covering algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures, 2012.
[11]
G. S. Brodal, J. L. Tr\"aff, and C. D. Zaroliagis. A parallel priority queue with constant time operations. J. Parallel Distrib. Comput., 49(1), Feb. 1998.
[12]
F. Chierichetti, R. Kumar, and A. Tomkins. Max-cover in map-reduce. In International Conference on World Wide Web, 2010.
[13]
E. Cohen. Using selective path-doubling for parallel shortest-path computations. J. Algorithms, 22(1), Jan. 1997.
[14]
R. Cole, P. N. Klein, and R. E. Tarjan. Finding minimum spanning forests in logarithmic time and linear work using random sampling. In ACM Symposium on Parallel Algorithms and Architectures. ACM, 1996.
[15]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (3. ed.). MIT Press, 2009.
[16]
N. S. Dasari, R. Desh, and M. Zubair. ParK: An efficient algorithm for k-core decomposition on multicore processors. In IEEE International Conference on Big Data, 2014.
[17]
A. A. Davidson, S. Baxter, M. Garland, and J. D. Owens. Work-efficient parallel GPU methods for single-source shortest paths. In IEEE International Parallel and Distributed Processing, 2014.
[18]
R. B. Dial. Algorithm 360: Shortest-path forest with topological ordering [H]. Commun. ACM, 12(11), Nov. 1969.
[19]
E. W. Dijkstra. A note on two problems in connexion with graphs. Numer. Math., 1(1), Dec. 1959.
[20]
B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In IEEE International Conference on Big Data, 2013.
[21]
M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM, 34(3), July 1987.
[22]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In USENIX Symposium on Operating Systems Design and Implementation, 2012.
[23]
Y. Gu, J. Shun, Y. Sun, and G. E. Blelloch. A top-down parallel semisort. In ACM Symposium on Parallelism in Algorithms and Architectures, 2015.
[24]
M. A. Hassaan, M. Burtscher, and K. Pingali. Ordered vs. unordered: A comparison of parallelism and work-efficiency in irregular algorithms. In ACM Symposium on Principles and Practice of Parallel Programming, 2011.
[25]
J. Jaja. Introduction to Parallel Algorithms. Addison-Wesley Professional, 1992.
[26]
J. Jiang, M. Mitzenmacher, and J. Thaler. Parallel peeling algorithms. ACM Trans. Parallel Comput., 3(1), Jan. 2017.
[27]
D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9(3), 1974.
[28]
W. Khaouid, M. Barsky, V. Srinivasan, and A. Thomo. k-core decomposition of large networks on a single PC. Proc. VLDB Endow., 9(1), Sept. 2015.
[29]
P. N. Klein and S. Subramanian. A randomized parallel algorithm for single-source shortest paths. J. Algorithms, 25(2), Nov. 1997.
[30]
R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani. Fast greedy algorithms in mapreduce and streaming. ACM Trans. Parallel Comput., 2(3), Sept. 2015.
[31]
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In International Conference on World Wide Web, 2010.
[32]
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow., 5(8), Apr. 2012.
[33]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. GraphLab: A new parallel framework for machine learning. In Conference on Uncertainty in Artificial Intelligence, July 2010.
[34]
K. Madduri, D. A. Bader, J. W. Berry, and J. R. Crobak. An experimental study of a parallel shortest path algorithm for solving large-scale graph instances. In Meeting on Algorithm Engineering & Experiments, 2007.
[35]
S. Maleki, D. Nguyen, A. Lenharth, M. Garzarán, D. Padua, and K. Pingali. DSMR: A parallel algorithm for single-source shortest path problem. In International Conference on Supercomputing, 2016.
[36]
S. Maleki, D. Nguyen, A. Lenharth, M. Garzarán, D. Padua, and K. Pingali. DSMR: A parallel algorithm for single-source shortest path problem. In International Conference on Supercomputing, 2016.
[37]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In ACM SIGMOD International Conference on Management of Data, 2010.
[38]
D. W. Matula and L. L. Beck. Smallest-last ordering and clustering and graph coloring algorithms. J. ACM, 30(3), July 1983.
[39]
F. McSherry, M. Isard, and D. G. Murray. Scalability! But at what COST? In Workshop on Hot Topics in Operating Systems, 2015.
[40]
R. Meusel, S. Vigna, O. Lehmberg, and C. Bizer. The graph structure in the web--analyzed on different aggregation levels. The Journal of Web Science, 1(1), 2015.
[41]
U. Meyer and P. Sanders. Δ-stepping: a parallelizable shortest path algorithm. Journal of Algorithms, 49(1), 2003.
[42]
A. Montresor, F. D. Pellegrini, and D. Miorandi. Distributed k-core decomposition. IEEE Transactions on Parallel and Distributed Systems, 24(2), 2013.
[43]
D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In ACM Symposium on Operating Systems Principles, 2013.
[44]
R. C. Paige and C. P. Kruskal. Parallel algorithms for shortest path problems. In International Conference on Parallel Processing, 1985.
[45]
K. Pechlivanidou, D. Katsaros, and L. Tassiulas. MapReduce-based distributed k-shell decomposition for online social networks. In IEEE World Congress on Services, 2014.
[46]
S. Rajagopalan and V. V. Vazirani. Primal-dual RNC approximation algorithms for set cover and covering integer programs. SIAM J. Comput., 28(2), Feb. 1999.
[47]
A. E. Sariyüce and A. Pinar. Fast hierarchy construction for dense subgraphs. Proc. VLDB Endow., 10(3), Nov. 2016.
[48]
A. E. Sariyuce, C. Seshadhri, and A. Pinar. Parallel local algorithms for core, truss, and nucleus decompositions. arXiv preprint arXiv:1704.00386, 2017.
[49]
S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3), 1983.
[50]
H. Shi and T. H. Spencer. Time-work tradeoffs of the single-source shortest paths problem. J. Algorithms, 30(1), Jan. 1999.
[51]
K. Shin, T. Eliassi-Rad, and C. Faloutsos. CoreScope: Graph mining using k-core analysis--patterns, anomalies and algorithms. In IEEE International Conference on Data Mining, 2016.
[52]
J. Shun and G. E. Blelloch. Ligra: A lightweight graph processing framework for shared memory. In ACM SIGPLAN Symposium On Principles and Practice of Parallel Programming, 2013.
[53]
J. Shun, G. E. Blelloch, J. T. Fineman, and P. B. Gibbons. Reducing contention through priority updates. In ACM Symposium on Parallelism in Algorithms and Architectures, 2013.
[54]
J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, A. Kyrola, H. V. Simhadri, and K. Tangwongsan. Brief announcement: the problem based benchmark suite. In ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 2012.
[55]
J. Shun, L. Dhulipala, and G. Blelloch. A simple and practical linear-work parallel algorithm for connectivity. In ACM Symposium on Parallelism in Algorithms and Architectures, 2014.
[56]
J. Shun, L. Dhulipala, and G. Blelloch. Smaller and faster: Parallel processing of compressed graphs with Ligra+ In IEEE Data Compression Conference, 2015.
[57]
T. H. Spencer. Time-work tradeoffs for parallel algorithms. J. ACM, 44(5), Sept. 1997.
[58]
S. Stergiou and K. Tsioutsiouliklis. Set cover at web scale. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015.
[59]
J. Ugander, B. Karrer, L. Backstrom, and C. Marlow. The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503, 2011.
[60]
Y. Wang, A. A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens. Gunrock: a high-performance graph processing library on the GPU. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016.
[61]
S. Wuchty and E. Almaas. Peeling the yeast protein network. Proteomics, 5(2), 2005.

Cited By

View all
  • (2024)Efficient Parallel D-Core Decomposition at ScaleProceedings of the VLDB Endowment10.14778/3675034.367505417:10(2654-2667)Online publication date: 1-Jun-2024
  • (2024)BYO: A Unified Framework for Benchmarking Large-Scale Graph ContainersProceedings of the VLDB Endowment10.14778/3665844.366585917:9(2307-2320)Online publication date: 1-May-2024
  • (2024)Brief Announcement: Improved Massively Parallel Triangle Counting in O(1) RoundsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662819(519-522)Online publication date: 17-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '17: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures
July 2017
392 pages
ISBN:9781450345934
DOI:10.1145/3087556
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. experiments
  2. graph algorithms
  3. parallel programming
  4. shared memory

Qualifiers

  • Research-article

Funding Sources

Conference

SPAA '17
Sponsor:

Acceptance Rates

SPAA '17 Paper Acceptance Rate 31 of 127 submissions, 24%;
Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)231
  • Downloads (Last 6 weeks)30
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient Parallel D-Core Decomposition at ScaleProceedings of the VLDB Endowment10.14778/3675034.367505417:10(2654-2667)Online publication date: 1-Jun-2024
  • (2024)BYO: A Unified Framework for Benchmarking Large-Scale Graph ContainersProceedings of the VLDB Endowment10.14778/3665844.366585917:9(2307-2320)Online publication date: 1-May-2024
  • (2024)Brief Announcement: Improved Massively Parallel Triangle Counting in O(1) RoundsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662819(519-522)Online publication date: 17-Jun-2024
  • (2024)uBlade: Efficient Batch Processing for Uncertainty Graph QueriesProceedings of the ACM on Management of Data10.1145/36549822:3(1-24)Online publication date: 30-May-2024
  • (2024)Differentiating Set Intersections in Maximal Clique Enumeration by Function and Subproblem SizeProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656607(150-163)Online publication date: 30-May-2024
  • (2024)DAWN: Matrix Operation-Optimized Algorithm for Shortest Paths Problem on Unweighted GraphsProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656600(1-13)Online publication date: 30-May-2024
  • (2024)Parallel Algorithms for Hierarchical Nucleus DecompositionProceedings of the ACM on Management of Data10.1145/36392872:1(1-27)Online publication date: 26-Mar-2024
  • (2024)Parallel k-Core Decomposition with Batched Updates and Asynchronous ReadsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638508(286-300)Online publication date: 2-Mar-2024
  • (2024)CPMA: An Efficient Batch-Parallel Compressed Set Without PointersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638492(348-363)Online publication date: 2-Mar-2024
  • (2024)When Is Parallelism Fearless and Zero-Cost with Rust?Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659966(27-40)Online publication date: 17-Jun-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media