Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Executing Dynamic Data-Graph Computations Deterministically Using Chromatic Scheduling

Published: 18 July 2016 Publication History

Abstract

A data-graph computation—popularized by such programming systems as Galois, Pregel, GraphLab, PowerGraph, and GraphChi—is an algorithm that performs local updates on the vertices of a graph. During each round of a data-graph computation, an update function atomically modifies the data associated with a vertex as a function of the vertex’s prior data and that of adjacent vertices. A dynamic data-graph computation updates only an active subset of the vertices during a round, and those updates determine the set of active vertices for the next round.
This article introduces Prism, a chromatic-scheduling algorithm for executing dynamic data-graph computations. Prism uses a vertex coloring of the graph to coordinate updates performed in a round, precluding the need for mutual-exclusion locks or other nondeterministic data synchronization. A multibag data structure is used by Prism to maintain a dynamic set of active vertices as an unordered set partitioned by color. We analyze Prism using work-span analysis. Let G = (V, E) be a degree-Δ graph colored with χ colors, and suppose that QV is the set of active vertices in a round. Define size(Q)= |Q| + ∑vQ deg(v), which is proportional to the space required to store the vertices of Q using a sparse-graph layout. We show that a P-processor execution of Prism performs updates in Q using O(χ (lg ( Q/χ ) + lg Δ ) + lg P span and Θ(size(Q) + P) work.
These theoretical guarantees are matched by good empirical performance. To isolate the effect of the scheduling algorithm on performance, we modified GraphLab to incorporate Prism and studied seven application benchmarks on a 12-core multicore machine. Prism executes the benchmarks 1.2 to 2.1 times faster than GraphLab’s nondeterministic lock-based scheduler while providing deterministic behavior.
This article also presents Prism-R, a variation of Prism that executes dynamic data-graph computations deterministically even when updates modify global variables with associative operations. Prism-R satisfies the same theoretical bounds as Prism, but its implementation is more involved, incorporating a multivector data structure to maintain a deterministically ordered set of vertices partitioned by color. Despite its additional complexity, Prism-R is only marginally slower than Prism. On the seven application benchmarks studied, Prism-R incurs a 7% geometric mean overhead relative to Prism.

References

[1]
L. Adams and J. Ortega. 1982. A multi-color SOR method for parallel computation. In Proceedings of the International Conference on Parallel Processing. 53--56.
[2]
Eric Allen, David Chase, Joe Hallett, Victor Luchangco, Jan-Willem Maessen, Sukyoung Ryu, Guy L. Steele Jr., and Sam Tobin-Hochstadt. 2008. The Fortress Language Specification Version 1.0. Technical Report. Sun Microsystems.
[3]
Noga Alon, László Babai, and Alon Itai. 1986. A fast and simple randomized parallel algorithm for the maximal independent set problem. Journal of Algorithms 7, 4, 567--583.
[4]
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: Membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, NY, 44--54.
[5]
Leonid Barenboim and Michael Elkin. 2009. Distributed (Δ+1)-coloring in linear (in Δ) time. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC’09). ACM, New York, NY, 111--120.
[6]
Rajkishore Barik, Zoran Budimlic, Vincent Cavè, Sanjay Chatterjee, Yi Guo, David Peixotto, Raghavan Raman, et al. 2009. The Habanero multicore software research project. In Proceedings of the 24th ACM SIGPLAN Conference Companion on Object-Oriented Programming Systems Languages and Applications (OOPSLA’09). ACM, New York, NY, 735--736.
[7]
Tom Bergan, Owen Anderson, Joseph Devietti, Luis Ceze, and Dan Grossman. 2010. CoreDet: A compiler and runtime system for deterministic multithreaded execution. ACM SIGPLAN Notices 45, 3, 53--64.
[8]
Emery D. Berger, Ting Yang, Tongping Liu, and Gene Novark. 2009. Grace: Safe multithreaded programming for C/C++. ACM SIGPLAN Notices 44, 10, 81--96.
[9]
Dimitri P. Bertsekas and John N. Tsitsiklis. 1989. Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, Upper Saddle River, NJ.
[10]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08). ACM, New York, NY, 72--81.
[11]
Guy E. Blelloch. 1990. Prefix Sums and Their Applications. Technical Report. Carnegie Mellon University, Pittsburgh, PA.
[12]
Guy E. Blelloch. 1992. NESL: A Nested Data-Parallel Language. Technical Report. Carnegie Mellon University, Pittsburgh, PA.
[13]
Guy E. Blelloch. 1996. Programming parallel algorithms. Communications of the ACM 39, 3, 85--97.
[14]
Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Julian Shun. 2012. Internally deterministic parallel algorithms can be fast. ACM SIGPLAN Notices 47, 8, 181--192.
[15]
Guy E. Blelloch, Charles E. Leiserson, Bruce M. Maggs, C. Greg Plaxton, Stephen J. Smith, and Marco Zagha. 1991. A comparison of sorting algorithms for the connection machine CM-2. In Proceedings of the 3rd Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’91). ACM, New York, NY, 3--16.
[16]
Robert D. Blumofe and Charles E. Leiserson. 1998. Space-efficient scheduling of multithreaded computations. SIAM Journal on Computing 27, 1, 202--229.
[17]
Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling multithreaded computations by work stealing. Journal of the ACM 46, 5, 720--748.
[18]
Robert D. Blumofe and Dionisios Papadopoulos. 1999. Hood: A User-Level Threads Library for Multiprogrammed Multiprocessors. Technical Report. University of Texas at Austin, Austin, TX.
[19]
Robert L. Bocchino Jr., Vikram S. Adve, Sarita V. Adve, and Marc Snir. 2009. Parallel programming must be deterministic by default. In Proceedings of the 1st USENIX Conference on Hot Topics in Parallelism (HotPar’09). 4. http://dl.acm.org/citation.cfm?id=1855591.1855595
[20]
Richard P. Brent. 1974. The parallel evaluation of general arithmetic expressions. Journal of the ACM 21, 2, 201--206.
[21]
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1--7, 107--117.
[22]
Andrej Brodnik, Svante Carlsson, Erik D. Demaine, J. Ian Munro, and Robert Sedgewick. 1999. Resizable arrays in optimal time and space. In Proceedings of the 6th International Workshop on Algorithms and Data Structures (WADS’99). 37--48. http://dl.acm.org/citation.cfm?id=645932.673194
[23]
F. Warren Burton and M. Ronan Sleep. 1981. Executing functional programs on a virtual tree of processors. In Proceedings of the 1981 Conference on Functional Programming Languages and Computer Architecture (FPCA’81). ACM, New York, NY, 187--194.
[24]
Vincent Cavé, Jisheng Zhao, Jun Shirako, and Vivek Sarkar. 2011. Habanero-Java: The new adventures of old X10. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java (PPPJ’11). ACM, New York, NY, 51--61.
[25]
Bradford L. Chamberlain, Sung-Eun Choi, E. Christopher Lewis, Calvin Lin, Lawrence Snyder, and W. Derrick Weathersby. 2000. ZPL: A machine independent programming language for parallel computers. IEEE Transactions on Software Engineering 26, 3, 197--211.
[26]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: An object-oriented approach to non-uniform cluster computing. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’05). ACM, New York, NY, 519--538.
[27]
R. Cole and U. Vishkin. 1986. Deterministic coin tossing and accelerating cascades: Micro and macro techniques for designing parallel algorithms. In Proceedings of the 18th Annual ACM Symposium on Theory of Computing (STOC’86). ACM, New York, NY, 206--219.
[28]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (3rd ed.). MIT Press, Cambridge, MA.
[29]
P. J. Courtois, F. Heymans, and D. L. Parnas. 1971. Concurrent control with “readers” and “writers.” Communications of the ACM 14, 10, 667--668.
[30]
Joseph C. Culberson. 1992. Iterated Greedy Graph Coloring and the Difficulty Landscape. Technical Report. University of Alberta, Edmonton, Alberta, Canada.
[31]
Timothy A. Davis and Yifan Hu. 2011. The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software 38, 1, Article No. 1.
[32]
J. E. Dennis Jr. and Trond Steihaug. 1986. On the successive projections approach to least-squares problems. SIAM Journal on Numerical Analysis 23, 4, 717--733.
[33]
Joseph Devietti, Brandon Lucia, Luis Ceze, and Mark Oskin. 2009. DMP: Deterministic shared memory multiprocessing. ACM SIGPLAN Notices 44, 3, 85--96.
[34]
Joseph Devietti, Jacob Nelson, Tom Bergan, Luis Ceze, and Dan Grossman. 2011. RCDC: A relaxed consistency deterministic computer. ACM SIGPLAN Notices 47, 4, 67--78.
[35]
D. L. Eager, J. Zahorjan, and E. D. Lozowska. 1989. Speedup versus efficiency in parallel systems. IEEE Transactions on Computers 38, 3, 408--423.
[36]
Mingdong Feng and Charles E. Leiserson. 1997. Efficient detection of determinacy races in Cilk programs. In Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’97). ACM, New York, NY, 1--11.
[37]
Mingdong Feng and Charles E. Leiserson. 1999. Efficient detection of determinacy races in Cilk programs. Theory of Computing Systems 32, 3, 301--326.
[38]
Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin. 2009. Reducers and other Cilk++ hyperobjects. In Proceedings of the 21st Annual Symposium on Parallelism in Algorithms and Architectures (SPAA’09). ACM, New York, NY, 79--90.
[39]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The implementation of the Cilk-5 multithreaded language. ACM SIGPLAN Notices 33, 5, 212--223.
[40]
M. R. Garey, D. S. Johnson, and L. Stockmeyer. 1974. Some simplified NP-complete problems. In Proceedings of the 6th Annual ACM Symposium on Theory of Computing (STOC’74). ACM, New York, NY, 47--63.
[41]
Alan E. Gelfand and Adrian F. M. Smith. 1990. Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 85, 410, 398--409.
[42]
Stuart Geman and Donald Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 6, 721--741.
[43]
P. B. Gibbons. 1989. A more practical PRAM model. In Proceedings of the 1st Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’89). ACM, New York, NY, 158--168.
[44]
John R. Gilbert, Cleve Moler, and Robert Schreiber. 1992. Sparse matrices in Matlab: Design and implementation. SIAM Journal on Matrix Analysis and Applications 13, 1, 333--356.
[45]
Andrew V. Goldberg, Serge A. Plotkin, and Gregory E. Shannon. 1988. Parallel symmetry-breaking in sparse graphs. SIAM Journal on Discrete Mathematics 1, 4, 434--446.
[46]
Mark Goldberg and Thomas Spencer. 1989. A new parallel algorithm for the maximal independent set problem. SIAM Journal on Computing 18, 2, 419--427.
[47]
G. Golub and W. Kahan. 1965. Calculating the singular values and pseudo-inverse of a matrix. Journal of the Society for Industrial and Applied Mathematics: Series B, Numerical Analysis 2, 2, 205--224.
[48]
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). 17--30. http://dl.acm.org/citation.cfm?id=2387880.2387883
[49]
R. L. Graham. 1966. Bounds for certain multiprocessing anomalies. Bell System Technical Journal, 1563--1581.
[50]
Robert H. Halstead Jr. 1984. Implementation of MultiLisp: Lisp on a multiprocessor. In Proceedings of the 1984 ACM Symposium on LISP and Functional Programming (LFP’84). ACM, New York, NY, 9--17.
[51]
Robert H. Halstead Jr. 1985. MultiLisp: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems 7, 4, 501--538.
[52]
William Hasenplaugh, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2014. Ordering heuristics for parallel graph coloring. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’14). ACM, New York, NY, 166--177.
[53]
Yuxiong He, Charles E. Leiserson, and William M. Leiserson. 2010. The Cilkview scalability analyzer. In Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’10). ACM, New York, NY, 145--156.
[54]
Maurice Herlihy and Nir Shavit. 2008. The Art of Multiprocessor Programming. Morgan Kaufmann, San Francisco, CA.
[55]
F. L. Hitchcock. 1927. The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics 6, 1--4, 164--189.
[56]
Derek R. Hower, Polina Dudnik, Mark D. Hill, and David A. Wood. 2011. Calvin: Deterministic or not? Free will to choose. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA’11). IEEE, Los Alamitos, CA, 333--334. http://dl.acm.org/citation.cfm?id=2014698.2014870
[57]
Intel. 2012. The Threading Building Blocks. Available at http://software.intel.com.
[58]
Intel. 2013. Intel Cilk Plus. Available at http://software.intel.com.
[59]
Kenneth E. Iverson. 1962. A Programming Language. John Wiley & Sons, New York, NY.
[60]
Mark T. Jones and Paul E. Plassmann. 1993. A parallel graph coloring heuristic. SIAM Journal on Scientific Computing 14, 3, 654--669.
[61]
Charles H. Koelbel, David B. Loveman, Robert S. Schreiber, Guy L. Steele Jr., and Mary E. Zosel. 1994. The High Performance Fortran Handbook. MIT Press, Cambridge, MA.
[62]
Fabian Kuhn. 2009. Weak graph colorings: Distributed algorithms and applications. In Proceedings of the 21st Annual Symposium on Parallelism in Algorithms and Architectures (SPAA’09). ACM, New York, NY, 138--144.
[63]
Fabian Kuhn and Rogert Wattenhofer. 2006. On the complexity of distributed graph coloring. In Proceedings of the 25th Annual ACM Symposium on Principles of Distributed Computing (PODC’06). ACM, New York, NY, 7--15.
[64]
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). 31--46. http://dl.acm.org/citation.cfm?id=2387880.2387884
[65]
Cliff Lasser and Steve M. Omohundro. 1986. The Essential Lisp Manual. Technical Report. Thinking Machines, Cambridge, MA.
[66]
Doug Lea. 2000. A Java fork/join framework. In Proceedings of the ACM 2000 Conference on Java Grande (JAVA’00). ACM, New York, NY, 36--43.
[67]
Edward A. Lee. 2006. The problem with threads. Computer 39, 5, 33--42.
[68]
I.-Ting Angelina Lee, Aamir Shafi, and Charles E. Leiserson. 2012. Memory-mapping support for reducer hyperobjects. In Proceedings of the 24th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’12). ACM, New York, NY, 287--297.
[69]
D. Leijen and J. Hall. 2007. Parallel performance: Optimize managed code for multi-core machines. MSDN Magazine.
[70]
Charles E. Leiserson. 2010. The Cilk++ concurrency platform. Journal of Supercomputing 51, 3, 244--257.
[71]
J. Leskovec. 2013. SNAP: Stanford Network Analysis Platform. Retrieved June 1, 2013, from http://snap.stanford.edu/data/index.html.
[72]
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6, 1, 29--123.
[73]
Nathan Linial. 1992. Locality in distributed graph algorithms. SIAM Journal on Computing 21, 1, 193--201.
[74]
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment 5, 8, 716--727.
[75]
Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2010. GraphLab: A new parallel framework for machine learning. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI’10).
[76]
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, NY, 135--146.
[77]
Andrew McCallum. 2012. CoRA Data Set. Available at http://people.cs.umass.edu/mccallum/data.html.
[78]
D. McGrady. 2008. Avoiding Contention Using Combinable Objects. Retrieved May 21, 2015 from http://blogs.msdn.com/nativeconcurrency/archive/2008/09/25/avoiding-contention-using-combinable-objects.aspx.
[79]
Tom Mitchell. 2009. The NPIC500 Dataset. Retrieved May 21, 2016, from http://www.cs.cmu.edu/∼tom/10709_fall09/NPIC500.pdf.
[80]
Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). 467--475. http://dl.acm.org/citation.cfm?id=2073796.2073849
[81]
Robert H. B. Netzer and Barton P. Miller. 1992. What are race conditions? Some issues and formalizations. ACM Letters on Programming Languages and Systems 1, 1, 74--88.
[82]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, NY, 456--471.
[83]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2014. Deterministic Galois: On-demand, portable and parameterless. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, New York, NY, 499--512.
[84]
Kamal Nigam and Rayid Ghani. 2000. Analyzing the effectiveness and applicability of co-training. In Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM’00). ACM, New York, NY, 86--93.
[85]
Marek Olszewski, Jason Ansel, and Saman Amarasinghe. 2009. Kendo: Efficient deterministic multithreading in software. ACM SIGARCH Computer Architecture News 37, 1, 97--108.
[86]
Suhas S. Patil. 1970. Closure properties of interconnections of determinate systems. In Record of the Project MAC Conference on Concurrent Systems and Parallel Computation, J. B. Dennis (Ed.). ACM, New York, NY, 107--116.
[87]
Judea Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco, CA.
[88]
James Reinders. 2007. Intel Threading Building Blocks. O’Reilly & Associates, Sebastopol, CA.
[89]
Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. ACM SIGPLAN Notices 48, 8, 135--146.
[90]
Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, and Phillip B. Gibbons. 2013. Reducing contention through priority updates. In Proceedings of the 25th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’13). ACM, New York, NY, 152--163.
[91]
Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Aapo Kyrola, Harsha Vardhan Simhadri, and Kanat Tangwongsan. 2012. Brief announcement: The problem based benchmark suite. In Proceedings of the 24th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’12). ACM, New York, NY, 68--70.
[92]
Julian Shun, Laxman Dhulipala, and Guy E. Blelloch. 2015. Smaller and faster: Parallel processing of compressed graphs with Ligra+. In Proceedings of the 2015 Data Compression Conference (DCC’15). 403--412.
[93]
Parag Singla and Pedro Domingos. 2006. Entity resolution with Markov logic. In Proceedings of the 6th International Conference on Data Mining (ICDM’06). IEEE, Los Alamitos, CA, 572--582.
[94]
Guy L. Steele Jr. 1990. Making asynchronous parallelism safe for the world. In Proceedings of the 17th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’90). ACM, New York, NY, 218--231.
[95]
Josef Stoer, Roland Bulirsch, Richard H. Bartels, Walter Gautschi, and Christoph Witzgall. 2002. Introduction to Numerical Analysis. Springer, New York, NY. http://opac.inria.fr/record=b1098819
[96]
Márió Szegedy and Sundar Vishwanathan. 1993. Locality based graph coloring. In Proceedings of the 25th Annual ACM Symposium on Theory of Computing (STOC’93). ACM, New York, NY, 201--207.
[97]
Alan M. Turing. 1948. Rounding-off errors in matrix processes. Quarterly Journal of Mechanics and Applied Mathematics 1, 1, 287--308.
[98]
D. J. A. Welsh and M. B. Powell. 1967. An upper bound for the chromatic number of a graph and its application to timetabling problems. Computer Journal 10, 1, 85--86.
[99]
Jie Yu and Satish Narayanasamy. 2009. A case for an interleaving constrained shared-memory multi-processor. ACM SIGARCH Computer Architecture News 37, 3, 325--336.
[100]
Marco Zagha and Guy E. Blelloch. 1991. Radix sort for vector multiprocessors. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing’91). ACM, New York, NY, 712--721.

Cited By

View all
  • (2022)High-performance and balanced parallel graph coloring on multicore platformsThe Journal of Supercomputing10.1007/s11227-022-04894-679:6(6373-6421)Online publication date: 7-Nov-2022
  • (2021)BiPartProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441611(161-174)Online publication date: 17-Feb-2021
  • (2019)Graph Coloring on the GPU2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2019.00046(231-240)Online publication date: May-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 3, Issue 1
Special Issue for SPAA 2014
June 2016
192 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/2965648
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2016
Accepted: 01 February 2016
Revised: 01 November 2015
Received: 01 August 2014
Published in TOPC Volume 3, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data-graph computations
  2. chromatic scheduling
  3. determinism
  4. multicore
  5. multithreading
  6. parallel programming
  7. scheduling
  8. work stealing

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • NSF Graduate Research Fellowship
  • National Science Foundation
  • Intel Corporation and Foxconn Technology Group

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)High-performance and balanced parallel graph coloring on multicore platformsThe Journal of Supercomputing10.1007/s11227-022-04894-679:6(6373-6421)Online publication date: 7-Nov-2022
  • (2021)BiPartProceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3437801.3441611(161-174)Online publication date: 17-Feb-2021
  • (2019)Graph Coloring on the GPU2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2019.00046(231-240)Online publication date: May-2019
  • (2018)Brief AnnouncementProceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures10.1145/3210377.3210658(351-353)Online publication date: 11-Jul-2018
  • (2017)A Multicore Path to Connectomics-on-DemandACM SIGPLAN Notices10.1145/3155284.301876652:8(267-281)Online publication date: 26-Jan-2017
  • (2017)A Multicore Path to Connectomics-on-DemandProceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3018743.3018766(267-281)Online publication date: 26-Jan-2017

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media