Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

Optimistic parallelism requires abstractions

Published: 01 September 2009 Publication History

Abstract

The problem of writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a few application areas such as dense matrix computations. In particular, auto-parallelization of irregular programs, which are organized around large, pointer-based data structures like graphs, has seemed intractable.
The Galois project is taking a fresh look at autoparallelization. Rather than attempt to parallelize all programs no matter how obscurely they are written, we are designing programming abstractions that permit programmers to highlight opportunities for exploiting parallelism in sequential programs, and building a runtime system that uses these hints to execute the program in parallel. In this paper, we describe the design and implementation of a system based on these ideas. Experimental results for two real-world irregular applications, a Delaunay mesh refinement application and a graphics application that performs agglomerative clustering, demonstrate that this approach is promising.

References

[1]
Burke, M., Carini, P., Choi, J.-D. Interprocedural Pointer Alias Analysis. Technical Report IBM RC 21055, IBM Yorktown Heights, 1997.
[2]
Chew, L.P. Guaranteed-quality mesh generation for curved surfaces. In SCG'93: Proceedings of the 9th Annual Symposium on Computational Geometry (1993), 274--280.
[3]
de Galas, J. The quest for more processing power: is the single core CPU doomed? http://www.anandtech.com/cpuchipsets/showdoc.aspx?I=2377, February 2005.
[4]
Diniz, P.C., Rinard, M.C. Commutativity analysis: a new analysis technique for parallelizing compilers. ACM Trans. Prog. Lang. Syst. 19, 6 (1997), 942--991.
[5]
Ghiya, R., Hendren, L. Is it a tree, a dag, or a cyclic graph? A shape analysis for heap-directed pointers c. In POPL, 1996.
[6]
Herlihy, M., Koskinen, E. Transactional boosting: a methodology for highlyconcurrent transactional objects. In Principles and Practices of Parallel Programming (PPoPP), 2008.
[7]
Herlihy, M., Moss, J.E.B. Transactional memory: architectural support for lock-free data structures. In ISCA '93: Proceedings of the 20th Annual International Symposium on Computer Architecture (1993).
[8]
Hudson, B., Miller, G.L., Phillips, T. Sparse parallel Delaunay mesh refinement. In SPAA (2007).
[9]
Jefferson, D.R. Virtual time. ACM Trans. Prog. Lang. Syst. 7, 3 (1985), 404--425.
[10]
Kennedy, K., Allen, J., editors. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann, 2001.
[11]
Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Cascaval, C. How much parallelism is there in irregular applications? In Principles and Practices of Parallel Programming (PPoPP), 2009.
[12]
Kulkarni, M., Carribault, P., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P. Scheduling strategies for optimistic parallel execution of irregular programs. In Symposium on Parallel Architectures and Algorithms (SPAA) (2008).
[13]
Kulkarni M., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P. Optimistic parallelism benefits from data partitioning. SIGARCH Comput. Archit. News 36, 1 (2008), 233--243.
[14]
Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Chew, L.P. Optimistic parallelism requires abstractions. SIGPLAN Not (Proceedings of PLDI 2007) 42, 6 (2007), 211--222.
[15]
Larus, J., Rajwar, R. Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan&Claypool Publishers, 2007.
[16]
Ni, Y., Menon, V., Adl-Tabatabai, A.-R., Hosking, A.L., Hudson, R., Moss, J.E.B., Saha, B., Shpeisman, T. Open nesting in software transactional memory. In Principles and Practices of Parallel Programming (PPoPP), 2007.
[17]
Pang-Ning Tan, M.S., Kumar, V., editors. Introduction to Data Mining. Pearson Addison Wesley, 2005.
[18]
Ponnusamy, R., Saltz, J., Choudhary, A. Runtime compilation techniques for data partitioning and communication schedule reuse. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing (1993).
[19]
Rauchwerger, L., Padua, D.A. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Trans. Parallel Distrib. Syst. 10, 2 (1999), 160--180.
[20]
Sagiv, M., Reps, T., Wilhelm, R. Solving shape-analysis problems in languages with destructive updating. In Proceedings of the 23rd Annual ACM Symposium on the Principles of Programming Languages (st. Petersburg Beach, FL, January 1996).
[21]
Shewchuk, J.R. Triangle: Engineering a 2D quality mesh generator and Delaunay triangulator. In Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science. May 1996, 203--222.
[22]
Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C. A scalable approach to thread-level speculation. In ISCA '00: Proceedings of the 27th Annual International Symposium on Computer Architecture (2000).
[23]
Walter, B., Fernandez, S., Arbree, A., Bala, K., Donikian, M., Greenberg, D. Lightcuts: a scalable approach to illumination. ACM Trans. Graphics (SIGGRAPH) 24, 3 (July 2005), 1098--1107.
[24]
Zhan, L.R.Y., Torrellas, J. Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors. In HPCA '98: Proceedings of the 4th International Symposium on High-Performance Computer Architecture (1998).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 52, Issue 9
The Status of the P versus NP Problem
September 2009
139 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/1562164
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2009
Published in CACM Volume 52, Issue 9

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)325
  • Downloads (Last 6 weeks)96
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A new thread-level speculative automatic parallelization model and library based on duplicate code executionThe Journal of Supercomputing10.1007/s11227-024-05987-0Online publication date: 11-Mar-2024
  • (2023)Sortledton: a Universal Graph Data StructureACM SIGMOD Record10.1145/3604437.360444252:1(17-25)Online publication date: 8-Jun-2023
  • (2022)SortledtonProceedings of the VLDB Endowment10.14778/3514061.351406515:6(1173-1186)Online publication date: 22-Jun-2022
  • (2019)Conflict Abstractions and Shadow Speculation for Optimistic Transactional ObjectsProgramming Languages and Systems10.1007/978-3-030-34175-6_16(313-331)Online publication date: 18-Nov-2019
  • (2017)Using the Xeon Phi Platform to Run Speculatively-Parallelized CodesInternational Journal of Parallel Programming10.1007/s10766-016-0421-x45:2(225-241)Online publication date: 1-Apr-2017
  • (2016)A Survey on Thread-Level Speculation TechniquesACM Computing Surveys10.1145/293836949:2(1-39)Online publication date: 30-Jun-2016
  • (2016)New Data Structures to Handle Speculative Parallelization at RuntimeInternational Journal of Parallel Programming10.1007/s10766-014-0347-044:3(407-426)Online publication date: 1-Jun-2016
  • (2015)Thinking Like a VertexACM Computing Surveys10.1145/281818548:2(1-39)Online publication date: 12-Oct-2015
  • (2013)Avoiding Locks and Atomic Instructions in Shared-Memory Parallel BFS Using Optimistic ParallelizationProceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum10.1109/IPDPSW.2013.241(1628-1637)Online publication date: 20-May-2013
  • (2013)Double Inspection for Run-Time Loop ParallelizationLanguages and Compilers for Parallel Computing10.1007/978-3-642-36036-7_4(46-60)Online publication date: 2013
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media