Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Improving the performance of runtime parallelization

Published: 01 July 1993 Publication History

Abstract

When the inter-iteration dependency pattern of the iterations of a loop cannot be determined statically, compile time parallelization of the loop is not possible. In these cases, runtime parallelization [8] is the only alternative. The idea is to transform the loop into two code fragements: the inspector and the executor. When the program is run, the inspector examines the iteration dependencies and constructs a parallel schedule. The executor subsequently uses that schedule to carry out the actual computation in parallel.
In this paper, we show how to reduce the overhead of running the inspector through its parallel execution. We describe two related approaches. The first, which emphasizes inspector efficiency, achieves nearly linear speedup relative to a sequential execution of the inspector, but produces a schedule that may be less efficient for the executor. The second technique, which emphasizes executor efficiency, does not in general achieve linear speedup of the inspector, but is guaranteed to produce the best achievable schedule. We present these techniques, show that they are correct, and compare their performance to existing techniques using a set of experiments.
Because in this paper we are optimizing inspector time, but leaving the executor unchanged, the techniques we present have most dramatic effect when the inspector must be run for each invocation of the source loop. In a companion paper [3], we explore techniques that build upon those developed here to also improve executor performance.

References

[1]
Dimitri P. Bertsekas and John N. Tsitsiklis. Parallel and Distributed Computation: Numerical Methods. Prentice Hall, Englewood Cliffs, NJ, 1989.
[2]
Edward D. Lazowska, John Zahorjan, G. Scott Graham, and Kenneth C. Sevcik. Quantitative System Performance. Prentice Hall, Englewood Cliffs, NJ, 1984.
[3]
S. Leung and J. Zahorjan. Extending the domain and improving the execution performance of runtime parallelization. Technical Report, in preparation, Department of Computer Science & Engineering, University of Washington, October 1992.
[4]
Raif O. Onvural. Survey of closed queueing networks with blocking. A CM Computing Surveys, 22(2):83-121, June 1990.
[5]
William H. Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling. Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, Cambridge, 1986.
[6]
J. Saltz, H. Berryman, and J. Wu. Multiprocessors and runtime compilation. In Proc. International Workshop on Compilers for Parallel Computers, Paris, 1990.
[7]
J. Saltz and R. Mirchandaney. The preprocessed doacross loop. in Proc. 1991 International Conference on Parallel Processing, August 1991.
[8]
J. Saltz, R. Mirchandaney, and K. Crowley. Runtime parallelization and scheduling of loops. IEEE Transactions on Computers, 40(5):603- 612, May 1991.

Cited By

View all
  • (2013)Double Inspection for Run-Time Loop ParallelizationLanguages and Compilers for Parallel Computing10.1007/978-3-642-36036-7_4(46-60)Online publication date: 2013
  • (2009)Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processorsACM SIGPLAN Notices10.1145/1594835.150420944:4(219-228)Online publication date: 14-Feb-2009
  • (2009)Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processorsProceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/1504176.1504209(219-228)Online publication date: 14-Feb-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 28, Issue 7
July 1993
259 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/173284
Issue’s Table of Contents
  • cover image ACM Conferences
    PPOPP '93: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
    August 1993
    259 pages
    ISBN:0897915895
    DOI:10.1145/155332
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 1993
Published in SIGPLAN Volume 28, Issue 7

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)14
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2013)Double Inspection for Run-Time Loop ParallelizationLanguages and Compilers for Parallel Computing10.1007/978-3-642-36036-7_4(46-60)Online publication date: 2013
  • (2009)Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processorsACM SIGPLAN Notices10.1145/1594835.150420944:4(219-228)Online publication date: 14-Feb-2009
  • (2009)Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processorsProceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/1504176.1504209(219-228)Online publication date: 14-Feb-2009
  • (2009)A study of potential parallelism among traces in Java programsScience of Computer Programming10.1016/j.scico.2009.01.00474:5-6(296-313)Online publication date: 1-Mar-2009
  • (2005)The SPNT test: A new technology for run-time speculative parallelization of loopsLanguages and Compilers for Parallel Computing10.1007/BFb0032691(177-191)Online publication date: 9-Jun-2005
  • (2005)Run-time parallelization of irregular DOACROSS loopsParallel Algorithms for Irregularly Structured Problems10.1007/3-540-60321-2_5(75-80)Online publication date: 4-Jun-2005
  • (2004)Probabilistic program analysis for parallelizing compilersProceedings of the 6th international conference on High Performance Computing for Computational Science10.1007/11403937_46(610-622)Online publication date: 28-Jun-2004
  • (2001)Techniques for Reducing the Overhead of Run-Time ParallelizationCompiler Construction10.1007/3-540-46423-9_16(232-248)Online publication date: 1-Jun-2001
  • (2000)Principles of Speculative Run—Time ParallelizationLanguages and Compilers for Parallel Computing10.1007/3-540-48319-5_21(323-337)Online publication date: 12-May-2000
  • (1999)Eliminating synchronization overhead in automatically parallelized programs using dynamic feedbackACM Transactions on Computer Systems10.1145/312203.31221017:2(89-132)Online publication date: 1-May-1999
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media