Compiler optimizations for massively parallel machines: Transformations on iterative spatial loops

Chen, M.; Hu, Y.

doi:10.1007/3-540-57502-2_50

M. Chen¹ &
Y. Hu¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 757))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

141 Accesses

Abstract

This paper presents a set of compiler optimizations and their application strategies for a common class of data parallel loop nests. The arrays updated in the body of the loop nests are assumed to be partitioned into blocks (rectangular, rows, or columns) where each block is assigned to a processor.

These optimizations are demonstrated in the context of a FORTRAN-90 compiler with very encouraging preliminary results. In the case of solving tridiagonal systems by Gaussian Elimination, the performance of the optimized native code is two orders of magnitude better than the CM-FORTRAN compiler and approaching that of the hand-written Connection Machine Scientific Library (CMSSL) routine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Interactive Composition of Compiler Optimizations

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures

References

CM Fortran Reference Manual, version 1.0. Thinking Machines Corp., Cambridge, MA, February, 1991.
Google Scholar
CMSSL Release Notes, version 2.2. Thinking Machines Corp., Cambridge, MA, June, 1991.
Google Scholar
S. Abraham and D. Hudak. Compile-time partitioning of iterative parallel loops to reduce cache coherency traffic. IEEE Trans. on Parallel and Distributed Systems, 2(3):318–328, July 1991.
Google Scholar
U. Banerjee. Speedup of Ordinary Programs. PhD thesis, University of Illinois at Urbana-Champaign, 1979.
Google Scholar
M. Chen, Y. Choo, and J. Li. Compiling parallel programs by optimizing performance. Journal of Supercomputing, 1(2):171–207, July 1988.
Google Scholar
M. Chen and J. Cowie. Prototyping Fortran-90 compilers for massively parallel machines. In Proceedings of the ACM SIGPLAN'92 Conference on Programming Language Design and Implementation, June 1992.
Google Scholar
M. Chen and J. Wu. Optimizing Fortran-90 programs for data motion on massively parallel systems. Technical Report YALEU/DCS/TR-882, Department of Computer Science, Yale University, 1991.
Google Scholar
Y. Hu. Boolean cube emulation of PM2I networks encoded by Gray code. Manuscripts, November 1991.
Google Scholar
H. Siegel. Interconnention Networks for Large Scale Parallel Processing. Lexington Books, Lexington, MA, 1985.
Google Scholar
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. on Parallel and Distributed Systems, 2(4):452–471, Oct. 1991.
Google Scholar
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. PhD thesis, University of Illinois at Urbana-Champaign, 1982.
Google Scholar
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. The MIT Press, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Yale University, New Haven, Connecticut
M. Chen & Y. Hu

Authors

M. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Y. Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, M., Hu, Y. (1993). Compiler optimizations for massively parallel machines: Transformations on iterative spatial loops. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1992. Lecture Notes in Computer Science, vol 757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57502-2_50

Download citation

DOI: https://doi.org/10.1007/3-540-57502-2_50
Published: 03 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57502-3
Online ISBN: 978-3-540-48201-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Compiler optimizations for massively parallel machines: Transformations on iterative spatial loops

Abstract

Access this chapter

Preview

Similar content being viewed by others

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Interactive Composition of Compiler Optimizations

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Compiler optimizations for massively parallel machines: Transformations on iterative spatial loops

Abstract

Access this chapter

Preview

Similar content being viewed by others

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Interactive Composition of Compiler Optimizations

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation