Abstract
This paper presents a set of compiler optimizations and their application strategies for a common class of data parallel loop nests. The arrays updated in the body of the loop nests are assumed to be partitioned into blocks (rectangular, rows, or columns) where each block is assigned to a processor.
These optimizations are demonstrated in the context of a FORTRAN-90 compiler with very encouraging preliminary results. In the case of solving tridiagonal systems by Gaussian Elimination, the performance of the optimized native code is two orders of magnitude better than the CM-FORTRAN compiler and approaching that of the hand-written Connection Machine Scientific Library (CMSSL) routine.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CM Fortran Reference Manual, version 1.0. Thinking Machines Corp., Cambridge, MA, February, 1991.
CMSSL Release Notes, version 2.2. Thinking Machines Corp., Cambridge, MA, June, 1991.
S. Abraham and D. Hudak. Compile-time partitioning of iterative parallel loops to reduce cache coherency traffic. IEEE Trans. on Parallel and Distributed Systems, 2(3):318–328, July 1991.
U. Banerjee. Speedup of Ordinary Programs. PhD thesis, University of Illinois at Urbana-Champaign, 1979.
M. Chen, Y. Choo, and J. Li. Compiling parallel programs by optimizing performance. Journal of Supercomputing, 1(2):171–207, July 1988.
M. Chen and J. Cowie. Prototyping Fortran-90 compilers for massively parallel machines. In Proceedings of the ACM SIGPLAN'92 Conference on Programming Language Design and Implementation, June 1992.
M. Chen and J. Wu. Optimizing Fortran-90 programs for data motion on massively parallel systems. Technical Report YALEU/DCS/TR-882, Department of Computer Science, Yale University, 1991.
Y. Hu. Boolean cube emulation of PM2I networks encoded by Gray code. Manuscripts, November 1991.
H. Siegel. Interconnention Networks for Large Scale Parallel Processing. Lexington Books, Lexington, MA, 1985.
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. on Parallel and Distributed Systems, 2(4):452–471, Oct. 1991.
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. PhD thesis, University of Illinois at Urbana-Champaign, 1982.
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. The MIT Press, 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, M., Hu, Y. (1993). Compiler optimizations for massively parallel machines: Transformations on iterative spatial loops. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1992. Lecture Notes in Computer Science, vol 757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57502-2_50
Download citation
DOI: https://doi.org/10.1007/3-540-57502-2_50
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57502-3
Online ISBN: 978-3-540-48201-7
eBook Packages: Springer Book Archive