Abstract
The polyhedral model is known to be a powerful framework to reason about high level loop transformations. Recent developments in optimizing compilers broke some generally accepted ideas about the limitations of this model. First, thanks to advances in dependence analysis for irregular access patterns, its applicability which was supposed to be limited to very simple loop nests has been extended to wide code regions. Then, new algorithms made it possible to compute the target code for hundreds of statements while this code generation step was expected not to be scalable. Such theoretical advances and new software tools allowed actors from both academia and industry to study more complex and realistic cases. Unfortunately, despite strong optimization potential of a given transformation for e.g., parallelism or data locality, code generation may still be challenging or result in high control overhead. This paper presents scalable code generation methods that make possible the application of increasingly complex program transformations. By studying the transformations themselves, we show how it is possible to benefit from their properties to dramatically improve both code generation quality and space/time complexity, with respect to the best state-of-the-art code generation tool. In addition, we build on these improvements to present a new algorithm improving generated code performance for strided domains and reindexed schedules.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ancourt, C., Irigoin, F.: Scanning polyhedra with DO loops. In: 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 1991, pp. 39–50 (1991)
Bastoul, C.: Efficient code generation for automatic parallelization and optimization. In: ISPDC 2003 IEEE Intl. Symp. on Parallel and Distributed Computing, Ljubljana, October 2003, pp. 23–30 (2003)
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: PACT 13 IEEE Intl. Conf. on Parallel Architecture and Compilation Techniques, Juan-les-Pins, September 2004, pp. 7–16 (2004)
Bastoul, C., Feautrier, P.: Improving data locality by chunking. In: Hedin, G. (ed.) CC 2003. LNCS, vol. 2622, pp. 320–334. Springer, Heidelberg (2003)
Boulet, P., Darte, A., Silber, G.-A., Vivien, F.: Loop parallelization algorithms: From parallelism extraction to code generation. Parallel Computing 24(3), 421–444 (1998)
Cohen, A., Girbal, S., Parello, D., Sigler, M., Temam, O., Vasilache, N.: Facilitating the search for compositions of program transformations. In: ACM ICS 2005 International Conference on Supercomputing, Cambridge, June 2005, pp. 151–160 (2005)
Dantzig, G.: Maximization of a linear function of variables subject to linear inequalities. In: Koopmans, T. (ed.) Activity Analysis of Production and Allocation, Cowles Commission Monograph No. 13, New York, pp. 339–347. John Wiley & Sons, Chichester (1951)
Darte, A., Robert, Y.: Mapping uniform loop nests onto distributed memory architectures. Parallel Computing 20(5), 679–710 (1994)
Feautrier, P.: Dataflow analysis of scalar and array references. International Journal of Parallel Programming 20(1), 23–53 (1991)
Feautrier, P.: Some efficient solutions to the affine scheduling problem, part II: multidimensional time. Int. Journal of Parallel Programming 21(6), 389–420 (1992)
Franke, B., O’Boyle, M.: A complete compiler approach to auto-parallelizing c programs for Multi-DSP systems. IEEE Transactions on Parallel and Distributed Systems (TPDS) 16(3), 234–245 (2005)
Griebl, M.: Automatic parallelization of loop programs for distributed memory architectures. Habilitation thesis. Facultät für Mathematik und Informatik, Universität Passau (2004)
Hurbain, I., Ancourt, C., Irigoin, F., Barreteau, M., Mattioli, J., Paquier, F.: A case study of design space exploration for embedded multimedia applications in SoCs. Technical Report A-361, CRI – École des Mines de Paris (February 2005)
Kapasi, U., Rixner, S., Dally, W., Khailany, B., Ho Ahn, J., Mattson, P., Owens, J.: Programmable stream processors. IEEE Computer 36(8), 54–62 (2003)
Kelly, W., Pugh, W.: A framework for unifying reordering transformations. Technical Report CS-TR-3193, University of Maryland (1993)
Kelly, W., Pugh, W., Rosser, E.: Code generation for multiple mappings. In: Frontiers 1995 Symposium on the frontiers of massively parallel computation, McLean (1995)
Kuck, D.: The Structure of Computers and Computations. John Wiley & Sons, Chichester (1978)
Le Verge, H.: A note on Chernikova’s algorithm. Technical Report 635, IRISA (1992)
Lengauer, C.: Loop parallelization in the polytope model. In: Best, E. (ed.) CONCUR 1993. LNCS, vol. 715, pp. 398–416. Springer, Heidelberg (1993)
Li, W., Pingali, K.: A singular loop transformation framework based on non-singular matrices. International Journal of Parallel Programming 22(2), 183–205 (1994)
Lim, A., Lam, M.: Maximizing parallelism and minimizing synchronization with affine transforms. In: PoPL 24 ACM Symp. on Principles of Programming Languages, Paris, January 1997, pp. 201–214 (1997)
Müller-Pfefferkorn, R., Nagel, W., Trenkler, B.: Optimizing cache access: A tool for source-to-source transformations and real-life compiler tests. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 72–81. Springer, Heidelberg (2004)
Pugh, W.: The omega test: a fast and practical integer programming algorithm for dependence analysis. In: Proceedings of the third ACM/IEEE conference on Supercomputing, Albuquerque, August 1991, pp. 4–13 (1991)
Pugh, W.: Uniform techniques for loop optimization. In: ICS’5 ACM International Conference on Supercomputing, Cologne, June 1991, pp. 341–352 (1991)
Quilleré, F., Rajopadhye, S., Wilde, D.: Generation of efficient nested loops from polyhedra. International Journal of Parallel Programming 28(5), 469–498 (2000)
Ramanujam, J.: Beyond unimodular transformations. J. of Supercomputing 9(4), 365–389 (1995)
Schrijver, A.: Theory of linear and integer programming. John Wiley & Sons, Chichester (1986)
Wolfe, M.: High performance compilers for parallel computing. Addison-Wesley, Reading (1995)
Xue, J.: Automating non-unimodular loop transformations for massive parallelism. Parallel Computing 20(5), 711–728 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vasilache, N., Bastoul, C., Cohen, A. (2006). Polyhedral Code Generation in the Real World. In: Mycroft, A., Zeller, A. (eds) Compiler Construction. CC 2006. Lecture Notes in Computer Science, vol 3923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11688839_16
Download citation
DOI: https://doi.org/10.1007/11688839_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33050-9
Online ISBN: 978-3-540-33051-6
eBook Packages: Computer ScienceComputer Science (R0)