Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1375581.1375594acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Iterative optimization in the polyhedral model: part ii, multidimensional time

Published: 07 June 2008 Publication History

Abstract

High-level loop optimizations are necessary to achieve good performance over a wide variety of processors. Their performance impact can be significant because they involve in-depth program transformations that aim to sustain a balanced workload over the computational, storage, and communication resources of the target architecture. Therefore, it is mandatory that the compiler accurately models the target architecture as well as the effects of complex code restructuring.
However, because optimizing compilers (1) use simplistic performance models that abstract away many of the complexities of modern architectures, (2) rely on inaccurate dependence analysis, and (3) lack frameworks to express complex interactions of transformation sequences, they typically uncover only a fraction of the peak performance available on many applications. We propose a complete iterative framework to address these issues. We rely on the polyhedral model to construct and traverse a large and expressive search space. This space encompasses only legal, distinct versions resulting from the restructuring of any static control loop nest. We first propose a feedback-driven iterative heuristic tailored to the search space properties of the polyhedral model. Though, it quickly converges to good solutions for small kernels, larger benchmarks containing higher dimensional spaces are more challenging and our heuristic misses opportunities for significant performance improvement. Thus, we introduce the use of a genetic algorithm with specialized operators that leverage the polyhedral representation of program dependences. We provide experimental evidence that the genetic algorithm effectively traverses huge optimization spaces, achieving good performance improvements on large loop nests.

References

[1]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. F. P. O'Boyle, J. Thomson, M. Toussaint, and C. K. I. Williams. Using machine learning to focus iterative optimization. In IEEE/ACM Intl. Symp. on Code Generation and Optimization (CGO'06), pages 295--305, Washington, DC, USA, 2006. IEEE Computer Society.
[2]
N. Ahmed, N. Mateev, and K. Pingali. Tiling imperfectly-nested loop nests. In ACM/IEEE Conf. on Supercomputing (SC'00), Dallas, TX, USA, Nov. 2000.
[3]
J. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures. Morgan Kaufmann Publishers, 2002.
[4]
D. Barthou, J.-F. Collard, and P. Feautrier. Fuzzy array dataflow analysis. J. of Parallel and Distributed Computing, 40:210--226, 1997.
[5]
C. Bastoul. Code generation in the polyhedral model is easier than you think. In IEEE Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT'04), pages 7--16, Juan--les--Pins, France, Sept. 2004.
[6]
C. Bastoul and P. Feautrier. Improving data locality by chunking. In Intl. Conf. on Compiler Construction (ETAPS CC 12), volume 2622, pages 320--335, Warsaw, Poland, Apr. 2003.
[7]
A. Bernstein. Analysis of programs for parallel processing. IEEE Trans. on Electronic Computers, 15(5):757--763, Oct. 1966.
[8]
F. Bodin, T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, and E. Rohou. Iterative compilation in a non-linear optimisation space. In W. on Profile and Feedback Directed Compilation, Paris, Oct. 1998.
[9]
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Intl. Conf. on Compiler Construction (ETAPS CC 17), Budapest, Hungary, Apr. 2008.
[10]
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelization and locality optimization system. In ACM SIGPLAN Conf. on Programming Languages Design and Implementation (PLDI'08), Tucson, AZ, USA, June 2008.
[11]
K. D. Cooper, A. Grosul, T. J. Harvey, S. Reeves, D. Subramanian, L. Torczon, and T. Waterman. ACME: adaptive compilation made efficient. In ACM SIGLPAN/SIGBED Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES'05), pages 69--77, Chicago, IL, USA, 2005. ACM Press.
[12]
K. D. Cooper, P. J. Schielke, and D. Subramanian. Optimizing for reduced code space using genetic algorithms. In Workshop on Languages, Compilers, and Tools for Embedded Systems, pages 1--9, Atlanta, GA, USA, July 1999. ACM Press.
[13]
K. D. Cooper, D. Subramanian, and L. Torczon. Adaptive optimizing compilers for the 21st century. J. Supercomputing, 23(1):7--22, 2002.
[14]
A. Darte, Y. Robert, and F. Vivien. Scheduling and Automatic Parallelization. Birkhauser, 2000.
[15]
P. Feautrier. Parametric integer programming. RAIRO Recherche Opérationnelle, 22(3):243--268, 1988.
[16]
P. Feautrier. Some efficient solutions to the affine scheduling problem, part I: one dimensional time. Intl. J. of Parallel Programming, 21(5):313--348, Oct. 1992.
[17]
P. Feautrier. Some efficient solutions to the affine scheduling problem, part II: multidimensional time. Intl. J. of Parallel Programming, 21(6):389--420, Dec. 1992.
[18]
S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler, and O. Temam. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. Intl. J. of Parallel Programming, 34(3), 2006.
[19]
D. E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co. Inc., Boston, MA, USA, 1989.
[20]
M. Haneda, P. M. W. Knijnenburg, and H. A. G. Wijshoff. Automatic selection of compiler options using non-parametric inferential statistics. In IEEE Intl.\ Conf.\ on Parallel Architectures and Compilation Techniques (PACT'05), pages 123--132, Saint Louis, MO, USA, 2005. IEEE Computer Society.
[21]
W. Kelly. Optimization within a Unified Transformation Framework. PhD thesis, Univ. of Maryland, 1996.
[22]
W. Kelly, W. Pugh, and E. Rosser. Code generation for multiple mappings. In Intl. Symp. on the frontiers of massively parallel computation, pages 332--341, McLean, VA, USA, Feb. 1995.
[23]
T. Kisuki, P. M. W. Knijnenburg, and M. F. P. O'Boyle. Combined selection of tile sizes and unroll factors using iterative compilation. In IEEE Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT'00), pages 237--246, Philadelphia, PA, USA, 2000. IEEE Computer Society.
[24]
P. A. Kulkarni, S. R. Hines, D. B. Whalley, J. D. Hiser, J. W. Davidson, and D. L. Jones. Fast and efficient searches for effective optimization-phase sequences. ACM Trans. on Architecture and Code Optimization, 2(2):165--198, 2005.
[25]
M. Le Fur. Scanning parameterized polyhedron using Fourier-Motzkin elimination. Concurrency -- Practice and Experience, 8(6):445--460, 1996.
[26]
C. Lee. UTDSP benchmark suite, 1998. http://www.eecg.toronto.edu/char‘ corinna/DSP.
[27]
A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine transforms. In ACM Symp. on Principles of Programming Languages (PoPL'97), pages 201--214, Paris, France, 1997. ACM Press.
[28]
S. Long and G. Fursin. Systematic search within an optimisation space based on unified transformation framework. IJCSE Intl. J. of Computational Science and Engineering, 2006.
[29]
S. Long and M. O'Boyle. Adaptive Java optimisation using instance-based learning. In ACM Intl. Conf. on Supercomputing (ICS'04), pages 237--246, Saint-Malo, France, June 2004.
[30]
A. Nisbet. GAPS: A compiler framework for genetic algorithm (GA) optimised parallelisation. In HPCN Europe 1998: Proc. of the Intl. Conf. and Exhibition on High-Performance Computing and Networking, pages 987--989, London, UK, 1998. Springer-Verlag.
[31]
M. Palkovič. Enhanced Applicability of Loop Transformations. PhD thesis, T.U. Eindhoven, The Netherlands, Sept. 2007.
[32]
S. Pop, A. Cohen, C. Bastoul, S. Girbal, P. Jouvelot, G.-A. Silber, and N. Vasilache. GRAPHITE: Loop optimizations based on the polyhedral model for GCC. In Proc. of the 4th GCC Developper's Summit, Ottawa, Canada, June 2006.
[33]
L.-N. Pouchet, C. Bastoul, J. Cavazos, and A. Cohen. A note on the performance distribution of affine schedules. 2nd Workshop on Statistical and Machine learning approaches to ARchitectures and compilaTion (SMART'08), Göteborg, Sweden, Jan. 2008.
[34]
L.-N. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache. Iterative optimization in the polyhedral model: Part I, one-dimensional time. In IEEE/ACM Intl. Symp. on Code Generation and Optimization (CGO'07), pages 144--156, San Jose, CA, USA, Mar. 2007.
[35]
W. Pugh. The Omega test: a fast and practical integer programming algorithm for dependence analysis. In ACM Intl. Conf. on Supercomputing (ICS'91), pages 4--13, Albuquerque, NM, USA, Aug. 1991.
[36]
M. Püschel, B. Singer, J. Xiong, J. Moura, J. Johnson, D. Padua, M. Veloso, and R. W. Johnson. SPIRAL: A generator for platform-adapted libraries of signal processing algorithms. J. of High Performance Computing and Applications, special issue on Automatic Performance Tuning, 18(1):21--45, 2004.
[37]
F. Quilleré, S. Rajopadhye, and D. Wilde. Generation of efficient nested loops from polyhedra. Intl. J. of Parallel Programming, 28(5):469--498, Oct. 2000.
[38]
L. Renganarayanan, D. Kim, S. Rajopadhye, and M. M. Strout. Parameterized tiled loops for free. SIGPLAN Notices, Proc. of the 2007 PLDI Conf., 42(6):405--414, 2007.
[39]
A. Schrijver. Theory of Linear and Integer Programming. John Wiley & Sons, 1986.
[40]
M. Stephenson, S. Amarasinghe, M. Martin, and U.-M. O'Reilly. Meta optimization: improving compiler heuristics with machine learning. SIGPLAN Notices, 38(5):77--90, 2003.
[41]
S. Triantafyllis, M. Vachharajani, and D. I. August. Compiler optimization-space exploration. In J. of Instruction-level Parallelism, volume 7, Jan. 2005.
[42]
N. Vasilache, C. Bastoul, and A. Cohen. Polyhedral code generation in the real world. In Proc. of the Intl. Conf. on Compiler Construction (ETAPS CC 16), volume 3923, pages 185--201, Vienna, Austria, Mar. 2006. Springer-Verlag.
[43]
N. Vasilache, A. Cohen, and L.-N. Pouchet. Automatic correction of loop transformations. In IEEE Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT'07), pages 292--302, Brasov, Romania, Sept. 2007.
[44]
F. Vivien. On the optimality of Feautrier's scheduling algorithm. In Intl. Euro-Par Conf. on Parallel Processing (EURO--PAR'02), pages 299--308, London, UK, 2002. Springer-Verlag.
[45]
D. K. Wilde. A library for doing polyhedral operations. Technical Report 785, IRISA, Rennes, France, 1993.
[46]
M. Wolfe. High performance compilers for parallel computing. Addison-Wesley Publishing Company, 1995.
[47]
J. Xue. Transformations of nested loops with non-convex iteration spaces. Parallel Computing, 22(3):339--368, 1996.

Cited By

View all
  • (2021)Simplifying dependent reductions in the polyhedral modelProceedings of the ACM on Programming Languages10.1145/34343015:POPL(1-33)Online publication date: 4-Jan-2021
  • (2021)Learning based compilation of embedded applications targeting minimal energy consumption▪Journal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2021.102116116:COnline publication date: 1-Jun-2021
  • (2020)Fast linear programming through transprecision computing on small and sparse dataProceedings of the ACM on Programming Languages10.1145/34282634:OOPSLA(1-28)Online publication date: 13-Nov-2020
  • Show More Cited By

Index Terms

  1. Iterative optimization in the polyhedral model: part ii, multidimensional time

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2008
    396 pages
    ISBN:9781595938602
    DOI:10.1145/1375581
    • General Chair:
    • Rajiv Gupta,
    • Program Chair:
    • Saman Amarasinghe
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 43, Issue 6
      PLDI '08
      June 2008
      382 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1379022
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. affine scheduling
    2. genetic algorithm
    3. iterative compilation
    4. loop transformation

    Qualifiers

    • Research-article

    Conference

    PLDI '08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 406 of 2,067 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)51
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Simplifying dependent reductions in the polyhedral modelProceedings of the ACM on Programming Languages10.1145/34343015:POPL(1-33)Online publication date: 4-Jan-2021
    • (2021)Learning based compilation of embedded applications targeting minimal energy consumption▪Journal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2021.102116116:COnline publication date: 1-Jun-2021
    • (2020)Fast linear programming through transprecision computing on small and sparse dataProceedings of the ACM on Programming Languages10.1145/34282634:OOPSLA(1-28)Online publication date: 13-Nov-2020
    • (2020)Autotuning Search Space for Loop Transformations2020 IEEE/ACM 6th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC) and Workshop on Hierarchical Parallelism for Exascale Computing (HiPar)10.1109/LLVMHPCHiPar51896.2020.00007(12-22)Online publication date: Nov-2020
    • (2020)Smart selection of optimizations in dynamic compilersConcurrency and Computation: Practice and Experience10.1002/cpe.608933:18Online publication date: 26-Nov-2020
    • (2018)Speeding up Iterative Polyhedral Schedule Optimization with Surrogate Performance ModelsACM Transactions on Architecture and Code Optimization10.1145/329177315:4(1-27)Online publication date: 19-Dec-2018
    • (2018)An empirical study of the effect of source-level loop transformations on compiler stabilityProceedings of the ACM on Programming Languages10.1145/32764962:OOPSLA(1-29)Online publication date: 24-Oct-2018
    • (2018)Polyhedral Search Space Exploration in the ExaStencils Code GeneratorACM Transactions on Architecture and Code Optimization10.1145/327465315:4(1-25)Online publication date: 10-Oct-2018
    • (2018)A Survey on Compiler Autotuning using Machine LearningACM Computing Surveys10.1145/319797851:5(1-42)Online publication date: 18-Sep-2018
    • (2018)Machine Learning in Compiler OptimizationProceedings of the IEEE10.1109/JPROC.2018.2817118106:11(1879-1901)Online publication date: Nov-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media