Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Iterative Schedule Optimization for Parallelization in the Polyhedron Model

Published: 22 August 2017 Publication History
  • Get Citation Alerts
  • Abstract

    The polyhedron model is a powerful model to identify and apply systematically loop transformations that improve data locality (e.g., via tiling) and enable parallelization. In the polyhedron model, a loop transformation is, essentially, represented as an affine function. Well-established algorithms for the discovery of promising transformations are based on performance models. These algorithms have the drawback of not being easily adaptable to the characteristics of a specific program or target hardware. An iterative search for promising loop transformations is more easily adaptable and can help to learn better models. We present an iterative optimization method in the polyhedron model that targets tiling and parallelization. The method enables either a sampling of the search space of legal loop transformations at random or a more directed search via a genetic algorithm. For the latter, we propose a set of novel, tailored reproduction operators. We evaluate our approach against existing iterative and model-driven optimization strategies. We compare the convergence rate of our genetic algorithm to that of random exploration. Our approach of iterative optimization outperforms existing optimization techniques in that it finds loop transformations that yield significantly higher performance. If well configured, then random exploration turns out to be very effective and reduces the need for a genetic algorithm.

    Supplementary Material

    TACO1402-23 (taco1403-23.pdf)
    Slide deck associated with this paper

    References

    [1]
    T. W. Anderson and J. D. Finn. 1996. The New Statistical Analysis of Data. Springer.
    [2]
    S. Balev, P. Quinton, S. V. Rajopadhye, and T. Risset. 1998. Linear programming models for scheduling systems of affine recurrence equations—A comparative study. In Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’98). ACM Press, 250--258.
    [3]
    M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. 2010. The polyhedral model is more widely applicable than you think. In Compiler Construction. Lecture Notes in Computer Science, Vol. 6011, Rajiv Gupta (Ed.). Springer, 283--303.
    [4]
    Y. Benjamini and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B 57, 1 (1995), 289--300.
    [5]
    U. Bondhugula and others. 2008. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Compiler Construction. Lecture Notes in Computer Science, Vol. 4959, Laurie Hendren (Ed.). Springer, 132--146.
    [6]
    U. Bondhugula, A. Acharya, and A. Cohen. 2016. The Pluto+ algorithm: A practical approach for parallelization and locality optimization of affine loop nests. ACM Trans. Program. Lang. Syst. 38, 3 (May 2016), 12:1--12:32.
    [7]
    J. Clarke and others. 2003. Reformulating software engineering as a search problem. IEEE Proc. Softw. 150, 3 (June 2003), 161--175.
    [8]
    P. Feautrier. 1988. Parametric integer programming. RAIRO Operat. Res. 22, 3 (1988), 243--268.
    [9]
    P. Feautrier. 1991. Dataflow analysis of array and scalar references. Int. J. Par. Prog. 20, 1 (1991), 23--53.
    [10]
    P. Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part I. One-dimensional time. Int. J. Par. Prog. 21, 5 (1992), 313--347.
    [11]
    P. Feautrier. 1992. Some efficient solutions to the affine scheduling problem. Part II. multidimensional time. Int. J. Par. Prog. 21, 6 (1992), 389--420.
    [12]
    P. Feautrier and C. Lengauer. 2011. Polyhedron model. In Encyclopedia of Parallel Computing, Vol. 3, D. Padua and others (Eds.). Springer, 1581--1591.
    [13]
    M. Griebl, P. Feautrier, and C. Lengauer. 2000. Index set splitting. Int. J. Par. Prog. 28, 6 (Dec. 2000), 607--631.
    [14]
    T. Grosser, A. Größlinger, and C. Lengauer. 2012. Polly -- Performing polyhedral optimizations on a low-level intermediate representation. Par. Proc. Lett. 22, 4 (2012), article 1250010, 28 pages.
    [15]
    T. Grosser, S. Verdoolaege, and A. Cohen. 2015. Polyhedral AST generation is more than scanning polyhedra. ACM Trans. Program. Lang. Syst. 37, 4 (Aug. 2015), 12:1--12:50.
    [16]
    M. Harman. 2007. The current state and future of search based software engineering. In Proceedings of the Workshop on the Future of Software Engineering (FOSE’07). IEEE Computer Society, 342--357.
    [17]
    F. Irigoin. 2011. Tiling. In Encyclopedia of Parallel Computing, Vol. 4, D. Padua and others (Eds.). Springer, 2040--2049.
    [18]
    W. Kelly and W. Pugh. 1995. A unifying framework for iteration reordering transformations. In Proceedings of the IEEE First International Conference on Algorithms and Architectures for Parallel Processing (ICAPP’95), Vol. 1. IEEE, 153--162.
    [19]
    K. Kennedy and K. S. McKinley. 1993. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science, Vol. 768, U. Banerjee and others (Eds.). Springer, 301--320.
    [20]
    A. Kleen. 2004. An NUMA API for Linux. Technical Report. SUSE Labs.
    [21]
    P. R. Krishnaiah and P. K. Sen. 1984. Handbook of Statistics, Vol. 4. Elsevier.
    [22]
    C. Lattner. 2008. LLVM and clang: Next generation compiler technology. In Proceedings of the BSD Conference (BSDCan’08).
    [23]
    H. Le Verge. 1994. A Note on Chernikova’s Algorithm. Res. Report RR-1662. INRIA.
    [24]
    S. Long and G. Fursin. 2009. Systematic search within an optimisation space based on unified transformation framework. Int. J. Comput. Sci. Eng. 4, 2 (2009), 102--111.
    [25]
    S. Long and M. F. P. O’Boyle. 2004. Adaptive java optimisation using instance-based learning. In Proceedings of the 18th Annual International Conference on Supercomputing (ICS’04). ACM, 237--246.
    [26]
    M. Mitchell. 1998. An Introduction to Genetic Algorithms. MIT Press.
    [27]
    A. Nisbet. 1998. GAPS: A compiler framework for genetic algorithm (GA) optimised parallelisation. In High-Performance Computing and Networking (HPCN Europe), P. Sloot, M. Bubak, and B. Hertzberger (Eds.). Springer, 987--989.
    [28]
    A. Nisbet. 2001. Towards retargettable compilers -- Feedback directed compilation using genetic algorithms. In Proceedings of the 9th International Workshop on Compilers for Parallel Computers (CPC’01).
    [29]
    M. Odersky, L. Spoon, and B. Venners. 2008. Programming in Scala. Artima.
    [30]
    D. Padua. 2011. Parallelization, automatic. In Encyclopedia of Parallel Computing, D. Padua and others (Eds.). Vol. 3. Springer, 1442--1450.
    [31]
    L.-N. Pouchet. 2012. LeTSeE—The LEgal Transformation SpacE Explorator. Retrieved from http://web.cs.ucla.edu/ pouchet/software/letsee/.
    [32]
    L.-N. Pouchet and others. 2007. Iterative optimization in the polyhedral model: Part I, one-dimensional time. In Proceedings of the 5th International Symposium on Code Generation and Optimization (CGO’07). IEEE Computer Society, 144--156.
    [33]
    L.-N. Pouchet and others. 2008. Iterative optimization in the polyhedral model: Part II, multidimensional time. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). ACM, 90--100.
    [34]
    L.-N. Pouchet and others. 2010. Combined iterative and model-driven optimization in an automatic parallelization framework. In Proceedings ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis (SC’10). IEEE Computer Society, 1--11.
    [35]
    L.-N. Pouchet and T. Yuki. 2015. PolyBench 4.1. Retrieved May2015 from http://web.cse.ohio-state.edu/∼pouchet/software/polybench/.
    [36]
    A. Schrijver. 1994. Theory of Linear and Integer Programming. John Wiley & Sons.
    [37]
    K. Trifunovic and others. 2010. GRAPHITE two years after: First lessons learned from real-world polyhedral compilation. In Proceedings of the International Workshop on GCC Research Opportunities (GROW’10). 1--13.
    [38]
    R. Upadrasta and A. Cohen. 2013. Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’13). ACM, 483--496.
    [39]
    S. Verdoolaege. 2010. isl: An integer set library for the polyhedral model. In Proceedings of the International Conference on Mathematical Software (ICMS’10), K. Fukuda and others (Eds.). Springer, 299--302.
    [40]
    M. Wolfe. 1986. Loops skewing: The wavefront method revisited. Int. J. Par. Prog. 15, 4 (Aug. 1986), 279--293.

    Cited By

    View all
    • (2023)Cache Programming for Scientific Loops Using LeasesACM Transactions on Architecture and Code Optimization10.1145/360009020:3(1-25)Online publication date: 19-Jul-2023
    • (2023)Super-quadric CFD-DEM study of spout deflection behaviour of non-spherical particles in a spout fluidized bedPowder Technology10.1016/j.powtec.2023.119240(119240)Online publication date: Dec-2023
    • (2022)Investigating magic numbers: improving the inlining heuristic in the Glasgow Haskell CompilerProceedings of the 15th ACM SIGPLAN International Haskell Symposium10.1145/3546189.3549918(81-94)Online publication date: 6-Sep-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 14, Issue 3
    September 2017
    278 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/3132652
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2017
    Accepted: 01 June 2017
    Revised: 01 May 2017
    Received: 01 December 2016
    Published in TACO Volume 14, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Automatic loop optimization
    2. OpenMP
    3. genetic algorithm
    4. parallelization
    5. polyhedron model
    6. tiling

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • German Research Foundation

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)85
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Cache Programming for Scientific Loops Using LeasesACM Transactions on Architecture and Code Optimization10.1145/360009020:3(1-25)Online publication date: 19-Jul-2023
    • (2023)Super-quadric CFD-DEM study of spout deflection behaviour of non-spherical particles in a spout fluidized bedPowder Technology10.1016/j.powtec.2023.119240(119240)Online publication date: Dec-2023
    • (2022)Investigating magic numbers: improving the inlining heuristic in the Glasgow Haskell CompilerProceedings of the 15th ACM SIGPLAN International Haskell Symposium10.1145/3546189.3549918(81-94)Online publication date: 6-Sep-2022
    • (2022)TEA-SEAExpert Systems with Applications: An International Journal10.1016/j.eswa.2021.116152191:COnline publication date: 1-Apr-2022
    • (2021)PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT52795.2021.00009(17-29)Online publication date: Sep-2021
    • (2019)An Autotuning Framework for Scalable Execution of Tiled Code via Iterative Polyhedral CompilationACM Transactions on Architecture and Code Optimization10.1145/329344915:4(1-23)Online publication date: 8-Jan-2019
    • (2018)Speeding up Iterative Polyhedral Schedule Optimization with Surrogate Performance ModelsACM Transactions on Architecture and Code Optimization10.1145/329177315:4(1-27)Online publication date: 19-Dec-2018
    • (2018)Polyhedral Search Space Exploration in the ExaStencils Code GeneratorACM Transactions on Architecture and Code Optimization10.1145/327465315:4(1-25)Online publication date: 10-Oct-2018

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media