research-article

A graph-based iterative compiler pass selection and phase ordering approach

Authors:

Luiz G. A. Martins,

João M. P. CardosoAuthors Info & Claims

LCTES 2016: Proceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems

Pages 21 - 30

https://doi.org/10.1145/2907950.2907959

Published: 13 June 2016 Publication History

Abstract

Nowadays compilers include tens or hundreds of optimization passes, which makes it difficult to find sequences of optimizations that achieve compiled code more optimized than the one obtained using typical compiler options such as -O2 and -O3. The problem involves both the selection of the compiler passes to use and their ordering in the compilation pipeline. The improvement achieved by the use of custom phase orders for each function can be significant, and thus important to satisfy strict requirements such as the ones present in high-performance embedded computing systems. In this paper we present a new and fast iterative approach to the phase selection and ordering challenges resulting in compiled code with higher performance than the one achieved with the standard optimization levels of the LLVM compiler. The obtained performance improvements are comparable with the ones achieved by other iterative approaches while requiring considerably less time and resources. Our approach is based on sampling over a graph representing transitions between compiler passes. We performed a number of experiments targeting the LEON3 microarchitecture using the Clang/LLVM 3.7 compiler, considering 140 LLVM passes and a set of 42 representative signal and image processing C functions. An exhaustive cross-validation shows our new exploration method is able to achieve a geometric mean performance speedup of 1.28x over the best individually selected -OX flag when considering 100,000 iterations; versus geometric mean speedups from 1.16x to 1.25x obtained with state-of-the-art iterative methods not using the graph. From the set of exploration methods tested, our new method is the only one consistently finding compiler sequences that result in performance improvements when considering 100 or less exploration iterations. Specifically, it achieved geometric mean speedups of 1.08x and 1.16x for 10 and 100 iterations, respectively.

References

[1]

GCC, the GNU Compiler Collection, https://www.gnu.org/software/gcc/.

[2]

Lelac Almagor, Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steven W. Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman, 2004. Finding effective compilation sequences. SIGPLAN Not. 39, 7, 231-239.

Digital Library

[3]

Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, and Chengyong Wu, 2012. Deconstructing iterative optimization. ACM Transactions on Architecture and Code Optimization (TACO). 9, 3, 1-30.

Digital Library

[4]

Ricardo Nobre, 2013. Identifying sequences of optimizations for HW/SW compilation. In 23rd International Conference on Field Programmable Logic and Applications (FPL), 2013, 1-2.

[5]

Luiz G.A. Martins, Ricardo Nobre, Alexandre C.B. Delbem, Eduardo Marques, and Jo˜ao M.P. Cardoso, 2014. Exploration of compiler optimization sequences using clustering-based selection. In ACM Proc. 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems (LCTES), 63-72.

Digital Library

[6]

Ricardo Nobre, Luiz G.A. Martins, and Jo˜ao M.P. Cardoso, 2015. Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration. In Proc. 18th International Workshop on Software and Compilers for Embedded Systems (SCOPES ’15) (Schloss Rheinfels, St. Goar, Germany, June 1-3, 2015).

Digital Library

[7]

Amir H. Ashouri, Giovanni Mariani, Gianluca Palermo, and Cristina Silvano, 2014. A Bayesian network approach for compiler autotuning for embedded processors. In IEEE 12th Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia), 2014, 90- 97.

[8]

Aeroflex Gaisler, LEON3 Processor, http://www.gaisler.com/index.php/products/processors/leon3.

[9]

Aeroflex, TSIM2 ERC32/LEON simulator, http://www.gaisler.com/index.php/products/simulators/tsim.

[10]

Texas Instruments, 2008. TMS320C64x+ DSP Little-Endian Library Programmer’s Reference (Rev. B).

[11]

Texas Instruments, 2008. TMS320C64x+ DSP Image/Video Processing Library (v2.0) Programmer’s Reference (Rev. A).

[12]

Luiz G. A. Martins, Ricardo Nobre, Jo˜ao M.P. Cardoso, Alexandre C.B. Delbem, and Eduardo Marques. Clustering-Based Selection for the Exploration of Compiler Optimization Sequences. ACM Trans. Archit. Code Optim. 13, 1, Article 8 (March 2016), 28 pages.

Digital Library

[13]

Huang Qijing, Ruolong Lian, Andrew Canis, Jongsok Choi, Ryan Xi, Nazanin Calagar, Stephen Brown, Jason Anderson, 2013. The Effect of Compiler Optimizations on High-Level Synthesis for FPGAs. In IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2013, 89-96.

Digital Library

[14]

Jo˜ao M.P. Cardoso, Tiago Carvalho, Jos G.F. Coutinho, Wayne Luk, Ricardo Nobre, Pedro Diniz, and Zlatko Petrov, 2012. LARA: an aspect-oriented programming language for embedded systems. In Proceedings of the 11th annual international conference on Aspectoriented Software Development (Potsdam, Germany, 2012), ACM, 2162071, 179-190.

Digital Library

[15]

Ricardo Nobre, Jo˜ao M.P. Cardoso, Bryan Olivier, Razvan Nane, Liam Fitzpatrick, Jos Gabriel de F. Coutinho, Hans van Someren, Vlad-Mihai Sima, Koen Bertels, and Pedro C. Diniz, 2013. Hardware/Software Compilation. In Compilation and Synthesis for Embedded Reconfigurable Systems, J.M.P. Cardoso, P.C. Diniz, J.G.F. Coutinho and Z.M. Petrov Eds. Springer New York, 105-134.

[16]

clang: a C language family frontend for LLVM, http://clang.llvm.org/.

[17]

The LLVM Compiler Infrastructure, http://llvm.org/.

[18]

David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, 1st ed., Addison-Wesley Longman, 1989.

Digital Library

[19]

Scott Kirkpatrick, C. D. Gelat, Mario P. Vecchi. Optimization by simulated annealing. Science 220, 671-680 (1983).

[20]

Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steve Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman, 2006. Exploring the structure of the space of compilation sequences using randomized search algorithms. The Journal of Supercomputing 36, 2 (2006/05/01), 135-151.

Digital Library

[21]

Prasad A. Kulkarni, Stephen R. Hines, David B. Whalley, Jason D. Hiser, Jack W. Davidson, and Douglas L. Jones, 2004. Fast searches for effective optimization phase sequences. SIGPLAN Not. 39, 6, 171-182.

Digital Library

[22]

Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson, and Jack W. Davidson, 2009. Practical exhaustive optimization phase order exploration and evaluation. ACM Trans. Archit. Code Optim. 6, 1, 1-36.

Digital Library

[23]

Prasad A. Kulkarni, Michael R. Jantz, and David B. Whalley, 2010. Improving both the performance benefits and speed of optimization phase sequence searches. SIGPLAN Not. 45, 4, 95-104.

Digital Library

[24]

Suresh Purini and Lakshya Jain, 2013. Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. 9, 4, 1-23.

Digital Library

[25]

Michael R. Jantz and Prasad A. Kulkarni, 2013. Performance potential of optimization phase selection during dynamic JIT compilation. SIGPLAN Not. 48, 7, 131-142.

Digital Library

[26]

Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael F.P. O’Boyle, John Thomson, Marc Toussaint, and Christopher K.I. Williams, 2006. Using Machine Learning to Focus Iterative Optimization. In Proc. International Symposium on Code Generation and Optimization (2006), IEEE Computer Society, 1122412, 295-305.

Digital Library

[27]

Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Christopher K. I. Williams, and Michael OBoyle, 2011. Milepost GCC: Machine Learning Enabled Self-tuning Compiler. International Journal of Parallel Programming 39, 3 (2011/06/01), 296-327.

[28]

Gene Sher, Kyle Martin, and Damian Dechev, 2014. Preliminary results for neuroevolutionary optimization phase order generation for static compilation. In Proc. 11th Workshop on Optimizations for DSP and Embedded Systems (Orlando, Florida, USA, 2014), ACM, 33- 40.

Digital Library

Cited By

Cummins CWasti BGuo JCui BAnsel JGomez SJain SLiu JTeytaud OSteiner BTian YLeather HLee J(2022)CompilerGymProceedings of the 20th IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO53902.2022.9741258(92-105)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1109/CGO53902.2022.9741258
Wang TJain NBoehme DBeckingsale DMueller FGamblin TAyguadé EHwu WBadia RHofstee H(2020)CodeSeerProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392741(1-11)Online publication date: 29-Jun-2020
https://dl.acm.org/doi/10.1145/3392717.3392741
Cereda SPalermo GCremonesi PDoni SXue JJung C(2020)A Collaborative Filtering Approach for the Automatic Tuning of Compiler OptimisationsThe 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3372799.3394361(15-25)Online publication date: 16-Jun-2020
https://dl.acm.org/doi/10.1145/3372799.3394361
Show More Cited By

Index Terms

A graph-based iterative compiler pass selection and phase ordering approach
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Predictive modeling methodology for compiler phase-ordering
PARMA-DITAM '16: Proceedings of the 7th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and the 5th Workshop on Design Tools and Architectures For Multicore Embedded Computing Platforms

Today's compilers offer a huge number of transformation options to choose among and this choice can significantly impact on the performance of the code being optimized. Not only the selection of compiler options represents a hard problem to be solved, ...
A graph-based iterative compiler pass selection and phase ordering approach
LCTES '16

Nowadays compilers include tens or hundreds of optimization passes, which makes it difficult to find sequences of optimizations that achieve compiled code more optimized than the one obtained using typical compiler options such as -O2 and -O3. The ...
Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration
SCOPES '15: Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems

This paper presents a new approach to efficiently search for suitable compiler pass sequences, a challenge known as phase ordering. Our approach relies on information about the relative positions of compiler passes in compiler pass sequences previously ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

LCTES 2016: Proceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems

June 2016

122 pages

ISBN:9781450343169

DOI:10.1145/2907950

General Chair:
Tei-Wei Kuo,
Program Chair:
David B. Whalley

ACM SIGPLAN Notices Volume 51, Issue 5
LCTES '16
May 2016
122 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2980930
Editor:
Andy Gill
University of Kansas, Lawrence, KS
Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Fundação para a Ciência e a Tecnologia
NORTE 2020

Conference

LCTES'16

Sponsor:

LCTES'16: SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2016

June 13 - 14, 2016

CA, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 116 of 438 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
583
Total Downloads

Downloads (Last 12 months)65
Downloads (Last 6 weeks)7

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cummins CWasti BGuo JCui BAnsel JGomez SJain SLiu JTeytaud OSteiner BTian YLeather HLee J(2022)CompilerGymProceedings of the 20th IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO53902.2022.9741258(92-105)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1109/CGO53902.2022.9741258
Wang TJain NBoehme DBeckingsale DMueller FGamblin TAyguadé EHwu WBadia RHofstee H(2020)CodeSeerProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392741(1-11)Online publication date: 29-Jun-2020
https://dl.acm.org/doi/10.1145/3392717.3392741
Cereda SPalermo GCremonesi PDoni SXue JJung C(2020)A Collaborative Filtering Approach for the Automatic Tuning of Compiler OptimisationsThe 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3372799.3394361(15-25)Online publication date: 16-Jun-2020
https://dl.acm.org/doi/10.1145/3372799.3394361
Jiang WJianjun XXiankai MZhuo ZNan ZHaoyu Z(2020)High-Reliability Compilation Optimization Sequence Generation Framework Based ANN2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS51102.2020.00053(347-355)Online publication date: Dec-2020
https://doi.org/10.1109/QRS51102.2020.00053
Lattuada MFerrandi F(2019)A Design Flow Engine for the Support of Customized Dynamic High Level Synthesis FlowsACM Transactions on Reconfigurable Technology and Systems10.1145/335647512:4(1-26)Online publication date: 31-Oct-2019
https://dl.acm.org/doi/10.1145/3356475
Wang TJain NBeckingsale DBoehme DMueller FGamblin T(2019)FuncyTunerProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337842(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337842
Ben-Nun TJakobovits AHoefler T(2018)Neural code comprehensionProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327276(3589-3601)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327276
Silvano CPalermo GAgosta GAshouri AGadioli DCherubin SVitali EBenini LBartolini ACesarini DCardoso JBispo JPinto PNobre RRohou EBesnard LLasri ISanna NCavazzoni CCmar RMartinovič JSlaninová KGolasowski MBeccari AManelfi CKaeli DPericàs M(2018)Autotuning and adaptivity in energy efficient HPC systemsProceedings of the 15th ACM International Conference on Computing Frontiers10.1145/3203217.3205338(270-275)Online publication date: 8-May-2018
https://dl.acm.org/doi/10.1145/3203217.3205338
Ashouri AKillian WCavazos JPalermo GSilvano C(2018)A Survey on Compiler Autotuning using Machine LearningACM Computing Surveys10.1145/319797851:5(1-42)Online publication date: 18-Sep-2018
https://dl.acm.org/doi/10.1145/3197978
Wang ZO'Boyle M(2018)Machine Learning in Compiler OptimizationProceedings of the IEEE10.1109/JPROC.2018.2817118106:11(1879-1901)Online publication date: Nov-2018
https://doi.org/10.1109/JPROC.2018.2817118
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents