Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3446804.3446849acmconferencesArticle/Chapter ViewAbstractPublication PagesccConference Proceedingsconference-collections
research-article

Exploring the space of optimization sequences for code-size reduction: insights and tools

Published: 27 February 2021 Publication History

Abstract

The optimization space of a compiler is the set of every possible sequence of optimizations that said compiler can use. The exploration of the optimization space of any mainstream compiler has been, for decades, hampered by the lack of benchmarks. However, recent efforts from different research groups have made available a large quantity of compilable code that can be, today, used to overcome this problem. In this paper, we use 15,000 programs from a public collection to explore the optimization space of LLVM, focusing on code-size reduction. This exploration reveals that the probability of beating the default optimization levels of LLVM with random sequences ranges from 10% (considering opt -Oz) to 19% (considering clang -Os). Yet, the distribution of probabilities is uneven across programs: the default levels work well for most programs, and poorly for a few. Based on these observations, we introduce the notion of an Optimization Cache, a table of programs to optimization sequences that can be used to support predictive compilation. We then use an optimization cache to build what we call a Default Covering Set: a small ensemble of optimization sequences that, once combined, tend to be good for any program. Optimization caches and default covering sets are used independently. The former, when applied onto MiBench, yield programs that are 11.9% smaller than programs produced by opt -Os, on average. The latter produce programs 12.5% smaller.

Supplementary Material

Auxiliary Archive (cc21main-p49-p-archive.zip)
This is the data used to produce the charts and tables in the paper.

References

[1]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. F. P. O'Boyle, J. Thomson, M. Toussaint, and C. K. I. Williams. 2006. Using Machine Learning to Focus Iterative Optimization. In CGO. IEEE Computer Society, Washington, DC, USA, 295-305. htps://doi.org/10.1109/ CGO. 2006.37
[2]
Amir Hossein Ashouri, Andrea Bignoli, Gianluca Palermo, and Cristina Silvano. 2016. Predictive Modeling Methodology for Compiler PhaseOrdering. In PARMA-DITAM. Association for Computing Machinery, New York, NY, USA, 7-12. htps://doi.org/10.1145/2872421.2872424
[3]
Amir H. Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano. 2018. A Survey on Compiler Autotuning Using Machine Learning. Comput. Surv. 51, 5 ( 2018 ), 96 : 1-96 : 42. htps: //doi.org/10.1145/3197978
[4]
Amir Hossein Ashouri, Giovanni Mariani, Gianluca Palermo, Eunjung Park, John Cavazos, and Cristina Silvano. 2016. COBAYN: Compiler Autotuning Framework Using Bayesian Networks. TACO 13, 2 ( 2016 ), 21 : 1-21 : 25. htps://doi.org/10.1145/2928270
[5]
Francesco Biscani and Dario Izzo. 2020. A parallel global multiobjective framework for optimization: pagmo. Journal of Open Source Software 5, 53 ( 2020 ), 2338. htps://doi.org/10.21105/joss.02338
[6]
John Cavazos and Michael F. P. O'Boyle. 2006. Method-Specific Dynamic Compilation Using Logistic Regression. In OOPSLA. Association for Computing Machinery, New York, NY, USA, 229-240. htps://doi.org/10.1145/1167473.1167492
[7]
Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. 1999. Optimizing for Reduced Code Space Using Genetic Algorithms. In LCTES. ACM, New York, NY, USA, 1-9. htps://doi.org/10.1145/314403. 314414
[8]
Anderson Faustino da Silva, Bruno Kind, José Wesley Magalh aes, Jerônimo Rocha, Breno Guimar aes, and Fernando Magno Quint ao Pereira. 2020. AnghaBench: a Synthetic Collection of Benchmarks Mined from Open-Source Repositories. Technical Report 01-2020. Universidade Federal de Minas Gerais.
[9]
Anderson Faustino da Silva, Bruno Conde Kind, Jose Wesley de Souza Magalhaes, Jeronimo Nunes Rocha, Breno Campos Ferreira Guimaraes, and Fernando Magno Quintao Pereira. 2021. AnghaBench: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction. In CGO. ACM, New York, NY, USA, 1-13.
[10]
Shuangde Fang, Wenwen Xu, Yang Chen, Lieven Eeckhout, Olivier Temam, Yunji Chen, Chengyong Wu, and Xiaobing Feng. 2015. Practical Iterative Optimization for the Data Center. ACM Trans. Archit. Code Optim. 12, 2, Article 15 (May 2015 ), 26 pages. htps: //doi.org/10.1145/2739048
[11]
D. Fatiregun, M. Harman, and R. M. Hierons. 2004. Evolving transformation sequences using genetic algorithms. In International Workshop on Source Code Analysis and Manipulation. IEEE, USA, 65-74. htps://doi.org/10.1109/SCAM. 2004.11
[12]
João Fabrício Filho, Luis Gustavo Araujo Rodriguez, and Anderson Faustino da Silva. 2018. Yet Another Intelligent Code-Generating System: A Flexible and Low-Cost Solution. J. Comput. Sci. Technol. 33, 5 ( 2018 ), 940-965. htps://doi.org/10.1007/s11390-018-1867-7
[13]
Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, et al. 2011. Milepost GCC : Machine Learning Enabled Self-tuning Compiler. International journal of parallel programming 39, 3 ( 2011 ), 296-327.
[14]
Grigori Fursin and Olivier Temam. 2010. Collective Optimization: A Practical Collaborative Approach. Trans. Archit. Code Optim. 7, 4 ( 2010 ), 20 : 1-20 : 29. htps://doi.org/10.1145/1880043.1880047
[15]
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In WWC. IEEE, Washington, DC, USA, 3-14. htps://doi.org/10.1109/WWC. 2001.15
[16]
Shalini Jain, Utpal Bora, Prateek Kumar, Vaibhav B. Sinha, Suresh Purini, and Ramakrishna Upadrasta. 2019. An analysis of executable size reduction by LLVM passes. CSIT 7, 1 ( 2019 ), 105-110. htps: //doi.org/10.1007/s40012-019-00248-5
[17]
Prasad Kulkarni, Wankang Zhao, Hwashin Moon, Kyunghwan Cho, David Whalley, Jack Davidson, Mark Bailey, Yunheung Paek, and Kyle Gallivan. 2003. Finding Efective Optimization Phase Sequences. In LCTES. ACM, New York, NY, USA, 12-23. htps://doi.org/10.1145/ 780732.780735
[18]
Hugh Leather, Edwin Bonilla, and Michael O' boyle. 2014. Automatic Feature Generation for Machine Learning-Based Optimising Compilation. ACM Trans. Archit. Code Optim. 11, 1, Article 14 ( 2014 ), 32 pages. htps://doi.org/10.1145/2536688
[19]
Luiz G. A. Martins, Ricardo Nobre, João M. P. Cardoso, Alexandre C. B. Delbem, and Eduardo Marques. 2016. Clustering-Based Selection for the Exploration of Compiler Optimization Sequences. ACM Trans. Archit. Code Optim. 13, 1, Article 8 (March 2016 ), 28 pages. htps: //doi.org/10.1145/2883614
[20]
Mircea Namolaru, Albert Cohen, Grigori Fursin, Ayal Zaks, and Ari Freund. 2010. Practical Aggregation of Semantical Program Properties for Machine Learning Based Optimization. In CASES. ACM, New York, NY, USA, 197-206. htps://doi.org/10.1145/1878921.1878951
[21]
Mahmoud Parsian. 2015. k-Nearest Neighbors. O'Reilly Media, Boston, USA, 264-275.
[22]
Suresh Purini and Lakshya Jain. 2013. Finding Good Optimization Sequences Covering Program Space. ACM Trans. Archit. Code Optim. 9, 4, Article 56 ( Jan. 2013 ), 23 pages. htps://doi.org/10.1145/2400682. 2400715
[23]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In PLDI. Association for Computing Machinery, New York, NY, USA, 335-346. htps://doi.org/10.1145/2254064.2254104
[24]
John Thomson, Michael O'Boyle, Grigori Fursin, and Björn Franke. 2010. Reducing Training Time in a One-Shot Machine Learning-Based Compiler. In Languages and Compilers for Parallel Computing, Guang R. Gao, Lori L. Pollock, John Cavazos, and Xiaoming Li (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 399-407.
[25]
Zheng Wang and Michael F. P. O'Boyle. 2018. Machine Learning in Compiler Optimization. Proc. IEEE 106, 11 ( 2018 ), 1879-1901. htps: //doi.org/10.1109/JPROC. 2018.2817118

Cited By

View all
  • (2024)The Droplet Search Algorithm for Kernel SchedulingACM Transactions on Architecture and Code Optimization10.1145/365010921:2(1-28)Online publication date: 21-May-2024
  • (2024)Binary Folding Compression for Efficient Software DistributionProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636006(169-176)Online publication date: 8-Apr-2024
  • (2024)Machine Learning-Driven GCC Loop Unrolling Optimization: Compiler Performance Enhancement Strategy Based on XGBoostJournal of Circuits, Systems and Computers10.1142/S0218126625500355Online publication date: 23-Sep-2024
  • Show More Cited By

Index Terms

  1. Exploring the space of optimization sequences for code-size reduction: insights and tools

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CC 2021: Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction
    March 2021
    164 pages
    ISBN:9781450383257
    DOI:10.1145/3446804
    • General Chair:
    • Aaron Smith,
    • Program Chairs:
    • Delphine Demange,
    • Rajiv Gupta
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 February 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. Benchmark
    2. Compiler
    3. Exploration
    4. Optimization

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CC '21
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)68
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The Droplet Search Algorithm for Kernel SchedulingACM Transactions on Architecture and Code Optimization10.1145/365010921:2(1-28)Online publication date: 21-May-2024
    • (2024)Binary Folding Compression for Efficient Software DistributionProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636006(169-176)Online publication date: 8-Apr-2024
    • (2024)Machine Learning-Driven GCC Loop Unrolling Optimization: Compiler Performance Enhancement Strategy Based on XGBoostJournal of Circuits, Systems and Computers10.1142/S0218126625500355Online publication date: 23-Sep-2024
    • (2024)Revealing Compiler Heuristics through Automated Discovery and OptimizationProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444847(55-66)Online publication date: 2-Mar-2024
    • (2023)Shared Dictionary Compression for Efficient Mobile Software Distribution2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00019(85-94)Online publication date: 30-Aug-2023
    • (2022)Geração Automática de Benchmarks para Compilação PreditivaProceedings of the XXVI Brazilian Symposium on Programming Languages10.1145/3561320.3561323(59-67)Online publication date: 6-Oct-2022
    • (2022)RollBin: reducing code-size via loop rerolling at binary levelProceedings of the 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3519941.3535072(99-110)Online publication date: 14-Jun-2022
    • (2022)POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS55109.2022.00012(121-131)Online publication date: May-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media