Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1878921.1878951acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Practical aggregation of semantical program properties for machine learning based optimization

Published: 24 October 2010 Publication History

Abstract

Iterative search combined with machine learning is a promising approach to design optimizing compilers harnessing the complexity of modern computing systems. While traversing a program optimization space, we collect characteristic feature vectors of the program, and use them to discover correlations across programs, target architectures, data sets, and performance. Predictive models can be derived from such correlations, effectively hiding the time-consuming feedback-directed optimization process from the application programmer.
One key task of this approach, naturally assigned to compiler experts, is to design relevant features and implement scalable feature extractors, including statistical models that filter the most relevant information from millions of lines of code. This new task turns out to be a very challenging and tedious one from a compiler construction perspective. So far, only a limited set of ad-hoc, largely syntactical features have been devised. Yet machine learning is only able to discover correlations from information it is fed with: it is critical to select topical program features for a given optimization problem in order for this approach to succeed.
We propose a general method for systematically generating numerical features from a program. This method puts no restrictions on how to logically and algebraically aggregate semantical properties into numerical features. We illustrate our method on the difficult problem of selecting the best possible combination of 88 available optimizations in GCC. We achieve 74% of the potential speedup obtained through iterative compilation on a wide range of benchmarks and four different general-purpose and embedded architectures. Our work is particularly relevant to embedded system designers willing to quickly adapt the optimization heuristics of a mainstream compiler to their custom ISA, microarchitecture, benchmark suite and workload. Our method has been integrated with the publicly released MILEPOST GCC [14].

References

[1]
ACOVEA: Using Natural Selection to Investigate Software Complexities. http://www.coyotegulch.com/products/acovea.
[2]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M.F.P. O'Boyle, J. Thomson, M. Toussaint, and C.K.I. Williams. Using machine learning to focus iterative optimization. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), 2006.
[3]
A.V. Aho, M.S. Lam, R. Sethi, and J.D. Ullman. Compilers: Principles, Techniques and Tools. Addison-Wesley, 2nd edition, 2007.
[4]
F. Bodin, T. Kisuki, P.M.W. Knijnenburg, M.F.P. O'Boyle, and E. Rohou. Iterative compilation in a non-linear optimisation space. In Proceedings of the Workshop on Profile and Feedback Directed Compilation, 1998.
[5]
Edwin V. Bonilla, Christopher K. I. Williams, Felix V. Agakov, John Cavazos, John Thomson, and Michael F. P. O'Boyle. Predictive search distributions. In William W. Cohen and Andrew Moore, editors, Proceedings of the 23rd International Conference on Machine learning, pages 121--128, New York, NY, USA, 2006. ACM.
[6]
J. Cavazos, G. Fursin, F. Agakov, E. Bonilla, M. O'Boyle, and O. Temam. Rapidly selecting good compiler optimizations using performance counters. In Proceedings of the 5th Annual International Symposium on Code Generation and Optimization (CGO), March 2007.
[7]
K. Cooper, A. Grosul, T. Harvey, S. Reeves, D. Subramanian, L. Torczon, and T. Waterman. ACME: adaptive compilation made efficient. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2005.
[8]
K.D. Cooper, P.J. Schielke, and D. Subramanian. Optimizing for reduced code space using genetic algorithms. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 1--9, 1999.
[9]
Collective Tuning Infrastructure: automating and accelerating development and optimization of computing systems. http://cTuning.org.
[10]
ESTO: Expert System for Tuning Optimizations. http://www.haifa.ibm.com/projects/systems/cot/esto.
[11]
Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Courtois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris Williams, and Michael O'Boyle. Milepost gcc: machine learning based research compiler. In Proceedings of the GCC Developers' Summit, June 2008.
[12]
Grigori Fursin and Olivier Temam. Collective optimization. In Proceedings of the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2009), January 2009.
[13]
GCC: GNU Compiler Collection. http://gcc.gnu.org.
[14]
MILEPOST GCC: Collaborative development website. http://cTuning.org/milepost-gcc.
[15]
Matthew R. Guthaus, Jeffrey S. Ringenberg, Dan Ernst, Todd M. Austin, Trevor Mudge, and Richard B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE 4th Annual Workshop on Workload Characterization, Austin, TX, December 2001.
[16]
K. Heydemann and F. Bodin. Iterative compilation for two antagonistic criteria: Application to code size and performance. In Proceedings of the 4th Workshop on Optimizations for DSP and Embedded Systems, colocated with CGO, 2006.
[17]
K. Hoste and L. Eeckhout. Cole: Compiler optimization level exploration. In Proceedings of International Symposium on Code Generation and Optimization (CGO), 2008.
[18]
Shih-Hao Hung, Chia-Heng Tu, Huang-Sen Lin, and Chi-Meng Chen. An automatic compiler optimizations selection framework for embedded applications. In Intl. Conf. on Embedded Software and Systems (ICESS'09), pages 381--387, 2009.
[19]
Raj Jain. The Art of Computer Systems Performance Analysis. John Wiley and Sons, 1991.
[20]
P. Kulkarni, W. Zhao, H. Moon, K. Cho, D. Whalley, J. Davidson, M. Bailey, Y. Paek, and K. Gallivan. Finding effective optimization phase sequences. In Proc. Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 12--23, 2003.
[21]
L. Dehaspe and H. Toivonen. Discovery of frequent datalog patterns. In Data Mining and Knowledge Discovery, pages 7--36, 1999.
[22]
H. Leather, E. Yom-Tov, M. Namolaru, and A. Freund. Automatic feature generation for setting compilers heuristics. In 2nd Workshop on Statistical and Machine Learning Approaches Applied to Architectures and Compilation (SMART'08), colocated with HiPEAC'08 conference, 2008.
[23]
Hugh Leather, Edwin Bonilla, and Michael O'Boyle. Automatic feature generation for machine learning based optimizing compilation. In Proceedings of the International Symposium on Code Generation and Optimization, pages pages 81--91, Washington, DC, USA, 2009. IEEE Computer Society.
[24]
S. MacLane. Categories for the Working Mathematician, volume 5 of Graduate Texts in Mathematics. Springer Verlag, Berlin, 1971.
[25]
F. Matteo and S. Johnson. FFTW: An adaptive softwarearchitecture for the FFT. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 3, pages 1381--1384, Seattle, WA, May 1998.
[26]
A. Monsifrot, F. Bodin, and R. Quiniou. A machine learning approach to automatic production of compiler heuristics. In Proceedings of the International Conference on Artificial Intelligence: Methodology, Systems, Applications, LNCS 2443, pages 41--50, 2002.
[27]
S.S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.
[28]
Z. Pan and R. Eigenmann. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 319--332, 2006.
[29]
David Parello, Olivier Temam, Albert Cohen, and Jean-Marie Verdun. Towards a systematic, pragmatic and architecture-aware program optimization process for complex processors. In ACM/IEEE Conf. on Supercomputing (SC'04), page 15, Washington, DC, 2004.
[30]
B. Singer and M. Veloso. Learning to predict performance from formula modeling and training data. In Proceedings of the Conference on Machine Learning, 2000.
[31]
M. Stephenson and S. Amarasinghe. Predicting unroll factors using supervised classification. In Proceedings of International Symposium on Code Generation and Optimization (CGO), pages 123--134, 2005.
[32]
S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D.I. August. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 204--215, 2003.
[33]
J. D. Ullman. Principles of Database and Knowledge Systems, volume 1. Computer Science Press, 1988.
[34]
J. Whaley and M.S. Lam. Cloning based context sensitive pointer alias analysis using binary decision diagrams. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2004.
[35]
R. Whaley and J. Dongarra. Automatically tuned linear algebra software. In Proceedings of the Conference on High Performance Networking
[36]
D. Whitfield and M. L. Soffa. An approach to ordering optimizing transformations. In ACM Symp. on Principles & practice of parallel programming (PPoPP'90), pages 137--146, Seattle, Washington, United States, 1990.

Cited By

View all
  • (2023)A Game-Based Framework to Compare Program Classifiers and EvadersProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580012(108-121)Online publication date: 17-Feb-2023
  • (2022)Impacto de Ofuscadores e Otimizadores de Código na Acurácia de Classificadores de ProgramasProceedings of the XXVI Brazilian Symposium on Programming Languages10.1145/3561320.3561322(68-75)Online publication date: 6-Oct-2022
  • (2022)Fast selection of compiler optimizations using performance prediction with graph neural networksConcurrency and Computation: Practice and Experience10.1002/cpe.686935:17Online publication date: 16-Mar-2022
  • Show More Cited By

Index Terms

  1. Practical aggregation of semantical program properties for machine learning based optimization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
    October 2010
    276 pages
    ISBN:9781605589039
    DOI:10.1145/1878921
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • CEDA
    • IEEE CAS
    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. compilation
    2. machine learning

    Qualifiers

    • Research-article

    Conference

    ESWeek '10
    ESWeek '10: Sixth Embedded Systems Week
    October 24 - 29, 2010
    Arizona, Scottsdale, USA

    Acceptance Rates

    Overall Acceptance Rate 52 of 230 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Game-Based Framework to Compare Program Classifiers and EvadersProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580012(108-121)Online publication date: 17-Feb-2023
    • (2022)Impacto de Ofuscadores e Otimizadores de Código na Acurácia de Classificadores de ProgramasProceedings of the XXVI Brazilian Symposium on Programming Languages10.1145/3561320.3561322(68-75)Online publication date: 6-Oct-2022
    • (2022)Fast selection of compiler optimizations using performance prediction with graph neural networksConcurrency and Computation: Practice and Experience10.1002/cpe.686935:17Online publication date: 16-Mar-2022
    • (2021)VESPA: static profiling for binary optimizationProceedings of the ACM on Programming Languages10.1145/34855215:OOPSLA(1-28)Online publication date: 15-Oct-2021
    • (2021)Exploring the space of optimization sequences for code-size reduction: insights and toolsProceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction10.1145/3446804.3446849(47-58)Online publication date: 2-Mar-2021
    • (2021)AnghaBenchProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370322(378-390)Online publication date: 27-Feb-2021
    • (2021)Memory Utilization and Machine Learning Techniques for Compiler OptimizationITM Web of Conferences10.1051/itmconf/2021370102137(01021)Online publication date: 17-Mar-2021
    • (2020)YACOSProceedings of the 24th Brazilian Symposium on Context-Oriented Programming and Advanced Modularity10.1145/3427081.3427089(56-63)Online publication date: 19-Oct-2020
    • (2020)Type Inference for CACM Transactions on Programming Languages and Systems10.1145/342147242:3(1-71)Online publication date: 13-Nov-2020
    • (2020)Deep Program Structure Modeling Through Multi-Relational Graph-based LearningProceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques10.1145/3410463.3414670(111-123)Online publication date: 30-Sep-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media