Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

COBAYN: Compiler Autotuning Framework Using Bayesian Networks

Published: 14 June 2016 Publication History
  • Get Citation Alerts
  • Abstract

    The variety of today’s architectures forces programmers to spend a great deal of time porting and tuning application codes across different platforms. Compilers themselves need additional tuning, which has considerable complexity as the standard optimization levels, usually designed for the average case and the specific target architecture, often fail to bring the best results.
    This article proposes COBAYN: Compiler autotuning framework using BAYesian Networks, an approach for a compiler autotuning methodology using machine learning to speed up application performance and to reduce the cost of the compiler optimization phases. The proposed framework is based on the application characterization done dynamically by using independent microarchitecture features and Bayesian networks. The article also presents an evaluation based on using static analysis and hybrid feature collection approaches. In addition, the article compares Bayesian networks with respect to several state-of-the-art machine-learning models.
    Experiments were carried out on an ARM embedded platform and GCC compiler by considering two benchmark suites with 39 applications. The set of compiler configurations, selected by the model (less than 7% of the search space), demonstrated an application performance speedup of up to 4.6 × on Polybench (1.85 × on average) and 3.1 × on cBench (1.54 × on average) with respect to standard optimization levels. Moreover, the comparison of the proposed technique with (i) random iterative compilation, (ii) machine learning--based iterative compilation, and (iii) noniterative predictive modeling techniques shows, on average, 1.2 ×, 1.37 ×, and 1.48 × speedup, respectively. Finally, the proposed method demonstrates 4 × and 3 × speedup, respectively, on cBench and Polybench in terms of exploration efficiency given the same quality of the solutions generated by the random iterative compilation model.

    References

    [1]
    Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael FP O’Boyle, John Thomson, Marc Toussaint, and Christopher KI Williams. 2006. Using machine learning to focus iterative optimization. In Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, 295--305.
    [2]
    Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. 2009. PetaBricks: A language and compiler for algorithmic choice. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). ACM, New York, NY, 38--49.
    [3]
    Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O’Reilly, and Saman Amarasinghe. 2014. Opentuner: An extensible framework for program autotuning. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. ACM, 303--316.
    [4]
    Amir Hossein Ashouri, Andrea Bignoli, Gianluca Palermo, and Cristina Silvano. 2016. Predictive modeling methodology for compiler phase-ordering. In Proceedings of 7th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 5th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms. ACM.
    [5]
    Amir Hossein Ashouri, Giovanni Mariani, Gianluca Palermo, and Cristina Silvano. 2014. A Bayesian network approach for compiler auto-tuning for embedded processors. In IEEE 12th Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia’14). IEEE, 90--97.
    [6]
    Amir Hossein Ashouri, Vittorio Zaccaria, Sotirios Xydis, Gianluca Palermo, and Cristina Silvano. 2013. A framework for compiler level statistical analysis over customized VLIW architecture. In 2013 IFIP/IEEE 21st International Conference on Very Large Scale Integration (VLSI-SoC’13). IEEE, 124--129.
    [7]
    Rajendra Bhatia. 2009. Positive Definite Matrices. Princeton University Press, Princeton, NJ.
    [8]
    François Bodin, Toru Kisuki, Peter Knijnenburg, Mike O’Boyle, and Erven Rohou. 1998. Iterative compilation in a non-linear optimisation space. In Workshop on Profile and Feedback-Directed Compilation.
    [9]
    Karsten M. Borgwardt and Hans-Peter Kriegel. 2005. In Data Mining, 5th IEEE International Conference on Shortest-path Kernels on Graphs. IEEE, 8--pp.
    [10]
    John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael F. P. O’Boyle, and Olivier Temam. 2007. Rapidly selecting good compiler optimizations using performance counters. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’07). IEEE Computer Society, Washington, DC, 185--197.
    [11]
    Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, and Chengyong Wu. 2012. Deconstructing iterative optimization. ACM Transactions on Architecture and Code Optimization 9, 3, 21.
    [12]
    Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. 1999. Optimizing for reduced code space using genetic algorithms. In ACM SIGPLAN Notices, Vol. 34. ACM, 1--9.
    [13]
    Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May OReilly, and Saman Amarasinghe. 2015. Autotuning algorithmic choice for input sensitivity. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 379--390.
    [14]
    Shuangde Fang, Wenwen Xu, Yang Chen, Lieven Eeckhout, Olivier Temam, Yunji Chen, Chengyong Wu, and Xiaobing Feng. 2015. Practical iterative optimization for the data center. ACM Transactions on Architecture and Code Optimization 12, 2, 15.
    [15]
    Grigori Fursin. 2010. Collective benchmark (cbench), a collection of open-source programs with multiple datasets assembled by the community to enable realistic benchmarking and research on program and architecture optimization.
    [16]
    Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, and others. 2011. Milepost gcc: Machine learning enabled self-tuning compiler. International Journal of Parallel Programming 39, 3, 296--327.
    [17]
    Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Edwin Bonilla, John Thomson, Hugh Leather, and others. 2008. MILEPOST GCC: Machine learning based research compiler. In GCC Summit.
    [18]
    Richard L. Gorsuch. 1988. Exploratory factor analysis. In Handbook of Multivariate Experimental Psychology, John R. Nesselroade and Raymond B. Cattell (Eds.). Springer, 231--258.
    [19]
    Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. 2012. Auto-tuning a high-level language targeted to GPU codes. In Innovative Parallel Computing (InPar’12). IEEE, 1--10.
    [20]
    Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter 11, 1, 10--18.
    [21]
    David Heckerman and David M. Chickering. 1995. Learning Bayesian networks: The combination of knowledge and statistical data. In Machine Learning. 20--197.
    [22]
    Torsten Hoefler and Roberto Belli. 2015. Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 73.
    [23]
    Kenneth Hoste and Lieven Eeckhout. 2007. Microarchitecture-independent workload characterization. IEEE Micro 27, 3, 63--72.
    [24]
    Kenneth Hoste and Lieven Eeckhout. 2008. Cole: Compiler optimization level exploration. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 165--174.
    [25]
    Zhanpeng Jin and A. C. Cheng. 2008. Improve simulation efficiency using statistical benchmark subsetting - An implantbench case study. In 45th ACM/IEEE Design Automation Conference (DAC 2008). 970--973.
    [26]
    Henry F. Kaiser. 1958. The varimax criterion for analytic rotation in factor analysis. Psychometrika 23, 3, 187--200.
    [27]
    Christos Kartsaklis, Oscar Hernandez, Chung-Hsing Hsu, Thomas Ilsche, Wayne Joubert, and Richard L. Graham. 2012. HERCULES: A pattern driven code transformation system. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW’’12). IEEE, 574--583.
    [28]
    Toru Kisuki, Peter M. W. Knijnenburg, Mike F. P. O’Boyle, François Bodin, and Harry A. G. Wijshoff. 1999. A feasibility study in iterative compilation. In High Performance Computing, Constantine Polychronopoulos and Kazuki Joe Akira Fukuda (Eds.). Springer, 121--132.
    [29]
    Toru Kisuki, Peter M. W. Knijnenburg, and Michael F. P. O’Boyle. 2000. Combined selection of tile sizes and unroll factors using iterative compilation. In Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT’00), Philadelphia, PA, October 15--19, 2000. 237--248.
    [30]
    Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson, and Jack W. Davidson. 2009. Practical exhaustive optimization phase order exploration and evaluation. ACM Transactions on Architecture and Code Optimization 6, 1, Article 1, 36 pages.
    [31]
    Sameer Kulkarni and John Cavazos. 2012. Mitigating the compiler optimization phase-ordering problem using machine learning. ACM SIGPLAN Notices 47, 10, 147--162.
    [32]
    Hugh Leather, Edwin Bonilla, and Michael O’Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. In International Symposium on Code Generation and Optimization (CGO’09). IEEE, 81--91.
    [33]
    Junghsi Lee and V. John Mathews. 1994. A stability condition for certain bilinear systems. IEEE Transactions on Signal Processing 42, 7, 1871--1873.
    [34]
    Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). ACM, New York, NY, 190--200.
    [35]
    Luiz G. A. Martins, Ricardo Nobre, João M. P. Cardoso, Alexandre C. B. Delbem, and Eduardo Marques. 2016. Clustering-based selection for the exploration of compiler optimization sequences. ACM Transactions on Architecture and Code Optimization 13, 1, 8.
    [36]
    Sanyam Mehta and Pen-Chung Yew. 2015. Improving compiler scalability: Optimizing large programs at small price. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 143--152.
    [37]
    Kevin P. Murphy. 2001. The Bayes net toolbox for MATLAB. Computing Science and Statistics 33, 2001.
    [38]
    Gianluca Palermo, Cristina Silvano, and Vittorio Zaccaria. 2005. Multi-objective design space exploration of embedded systems. Journal of Embedded Computing 1, 3, 305--316.
    [39]
    EunJung Park, John Cavazos, and Marco A. Alvarez. 2012. Using graph-based program characterization for predictive modeling. In Proceedings of the 10th International Symposium on Code Generation and Optimization. ACM, 196--206.
    [40]
    Eunjung Park, John Cavazos, Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and P. Sadayappan. 2013. Predictive modeling in a polyhedral optimization space. International Journal of Parallel Programming 41, 5, 704--750.
    [41]
    EunJung Park, Christos Kartsaklis, and John Cavazos. 2014. HERCULES: Strong patterns towards more intelligent predictive modeling. In 43rd International Conference on Parallel Processing (ICPP’14). IEEE, 172--181.
    [42]
    Louis-Noël Pouchet. 2012. Polybench: The polyhedral benchmark suite. http://www.cs.ucla.edu/∼pouchet/software/polybench/.
    [43]
    Suresh Purini and Lakshya Jain. 2013. Finding good optimization sequences covering program space. ACM Transactions on Architecture and Code Optimization (TACO) 9, 4 (2013), 56.
    [44]
    R Foundation. 2012. The R Project for Statistical Computing. (2012). http://www.r-project.org/.
    [45]
    Roberto Santana, Concha Bielza, Pedro Larranaga, Jose A. Lozano, Carlos Echegoyen, Alexander Mendiburu, Rubén Armananzas, and Siddartha Shakya. 2010. Mateda-2.0: Estimation of distribution algorithms in MATLAB. Journal of Statistical Software 35, 7, 1--30.
    [46]
    Eric Schkufza, Rahul Sharma, and Alex Aiken. 2014. Stochastic optimization of floating-point programs with tunable precision. ACM SIGPLAN Notices 49, 6, 53--64.
    [47]
    Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O’Reilly. 2003. Meta optimization: Improving compiler heuristics with machine learning. In ACM SIGPLAN Notices, Vol. 38. ACM, 77--90.
    [48]
    Mirai Tanaka and Kazuhide Nakata. 2014. Positive definite matrix approximation with condition number constraint. Optimization Letters 8, 3, 939--947.
    [49]
    Texas Instruments. 2012. PandaBoard. OMAP4430 SoC Dev. Board, Revision A 2.
    [50]
    Wai Teng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huyng, Xibai Li, and Rick Siow Mong Goh. 2015. Optimizing and auto-tuning scale-free sparse matrix-vector multiplication on Intel Xeon Phi. In IEEE/ACM International Symposium on Code Generation and Optimization (CGO’15). IEEE, 136--145.
    [51]
    Bruce Thompson. 2002. Statistical, practical, and clinical: How many kinds of significance do counselors need to consider? Journal of Counseling & Development 80, 1, 64--71.

    Cited By

    View all
    • (2024)The Droplet Search Algorithm for Kernel SchedulingACM Transactions on Architecture and Code Optimization10.1145/365010921:2(1-28)Online publication date: 21-May-2024
    • (2024)Exponentially Expanding the Phase-Ordering Search Space via Dormant InformationProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641582(250-261)Online publication date: 17-Feb-2024
    • (2024)Compiler Autotuning through Multiple-phase LearningACM Transactions on Software Engineering and Methodology10.1145/364033033:4(1-38)Online publication date: 11-Jan-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 13, Issue 2
    June 2016
    200 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/2952301
    Issue’s Table of Contents
    © 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 June 2016
    Accepted: 01 April 2016
    Revised: 01 March 2016
    Received: 01 November 2015
    Published in TACO Volume 13, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Bayesian networks
    2. design space exploration
    3. statistical inference

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)878
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The Droplet Search Algorithm for Kernel SchedulingACM Transactions on Architecture and Code Optimization10.1145/365010921:2(1-28)Online publication date: 21-May-2024
    • (2024)Exponentially Expanding the Phase-Ordering Search Space via Dormant InformationProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641582(250-261)Online publication date: 17-Feb-2024
    • (2024)Compiler Autotuning through Multiple-phase LearningACM Transactions on Software Engineering and Methodology10.1145/364033033:4(1-38)Online publication date: 11-Jan-2024
    • (2024)JavaScript Performance Tuning as a Crowdsourced ServiceIEEE Transactions on Mobile Computing10.1109/TMC.2023.3316167(1-16)Online publication date: 2024
    • (2024)A Two-Stage LLVM Option Sequence Optimization Method to Minimize Energy ConsumptionSwarm and Evolutionary Computation10.1016/j.swevo.2024.10159188(101591)Online publication date: Jul-2024
    • (2024)Exploring compiler optimization space for control flow obfuscationComputers & Security10.1016/j.cose.2024.103704139(103704)Online publication date: May-2024
    • (2024)Tsoa: a two-stage optimization approach for GCC compilation options to minimize execution timeAutomated Software Engineering10.1007/s10515-024-00437-w31:2Online publication date: 1-Nov-2024
    • (2023)Constructing an AI Compiler for ARM Cortex-M DevicesComputer Systems Science and Engineering10.32604/csse.2023.03467246:1(999-1019)Online publication date: 2023
    • (2023)Compiler Test-Program Generation via Memoized Configuration SearchProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00172(2035-2047)Online publication date: 14-May-2023
    • (2022)Compiler Optimization Parameter Selection Method Based on Ensemble LearningElectronics10.3390/electronics1115245211:15(2452)Online publication date: 6-Aug-2022
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media