Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Autotuning algorithmic choice for input sensitivity

Published: 03 June 2015 Publication History

Abstract

A daunting challenge faced by program performance autotuning is input sensitivity, where the best autotuned configuration may vary with different input sets. This paper presents a novel two-level input learning algorithm to tackle the challenge for an important class of autotuning problems, algorithmic autotuning. The new approach uses a two-level input clustering method to automatically refine input grouping, feature selection, and classifier construction. Its design solves a series of open issues that are particularly essential to algorithmic autotuning, including the enormous optimization space, complex influence by deep input features, high cost in feature extraction, and variable accuracy of algorithmic choices. Experimental results show that the new solution yields up to a 3x speedup over using a single configuration for all inputs, and a 34x speedup over a traditional one-level method for addressing input sensitivity in program optimizations.

References

[1]
Government’s open data. http://www.data.org/.
[2]
UCI data sets. http://archive.ics.uci.edu/ml/datasets.
[3]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. F. P. O’boyle, J. Thomson, M. Toussaint, and C. K. I. Williams. Using machine learning to focus iterative optimization. In International Symposium on Code Generation and Optimization, pages 295–305, 2006.
[4]
L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves, D. Subramanian, L. Torczon, and T. Waterman. Finding effective compilation sequences. In LCTES’04, pages 231–239, 2004.
[5]
J. Ansel, Y. L. W. ans Cy Chan, M. Olszewski, A. Edelman, and S. Amarasinghe. Language and compiler support for auto-tuning variable-accuracy algorithms. In CGO, Chamonix, France, Apr 2011.
[6]
J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao, A. Edelman, and S. Amarasinghe. PetaBricks: A language and compiler for algorithmic choice. In PLDI, Dublin, Ireland, Jun 2009.
[7]
J. Ansel, S. Kamil, K. Veeramachaneni, J. Ragan-Kelley, J. Bosboom, U. O’Reilly, and S. Amarasinghe. Opentuner: An extensible framework for program autotuning. In Proceedings of The 23rd International Conference on Parallel Architectures and Compilation Techniques, 2014.
[8]
J. Ansel, M. Pacula, S. Amarasinghe, and U.-M. O’Reilly. An efficient evolutionary algorithm for solving bottom up problems. In Annual Conference on Genetic and Evolutionary Computation, Dublin, Ireland, July 2011.
[9]
J. Auslander, M. Philipose, C. Chambers, S. J. Eggers, and B. N. Bershad. Fast, effective dynamic compilation. In PLDI, 1996.
[10]
W. Baek and T. Chilimbi. Green: A framework for supporting energyconscious programming using controlled approximation. In PLDI, June 2010.
[11]
P. Berube, J. Amaral, R. Ho, and R. Silvera. Workload reduction for multi-input profile-directed optimization. In Proceedings of the IEEE / ACM International Symposium on Code Generation and Optimization, 2009.
[12]
V. Bhat, M. Parashar,. Hua Liu, M. Khandekar, N. Kandasamy, and S. Abdelwahed. Enabling self-managing applications using model-based online control strategies. In International Conference on Autonomic Computing, Washington, DC, 2006.
[13]
J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology. In Proceedings of the ACM International Conference on Supercomputing, pages 340–347, 1997.
[14]
F. Chang and V. Karamcheti. A framework for automatic adaptation of tunable distributed applications. Cluster Computing, 4, March 2001.
[15]
Y. Chen, S. Fang, L. Eeckhout, O. Temam, and C. Wu. Iterative optimization for the data center. In ASPLOS, 2012.
[16]
Y. Chen, Y. Huang, L. Eeckhout, G. Fursin, L. Peng, O. Temam, and C. Wu. Evaluating iterative optimization across 1000 datasets. In Proceedings of the ACM SIGPLAN conference on Programming language design and implementation, PLDI’10, pages 448–459, 2010.
[17]
P. C. Diniz and M. C. Rinard. Dynamic feedback: an effective technique for adaptive computing. In PLDI, New York, NY, 1997.
[18]
S. Fang, Z. Du, Y. Fang, Y. Huang, Y. Chen, L. Eeckhout, O. Temam, H. Li, Y. Chen, and C. Wu. Performance portability across heterogeneous socs using a generalized library-based approach. ACM Transactions on Architecture and Code Optimization, 11, 2014.
[19]
M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216–231, 2005.
[20]
M. Frigo and S. G. Johnson. The design and implementation of FFTW3. IEEE, 93(2), February 2005. Invited paper, special issue on “Program Generation, Optimization, and Platform Adaptation”.
[21]
G. Fursin, A. Cohen, M. O’Boyle, and O. Temam. Quick and practical run-time evaluation of multiple program optimizations. Transactions on High-Performance Embedded Architectures and Compilers, 4050:34– 53, 2007.
[22]
G. Fursin, C. Miranda, O. Temam, M. Namolaru, E. Yom-Tov, A. Zaks, B. Mendelson, E. Bonilla, J. Thomson, H. Leather, C. Williams, M. O’Boyle, P. Barnard, E. Ashton, E. Courtois, and F. Bodin. MILEPOST GCC: machine learning based research compiler. In Proceedings of the GCC Developers’ Summit, Jul 2008.
[23]
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2001.
[24]
P. Hawkins, A. Aiken, K. Fisher, M. Rinard, and M. Sagiv. Data representation synthesis. In Proceedings of ACM SIGPLAN Conference on Programming Languages Design and Implementation, 2012.
[25]
H. Hoffmann, J. Eastep, M. D. Santambrogio, J. E. Miller, and A. Agarwal. Application heartbeats: a generic interface for specifying program performance and goals in autonomous computing environments. In ICAC, New York, NY, 2010.
[26]
H. Hoffmann, S. Misailovic, S. Sidiroglou, A. Agarwal, and M. Rinard. Using code perforation to improve performance, reduce energy consumption, and respond to failures. Technical Report MIT-CSAILTR-2209-042, Massachusetts Institute of Technology, Sep 2009.
[27]
H. Hoffmann, S. Sidiroglou, M. Carbin, S. Misailovic, A. Agarwal, and M. Rinard. Power-aware computing with dynamic knobs. In ASPLOS, 2011.
[28]
E. Im and K. Yelick. Optimizing sparse matrix computations for register reuse in SPARSITY. In International Conference on Computational Science, 2001.
[29]
E.-J. Im, K. Yelick, and R. Vuduc. Sparsity: Optimization framework for sparse matrix kernels. Int. J. High Perform. Comput. Appl., 18(1):135–158, 2004.
[30]
C. Jung, S. Rus, B. P. Railing, N. Clark, and S. Pande. Brainy: effective selection of data structures. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation, PLDI ’11, pages 86–97, New York, NY, USA, 2011. ACM.
[31]
G. Karsai, A. Ledeczi, J. Sztipanovits, G. Peceli, G. Simon, and T. Kovacshazy. An approach to self-adaptive software based on supervisory control. In International Workshop in Self-adaptive software, 2001.
[32]
X. Li, M. J. Garzarán, and D. Padua. Optimizing sorting with genetic algorithms. In CGO, 2005.
[33]
Y. Liu, E. Z. Zhang, and X. Shen. A cross-input adaptive framework for gpu programs optimization. In Proceedings of International Parallel and Distribute Processing Symposium (IPDPS), pages 1–10, 2009.
[34]
S. Muralidharan, M. Shantharam, M. Hall, M. Garland, and B. Catanzaro. Nitro: A framework for adaptive code variant tuning. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pages 501–512. IEEE, 2014.
[35]
E. Park, L.-N. Pouche, J. Cavazos, A. Cohen, and P. Sadayappan. Predictive modeling in a polyhedral optimization space. In IEEE/ACM International Symposium on Code Generation and Optimization, pages 119 –129, April 2011.
[36]
M. Puschel, J. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. Johnson, and N. Rizzolo. SPIRAL: code generation for DSP transforms. Proceedings of the IEEE, 93(2):232–275, 2005.
[37]
M. Püschel, J. M. F. Moura, B. Singer, J. Xiong, J. R. Johnson, D. A. Padua, M. M. Veloso, and R. W. Johnson. Spiral: A generator for platform-adapted libraries of signal processing alogorithms. IJHPCA, 18(1), 2004.
[38]
J. Quinlan. Induction of decision trees. Machine learning, 1(1):81–106, 1986.
[39]
M. Samadi, A. Hormati, M. Mehrara, J. Lee, and S. Mahlke. Adaptive input-aware compilation for graphics engines. In Proceedings of ACM SIGPLAN 2012 Conference on Programming Language Design and Implementation, 2012.
[40]
C. Tapus, I.-H. Chung, and J. K. Hollingsworth. Active harmony: Towards automated performance tuning. In In Proceedings from the Conference on High Performance Networking and Computing, pages 1–11, 2003.
[41]
N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in STAPL. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 277–288, 2005.
[42]
K. Tian, Y. Jiang, E. Zhang, and X. Shen. An input-centric paradigm for program dynamic optimizations. In the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), 2010.
[43]
M. Voss and R. Eigenmann. Adapt: Automated de-coupled adaptive program transformation. In International Conference on Parallel Processing, 2000.
[44]
M. Voss and R. Eigenmann. High-level adaptive program optimization with adapt. ACM SIGPLAN Notices, 36(7), 2001.
[45]
R. Vuduc, J. W. Demmel, and K. A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Scientific Discovery through Advanced Computing Conference, Journal of Physics: Conference Series, San Francisco, CA, June 2005.
[46]
R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. In Supercomputing, Washington, DC, 1998.
[47]
R. C. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3–35, 2001.

Cited By

View all
  • (2024)Enabling Large Dynamic Neural Network Training with Learning-based Memory Management2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00066(788-802)Online publication date: 2-Mar-2024
  • (2023)The politics of digital technologies: Reimagining social participation in the digital ageProceedings of the 16th International Conference on Theory and Practice of Electronic Governance10.1145/3614321.3614352(221-227)Online publication date: 26-Sep-2023
  • (2023)Analyzing the Impact of Workloads on Modeling the Performance of Configurable Software SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00176(2085-2097)Online publication date: 14-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 50, Issue 6
PLDI '15
June 2015
630 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2813885
  • Editor:
  • Andy Gill
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2015
    630 pages
    ISBN:9781450334686
    DOI:10.1145/2737924
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2015
Published in SIGPLAN Volume 50, Issue 6

Check for updates

Author Tags

  1. Algorithmic Selection
  2. Autotuning
  3. Input Sensitivity
  4. Variable Accuracy

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)248
  • Downloads (Last 6 weeks)34
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Enabling Large Dynamic Neural Network Training with Learning-based Memory Management2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00066(788-802)Online publication date: 2-Mar-2024
  • (2023)The politics of digital technologies: Reimagining social participation in the digital ageProceedings of the 16th International Conference on Theory and Practice of Electronic Governance10.1145/3614321.3614352(221-227)Online publication date: 26-Sep-2023
  • (2023)Analyzing the Impact of Workloads on Modeling the Performance of Configurable Software SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00176(2085-2097)Online publication date: 14-May-2023
  • (2022)Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device MappingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.311461741:6(1636-1648)Online publication date: Jun-2022
  • (2021)HPACProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476216(1-14)Online publication date: 14-Nov-2021
  • (2021)Deep Software Variability: Towards Handling Cross-Layer ConfigurationProceedings of the 15th International Working Conference on Variability Modelling of Software-Intensive Systems10.1145/3442391.3442402(1-8)Online publication date: 9-Feb-2021
  • (2021)On the Anatomy of Predictive Models for Accelerating GPU Convolution Kernels and BeyondACM Transactions on Architecture and Code Optimization10.1145/343440218:1(1-24)Online publication date: 7-Jan-2021
  • (2021)FOGA: Flag Optimization with Genetic Algorithm2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA)10.1109/INISTA52262.2021.9548573(1-6)Online publication date: 25-Aug-2021
  • (2020)Identifying and (automatically) remedying performance problems in CPU/GPU applicationsProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392759(1-13)Online publication date: 29-Jun-2020
  • (2020)ApproxDetProceedings of the 18th Conference on Embedded Networked Sensor Systems10.1145/3384419.3431159(449-462)Online publication date: 16-Nov-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media