survey

A Survey on Compiler Autotuning using Machine Learning

Authors:

Amir H. Ashouri,

William Killian,

Gianluca Palermo,

Cristina SilvanoAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 51, Issue 5

Article No.: 96, Pages 1 - 42

https://doi.org/10.1145/3197978

Published: 18 September 2018 Publication History

Abstract

Since the mid-1990s, researchers have been trying to use machine-learning-based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations, and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches, and finally, the influential papers of the field.

References

[1]

Bas Aarts, Michel Barreteau, François Bodin, Peter Brinkhaus, Zbigniew Chamski, Henri-Pierre Charles, Christine Eisenbeis, John Gurd, Jan Hoogerbrugge, Ping Hu et al. 1997. OCEANS: Optimizing compilers for embedded applications. In Proceedings of the European Conference on Parallel Processing (Euro-Par’97). 1351--1356.

Digital Library

[2]

Ali-Reza Adl-Tabatabai, Michał Cierniak, Guei-Yuan Lueh, Vishesh M. Parikh, and James M. Stichnoth. 1998. Fast, effective code generation in a just-in-time java compiler. In ACM SIGPlAN Notices, Vol. 33. ACM, 280--290.

Digital Library

[3]

Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael F. P. O’Boyle, John Thomson, Marc Toussaint, and Christopher K. I. Williams. 2006. Using machine learning to focus iterative optimization. In Proceedings of the International Symposium on Code Generation and Optimization. IEEE, 295--305.

Digital Library

[4]

Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers, Principles, Techniques. Addison Wesley.

Digital Library

[5]

Frances E. Allen. 1970. Control flow analysis. In ACM Sigplan Notices, Vol. 5. ACM, 1--19.

Digital Library

[6]

L. Almagor and K. D. Cooper. 2004. Finding effective compilation sequences. ACM SIGPLAN Notices 39, 7 (2004), 231--239. Retrieved from http://www.anc.ed.ac.uk/machine-learning/colo/repository/LCTES04.pdf.

Digital Library

[7]

George Almasi and David A. Padua. 2000. MaJIC: A MATLAB just-in-time compiler. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing. Springer, 68--81.

Digital Library

[8]

Ethem Alpaydin. 2014. Introduction to Machine Learning. MIT Press.

Digital Library

[9]

Martin Alt, Uwe Aßmann, and Hans Van Someren. 1994. Cosy compiler phase embedding with the cosy compiler model. In Proceedings of the International Conference on Compiler Construction. Springer, 278--293.

Digital Library

[10]

Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. 2009. PetaBricks: A language and compiler for algorithmic choice. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). ACM, New York, NY, 38--49.

Digital Library

[11]

J. Ansel and S. Kamil. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. 303--316.

Digital Library

[12]

Karl-Erik Årzén and Anton Cervin. 2005. Control and embedded computing: Survey of research directions. IFAC Proc. Vol. 38, 1 (2005), 191--202.

[13]

G. Ascia, V. Catania, M. Palesi, and D. Patti. 2005. A system-level framework for evaluating area/performance/power trade-offs of VLIW-based embedded systems. In Proceedings of the Design Automation Conference (ASP-DAC’05), Vol. 2. 940--943.

Digital Library

[14]

Yosi Ben Asher, Gadi Haber, and Esti Stein. 2017. A study of conflicting pairs of compiler optimizations. In Proceedings of the IEEE 11th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC’17). IEEE, 52--58.

[15]

A. H. Ashouri, G. Mariani, G. Palermo, and C. Silvano. 2014. A Bayesian network approach for compiler auto-tuning for embedded processors. In Proceedings of the IEEE Embedded Systems for Real-Time Multimedia (ESTIMedia). 90--97.

[16]

Amir Hossein Ashouri. 2012. Design space exploration methodology for compiler parameters in VLIW processors. Master’s thesis. M. Sc. Dissertation. Politecnico Di Milano, Italy. Retrieved from http://hdl.handle.net/10589/72083.

[17]

Amir Hossein Ashouri. 2016. Compiler Autotuning Using Machine Learning Techniques. Ph.D. Dissertation. Politecnico di Milano, Italy. Retrieved from http://hdl.handle.net/10589/129561.

[18]

Amir Hossein Ashouri, Andrea Bignoli, Gianluca Palermo, and Cristina Silvano. 2016. Predictive modeling methodology for compiler phase-ordering. In Proceedings of the 7th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and the 5th Workshop on Design Tools and Architectures For Multicore Embedded Computing Platforms (PARMA-DITAM’16). ACM, New York, NY, 7--12.

Digital Library

[19]

Amir H. Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, and John Cavazos. 2017. MiCOMP: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning. ACM Trans. Archit. Code Optim. 14, 3, Article 29 (Sept. 2017).

Digital Library

[20]

Amir H. Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano. 2018. A survey on compiler autotuning using machine learning. arXiv preprint arXiv:1801.04405 (2018).

Digital Library

[21]

Amir Hossein Ashouri, Giovanni Mariani, Gianluca Palermo, Eunjung Park, John Cavazos, and Cristina Silvano. 2016. COBAYN: Compiler autotuning framework using bayesian networks. ACM Trans. Archit. Code Optim. 13, 2, Article 21 (June 2016).

Digital Library

[22]

Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. 2018. Automatic Tuning of Compilers Using Machine Learning. Springer International Publishing.

Digital Library

[23]

Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. 2018. Background. Springer International Publishing, Cham, 1--22.

[24]

Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. 2018. Design Space Exploration of Compiler Passes: A Co-Exploration Approach for the Embedded Domain. Springer International Publishing, Cham, 23--39.

[25]

Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. 2018. The Phase-Ordering Problem: A Complete Sequence Prediction Approach. Springer International Publishing, Cham, 85--113.

[26]

Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. 2018. The Phase-Ordering Problem: An Intermediate Speedup Prediction Approach. Springer International Publishing, Cham, 71--83.

[27]

Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. 2018. Selecting the Best Compiler Optimizations: A Bayesian Network Approach. Springer International Publishing, Cham, 41--70.

[28]

Amir Hossein Ashouri, Gianluca Palermo, and Cristina Silvano. An evaluation of autotuning techniques for the compiler optimization problems. In Proceedings of the Workshop on Resource Awareness and Application Autotuning in Adaptive and Heterogeneous Computing (RES4ANT’16), colocated with the Design Automation and Test in Europe Conference and Expo (DATE’16). 23--27. http://ceur-ws.org/Vol-1643/#paper-05

[29]

Amir Hossein Ashouri, Vittorio Zaccaria, Sotirios Xydis, Gianluca Palermo, and Cristina Silvano. 2013. A framework for Compiler Level statistical analysis over customized VLIW architecture. In Proceedings of the International Conference on Very Large Scale Integration (VLSI-SoC’13). 124--129.

[30]

Jose L. Ayala, Marisa López-Vallejo, David Atienza, Praveen Raghavan, Francky Catthoor, and Diederik Verkest. 2007. Energy-aware compilation and hardware design for VLIW embedded systems. Int. J. Embed. Syst. 3, 1--2 (2007), 73--82.

[31]

John Aycock. 2003. A brief history of just-in-time. ACM Comput. Surveys 35, 2 (2003), 97--113.

Digital Library

[32]

R. Babuka, P. J. Van der Veen, and U. Kaymak. 2002. Improved covariance estimation for Gustafson-Kessel clustering. In Proceedings of the 2002 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’02), Vol. 2. IEEE, 1081--1085.

[33]

David F. Bacon, Susan L. Graham, and Oliver J. Sharp. 1994. Compiler transformations for high-performance computing. Comput. Surveys 26, 4 (Dec. 1994), 345--420.

Digital Library

[34]

Victor R. Basil and Albert J. Turner. 1975. Iterative enhancement: A practical technique for software development. IEEE Trans. Softw. Eng. 4 (1975), 390--396.

Digital Library

[35]

Protonu Basu, Mary Hall, Malik Khan, Suchit Maindola, Saurav Muralidharan, Shreyas Ramalingam, Axel Rivera, Manu Shantharam, and Anand Venkat. 2013. Towards making autotuning mainstream. Int. J. High Perform. Comput. Appl. 27, 4 (2013), 379--393.

Digital Library

[36]

Protonu Basu, Samuel Williams, Brian Van Straalen, Leonid Oliker, Phillip Colella, and Mary Hall. 2017. Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers. Parallel Comput. 64 (2017), 50--64.

Digital Library

[37]

Mohamed-Walid Benabderrahmane, Louis-Noël Pouchet, Albert Cohen, and Cédric Bastoul. 2010. The polyhedral model is more widely applicable than you think. In Compiler Construction. Springer, 283--303.

Digital Library

[38]

Craig Blackmore, Oliver Ray, and Kerstin Eder. 2015. A logic programming approach to predict effective compiler settings for embedded software. Theory Pract. Logic Program. 15, 4--5 (2015), 481--494.

[39]

Craig Blackmore, Oliver Ray, and Kerstin Eder. 2017. Automatically tuning the GCC compiler to optimize the performance of applications running on the ARM cortex-M3. arXiv preprint arXiv:1703.08228 (2017).

[40]

Craig Blackmore, Oliver Ray, and Kerstin Eder. 2017. Automatically tuning the GCC compiler to optimize the performance of applications running on the ARM cortex-M3. CoRR abs/1703.08228 (2017). arxiv:1703.08228, retrieved from http://arxiv.org/abs/1703.08228.

[41]

Bruno Bodin, Luigi Nardi, M. Zeeshan Zia, Harry Wagstaff, Govind Sreekar Shenoy, Murali Emani, John Mawer, Christos Kotselidis, Andy Nisbet, Mikel Lujan et al. 2016. Integrating algorithmic parameters into benchmarking and design space exploration in 3D scene understanding. In Proceedings of the 2016 International Conference on Parallel Architectures and Compilation. ACM, 57--69.

Digital Library

[42]

François Bodin, Toru Kisuki, Peter Knijnenburg, Mike O’Boyle, and Erven Rohou. 1998. Iterative compilation in a non-linear optimisation space. In Proceedings of the Workshop on Profile and Feedback-Directed Compilation.

[43]

U. Bondhugula and M. Baskaran. 2008. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Proceedings of the International Conference on Compiler Construction. 132--146. Retrieved from http://link.springer.com/chapter/10.1007/978-3-540-78791-4.

Digital Library

[44]

U. Bondhugula and A. Hartono. 2008. A practical automatic polyhedral parallelizer and locality optimizer. (2008). Retrieved from http://dl.acm.org/citation.cfm?id=1375595.

Digital Library

[45]

Uday Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. 2008. PLuTo: A practical and fully automatic polyhedral program optimization system. In Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation (PLDI’08). Citeseer.

[46]

Karsten M. Borgwardt and Hans-Peter Kriegel. 2005. Shortest-path kernels on graphs. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM’05). IEEE, 8--pp.

Digital Library

[47]

Rajkumar Buyya, Chee Shin Yeo, and Srikumar Venugopal. 2008. Market-oriented cloud computing: Vision, hype, and reality for delivering it services as computing utilities. In Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications (HPCC’08). IEEE, 5--13.

Digital Library

[48]

Gustavo Camps-Valls, Tatyana V. Bandos Marsheva, and Dengyong Zhou. 2007. Semi-supervised graph-based hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 45, 10 (2007), 3044--3054.

[49]

João Manuel Paiva Cardoso, José Gabriel de Figueiredo Coutinho, and Pedro C. Diniz. 2017. Embedded Computing for High Performance: Efficient Mapping of Computations Using Customization, Code Transformations and Compilation. Morgan Kaufmann.

[50]

J. Cavazos, C. Dubach, and F. Agakov. 2006. Automatic performance model construction for the fast software exploration of new hardware designs. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. 24--34. Retrieved from http://dl.acm.org/citation.cfm?id=1176765.

Digital Library

[51]

J. Cavazos, G. Fursin, and F. Agakov. 2007. Rapidly selecting good compiler optimizations using performance counters. Proceedings of the International Symposium on Code Generation and Optimization (CGO’07). Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[52]

J. Cavazos and J. E. B. Moss. 2004. Inducing heuristics to decide whether to schedule. ACM SIGPLAN Notices (2004). Retrieved from http://dl.acm.org/citation.cfm?id=996864.

Digital Library

[53]

J. Cavazos, J. E. B. Moss, and M. F. P. O’Boyle. 2006. Hybrid optimizations: Which optimization algorithm to use?Compiler Construction (2006). Retrieved from http://link.springer.com/chapter/10.1007/11688839.

Digital Library

[54]

J. Cavazos and M. F. P. O’Boyle. 2005. Automatic tuning of inlining heuristics. Proceedings of the ACM/IEEE SC 2005 Conference on Supercomputing. 14--14. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[55]

J. Cavazos and M. F. P. O’boyle. 2006. Method-specific dynamic compilation using logistic regression. ACM SIGPLAN Notices (2006). Retrieved from http://dl.acm.org/citation.cfm?id=1167492.

Digital Library

[56]

Gregory J. Chaitin, Marc A. Auslander, Ashok K. Chandra, John Cocke, Martin E. Hopkins, and Peter W. Markstein. 1981. Register allocation via coloring. Comput. Lang. 6, 1 (1981), 47--57.

Digital Library

[57]

Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. 2009. Semi-supervised learning (O. Chapelle et al., eds.). IEEE Trans. Neural Netw. 20, 3 (2009), 542--542.

Digital Library

[58]

C. Chen, J. Chame, and M. Hall. 2005. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. Proceedings of the International Symposium on Code Generation and Optimization. 111--122. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[59]

Chun Chen, Jacqueline Chame, and Mary Hall. 2008. CHiLL: A Framework for Composing High-level Loop Transformations. Technical report. Citeseer.

[60]

Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, and Chengyong Wu. 2012. Deconstructing iterative optimization. ACM Trans. Architect. Code Optim. 9, 3 (2012), 21.

Digital Library

[61]

Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, and Chengyong Wu. 2010. Evaluating iterative optimization across 1000 datasets. In Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’10). ACM, New York, NY, 448--459.

Digital Library

[62]

B. R. Childers and M. L. Soffa. 2005. A model-based framework: An approach for profit-driven optimization. In Proceedings of the International Symposium on Code Generation and Optimization. IEEE, 317--327.

Digital Library

[63]

Alton Chiu, Joseph Garvey, and Tarek S. Abdelrahman. 2015. Genesis: A language for generating synthetic training programs for machine learning. In Proceedings of the 12th ACM International Conference on Computing Frontiers. ACM, 8.

Digital Library

[64]

Cliff Click and Keith D. Cooper. 1995. Combining analyses, combining optimizations. ACM Trans. Program. Lang. Syst. 17, 2 (1995), 181--196.

Digital Library

[65]

Katherine Compton and Scott Hauck. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surveys 34, 2 (2002), 171--210.

Digital Library

[66]

Katherine E. Coons, Behnam Robatmili, Matthew E. Taylor, Bertrand A. Maher, Doug Burger, and Kathryn S. McKinley. 2008. Feature selection and policy optimization for distributed instruction placement using reinforcement learning. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. ACM, 32--42.

Digital Library

[67]

K. D. Cooper, A. Grosul, and T. J. Harvey. 2005. ACME: Adaptive compilation made efficient. In ACM SIGPLAN Notices 40, 7 (2005), 69--77. Retrieved from http://dl.acm.org/citation.cfm?id=1065921.

Digital Library

[68]

K. Cooper, Timothy J. Harvey, Devika Subramanian, and Linda Torczon. 2002. Compilation Order Matters. Technical report.

[69]

K. D. Cooper, P. J. Schielke, and D. Subramanian. 1999. Optimizing for reduced code space using genetic algorithms. ACM SIGPLAN Notices. Retrieved from http://dl.acm.org/citation.cfm?id=314414.

Digital Library

[70]

K. D. Cooper, D. Subramanian, and L. Torczon. 2002. Adaptive optimizing compilers for the 21st Century. J. Supercomput. Retrieved from http://link.springer.com/article/10.1023/A:1015729001611.

Digital Library

[71]

Biagio Cosenza, Juan J. Durillo, Stefano Ermon, and Ben Juurlink. 2017. Stencil autotuning with ordinal regression: Extended abstract. In Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES’17). ACM, New York, NY, 72--75.

Digital Library

[72]

Chris Cummins, Pavlos Petoumenos, Michel Steuwer, and Hugh Leather. 2015. Autotuning OpenCL workgroup size for stencil patterns. arXiv preprint arXiv:1511.02490.

[73]

C. Cummins, P. Petoumenos, Z. Wang, and H. Leather. 2017. End-to-end deep learning of optimization heuristics. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT’17). 219--232.

[74]

Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. Synthesizing benchmarks for predictive modeling. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO’17). IEEE, 86--99.

[75]

Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and T. A. M. T. Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 2 (2002), 182--197.

Digital Library

[76]

Thomas G. Dietterich. 2000. Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems. Springer, 1--15.

Digital Library

[77]

Y. Ding, J. Ansel, and K. Veeramachaneni. 2015. Autotuning algorithmic choice for input sensitivity. ACM SIGPLAN Notices 50, 6 (2015), 379--390. Retrieved from http://dl.acm.org/citation.cfm?id=2737969.

Digital Library

[78]

C. Dubach, J. Cavazos, and B. Franke. 2007. Fast compiler optimisation evaluation using code-feature based performance prediction. In Proceedings of the 4th International Conference on Computing Frontiers. 131--142. Retrieved from http://dl.acm.org/citation.cfm?id=1242553.

Digital Library

[79]

C. Dubach, T. M. Jones, and E. V. Bonilla. 2009. Portable compiler optimisation across embedded programs and microarchitectures using machine learning. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 78--88. Retrieved from http://dl.acm.org/citation.cfm?id=1669124.

Digital Library

[80]

Chris Eagle. 2011. The IDA Pro Book: The Unofficial Guide to the World’s Most Popular Disassembler. No Starch Press.

Digital Library

[81]

Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. 2011. Dark silicon and the end of multicore scaling. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11). IEEE, 365--376.

Digital Library

[82]

Thomas L. Falch and Anne C. Elster. 2015. Machine learning based auto-tuning for enhanced opencl performance portability. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW’15). IEEE, 1231--1240.

Digital Library

[83]

S. Fang, W. Xu, Y. Chen, and L. Eeckhout. 2015. Practical iterative optimization for the data center. ACM Trans. Archit. Code Optim. 12, 2 (2015), 15. Retrieved from http://dl.acm.org/citation.cfm?id=2739048.

Digital Library

[84]

Paolo Faraboschi, Geoffrey Brown, Joseph A. Fisher, Giuseppe Desoli, and Fred Homewood. 2000. Lx: A technology platform for customizable VLIW embedded processing. In ACM SIGARCH Computer Architecture News, Vol. 28. ACM, 203--213.

Digital Library

[85]

Paul Feautrier. 1988. Parametric integer programming. RAIRO Rech. Opération. 22, 3 (1988), 243--268.

[86]

Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9, 3 (July 1987), 319--349.

Digital Library

[87]

J. A. Fisher, P. Faraboschi, and C. Young. 2009. VLIW processors: Once blue sky, now commonplace. IEEE Solid-State Circ. Mag. 1, 2 (2009), 10--17.

[88]

Joseph A. Fisher. 1981. Microcode compaction. IEEE Trans. Comput. 30, 7 (1981).

Digital Library

[89]

Joseph A. Fisher, Paolo Faraboschi, and Cliff Young. 2004. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan Kaufmann.

Digital Library

[90]

Joseph A. Fisher, Paolo Faraboschi, and Cliff Young. 2005. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Elsevier.

Digital Library

[91]

B. Franke, M. O’Boyle, J. Thomson, and G. Fursin. 2005. Probabilistic source-level optimisation of embedded programs. ACM SIGPLAN Notices (2005). Retrieved from http://dl.acm.org/citation.cfm?id=1065922.

Digital Library

[92]

Christopher W. Fraser. 1999. Automatic inference of models for statistical code compression. ACM SIGPLAN Notices 34, 5 (May 1999), 242--246.

Digital Library

[93]

Stefan M. Freudenberger and John C. Ruttenberg. 1992. Phase ordering of register allocation and instruction scheduling. In Code GenerationâĂŤ Concepts, Tools, Techniques. Springer, 146--170.

[94]

Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2001. The Elements of Statistical Learning. Vol. 1. Springer, Berlin.

[95]

Nir Friedman, Dan Geiger, and Moises Goldszmidt. 1997. Bayesian network classifiers. Mach. Learn. 29, 2--3 (1997), 131--163.

Digital Library

[96]

G. G. Fursin. 2004. Iterative compilation and performance prediction for numerical applications. Retrieved from https://www.era.lib.ed.ac.uk/handle/1842/565.

[97]

Grigori Fursin. 2010. Collective benchmark (cbench), a collection of open-source programs with multiple datasets assembled by the community to enable realistic benchmarking and research on program and architecture optimization. Retrieved from http://ctuning.org/wiki/index.php/CTools:CBench.

[98]

G. Fursin, J. Cavazos, M. O’Boyle, and O. Temam. 2007. Midatasets: Creating the conditions for a more realistic evaluation of iterative optimization. Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers. 245--260. Retrieved from http://link.springer.com/chapter/10.1007/978-3-540-69338-3.

Digital Library

[99]

G. Fursin and A. Cohen. 2007. Building a practical iterative interactive compiler. Workshop Proceedings. Retrieved from https://www.researchgate.net/profile/Chuck.

[100]

G. Fursin, A. Cohen, M. O’Boyle, and O. Temam. 2005. A practical method for quickly evaluating program optimizations. Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers. 29--46. Retrieved from http://link.springer.com/chapter/10.1007/11587514.

Digital Library

[101]

G. Fursin, Y. Kashnikov, and A. W. Memon. 2011. Milepost GCC: Machine learning enabled self-tuning compiler. Int. J. Parallel Program. 39, 3 (2011), 296--327. Retrieved from http://link.springer.com/article/10.1007/s10766-010-0161-2.

[102]

Grigori Fursin, Anton Lokhmotov, and Ed Plowman. 2016. Collective knowledge: Towards R&D sustainability. In Proceedings of the Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE’16). IEEE, 864--869.

Digital Library

[103]

Grigori Fursin, Anton Lokhmotov, Dmitry Savenko, and Eben Upton. 2018. A collective knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques. arXiv preprint arXiv:1801.08024 (2018).

[104]

Grigori Fursin, Abdul Memon, Christophe Guillon, and Anton Lokhmotov. 2015. Collective mind, part II: Towards performance-and cost-aware software engineering as a natural science. arXiv preprint arXiv:1506.06256 (2015).

[105]

G. Fursin, C. Miranda, and O. Temam. 2008. MILEPOST GCC: Machine learning based research compiler. Proceedings of the GCC Summit. Retrieved from https://hal.inria.fr/inria-00294704/.

[106]

G. G. Fursin, M. F. P. O’Boyle, and P. M. W. Knijnenburg. 2002. Evaluating iterative compilation. Proceedings of the International Workshop on Languages and Compilers for Parallel Computing. 362--376. Retrieved from http://link.springer.com/chapter/10.1007/11596110.

Digital Library

[107]

Grigori Fursin and Olivier Temam. 2009. Collective optimization. In Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers. Springer, 34--49.

Digital Library

[108]

G. Fursin and O. Temam. 2010. Collective optimization: A practical collaborative approach. ACM Trans. Architect. Code Optim. 7, 4 (2010), 20. Retrieved from http://dl.acm.org/citation.cfm?id=1880047.

Digital Library

[109]

Davide Gadioli, Ricardo Nobre, Pedro Pinto, Emanuele Vitali, Amir H. Ashouri, Gianluca Palermo, Cristina Silvano, and Joao Cardoso. 2018. SOCRATES—A seamless online compiler and system runtime autotuning framework for energy-aware applications. In Proceedings of the Design, Automation and Test in Europe Conference 8 Exhibition (DATE’18). 1143--1146.

[110]

Unai Garciarena and Roberto Santana. 2016. Evolutionary optimization of compiler flag selection by learning and exploiting flags interactions. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion (GECCO’16). ACM, New York, NY, 1159--1166.

Digital Library

[111]

Kyriakos Georgiou, Craig Blackmore, Samuel Xavier-de Souza, and Kerstin Eder. 2018. Less is more: Exploiting the standard compiler optimization levels for better performance and energy consumption. arXiv preprint arXiv:1802.09845 (2018).

Digital Library

[112]

Zhangxiaowen Gong, Zhi Chen, Justin Josef Szaday, David C. Wong, Zehra Sura, Neftali Watkinson, Saeed Maleki, David Padua, Alexandru Nicolau, Alexander V. Veidenbaum et al. 2018. An empirical study of the effect of source-level transformations on compiler stability. In Proceedings of the Workshop on Compilers for Parallel Computing (CPC’18).

[113]

Richard L. Gorsuch. 1988. Exploratory factor analysis. In Handbook of Multivariate Experimental Psychology. Springer, 231--258.

[114]

Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. 2012. Auto-tuning a high-level language targeted to GPU codes. In Proceedings of the Conference on Innovative Parallel Computing (InPar’12). IEEE, 1--10.

[115]

M. Hall, D. Padua, and K. Pingali. 2009. Compiler research: The next 50 years. Commun. ACM (2009). Retrieved from http://dl.acm.org/citation.cfm?id=1461946.

Digital Library

[116]

M. Haneda. 2005. Optimizing general purpose compiler optimization. Proceedings of the 2nd Conference on Computing frontiers (2005), 180--188. Retrieved from http://dl.acm.org/citation.cfm?id=1062293.

Digital Library

[117]

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009. Unsupervised learning. In The Elements of Statistical Learning. Springer, 485--585.

[118]

Jeffrey Hightower and Gaetano Borriello. 2001. A survey and taxonomy of location systems for ubiquitous computing. IEEE Comput. 34, 8 (2001), 57--66.

Digital Library

[119]

Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath et al. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag. 29, 6 (2012), 82--97.

[120]

Torsten Hoefler and Roberto Belli. 2015. Scientific benchmarking of parallel computing systems: Twelve ways to tell the masses when reporting performance results. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 73.

Digital Library

[121]

Kenneth Hoste and Lieven Eeckhout. 2007. Microarchitecture-independent workload characterization. IEEE Micro 27, 3 (2007), 63--72.

Digital Library

[122]

K. Hoste and L. Eeckhout. 2008. Cole: Compiler optimization level exploration. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. 165--174. Retrieved from http://dl.acm.org/citation.cfm?id=1356080.

Digital Library

[123]

K. Hoste, A. Georges, and L. Eeckhout. 2010. Automated just-in-time compiler tuning. Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (2010), 62--72. Retrieved from http://dl.acm.org/citation.cfm?id=1772965.

Digital Library

[124]

Ronald A. Howard. 1966. “Dynamic programming.” Management Science 12, 5 (1966), 317--348.

Digital Library

[125]

Texas Instruments. 2012. Pandaboard. OMAP4430 SoC Dev. Board 2 (2012).

[126]

Kazuaki Ishizaki, Akihiro Hayashi, Gita Koblents, and Vivek Sarkar. 2015. Compiling and optimizing java 8 programs for gpu execution. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT’15). IEEE, 419--431.

Digital Library

[127]

Michael R. Jantz and Prasad A. Kulkarni. 2013. Exploiting phase inter-dependencies for faster iterative compiler optimization phase order searches. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES’13). IEEE, 1--10.

Digital Library

[128]

Sverre Jarp. 2002. A Methodology for Using the Itanium 2 Performance Counters for Bottleneck Analysis. Technical report, HP Labs.

[129]

Brian Jeff. 2012. Big. LITTLE system architecture from ARM: Saving power through heterogeneous multiprocessing and task context migration. In Proceedings of the 49th Annual Design Automation Conference. ACM, 1143--1146.

[130]

Richard Arnold Johnson and Dean W. Wichern. 2002. Applied Multivariate Statistical Analysis. Vol. 5. Prentice Hall, Upper Saddle River, NJ.

[131]

Bill Joy, Guy Steele, James Gosling, and Gilad Bracha. 2000. Java (TM) Language Specification. Addisson-Wesley.

[132]

Agnieszka Kamiadska and Wacodzimierz Bielecki. 2016. Statistical models to accelerate software development by means of iterative compilation. Comput. Sci. 17, 3 (2016), 407. Retrieved from https://journals.agh.edu.pl/csci/article/view/1800.

[133]

Christos Kartsaklis, Oscar Hernandez, Chung-Hsing Hsu, Thomas Ilsche, Wayne Joubert, and Richard L. Graham. 2012. HERCULES: A pattern driven code transformation system. In Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium Workshops 8 PhD Forum (IPDPSW’12). IEEE, 574--583.

Digital Library

[134]

Christos Kartsaklis, Eunjung Park, and John Cavazos. 2014. HSLOT: The HERCULES scriptable loop transformations engine. In Proceedings of the 4th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing. IEEE Press, 31--41.

Digital Library

[135]

Vasilios Kelefouras. 2017. A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details. Computing (2017), 1--24.

Digital Library

[136]

William Killian, Renato Miceli, Eunjung Park, Marco Alvarez, and John Cavazos. 2014. Performance improvement in kernels by guiding compiler auto-vectorization heuristics. PRACE-RI.EU. Retrieved from http://www.prace-ri.eu/IMG/pdf/WP183.pdf.

[137]

Kyoung-jae Kim. 2003. Financial time series forecasting using support vector machines. Neurocomputing 55, 1 (2003), 307--319.

[138]

Anton Kindestam. 2017. Graph-based features for machine learning driven code optimization. Master of science Dissertation, KTH, urn:nbn:se:kth:diva-211444.

[139]

Toru Kisuki, P. Knijnenburg, M. O’Boyle, and H. Wijshoff. 2000. Iterative compilation in program optimization. In Proceedings of the Conference on Compilers for Parallel Computers (CPC’10). Citeseer, 35--44.

[140]

Toru Kisuki, Peter M. W. Knijnenburg, Mike F. P. O’Boyle, François Bodin, and Harry A. G. Wijshoff. 1999. A feasibility study in iterative compilation. In High Performance Computing. Springer, 121--132.

Digital Library

[141]

Toru Kisuki, Peter M. W. Knijnenburg, and Michael F. P. O’Boyle. 2000. Combined selection of tile sizes and unroll factors using iterative compilation. In Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT’00). 237--248.

Digital Library

[142]

P. M. W. Knijnenburg, T. Kisuki, and M. F. P. O ’boyle. 2003. Combined selection of tile sizes and unroll factors using iterative compilation. J. Supercomput. 24 (2003), 43--67.

[143]

A. Koseki. 1997. A method for estimating optimal unrolling times for nested loops. In Proceedings of the 3rd International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN’97). 376--382. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[144]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Curran Associates, inc., 1097--1105.

Digital Library

[145]

David C. Ku and Giovanni De Micheli. 1992. Design space exploration. In High Level Synthesis of ASICs under Timing and Synchronization Constraints. Springer, 83--111.

[146]

P. Kulkarni, S. Hines, and J. Hiser. 2004. Fast searches for effective optimization phase sequences. ACM SIGPLAN Notices 39, 6 (2004), 171--182. Retrieved from http://dl.acm.org/citation.cfm?id=996863.

Digital Library

[147]

Prasad A. Kulkarni, Michael R. Jantz, and David B. Whalley. 2010. Improving both the performance benefits and speed of optimization phase sequence searches. In ACM Sigplan Notices, Vol. 45. ACM, 95--104.

Digital Library

[148]

Prasad A. Kulkarni, David B. Whalley, and Gary S. Tyson. 2007. Evaluating heuristic optimization phase order search algorithms. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’07). IEEE, 157--169.

Digital Library

[149]

Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson, and Jack W. Davidson. 2006. Exhaustive optimization phase order space exploration. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’06). IEEE, 13.

Digital Library

[150]

Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson, and Jack W. Davidson. 2009. Practical exhaustive optimization phase order exploration and evaluation. ACM Trans. Archit. Code Optim. 6, 1, Article 1 (Apr. 2009).

Digital Library

[151]

S. Kulkarni and J. Cavazos. 2012. Mitigating the compiler optimization phase-ordering problem using machine learning. ACM SIGPLAN Notices (2012). Retrieved from http://dl.acm.org/citation.cfm?id=2384628.

Digital Library

[152]

S. Kulkarni and J. Cavazos. 2013. Automatic construction of inlining heuristics using machine learning. Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO’13). 1--12. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[153]

T. Satish Kumar, S. Sakthivel, S. Sushil Kumar, and N. Arun. 2014. Compiler phase ordering and optimizing MPI runtime parameters using heuristic algorithms on SMPs. Int. J. Appl. Eng. Res. 9, 24 (2014), 30831--30851.

[154]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis 8 transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’04). IEEE, 75--86.

Digital Library

[155]

H. Leather, E. Bonilla, and M. O’Boyle. 2009. Automatic feature generation for machine learning based optimizing compilation. Proceedings of the International Symposium on Code Generation and Optimization (CGO’09). 81--91. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[156]

Bruce W. Leverett, Roderic Geoffrey Galton Cattell, Steven O. Hobbs, Joseph M. Newcomer, Andrew H. Reiner, Bruce R. Schatz, and William A. Wulf. 1979. An Overview of the Production Quality Compiler-Compiler Project. Carnegie Mellon University, Department of Computer Science.

[157]

Fengqian Li, Feilong Tang, and Yao Shen. 2014. Feature mining for machine learning based compilation optimization. Proceedings of the 8th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS’14). 207--214.

Digital Library

[158]

Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). IEEE, 469--480.

Digital Library

[159]

Y. Li, J. Dongarra, and S. Tomov. 2009. A note on auto-tuning GEMM for GPUs. Proceedings of the Conference on Computational Science (ICCS’09). Retrieved from http://link.springer.com/chapter/10.1007/978-3-642-01970-8.

Digital Library

[160]

Hui Liu, Rongcai Zhao, Qi Wang, and Yingying Li. 2018. ALIC: A low overhead compiler optimization prediction model. Wireless Personal Commun. Springer.

[161]

Vincent Loechner. 1999. PolyLib: A library for manipulating parameterized polyhedra. Retrieved from online at http://icps.u-strasbg.fr/PolyLib/.

[162]

P. Lokuciejewski and F. Gedikli. 2009. Automatic WCET reduction by machine learning based heuristics for function inlining. Proceedings of the 3rd Workshop on Statistical and Machine Learning Approaches to Architectures and Compilation (SMART’09). 1--15. Retrieved from https://www.researchgate.net/profile/Peter.

[163]

Paul Lokuciejewski, Sascha Plazar, Heiko Falk, Peter Marwedel, and Lothar Thiele. 2010. Multi-objective exploration of compiler optimizations for real-time systems. Proceedings of the 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC’10), vol. 1, 115--122.

Digital Library

[164]

David B. Loveman. 1977. Program improvement by source-to-source transformation. J. ACM 24, 1 (1977), 121--145.

Digital Library

[165]

Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). ACM, New York, NY, 190--200.

Digital Library

[166]

L. Luo, Y. Chen, C. Wu, S. Long, and G. Fursin. 2014. Finding representative sets of optimizations for adaptive multiversioning applications. arXiv preprint arXiv:1407.4075 (2014). Retrieved from http://arxiv.org/abs/1407.4075.

[167]

Scott A. Mahlke, David C. Lin, William Y. Chen, Richard E. Hank, and Roger A. Bringmann. 1992. Effective compiler support for predicated execution using the hyperblock. In ACM SIGMICRO Newsletter, Vol. 23. IEEE Computer Society Press, 45--54.

Digital Library

[168]

Giovanni Mariani, Aleksandar Brankovic, Gianluca Palermo, Jovana Jovic, Vittorio Zaccaria, and Cristina Silvano. 2010. A correlation-based design space exploration methodology for multi-processor systems-on-chip. In Proceedings of the 47th ACM/IEEE Design Automation Conference (DAC’10). IEEE, 120--125.

Digital Library

[169]

J. Mars and R. Hundt. 2009. Scenario based optimization: A framework for statically enabling online optimizations. Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization. Retrieved from http://dl.acm.org/citation.cfm?id=1545068.

Digital Library

[170]

L. G. A. Martins and R. Nobre. 2016. Clustering-based selection for the exploration of compiler optimization sequences. ACM Trans. Architect. Code Optim. 13, 1 (2016), 28. Retrieved from http://dl.acm.org/citation.cfm?id=2883614.

Digital Library

[171]

Luiz G. A. Martins, Ricardo Nobre, Alexandre C. B. Delbem, Eduardo Marques, and João M. P. Cardoso. 2014. Exploration of compiler optimization sequences using clustering-based selection. In ACM SIGPLAN Notices, Vol. 49. ACM, 63--72.

Digital Library

[172]

Amy McGovern, Eliot Moss, and Andrew G. Barto. 1999. Scheduling straight-line code using reinforcement learning and rollouts. Technical report no. 99-23 (1999).

Digital Library

[173]

Amy McGovern, Eliot Moss, and Andrew G. Barto. 2002. Building a basic block instruction scheduler with reinforcement learning and rollouts. Mach. Learn. 49, 2--3 (2002), 141--160.

Digital Library

[174]

Abdul Wahid Memon and Grigori Fursin. 2013. Crowdtuning: Systematizing auto-tuning using predictive modeling and crowdsourcing. In Proceedings of the PARCO Mini-Symposium on Application Autotuning for HPC (Architectures).

[175]

R. Miceli, G. Civario, A. Sikora, and E. César. 2012. Autotune: A plugin-driven approach to the automatic tuning of parallel applications. Proceedings of the International Workshop on Applied Parallel Computing. 328--342. Retrieved from http://link.springer.com/chapter/10.1007/978-3-642-36803-5.

Digital Library

[176]

MinIR 2011. MINimal IR space. Retrieved from http://www.assembla.com/wiki/show/minir-dev.

[177]

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. 2012. Foundations of Machine Learning. MIT Press.

Digital Library

[178]

A. Monsifrot, F. Bodin, and R. Quiniou. 2002. A machine learning approach to automatic production of compiler heuristics. Proceedings of the International Conference on Artificial Intelligence: Methodology, Systems, and Applications (2002), 41--50. Retrieved from http://link.springer.com/chapter/10.1007/3-540-46148-5.

Digital Library

[179]

Thierry Moreau, Anton Lokhmotov, and Grigori Fursin. 2018. Introducing ReQuEST: An open platform for reproducible and quality-efficient systems-ML tournaments. CoRR abs/1801.06378. arxiv:1801.06378. Retrieved from http://arxiv.org/abs/1801.06378.

[180]

Eliot Moss, Paul Utgoff, John Cavazos, Doina Precup, D Stefanovic, Carla Brodley, and David Scheeff. 1998. Learning to schedule straight-line code. Adv. Neural Info. Process. Syst. 10 (1998), 929--935. Retrieved from http://books.nips.cc/papers/files/nips10/0929.pdf.

Digital Library

[181]

Paschalis Mpeis, Pavlos Petoumenos, and Hugh Leather. 2015. Iterative compilation on mobile devices. CoRR abs/1511.02603. Retrieved from http://arxiv.org/abs/1511.02603.

[182]

Philip J. Mucci, Shirley Browne, Christine Deane, and George Ho. 1999. PAPI: A portable interface to hardware performance counters. In Proceedings of the Department of Defense HPCMP Users Group Conference. 7--10.

[183]

M. Namolaru, A. Cohen, and G. Fursin. 2010. Practical aggregation of semantical program properties for machine learning based optimization. In Proceedings of the 2010 International Conference on Compilers, Architectures and Synthesis for Embedded Systems. Retrieved from http://dl.acm.org/citation.cfm?id=1878951.

Digital Library

[184]

Ricardo Nobre, Reis Luis, and M. P. Cardoso Joao. 2016. Compiler phase ordering as an orthogonal approach for reducing energy consumption. In Proceedings of the 19th Workshop on Compilers for Parallel Computing (CPC’16).

[185]

Ricardo Nobre, Luiz G. A. Martins, and Joao M. P. Cardoso. 2015. Use of previously acquired positioning of optimizations for phase ordering exploration. In Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems. ACM, 58--67.

Digital Library

[186]

Ricardo Nobre, Luiz G. A. Martins, and João M. P. Cardoso. 2016. A graph-based iterative compiler pass selection and phase ordering approach. ACM SIGPLAN Notices. 21--30.

Digital Library

[187]

Ricardo Nobre, Luís Reis, and João M. P. Cardoso. 2018. Impact of compiler phase ordering when targeting GPUs. In Proceedings of the Parallel Processing Workshops (Euro-Par’17), Dora B. Heras and Luc Bougé (Eds.). Springer International Publishing, Cham, 427--438.

[188]

Hirotaka Ogawa, Kouya Shimura, Satoshi Matsuoka, Fuyuhiko Maruyama, Yukihiko Sohda, and Yasunori Kimura. 2000. OpenJIT: An open-ended, reflective JIT compiler framework for Java. In Proceedings of the European Conference on Object-Oriented Programming. Springer, 362--387.

Digital Library

[189]

William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. Minimizing the cost of iterative compilation with active learning. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO’17). IEEE, 245--256.

[190]

Karl Joseph Ottenstein. 1978. Data-flow graphs as an intermediate program form. Ph.D. Dissertation. Purdue University.

[191]

David A. Padua and Michael J. Wolfe. 1986. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (1986), 1184--1201.

Digital Library

[192]

G. Palermo, C. Silvano, S. Valsecchi, and V. Zaccaria. 2003. A system-level methodology for fast multi-objective design space exploration. In Proceedings of the 13th ACM Great Lakes Symposium on VLSI. ACM, 92--95.

Digital Library

[193]

Gianluca Palermo, Cristina Silvano, and Vittorio Zaccaria. 2005. Multi-objective design space exploration of embedded systems. J. Embed. Comput. 1, 3 (2005), 305--316.

Digital Library

[194]

Gianluca Palermo, Cristina Silvano, and Vittorio Zaccaria. 2005. Multi-objective design space exploration of embedded systems. J. Embed. Comput. 1, 3 (2005), 305--316.

Digital Library

[195]

James Pallister, Simon J. Hollis, and Jeremy Bennett. 2013. Identifying compiler options to minimize energy consumption for embedded platforms. Comput. J. 58, 1 (2013), 95--109.

[196]

Z. Pan and R. Eigenmann. 2004. Rating compiler optimizations for automatic performance tuning. Proceedings of the 2004 ACM/IEEE Conference on Supercomputing. 14. Retrieved from http://dl.acm.org/citation.cfm?id=1049958.

Digital Library

[197]

Zhelong Pan and Rudolf Eigenmann. 2006. Fast and effective orchestration of compiler optimizations for automatic performance tuning. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’06). IEEE, 12.

Digital Library

[198]

E. Park, J. Cavazos, and M. A. Alvarez. 2012. Using graph-based program characterization for predictive modeling. Proceedings of the International Symposium on Code Generation and Optimization. 295--305. Retrieved from http://dl.acm.org/citation.cfm?id=2259042.

Digital Library

[199]

E. Park, J. Cavazos, and L. N. Pouchet. 2013. Predictive modeling in a polyhedral optimization space. International J. Parallel Program. 41, 5 (2013), 704--750. Retrieved from http://link.springer.com/article/10.1007/s10766-013-0241-1.

[200]

Eunjung Park, Christos Kartsaklis, and John Cavazos. 2014. HERCULES: Strong patterns towards more intelligent predictive modeling. Proceedings of the 43rd International Conference on Parallel Processing. 172--181.

[201]

E. Park, S. Kulkarni, and J. Cavazos. 2011. An evaluation of different modeling techniques for iterative compilation. In Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES'11). ACM, 65--74. Retrieved from http://dl.acm.org/citation.cfm?id=2038711.

Digital Library

[202]

Eun Jung Park. 2015. Automatic selection of compiler optimizations using program characterization and machine learning title. Ph.D. Dissertation, University of Delaware, USA.

[203]

David A. Patterson and John L. Hennessy. 2013. Computer Organization and Design: The Hardware/Software Interface. Newnes.

Digital Library

[204]

Judea Pearl. 1985. Bayesian Networks: A Model of Self-activated Memory for Evidential Reasoning. UCLA Technical report no. CSD-850017); Proceedings of the 7th Conference of the Cognitive Science Society, vol. 3, 329--334.

[205]

Leslie Pérez Cáceres, Federico Pagnozzi, Alberto Franzin, and Thomas Stützle. 2018. Automatic configuration of GCC using irace. In Artificial Evolution, Evelyne Lutton, Pierrick Legrand, Pierre Parrend, Nicolas Monmarché, and Marc Schoenauer (Eds.). Springer International Publishing, Cham, 202--216.

[206]

R. P. J. Pinkers, P. M. W. Knijnenburg, M. Haneda, and H. A. G. Wijshoff. 2004. Statistical selection of compiler options. Proceedings of the IEEE Computer Society’s Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS’04). 494--501.

Digital Library

[207]

L. L. Pollock and M. L. Soffa. 1990. Incremental global optimization for faster recompilations. In Proceedings of the International Conference on Computer Languages. 281--290.

[208]

L. N. Pouchet and C. Bastoul. 2007. Iterative optimization in the polyhedral model. Part I: One-dimensional time. Proceedings of the International Symposium on Code Generation and Optimization (CGO’07). 144--156. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[209]

L. N. Pouchet, C. Bastoul, A. Cohen, and J. Cavazos. 2008. Iterative optimization in the polyhedral model. Part II: Multidimensional time. ACM SIGPLAN Notices. Retrieved from http://dl.acm.org/citation.cfm?id=1375594.

Digital Library

[210]

L. N. Pouchet and U. Bondhugula. 2010. Combined iterative and model-driven optimization in an automatic parallelization framework. Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. 1--11. Retrieved from http://dl.acm.org/citation.cfm?id=1884672.

Digital Library

[211]

Louis-Noël Pouchet. 2012. Polybench: The polyhedral benchmark suite. Retrieved from http://www.cs.ucla.edu/pöouchet/software/polybench/.

[212]

S. Purini and L. Jain. 2013. Finding good optimization sequences covering program space. ACM Trans. Architect. Code Optim. 9, 4 (2013), 56. Retrieved from http://dl.acm.org/citation.cfm?id=2400715.

Digital Library

[213]

Matthieu Stéphane Benoit Queva. 2007. Phase-ordering in optimizing compilers. MS thesis. Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark.

[214]

Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender Systems Handbook. Springer, 1--35.

[215]

Ranjit K. Roy. 2001. Design of Experiments Using the Taguchi Approach: 16 Steps to Product and Process Improvement. Wiley-Interscience.

[216]

T. Rusira, M. Hall, and P. Basu. 2017. Automating compiler-directed autotuning for phased performance behavior. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’17). 1362--1371.

[217]

Ricardo Nabinger Sanchez, Jose Nelson Amaral, Duane Szafron, Marius Pirvu, and Mark Stoodley. 2011. Using machines to learn method-specific compilation strategies. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization. 257--266. Retrieved from http://dl.acm.org/citation.cfm?id=2190072.

Digital Library

[218]

Vivek Sarkar. 1997. Automatic selection of high-order transformations in the IBM XL FORTRAN compilers. IBM J. Res. Dev. 41, 3 (1997), 233--264.

Digital Library

[219]

V. Sarkar. 2000. Optimized unrolling of nested loops. Proceedings of the 14th International Conference on Supercomputing. 153--166. Retrieved from http://dl.acm.org/citation.cfm?id=335246.

Digital Library

[220]

Robert R. Schaller. 1997. Moore’s law: Past, present, and future. IEEE Spectrum 34, 6 (1997), 52--59.

Digital Library

[221]

E. Schkufza, R. Sharma, and A. Aiken. 2014. Stochastic optimization of floating-point programs with tunable precision. ACM SIGPLAN Notices. Retrieved from http://dl.acm.org/citation.cfm?id=2594302.

Digital Library

[222]

Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Netw. 61 (2015), 85--117.

Digital Library

[223]

Paul B. Schneck. 1973. A survey of compiler optimization techniques. In Proceedings of the ACM Annual Conference. ACM, 106--113.

Digital Library

[224]

Bernhard Schölkopf. 2001. The kernel trick for distances. In Advances in Neural Information Processing Systems. Curran Associates, inc., 301--307.

[225]

Cristina Silvano, Giovanni Agosta, Andrea Bartolini, Andrea Beccari, Luca Benini, Joao M. P. Cardoso, Carlo Cavazzoni, Radim Cmar, Jan Martinovic, Gianluca Palermo et al. 2015. ANTAREX--AutoTuning and adaptivity approach for energy efficient eXascale HPC systems. In Proceedings of the IEEE 18th International Conference on Computational Science and Engineering (CSE’15). IEEE, 343--346.

Digital Library

[226]

Cristina Silvano, Giovanni Agosta, Andrea Bartolini, Andrea R Beccari, Luca Benini, João Bispo, Radim Cmar, João MP Cardoso, Carlo Cavazzoni, Jan Martinovič et al. 2016. AutoTuning and adaptivity appRoach for energy efficient eXascale HPC systems: The ANTAREX approach. In Proceedings of the Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE’16). IEEE, 708--713.

Digital Library

[227]

Cristina Silvano, Giovanni Agosta, Stefano Cherubin, Davide Gadioli, Gianluca Palermo, Andrea Bartolini, Luca Benini, Jan Martinovič, Martin Palkovič, Kateřina Slaninová et al. 2016. The ANTAREX approach to autotuning and adaptivity for energy efficient hpc systems. In Proceedings of the ACM International Conference on Computing Frontiers. ACM, 288--293.

Digital Library

[228]

Cristina Silvano, Andrea Bartolini, Andrea Beccari, Candida Manelfi, Carlo Cavazzoni, Davide Gadioli, Erven Rohou, Gianluca Palermo, Giovanni Agosta, Jan Martinovič et al. 2017. The ANTAREX tool flow for monitoring and autotuning energy efficient HPC systems. In Proceedings of the International Conference on Embedded Computer Systems: Architecture, Modeling, and Simulation (SAMOS’17).

[229]

Cristina Silvano, William Fornaciari, Gianluca Palermo, Vittorio Zaccaria, Fabrizio Castro, Marcos Martinez, Sara Bocchio, Roberto Zafalon, Prabhat Avasare, Geert Vanmeerbeeck et al. 2011. Multicube: Multi-objective design space exploration of multi-core architectures. In Proceedings of the VLSI 2010 Annual Symposium. Springer, 47--63.

Digital Library

[230]

Cristina Silvano, Gianluca Palermo, Giovanni Agosta, Amir H. Ashouri, Davide Gadioli, Stefano Cherubin, Emanuele Vitali, Luca Benini, Andrea Bartolini, Daniele Cesarini, Joao Cardoso, Joao Bispo, Pedro Pinto, Riccardo Nobre, Erven Rohou, Loïc Besnard, Imane Lasri, Nico Sanna, Carlo Cavazzoni, Radim Cmar, Jan Martinovič, Kateřina Slaninová, Martin Golasowski, Andrea R. Beccari, and Candida Manelfi. 2018. Autotuning and adaptivity in energy efficient HPC systems: The ANTAREX toolbox. In Proceedings of the Computing Frontiers Conference. ACM.

Digital Library

[231]

Bernard W. Silverman. 1986. Density estimation for statistics and data analysis. Vol. 26. CRC press.

[232]

Richard Stallman. 2001. Using and porting the GNU compiler collection. In MIT Artificial Intelligence Laboratory. Citeseer.

[233]

Richard M. Stallman et al. 2003. Using GCC: the GNU compiler collection reference manual. GNU Press.

[234]

Kenneth O. Stanley. 2002. Efficient reinforcement learning through evolving neural network topologies. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’02). Citeseer.

[235]

M. W. Stephenson. 2006. Automating the construction of compiler heuristics using machine learning. Retrieved from http://groups.csail.mit.edu/commit/papers/2006/stephenson_phdthesis.pdf.

[236]

M. Stephenson and S. Amarasinghe. 2003. Meta optimization: Improving compiler heuristics with machine learning. 38, 5 (2003), 77--90. Retrieved from http://dl.acm.org/citation.cfm?id=781141.

Digital Library

[237]

M. Stephenson and S. Amarasinghe. 2005. Predicting unroll factors using supervised classification. In Proceedings of the International Symposium on Code Generation and Optimization.

Digital Library

[238]

M. Stephenson and U. M. O’Reilly. 2003. Genetic programming applied to compiler heuristic optimization. Proceedings of the European Conference on Genetic Programming. 238--253. Retrieved from http://link.springer.com/chapter/10.1007/3-540-36599-0.

Digital Library

[239]

Ralph E. Steuer. 1986. Multiple Criteria Optimization: Theory, Computation, and Applications. Wiley.

[240]

K. Stock, L. N. Pouchet, and P. Sadayappan. 2012. Using machine learning to improve automatic vectorization. ACM Trans. Architect. Code Optim. 8, 4 (2012), 50. Retrieved from http://dl.acm.org/citation.cfm?id=2086729.

Digital Library

[241]

Toshio Suganuma, Takeshi Ogasawara, Mikio Takeuchi, Toshiaki Yasue, Motohiro Kawahito, Kazuaki Ishizaki, Hideaki Komatsu, and Toshio Nakatani. 2000. Overview of the IBM Java just-in-time compiler. IBM Syst. J. 39, 1 (2000), 175--193.

Digital Library

[242]

Cristian Ţăpuş, I-Hsin Chung, Jeffrey K. Hollingsworth et al. 2002. Active harmony: Towards automated performance tuning. In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing. IEEE Computer Society Press, 1--11.

Digital Library

[243]

Michele Tartara and Stefano Crespi Reghizzi. 2012. Parallel iterative compilation: Using MapReduce to speedup machine learning in compilers. In Proceedings of 3rd International Workshop on MapReduce and Its Applications Date. ACM, 33--40.

Digital Library

[244]

Michele Tartara and Stefano Crespi Reghizzi. 2013. Continuous learning of compiler heuristics. ACM Trans. Architect. Code Optim. 9, 4 (2013), 46.

Digital Library

[245]

Michele Tartara and Stefano Crespi Reghizzi. 2013. Continuous learning of compiler heuristics. ACM Trans. Archit. Code Optim. 9, 4, Article 46 (Jan. 2013).

Digital Library

[246]

Gerald Tesauro and Gregory R. Galperin. 1996. On-line policy improvement using monte-carlo search. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’96), Vol. 96. 1068--1074.

Digital Library

[247]

Bruce Thompson. 2002. Statistical, practical, and clinical: How many kinds of significance do counselors need to consider? J. Counsel. Dev. 80, 1 (2002), 64--71.

[248]

J. Thomson, M. O’Boyle, G. Fursin, and B. Franke. 2009. Reducing training time in a one-shot machine learning-based compiler. Proceedings of the International Workshop on Languages and Compilers for Parallel Computing. 399--407. Retrieved from http://link.springer.com/10.1007.

Digital Library

[249]

A. Tiwari, C. Chen, and J. Chame. 2009. A scalable auto-tuning framework for compiler optimization. In Proceedings of the IEEE International Symposium on Parallel 8 Distributed Processin (IPDPS’09). 1--12. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[250]

Georgios Tournavitis, Zheng Wang, Björn Franke, and Michael F. P. O’Boyle. 2009. Towards a holistic approach to auto-parallelization: Integrating profile-driven parallelism detection and machine-learning based mapping. ACM Sigplan Not. 44, 6 (2009), 177--187.

Digital Library

[251]

S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D. I. August. 2003. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’03). IEEE, 204--215.

Digital Library

[252]

Eben Upton and Gareth Halfacree. 2014. Raspberry Pi User Guide. John Wiley 8 Sons.

Digital Library

[253]

K. Vaswani. 2007. Microarchitecture sensitive empirical models for compiler optimizations. International Symposium on Code Generation and Optimization (CGO’07) (2007), 131--143. Retrieved from http://ieeexplore.ieee.org/xpls/abs.

Digital Library

[254]

Steven R. Vegdahl. 1982. Phase coupling and constant generation in an optimizing microcode compiler. ACM SIGMICRO Newslett. 13, 4 (1982), 125--133.

Digital Library

[255]

Richard Vuduc, James W. Demmel, and Jeff A. Bilmes. 2004. Statistical models for empirical search-based performance tuning. Int. J. High Perform. Comput. Appl. 18, 1 (2004), 65--94.

Digital Library

[256]

Richard W. Vuduc. 2011. Autotuning. Springer, Boston, MA, 102--105.

[257]

David W. Wall. 1991. Limits of Instruction-level Parallelism. Vol. 19. ACM.

Digital Library

[258]

Wei Wang, John Cavazos, and Allan Porterfield. 2014. Energy auto-tuning using the polyhedral approach. In Proceedings of the Workshop on Polyhedral Compilation Techniques.

[259]

Z. Wang and M. F. P. O’Boyle. 2009. Mapping parallelism to multi-cores: A machine learning based approach. ACM Sigplan Not. 44, 4 (2009), 75--84. Retrieved from http://dl.acm.org/citation.cfm?id=1504189.

Digital Library

[260]

Todd Waterman. 2006. Adaptive Compilation and Inlining. Ph.D. Dissertation, Rice University.

Digital Library

[261]

Deborah Whitfield and Mary Lou Soffa. 1991. Automatic generation of global optimizers. In ACM SIGPLAN Notices, Vol. 26. ACM, 120--129.

Digital Library

[262]

D. Whitfield, M. L. Soffa, D. Whitfield, and M. L. Soffa. 1990. An approach to ordering optimizing transformations. In Proceedings of the 2nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP’90), Vol. 25. ACM Press, New York, New York, 137--146.

Digital Library

[263]

Deborah L. Whitfield and Mary Lou Soffa. 1997. An approach for exploring code improving transformations. ACM Trans. Program. Lang. Syst. 19, 6 (Nov. 1997), 1053--1084.

Digital Library

[264]

Doran K. Wilde. 1993. A Library for Doing Polyhedral Operations. Technical report no. 785. IRISA.

[265]

Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52, 4 (2009), 65--76.

Digital Library

[266]

Robert P. Wilson, Robert S. French, Christopher S. Wilson, Saman P. Amarasinghe, Jennifer M. Anderson, Steve W. K. Tjiang, Shih-Wei Liao, Chau-Wen Tseng, Mary W. Hall, Monica S. Lam et al. 1994. SUIF: An infrastructure for research on parallelizing and optimizing compilers. ACM Sigplan Not. 29, 12 (1994), 31--37.

Digital Library

[267]

M. I. Wolczko and D. M. Ungar. 2000. Method and apparatus for improving compiler performance during subsequent compilations of a source program. U.S. Patent No. 6,078,744. Retrieved from https://www.google.com/patents/US6078744.

[268]

Stephan Wong, Thijs Van As, and Geoffrey Brown. 2008. -VEX: A reconfigurable and extensible softcore VLIW processor. In Proceedings of the International Conference on ICECE Technology (FPT’08). IEEE, 369--372.

[269]

William Allan Wulf, Richard K. Johnsson, Charles B. Weinstock, Steven O. Hobbs, and Charles M. Geschke. 1975. The Design of an Optimizing Compiler. Elsevier Science Inc.

Digital Library

[270]

T. Yuki, V. Basupalli, G. Gupta, G. Iooss, and D. Kim. 2012. Alphaz: A system for analysis, transformation, and code generation in the polyhedral equational model. Retrieved from http://www.cs.colostate.edu/TechReports/Reports/2012/tr12-101.pdf.

[271]

T. Yuki, G. Gupta, D. G. Kim, T. Pathan, and S. Rajopadhye. 2012. AlphaZ: A system for design space exploration in the polyhedral model, In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing. 17--31. Retrieved from http://people.rennes.inria.fr/Tomofumi.Yuki/papers/yuki-lcpc2012.pdf.

[272]

Vittorio Zaccaria, Gianluca Palermo, Fabrizio Castro, Cristina Silvano, and Giovanni Mariani. 2010. Multicube explorer: An open source framework for design space exploration of chip multi-processors. In Proceedings of the 23rd International Conference on Architecture of Computing Systems (ARCS’10). VDE, 1--7.

[273]

Min Zhao, Bruce Childers, Mary Lou Soffa, Min Zhao, Bruce Childers, and Mary Lou Soffa. 2003. Predicting the impact of optimizations for embedded systems. In Proceedings of the 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES’03), Vol. 38. ACM Press, New York, New York, 1.

Digital Library

[274]

Min Zhao, Bruce R. Childers, and Mary Lou Soffa. 2005. A model-based framework: An approach for profit-driven optimization. In Proceedings of the International Symposium on Code Generation and Optimization. IEEE, 317--327.

Digital Library

Cited By

Hussein MGhallab MBadr NAttia FAbdelwahab M(2024)Aerobic exercise versus acupuncture on the quality of life in women suffering from irritable bowel syndromeFizjoterapia Polska10.56984/8ZG56086EL24:2(259-265)Online publication date: 20-Jun-2024
https://doi.org/10.56984/8ZG56086EL
Quetschlich NBurgholzer LWille R(2024)MQT Predictor: Automatic Device Selection with Device-Specific Circuit Compilation for Quantum ComputingACM Transactions on Quantum Computing10.1145/3673241Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3673241
Gao WZhang XHuang SGuo SSun PWen YZhang T(2024)AutoSched: An Adaptive Self-configured Framework for Scheduling Deep Learning Training WorkloadsProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656598(473-484)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3650200.3656598
Show More Cited By

Index Terms

A Survey on Compiler Autotuning using Machine Learning
1. Computing methodologies
  1. Machine learning
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning

Recent compilers offer a vast number of multilayered optimizations targeting different code segments of an application. Choosing among these optimizations can significantly impact the performance of the code being optimized. The selection of the right ...
Mitigating the compiler optimization phase-ordering problem using machine learning
OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications

Today's compilers have a plethora of optimizations to choose from, and the correct choice of optimizations can have a significant impact on the performance of the code being optimized. Furthermore, choosing the correct order in which to apply those ...
Mitigating the compiler optimization phase-ordering problem using machine learning
OOPSLA '12

Today's compilers have a plethora of optimizations to choose from, and the correct choice of optimizations can have a significant impact on the performance of the code being optimized. Furthermore, choosing the correct order in which to apply those ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 51, Issue 5

September 2019

791 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3271482

Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 September 2018

Accepted: 01 March 2018

Revised: 01 February 2018

Received: 01 November 2016

Published in CSUR Volume 51, Issue 5

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Funding Sources

EU Commission H2020-FET-HPC program

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

112
Total Citations
View Citations
3,210
Total Downloads

Downloads (Last 12 months)504
Downloads (Last 6 weeks)46

Reflects downloads up to 29 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hussein MGhallab MBadr NAttia FAbdelwahab M(2024)Aerobic exercise versus acupuncture on the quality of life in women suffering from irritable bowel syndromeFizjoterapia Polska10.56984/8ZG56086EL24:2(259-265)Online publication date: 20-Jun-2024
https://doi.org/10.56984/8ZG56086EL
Quetschlich NBurgholzer LWille R(2024)MQT Predictor: Automatic Device Selection with Device-Specific Circuit Compilation for Quantum ComputingACM Transactions on Quantum Computing10.1145/3673241Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3673241
Gao WZhang XHuang SGuo SSun PWen YZhang T(2024)AutoSched: An Adaptive Self-configured Framework for Scheduling Deep Learning Training WorkloadsProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656598(473-484)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3650200.3656598
Canesche MRosário VBorin EQuintão Pereira F(2024)The Droplet Search Algorithm for Kernel SchedulingACM Transactions on Architecture and Code Optimization10.1145/365010921:2(1-28)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3650109
Zhu MHao DChen J(2024)Compiler Autotuning through Multiple-phase LearningACM Transactions on Software Engineering and Methodology10.1145/364033033:4(1-38)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3640330
Poppe OArora PSharma SChen JPandit SSawhney RJhalani VLang WGuo QInumella ASridhar SGala DRathi NOslake MChirica AIyer SGoel PKalhan ABarcelo PSanchez-Pi NMeliou ASudarshan S(2024)Proactive Resume and Pause of Resources for Microsoft Azure SQL Database ServerlessCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653371(227-240)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3626246.3653371
Sajjadinasab RRastaghi HShahzad HArora SDrepper UHerbordt M(2024)Further Optimizations and Analysis of Smith-Waterman with Vector Extensions2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00113(561-570)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00113
Nikith BReeha SReddy GBelwal M(2024)An Experimental Analysis of RL based Compiler Optimization Techniques using Compiler GYM2024 IEEE 9th International Conference for Convergence in Technology (I2CT)10.1109/I2CT61223.2024.10543642(1-6)Online publication date: 5-Apr-2024
https://doi.org/10.1109/I2CT61223.2024.10543642
Castro RAndrade DFraguela B(2024)STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep LearningIEEE Access10.1109/ACCESS.2024.340232612(70581-70599)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3402326
Ni YDu XSong LXiao RYe PWang J(2024)A Two-Stage LLVM Option Sequence Optimization Method to Minimize Energy ConsumptionSwarm and Evolutionary Computation10.1016/j.swevo.2024.10159188(101591)Online publication date: Jul-2024
https://doi.org/10.1016/j.swevo.2024.101591
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents