Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Minotaur: A SIMD-Oriented Synthesizing Superoptimizer

Published: 08 October 2024 Publication History

Abstract

A superoptimizing compiler—-one that performs a meaningful search of the program space as part of the optimization process—-can find optimization opportunities that are missed by even the best existing optimizing compilers. We created Minotaur: a superoptimizer for LLVM that uses program synthesis to improve its code generation, focusing on integer and floating-point SIMD code. On an Intel Cascade Lake processor, Minotaur achieves an average speedup of 7.3% on the GNU Multiple Precision library (GMP)’s benchmark suite, with a maximum speedup of 13%. On SPEC CPU 2017, our superoptimizer produces an average speedup of 1.5%, with a maximum speedup of 4.5% for 638.imagick. Every optimization produced by Minotaur has been formally verified, and several optimizations that it has discovered have been implemented in LLVM as a result of our work.

References

[1]
Maaz Bin Safeer Ahmad, Alexander J. Root, Andrew Adams, Shoaib Kamil, and Alvin Cheung. Vector Instruction Selection for Digital Signal Processors Using Program Synthesis. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2022, page 1004–1016, 2022.
[2]
AMD. AMD Zen 3, 2023. https://www.amd.com/en/technologies/zen-core.
[3]
ARM. ARM NEON Architecture, 2023. https://developer.arm.com/Architectures/Neon.
[4]
Sorav Bansal and Alex Aiken. Automatic Generation of Peephole Superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, page 394–403, 2006.
[5]
Sebastian Buchwald. Optgen: A generator for local optimizations. In International Conference on Compiler Construction, pages 171–189. Springer, 2015.
[6]
Yishen Chen, Charith Mendis, Michael Carbin, and Saman Amarasinghe. VeGen: A Vectorizer Generator for SIMD and Beyond, page 902–914. 2021.
[7]
Meghan Cowan, Thierry Moreau, Tianqi Chen, James Bornholt, and Luis Ceze. Automatic Generation of High-Performance Quantized Machine Learning Kernels. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2020, page 305–316, 2020.
[8]
Jack W. Davidson and Christopher W. Fraser. Automatic Generation of Peephole Optimizations. In Proceedings of the 1984 SIGPLAN Symposium on Compiler Construction, SIGPLAN ’84, pages 111–116, 1984.
[9]
Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient smt solver. In Tools and Algorithms for the Construction and Analysis of Systems: 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings 14, pages 337–340. Springer, 2008.
[10]
LLVM Developers. LLVM Machine Code Analyzer, 2023. https://llvm.org/docs/CommandGuide/llvm-mca.html.
[11]
Bruno Dutertre. Solving exists/forall problems with yices. In The 13th International Workshop on Satisfiability Modulo Theories, 2015.
[12]
Marcos Horro, Louis-Noël Pouchet, Gabriel Rodríguez, and Juan Tourino. Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations. In Proceedings of the 31st International Conference on Parallel Architectures and Compilation Techniques, 2022.
[13]
Intel. Cascade Lake: Overview, 2023. https://www.intel.com/content/www/us/en/products/platforms/details/cascade-lake.html.
[14]
Intel. Intel Intrinsics Guide, 2023. https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html.
[15]
Rajeev Joshi, Greg Nelson, and Keith Randall. Denali: a goal-directed superoptimizer. SIGPLAN Notices, 37(5):304–314, May 2002.
[16]
Samuel Larsen and Saman Amarasinghe. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI ’00, page 145–156, 2000.
[17]
Chris Lattner and Vikram Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO’04), Palo Alto, California, Mar 2004.
[18]
LLVM Developers. LoopInfo, 2023. https://llvm.org/doxygen/classllvm_1_1LoopInfo.html.
[19]
LLVM Developers. MemorySSA, 2023. https://llvm.org/docs/MemorySSA.html.
[20]
LLVM Developers. ORC Design and Implementation, 2023. https://llvm.org/docs/ORCv2.html.
[21]
LLVM Developers. TargetTransformInfo Class Reference, 2023. https://llvm.org/doxygen/classllvm_1_1TargetTransformInfo.html.
[22]
Nuno P. Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. Alive2: Bounded Translation Validation for LLVM, page 65–79. 2021.
[23]
Henry Massalin. Superoptimizer: A look at the smallest program. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems, ASPLOS II, page 122–126, 1987.
[24]
Phitchaya Mangpo Phothilimthana, Archibald Samuel Elliott, An Wang, Abhinav Jangda, Bastian Hagedorn, Henrik Barthels, Samuel J. Kaufman, Vinod Grover, Emina Torlak, and Rastislav Bodik. Swizzle Inventor: Data Movement Synthesis for GPU Kernels. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’19, page 65–78, 2019.
[25]
Jonathan Ragan-Kelley, Andrew Adams, Dillon Sharlet, Connelly Barnes, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. Halide: Decoupling Algorithms from Schedules for High-Performance Image Processing. Commun. ACM, 61(1):106–115, dec 2017.
[26]
Duncan Sands. Super-optimizing LLVM IR, November 2011. Presentation at the 2011 LLVM Developers’ Meeting.
[27]
Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Gratian Lup, Jubi Taneja, and John Regehr. Souper: A synthesizing superoptimizer. arXiv preprint arXiv:1711.04422, 2017.
[28]
Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic Superoptimization. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’13, page 305–316, 2013.
[29]
Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic optimization of floating-point programs with tunable precision. ACM SIGPLAN Notices, 49(6):53–64, 2014.
[30]
Rahul Sharma, Eric Schkufza, Berkeley Churchill, and Alex Aiken. Conditionally correct superoptimization. ACM SIGPLAN Notices, 50(10):147–162, 2015.
[31]
Armando Solar-Lezama. The Sketching Approach to Program Synthesis. In Proceedings of the 7th Asian Symposium on Programming Languages and Systems, APLAS ’09, page 4–13, Berlin, Heidelberg, 2009. Springer-Verlag.
[32]
Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. Equality Saturation: A New Approach to Optimization. In Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’09, page 264–276, 2009.
[33]
Qualcomm Technologies. Hexagon dsp sdk. https://developer.qualcomm.com/software/hexagon-dsp-sdk.
[34]
Emina Torlak and Rastislav Bodik. Growing Solver-Aided Languages with Rosette. In Proceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software, Onward! 2013, page 135–152, 2013.
[35]
Alexa VanHattum, Rachit Nigam, Vincent T. Lee, James Bornholt, and Adrian Sampson. Vectorization for Digital Signal Processors via Equality Saturation. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’21, page 874–886, 2021.
[36]
Jackson Woodruff, Thomas Koehler, Alexander Brauckmann, Chris Cummins, Sam Ainsworth, and Michael FP O’Boyle. Rewriting history: Repurposing domain-specific cgras. arXiv preprint arXiv:2309.09112, 2023.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 8, Issue OOPSLA2
October 2024
2691 pages
EISSN:2475-1421
DOI:10.1145/3554319
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2024
Published in PACMPL Volume 8, Issue OOPSLA2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. SIMD
  2. peephole optimization
  3. program synthesis
  4. superoptimization

Qualifiers

  • Research-article

Funding Sources

  • NSF

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 313
    Total Downloads
  • Downloads (Last 12 months)313
  • Downloads (Last 6 weeks)125
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media