Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3617232.3624873acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Automatic Generation of Vectorizing Compilers for Customizable Digital Signal Processors

Published: 17 April 2024 Publication History

Abstract

Embedded applications extract the best power-performance trade-off from digital signal processors (DSPs) by making extensive use of vectorized execution. Rather than handwriting the many customized kernels these applications use, DSP engineers rely on auto-vectorizing compilers to quickly produce effective code. Building these compilers is a large and error-prone investment, and each new DSP architecture or application-specific ISA customization must repeat this effort to derive a new high-performance compiler.
We present Isaria, a framework for automatically generating vectorizing compilers for DSP architectures. Isaria uses equality saturation to search for vectorized DSP code using a system of rewrite rules. Rather than hand-crafting these rules, Isaria automatically synthesizes sound rewrite rules from an ISA specification, discovers phase structure within these rules that improve compilation performance, and schedules their application at compile time while pruning intermediate states of the search. We use Isaria to generate a compiler for an industrial DSP architecture, and show that the resulting kernels outperform existing DSP libraries by up to 6.9× and are competitive with those generated by expert-built compilers. We also demonstrate how Isaria can speed up exploration of new ISA customizations by automatically generating a high-quality vectorizing compiler.

References

[1]
Maaz Bin Safeer Ahmad, Alexander J. Root, Andrew Adams, Shoaib Kamil, and Alvin Cheung. Vector instruction selection for digital signal processors using program synthesis. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '22, page 1004--1016, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450392051.
[2]
Randy Allen and Ken Kennedy. Automatic translation of fortran programs to vector form. ACM Trans. Program. Lang. Syst., 9(4): 491--542, oct 1987. ISSN 0164-0925.
[3]
Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. CVC4. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14--20, 2011. Proceedings, volume 6806 of Lecture Notes in Computer Science, pages 171--177. Springer, 2011. URL https://doi.org/10.1007/978-3-642-22110-1_14.
[4]
Dan Benanav, Deepak Kapur, and Paliath Narendran. Complexity of matching problems. Journal of Symbolic Computation, 3(1):203--216, 1987.
[5]
Cadence Design Systems, Inc. Tensilica customizable cores, 2020. https://ip.cadence.com/ipportfolio/tensilica-ip/xtensa-customizable.
[6]
David Cao, Rose Kunkel, Chandrakana Nandi, Max Willsey, Zachary Tatlock, and Nadia Polikarpova. babble: Learning better abstractions with e-graphs and anti-unification. Proceedings of the ACM on Programming Languages, 7(POPL):396--424, jan 2023. URL https://doi.org/10.1145%2F3571207.
[7]
Yishen Chen, Charith Mendis, Michael Carbin, and Saman Amarasinghe. Vegen: A vectorizer generator for simd and beyond. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '21, page 902--914, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383172.
[8]
Chris Fallin. Cranelift: Using e-graphs for verified, cooperating middle-end optimizations, 2022. https://github.com/bytecodealliance/rfcs/blob/main/accepted/cranelift-egraph.md.
[9]
Franz Franchetti and Markus Püschel. Generating simd vectorized permutations. In Proceedings of the Joint European Conferences on Theory and Practice of Software 17th International Conference on Compiler Construction, CC'08/ETAPS'08, page 116--131, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 3540787909.
[10]
Gaël Guennebaud, Benoît Jacob, et al. Eigen v3, 2010. http://eigen.tuxfamily.org.
[11]
Samuel Larsen and Saman Amarasinghe. Exploiting superword level parallelism with multimedia instruction sets. SIGPLAN Not., 35(5): 145--156, may 2000. ISSN 0362-1340.
[12]
Samuel Larsen and Saman P. Amarasinghe. Exploiting superword level parallelism with multimedia instruction sets. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Vancouver, Britith Columbia, Canada, June 18--21, 2000, pages 145--156. ACM, 2000.
[13]
Vu Le, Mehrdad Afshari, and Zhendong Su. Compiler validation via equivalence modulo inputs. SIGPLAN Not., 49(6):216--226, jun 2014. ISSN 0362-1340.
[14]
Nuno P. Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. Provably correct peephole optimizations with alive. SIGPLAN Not., 50(6):22--32, jun 2015. ISSN 0362-1340.
[15]
Nuno P. Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. Alive2: Bounded translation validation for llvm. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2021, page 65--79, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383912.
[16]
Charith Mendis and Saman Amarasinghe. Goslp: Globally optimized superword level parallelism framework. Proc. ACM Program. Lang., 2(OOPSLA), oct 2018.
[17]
Chandrakana Nandi, Max Willsey, Adam Anderson, James R. Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. Synthesizing structured CAD models with equality saturation and inverse transformations. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15--20, 2020, pages 31--44. ACM, 2020.
[18]
Chandrakana Nandi, Max Willsey, Amy Zhu, Yisu Remy Wang, Brett Saiki, Adam Anderson, Adriana Schulz, Dan Grossman, and Zachary Tatlock. Rewrite rule inference using equality saturation. Proc. ACM Program. Lang., 5(OOPSLA), oct 2021.
[19]
Greg Nelson. Techniques for program verification. PhD thesis, Stanford University, 1980.
[20]
Julie L. Newcomb, Andrew Adams, Steven Johnson, Rastislav Bodík, and Shoaib Kamil. Verifying and improving halide's term rewriting system with program synthesis. Proc. ACM Program. Lang., 4(OOPSLA): 166:1--166:28, 2020.
[21]
Andres Nötzli, Andrew Reynolds, Haniel Barbosa, Aina Niemetz, Mathias Preiner, Clark W. Barrett, and Cesare Tinelli. Syntax-guided rewrite rule enumeration for SMT solvers. In Theory and Applications of Satisfiability Testing - SAT 2019 - 22nd International Conference, SAT 2019, Lisbon, Portugal, July 9--12, 2019, Proceedings, volume 11628 of Lecture Notes in Computer Science, pages 279--297, 2019. URL https://doi.org/10.1007/978-3-030-24258-9_20.
[22]
Dorit Nuzman, Ira Rosen, and Ayal Zaks. Auto-vectorization of interleaved data for simd. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '06, page 132--143, New York, NY, USA, 2006. Association for Computing Machinery. ISBN 1595933204.
[23]
Pavel Panchekha, Alex Sanchez-Stern, James R. Wilcox, and Zachary Tatlock. Automatically improving accuracy for floating point expressions. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15--17, 2015, pages 1--11, 2015.
[24]
Phitchaya Mangpo Phothilimthana, Archibald Samuel Elliott, An Wang, Abhinav Jangda, Bastian Hagedorn, Henrik Barthels, Samuel J. Kaufman, Vinod Grover, Emina Torlak, and Rastislav Bodik. Swizzle inventor: Data movement synthesis for gpu kernels. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '19, pages 65--78, New York, NY, USA, 2019.
[25]
Markus Puschel, José MF Moura, Jeremy R Johnson, David Padua, Manuela M Veloso, Bryan W Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, et al. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 93(2):232--275, 2005.
[26]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman P. Amarasinghe. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '13, Seattle, WA, USA, June 16--19, 2013, pages 519--530, 2013.
[27]
Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Jubi Taneja, and John Regehr. Souper: A synthesizing superoptimizer. CoRR, abs/1711.04422, 2017. URL http://arxiv.org/abs/1711.04422.
[28]
Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic superoptimization. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, page 305--316, New York, NY, USA, 2013. Association for Computing Machinery. ISBN 9781450318709.
[29]
Rohit Singh and Armando Solar-Lezama. SWAPPER: A framework for automatic generation of formula simplifiers based on conditional rewrite rules. In Ruzica Piskac and Muralidhar Talupur, editors, 2016 Formal Methods in Computer-Aided Design, FMCAD 2016, Mountain View, CA, USA, October 3--6, 2016, pages 185--192, 2016.
[30]
Daniele G. Spampinato, Diego Fabregat-Traver, Paolo Bientinesi, and Markus Püschel. Program generation for small-scale linear algebra applications. In Proceedings of the 2018 International Symposium on Code Generation and Optimization, CGO 2018, page 327--339, New York, NY, USA, 2018. Association for Computing Machinery. ISBN 9781450356176.
[31]
Nigel Stephens, Stuart Biles, Matthias Boettcher, Jacob Eapen, Mbou Eyole, Giacomo Gabrielli, Matt Horsnell, Grigorios Magklis, Alejandro Martinez, Nathanael Premillieu, Alastair Reid, Alejandro Rico, and Paul Walker. The Arm Scalable Vector Extension. IEEE Micro, 37(2): 26--39, March 2017.
[32]
Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. Equality saturation: A new approach to optimization. SIGPLAN Not., 44(1): 264--276, jan 2009. ISSN 0362-1340.
[33]
Emina Torlak and Rastislav Bodik. Growing solver-aided languages with rosette. In Proceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2013, page 135--152, New York, NY, USA, 2013. Association for Computing Machinery. ISBN 9781450324724.
[34]
Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Ayal Zaks, and Ira Rosen. Polyhedral-model guided loop-nest auto-vectorization. In PACT 2009, Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 12--16 September 2009, Raleigh, North Carolina, USA, pages 327--337. IEEE Computer Society, 2009.
[35]
Alexa VanHattum, Rachit Nigam, Vincent T. Lee, James Bornholt, and Adrian Sampson. Vectorization for digital signal processors via equality saturation. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021, page 874--886, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383172.
[36]
Sander Vocke, Henk Corporaal, Roel Jordans, Rosilde Corvino, and Rick Nas. Extending halide to improve software development for imaging dsps. ACM Trans. Archit. Code Optim., 14(3), aug 2017. ISSN 1544-3566.
[37]
Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. Egg: Fast and extensible equality saturation. Proc. ACM Program. Lang., 5(POPL), jan 2021.
[38]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. Finding and understanding bugs in c compilers. SIGPLAN Not., 46(6):283--294, jun 2011. ISSN 0362-1340.

Cited By

View all
  • (2024)MCFOF: Multi-Compilation Fusion Optimization Framework using ROSE Compiler Infrastructure2024 8th International Workshop on Control Engineering and Advanced Algorithms (IWCEAA)10.1109/IWCEAA63616.2024.10823767(95-102)Online publication date: 1-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1
April 2024
494 pages
ISBN:9798400703720
DOI:10.1145/3617232
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2024

Check for updates

Badges

Author Tags

  1. vectorization
  2. DSPs
  3. equality saturation
  4. rewrite rules
  5. program synthesis

Qualifiers

  • Research-article

Funding Sources

  • NSF

Conference

ASPLOS '24

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,218
  • Downloads (Last 6 weeks)162
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)MCFOF: Multi-Compilation Fusion Optimization Framework using ROSE Compiler Infrastructure2024 8th International Workshop on Control Engineering and Advanced Algorithms (IWCEAA)10.1109/IWCEAA63616.2024.10823767(95-102)Online publication date: 1-Nov-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media