research-article

Open access

SpEQ: Translation of Sparse Codes using Equivalences

Authors:

Nikolaj Bjørner,

Maryam Mehri DehnaviAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 8, Issue PLDI

Article No.: 215, Pages 1680 - 1703

https://doi.org/10.1145/3656445

Published: 20 June 2024 Publication History

Abstract

We present SpEQ, a quick and correct strategy for detecting semantics in sparse codes and enabling automatic translation to high-performance library calls or domain-specific languages (DSLs). When sparse linear algebra codes contain implicit preconditions about how data is stored that hamper direct translation, SpEQ identifies the high-level computation along with storage details and related preconditions. A run-time check guards the translation and ensures that required preconditions are met. We implement SpEQ using the LLVM framework, the Z3 solver, and egglog library and correctly translate sparse linear algebra codes into two high-performance libraries, NVIDIA cuSPARSE and Intel MKL, and OpenMP (OMP). We evaluate SpEQ on ten diverse benchmarks against two state-of-the-art translation tools. SpEQ achieves geometric mean speedups of 3.25×, 5.09×, and 8.04× on OpenMP, MKL, and cuSPARSE backends, respectively. SpEQ is the only tool that can guarantee the correct translation of sparse computations.

References

[1]

2023. MemorySSA. https://llvm.org/docs/MemorySSA.html. Accessed: 2023-11-13.

[2]

Maaz Bin Safeer Ahmad and Alvin Cheung. 2018. Automatically leveraging mapreduce frameworks for data-intensive applications. In Proceedings of the 2018 International Conference on Management of Data. 1205-1220. https://doi.org/10. 1145/3183713.3196891

Digital Library

[3]

Maaz Bin Safeer Ahmad, Jonathan Ragan-Kelley, Alvin Cheung, and Shoaib Kamil. 2019. Automatically translating image processing libraries to halide. ACM Transactions on Graphics (TOG) 38, 6 ( 2019 ), 1-13. https://doi.org/10.1145/ 3355089.3356549

Digital Library

[4]

Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jefrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., USA.

Digital Library

[5]

Andrew W. Appel. 1998. SSA is functional programming. SIGPLAN Not. 33, 4 (apr 1998 ), 17-20. https://doi.org/10. 1145/278283.278285

Digital Library

[6]

OpenMP ARB. 2023. OpenMP. https://www.openmp.org/.

[7]

Gilad Arnold, Johannes Hölzl, Ali Sinan Köksal, Rastislav Bodík, and Mooly Sagiv. 2010. Specifying and verifying sparse matrix codes. ACM Sigplan Notices 45, 9 ( 2010 ), 249-260. https://doi.org/10.1145/1863543.1863581

Digital Library

[8]

Nikolaj Bjørner, Anca Browne, and Zohar Manna. 1997. Automatic generation of invariants and intermediate assertions. Theoretical Computer Science 173, 1 ( 1997 ), 49-87. https://doi.org/10.1007/3-540-60299-2_37

[9]

L Susan Blackford, Antoine Petitet, Roldan Pozo, Karin Remington, R Clint Whaley, James Demmel, Jack Dongarra, Iain Duf, Sven Hammarling, Greg Henry, et al. 2002. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Software 28, 2 ( 2002 ), 135-151. https://doi.org/10.1145/567806.567807

Digital Library

[10]

Alvin Cheung. 2023. MetaLift. https://github.com/metalift/metalift.

[11]

Stephen Chou. 2022. Format Abstractions for the Compilation of Sparse Tensor Algebra. Ph. D. Dissertation. Massachusetts Institute of Technology. https://doi.org/10.1145/3276493

Digital Library

[12]

cuSPARSE [n. d.]. Basic Linear Algebra for Sparse Matrices on NVIDIA GPUs. https://developer.nvidia.com/cusparse.

[13]

R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. 1989. An Eficient Method of Computing Static Single Assignment Form. In Proceedings of the 16th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Austin, Texas, USA) ( POPL '89). Association for Computing Machinery, New York, NY, USA, 25-35. https://doi.org/10.1145/75277.75280

Digital Library

[14]

Timothy A Davis. 2006. Direct methods for sparse linear systems. SIAM. https://doi.org/10.1137/1.9780898718881

[15]

Joao PL De Carvalho, Braedy Kuzma, Ivan Korostelev, José Nelson Amaral, Christopher Barton, José Moreira, and Guido Araujo. 2021. KernelFaRer: replacing native-code idioms with high-performance library calls. ACM Transactions On Architecture And Code Optimization (TACO) 18, 3 ( 2021 ), 1-22. https://doi.org/10.1145/3459010

Digital Library

[16]

Leonardo De Moura and Nikolaj Bjørner. 2007. Eficient E-matching for SMT solvers. In Automated Deduction-CADE-21: 21st International Conference on Automated Deduction Bremen, Germany, July 17-20, 2007 Proceedings 21. Springer, 183-198. https://doi.org/10.1007/978-3-540-73595-3_13

Digital Library

[17]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An eficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems: 14th International Conference, TACAS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April 6, 2008. Proceedings 14. Springer, 337-340. https://doi.org/10.1007/978-3-540-78800-3_24

[18]

David Detlefs, Greg Nelson, and James B Saxe. 2005. Simplify: a theorem prover for program checking. Journal of the ACM (JACM) 52, 3 ( 2005 ), 365-473. https://doi.org/10.1145/1066100.1066102

Digital Library

[19]

Jack Dongarra, Victor Eijkhout, and Henk van der Vorst. 2001. An iterative solver benchmark. Scientific Programming 9, 4 ( 2001 ), 223-231. https://doi.org/10.1155/ 2001 /527931

Digital Library

[20]

Tristan Dyer, Alper Altuntas, and John Baugh. 2019. Bounded verification of sparse matrix computations. In 2019 IEEE/ACM 3rd International Workshop on Software Correctness for HPC Applications ( Correctness). IEEE, 36-43. https: //doi.org/10.1109/Correctness49594. 2019.00010

[21]

Philip Ginsbach, Bruce Collie, and Michael FP O'Boyle. 2020. Automatically harnessing sparse acceleration. In Proceedings of the 29th International Conference on Compiler Construction. 179-190. https://doi.org/10.1145/3377555. 3377893

Digital Library

[22]

Philip Ginsbach, Toomas Remmelg, Michel Steuwer, Bruno Bodin, Christophe Dubach, and Michael FP O'Boyle. 2018. Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach. In Proceedings of the TwentyThird International Conference on Architectural Support for Programming Languages and Operating Systems. 139-153. https://doi.org/10.1145/3173162.3173182

Digital Library

[23]

Bastian Hagedorn, Johannes Lenfers, Thomas Koehler, Xueying Qin, Sergei Gorlatch, and Michel Steuwer. 2020. Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies. Proceedings of the ACM on Programming Languages 4, ICFP ( 2020 ), 1-29. https://doi.org/10.1145/ 3408974

Digital Library

[24]

Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified lifting of stencil computations. ACM SIGPLAN Notices 51, 6 ( 2016 ), 711-726. https://doi.org/10.1145/2908080.2908117

Digital Library

[25]

Shmuel Katz and Zohar Manna. 1976. Logical analysis of programs. Commun. ACM 19, 4 ( 1976 ), 188-206.

[26]

Fredrik Kjolstad, Stephen Chou, David Lugato, Shoaib Kamil, and Saman Amarasinghe. 2017. Taco: A tool to generate tensor algebra kernels. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 943-948. https://doi.org/10.1109/ASE. 2017.8115709

[27]

Thomas Koehler, Phil Trinder, and Michel Steuwer. 2022. Sketch-Guided Equality Saturation.

[28]

Avery Laird. 2024. SpEQ: Translation of Sparse Codes using Equivalences. https://doi.org/10.5281/zenodo.10906216

[29]

Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In CGO. San Jose, CA, USA, 75-88.

[30]

Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling compiler infrastructure for domain specific computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 2-14. https://doi.org/10.1109/CGO51591. 2021.9370308

Digital Library

[31]

LCSSA 2023. Loop Closed SSA (LCSSA). https://llvm.org/docs/LoopTerminology.html# loop-closed-ssa-lcssa.

[32]

LoopSimplify 2023. Loop Simplify Form. https://llvm.org/docs/LoopTerminology.html# loop-simplify-form.

[33]

Júnior Löf, Dalvan Griebler, Gabriele Mencagli, Gabriell Araujo, Massimo Torquati, Marco Danelutto, and Luiz Gustavo Fernandes. 2021. The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on sharedmemory architectures. Future Generation Computer Systems 125 ( 2021 ), 743-757. https://doi.org/10.1016/j.future. 2021. 07.021

[34]

Saeed Maleki, Yaoqing Gao, Maria J. Garzar´n, Tommy Wong, and David A. Padua. 2011. An Evaluation of Vectorizing Compilers. In 2011 International Conference on Parallel Architectures and Compilation Techniques. 372-382. https: //doi.org/10.1109/PACT. 2011.68

[35]

Zohar Manna and Amir Pnueli. 2012. Temporal verification of reactive systems: safety. Springer Science & Business Media. https://doi.org/10.1007/978-1-4612-4222-2

[36]

Pablo Antonio Martínez, Jackson Woodruf, Jordi Armengol-Estapé, Gregorio Bernabé, José Manuel García, and Michael FP O'Boyle. 2023. Matching linear algebra and tensor code to specialized hardware accelerators. In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction. 85-97.

Digital Library

[37]

MKL [n. d.]. Intel® oneAPI Math Kernel Library. https://www.intel.com/content/www/us/en/developer/tools/oneapi/ onemkl.html.

[38]

Mahdi Soltan Mohammadi, Tomofumi Yuki, Kazem Cheshmi, Eddie C Davis, Mary Hall, Maryam Mehri Dehnavi, Payal Nandy, Catherine Olschanowsky, Anand Venkat, and Michelle Mills Strout. 2019. Sparse computation data dependence simplification for eficient compiler-generated inspectors. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. 594-609. https://doi.org/10.1145/3314221.3314646

Digital Library

[39]

Kedar S Namjoshi and Lenore D Zuck. 2013. Witnessing program transformations. In Static Analysis: 20th International Symposium, SAS 2013, Seattle, WA, USA, June 20-22, 2013. Proceedings 20. Springer, 304-323. https://doi.org/10.1007/978-3-642-38856-9_17

[40]

Charles Gregory Nelson. 1980. Techniques for program verification. Stanford University.

[41]

Michael Norrish and Michelle Mills Strout. 2015. An approach for proving the correctness of inspector/executor transformations. In Languages and Compilers for Parallel Computing: 27th International Workshop, LCPC 2014, Hillsboro, OR, USA, September 15-17, 2014, Revised Selected Papers 27. Springer, 131-145. https://doi.org/10.1007/978-3-319-17473-0_9

[42]

Louis-Noel Pouchet and Tomofumi Yuki. 2019. Polyhedral Benchmark suite. http://polybench.sf.net/.

[43]

Roldan Pozo. 2000. SciMark 2.0. http://math.nist.gov/scimark2/ ( 2000 ).

[44]

Reese T. Prosser. 1959. Applications of Boolean Matrices to the Analysis of Flow Diagrams. In Papers Presented at the December 1-3, 1959, Eastern Joint IRE-AIEE-ACM Computer Conference (Boston, Massachusetts) ( IRE-AIEE-ACM '59 (Eastern)). Association for Computing Machinery, New York, NY, USA, 133-138. https://doi.org/10.1145/1460299. 1460314

Digital Library

[45]

Cosmin Radoi, Stephen J Fink, Rodric Rabbah, and Manu Sridharan. 2014. Translating imperative code to MapReduce. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications. 909-927. https://doi.org/10.1145/2714064.2660228

Digital Library

[46]

Mike Rainey, Kyle Hale, Ryan R. Newton, Nikos Hardavellas, Simone Campanoni, Peter Dinda, and Umut A. Acar. 2021. Task Parallel Assembly Language for Uncompromising Parallelism. In Proceedings of the 42nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '21). ACM, New York, NY, USA. https: //doi.org/10.1145/3453483.3460969

Digital Library

[47]

Davide Sangiorgi. 2011. Introduction to bisimulation and coinduction. Cambridge University Press. https://doi.org/10. 1017/CBO9780511777110

[48]

Michel Steuwer, Christian Fensch, Sam Lindley, and Christophe Dubach. 2015. Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code. ACM SIGPLAN Notices 50, 9 ( 2015 ), 205-217. https://doi.org/10.1145/2858949.2784754

Digital Library

[49]

Michel Steuwer, Toomas Remmelg, and Christophe Dubach. 2017. Lift: a functional data-parallel IR for high-performance GPU code generation. In 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 74-85. https://doi.org/10.1109/CGO. 2017.7863730

[50]

John A Stratton, Christopher Rodrigues, I-Jui Sung, Nady Obeid, Li-Wen Chang, Nasser Anssari, Geng Daniel Liu, and Wen-mei W Hwu. 2012. Parboil: A revised benchmark suite for scientific and commercial throughput computing. Center for Reliable and High-Performance Computing 127 ( 2012 ), 27.

[51]

Michelle Mills Strout, Mary Hall, and Catherine Olschanowsky. 2018. The sparse polyhedral framework: Composing compiler-generated inspector-executor code. Proc. IEEE 106, 11 ( 2018 ), 1921-1934. https://doi.org/10.1109/JPROC. 2018. 2857721

[52]

Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality saturation: a new approach to optimization. In Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 264-276. https://doi.org/10.1145/1594834.1480915

Digital Library

[53]

Anand Venkat, Mary Hall, and Michelle Strout. 2015. Loop and data transformations for sparse matrix code. ACM SIGPLAN Notices 50, 6 ( 2015 ), 521-532. https://doi.org/10.1145/2737924.2738003

Digital Library

[54]

Anand Venkat, Mahdi Soltan Mohammadi, Jongsoo Park, Hongbo Rong, Rajkishore Barik, Michelle Mills Strout, and Mary Hall. 2016. Automating wavefront parallelization for sparse matrix computations. In SC'16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 480-491. https://doi.org/10.1109/SC. 2016.40

[55]

Anand Venkat, Manu Shantharam, Mary Hall, and Michelle Mills Strout. 2014. Non-afine extensions to polyhedral code generation. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. 185-194. https://doi.org/10.1145/2544137.2544141

Digital Library

[56]

Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. 2021. Egg: Fast and Extensible Equality Saturation. Proc. ACM Program. Lang. 5, POPL, Article 23 (jan 2021 ), 29 pages. https://doi.org/10.1145/3434304

Digital Library

[57]

Yihong Zhang, Yisu Remy Wang, Oliver Flatt, David Cao, Philip Zucker, Eli Rosenthal, Zachary Tatlock, and Max Willsey. 2023. Better Together: Unifying Datalog and Equality Saturation. Proc. ACM Program. Lang. 7, PLDI, Article 125 (jun 2023 ), 25 pages. https://doi.org/10.1145/3591239

Digital Library

Index Terms

SpEQ: Translation of Sparse Codes using Equivalences
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
  2. Software notations and tools
    1. Compilers
      1. Translator writing systems and compiler generators

Recommendations

Specifying and verifying sparse matrix codes
ICFP '10

Sparse matrix formats are typically implemented with low-level imperative programs. The optimized nature of these implementations hides the structural organization of the sparse format and complicates its verification. We define a variable-free ...
Combining translation memories and statistical machine translation using sparse features

The combination of translation memories (TMs) and statistical machine translation (SMT) has been demonstrated to be beneficial. In this paper, we present a combination approach which integrates TMs into SMT by using sparse features extracted at run-time ...
Specifying and verifying sparse matrix codes
ICFP '10: Proceedings of the 15th ACM SIGPLAN international conference on Functional programming

Sparse matrix formats are typically implemented with low-level imperative programs. The optimized nature of these implementations hides the structural organization of the sparse format and complicates its verification. We define a variable-free ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 8, Issue PLDI

June 2024

2198 pages

EISSN:2475-1421

DOI:10.1145/3554317

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2024

Published in PACMPL Volume 8, Issue PLDI

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

NSERC

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
217
Total Downloads

Downloads (Last 12 months)217
Downloads (Last 6 weeks)82

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents