Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Static stages for heterogeneous programming

Published: 12 October 2017 Publication History

Abstract

Heterogeneous hardware is central to modern advances in performance and efficiency. Mainstream programming models for heterogeneous architectures, however, sacrifice safety and expressiveness in favor of low-level control over performance details. The interfaces between hardware units consist of verbose, unsafe APIs; hardware-specific languages make it difficult to move code between units; and brittle preprocessor macros complicate the task of specializing general code for efficient accelerated execution. We propose a unified low-level programming model for heterogeneous systems that offers control over performance, safe communication constructs, cross-device code portability, and hygienic metaprogramming for specialization. The language extends constructs from multi-stage programming to separate code for different hardware units, to communicate between them, and to express compile-time code optimization. We introduce static staging, a different take on multi-stage programming that lets the compiler generate all code and communication constructs ahead of time.
To demonstrate our approach, we use static staging to implement BraidGL, a real-time graphics programming language for CPU-GPU systems. Current real-time graphics software in OpenGL uses stringly-typed APIs for communication and unsafe preprocessing to generate specialized GPU code variants. In BraidGL, programmers instead write hybrid CPU-GPU software in a unified language. The compiler statically generates target-specific code and guarantees safe communication between the CPU and the graphics pipeline stages. Example scenes demonstrate the language's productivity advantages: BraidGL eliminates the safety and expressiveness pitfalls of OpenGL and makes common specialization techniques easy to apply. The case study demonstrates how static staging can express core placement and specialization in general heterogeneous programming.

Supplementary Material

Auxiliary Archive (oopsla17-oopsla83-aux.zip)

References

[1]
Advanced Micro Devices. Mantle Programming Guide and API Reference 1.0. https://www.amd.com/Documents/ Mantle- Programming- Guide- and- API- Reference.pdf .
[2]
Jason Ansel, Cy P. Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman P. Amarasinghe. 2009. PetaBricks: a language and compiler for algorithmic choice. In ACM Conference on Programming Language Design and Implementation (PLDI).
[3]
Apple. Metal Shading Language Specification, Version 2.0. https://developer.apple.com/metal/ Metal- Shading- Language- Specification.pdf .
[4]
Joshua Auerbach, David F. Bacon, Perry Cheng, and Rodric Rabbah. 2010. Lime: A Java-compatible and Synthesizable Language for Heterogeneous Architectures. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA).
[5]
Chad Austin and Dirk Reiners. 2005. Renaissance: A functional shading language. In ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware.
[6]
Baggers. Varjo: Lisp to GLSL Language Translator. https://github.com/cbaggers/varjo .
[7]
Alan Bawden. 1999. Quasiquotation in Lisp. In ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM).
[8]
Zine-El-Abidine Benaissa, Eugenio Moggi, Walid Taha, and Tim Sheard. 1999. Logical Modalities and Multi-Stage Programming. In Federated Logic Conference (FLoC) Satellite Workshop on Intuitionistic Modal Logics and Applications (IMLA).
[9]
Tobias Bexelius. GPipe. http://hackage.haskell.org/package/GPipe .
[10]
Kovas Boguta. Gamma. https://github.com/kovasb/gamma .
[11]
Kevin J. Brown, Arvind K. Sujeeth, HyoukJoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun. 2011. A Heterogeneous Parallel Framework for Domain-Specific Languages. In International Conference on Parallel Architectures and Compilation Techniques (PACT).
[12]
C. Calcagno, E. Moggi, and T. Sheard. 2003a. Closed Types for a Safe Imperative MetaML. Journal of Functional Programming 13, 3 (May 2003), 545–571.
[13]
Cristiano Calcagno, Eugenio Moggi, and Walid Taha. 2004. ML-Like Inference for Classifiers. In European Symposium on Programming (ESOP).
[14]
Cristiano Calcagno, Walid Taha, Liwen Huang, and Xavier Leroy. 2003b. Implementing Multi-stage Languages Using ASTs, Gensym, and Reflection. In International Conference on Generative Programming and Component Engineering (GPCE).
[15]
Hassan Chafi, Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Anand R. Atreya, and Kunle Olukotun. 2011. A Domainspecific Approach to Heterogeneous Parallelism. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP).
[16]
Bradford L. Chamberlain, David Callahan, and Hans P. Zima. 2007. Parallel Programmability and the Chapel Language. International Journal of High Performance Computing Applications 21, 3 (2007), 291–312.
[17]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: An Object-oriented Approach to Non-uniform Cluster Computing. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA).
[18]
Chiyan Chen and Hongwei Xi. 2003. Meta-programming Through Typeful Code Representation. In ACM SIGPLAN International Conference on Functional Programming (ICFP).
[19]
James Cheney, Sam Lindley, and Philip Wadler. 2013. A Practical Theory of Language-integrated Query. In ACM SIGPLAN International Conference on Functional Programming (ICFP).
[20]
Rowan Davies and Frank Pfenning. 1996. A Modal Analysis of Staged Computation. In ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL).
[21]
Zachary DeVito, James Hegarty, Alex Aiken, Pat Hanrahan, and Jan Vitek. 2013. Terra: A Multi-stage Language for High-performance Computing. In ACM Conference on Programming Language Design and Implementation (PLDI).
[22]
Jason Eckhardt, Roumen Kaiabachev, Emir Pasalic, Kedar Swadi, and Walid Taha. 2007. Implicitly Heterogeneous Multi-stage Programming. New Generation Computing 25, 3 (Jan. 2007), 305–336.
[23]
Conal Elliott. 2004. Programming Graphics Processors Functionally. In Haskell Workshop.
[24]
Nicolas Feltman, Carlo Angiuli, Umut A. Acar, and Kayvon Fatahalian. 2016. Automatically Splitting a Two-Stage Lambda Calculus. In European Symposium on Programming (ESOP).
[25]
Matthew Flatt. 2002. Composable and Compilable Macros: You Want It When?. In ACM SIGPLAN International Conference on Functional Programming (ICFP).
[26]
Matthew Flatt. 2016. Binding As Sets of Scopes. In ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL).
[27]
Tim Foley and Pat Hanrahan. 2011. Spark: Modular, Composable Shaders for Graphics Hardware. In SIGGRAPH.
[28]
Steven E. Ganz, Amr Sabry, and Walid Taha. 2001. Macros As Multi-stage Computations: Type-safe, Generative, Binding Macros in MacroML. In ACM SIGPLAN International Conference on Functional Programming (ICFP).
[29]
Kate Gregory and Ade Miller. 2012. C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++. O’Reilly. http://www.gregcons.com/cppamp/
[30]
Ilya Grigorik, James Simonsen, and Jatinder Mann. High Resolution Time Level 2: W3C Working Draft. https://www.w3. org/TR/hr- time/ .
[31]
Yuichiro Hanada and Atsushi Igarashi. 2014. On Cross-Stage Persistence in Multi-Stage Programming. In International Symposium on Functional and Logic Programming (FLOPS).
[32]
Johann Hauswald, Yiping Kang, Michael A. Laurenzano, Quan Chen, Cheng Li, Trevor Mudge, Ronald G. Dreslinski, Jason Mars, and Lingjia Tang. 2015. DjiNN and Tonic: DNN As a Service and Its Implications for Future Warehouse Scale Computers. In International Symposium on Computer Architecture (ISCA).
[33]
Yong He, Tim Foley, and Kayvon Fatahalian. 2016. A System for Rapid Exploration of Shader Optimization Choices. In SIGGRAPH.
[34]
Yong He, Tim Foley, Natalya Tatarchuk, and Kayvon Fatahalian. 2015. A System for Rapid, Automatic Shader Level-of-detail. In SIGGRAPH Asia.
[35]
Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin Oancea. 2017. Futhark: Purely Functional GP U-programming with Nested Parallelism and In-place Array Updates. In ACM Conference on Programming Language Design and Implementation (PLDI).
[36]
Martin Hirzel and Robert Grimm. 2007. Jeannie: Granting Java Native Interface Developers Their Wishes. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA).
[37]
Lee Howes and Maria Rovatsou. SYCL Specification. https://www.khronos.org/registry/sycl/ .
[38]
Dean Jackson and Jeff Gilbert. WebGL Specification. https://www.khronos.org/registry/webgl/specs/latest/1.0/ .
[39]
Ulrik Jørring and William L. Scherlis. 1986. Compilers and Staging Transformations. In ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL).
[40]
Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In International Symposium on Computer Architecture (ISCA).
[41]
John Kessenich. An Introduction to SPIR-V: A Khronos-Defined Intermediate Language for Native Representation of Graphical Shaders and Compute Kernels. https://www.khronos.org/registry/spir- v/papers/WhitePaper.pdf .
[42]
Khronos. Vulkan 1.0.48: A Specification. https://www.khronos.org/registry/vulkan/specs/1.0/pdf/vkspec.pdf .
[43]
Ik-Soon Kim, Kwangkeun Yi, and Cristiano Calcagno. 2006. A Polymorphic Modal Type System for Lisp-like Multi-staged Languages. In ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL).
[44]
Oleg Kiselyov. 2014. The Design and Implementation of BER MetaOCaml. In International Symposium on Functional and Logic Programming (FLOPS).
[45]
Oleg Kiselyov. MetaOCaml – an OCaml dialect for multi-stage programming. http://okmij.org/ftp/ML/MetaOCaml.html .
[46]
Andreas Klöckner. 2014. Loo.py: Transformation-based Code Generation for GP Us and CP Us. In International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY).
[47]
Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan Catanzaro, Paul Ivanov, and Ahmed Fasih. 2012. PyCUDA and PyOpenCL: A Scripting-based Approach to GP U Run-time Code Generation. Parallel Comput. 38, 3 (March 2012), 157–174.
[48]
Eugene Kohlbecker, Daniel P. Friedman, Matthias Felleisen, and Bruce Duba. 1986. Hygienic Macro Expansion. In ACM Conference on LISP and Functional Programming.
[49]
LambdaCube. LambdaCube 3D. http://lambdacube3d.com .
[50]
Byeongcheol Lee, Robert Grimm, Martin Hirzel, and Kathryn S. McKinley. 2012. Marco: Safe, Expressive Macros for Any Language. In European conference on Object-Oriented Programming (ECOOP).
[51]
Chi-Keung Luk, Sunpyo Hong, and Hyesoon Kim. 2009. Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping. In IEEE/ACM International Symposium on Microarchitecture (MICRO).
[52]
Geoffrey Mainland. 2012. Explicitly heterogeneous metaprogramming with MetaHaskell. In ACM SIGPLAN International Conference on Functional Programming (ICFP).
[53]
David Majda. PEG.js: Parser Generator for JavaScript. http://pegjs.org .
[54]
Michael McCool, Stefanus Du Toit, Tiberiu Popa, Bryan Chan, and Kevin Moule. 2004. Shader Algebra. In SIGGRAPH.
[55]
Michael McCool, Zheng Qin, and Tiberiu S. Popa. 2002. Shader Metaprogramming. In ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware.
[56]
Sean McDirmid. Two Lightweight DSLs for Rich UI Programming. http://research.microsoft.com/pubs/191794/ldsl09.pdf .
[57]
Morgan McGuire. Computer Graphics Archive. http://graphics.cs.williams.edu/data .
[58]
Microsoft. Direct3D. https://msdn.microsoft.com/en- us/library/windows/desktop/hh309466(v=vs.85).aspx .
[59]
Eugenio Moggi, Walid Taha, Zine-El-Abidine Benaissa, and Tim Sheard. 1999. An Idealized MetaML: Simpler, and More Expressive. In European Symposium on Programming (ESOP).
[60]
Tom Murphy, VII, Karl Crary, and Robert Harper. 2007. Type-safe Distributed Programming with ML5. In Conference on Trustworthy Global Computing (TGC).
[61]
Todd Mytkowicz and Wolfram Schulte. 2014. Waiting for Godot? The Right Language Abstractions for Parallel Programming Should Be Here Soon: The Multicore Transformation. Ubiquity (June 2014), 4:1–4:12.
[62]
Shayan Najd, Sam Lindley, Josef Svenningsson, and Philip Wadler. 2016. Everything Old is New Again: Quoted Domainspecific Languages. In ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM).
[63]
Aleksandar Nanevski and Frank Pfenning. 2005. Staged Computation with Names and Necessity. Journal of Functional Programming (JFP) 15 (Nov. 2005), 893–939. Issue 6.
[64]
John Nickolls, Ian Buck, Michael Garland, and Kevin Skadron. 2008. Scalable Parallel Programming with CUDA. Queue 6, 2 (March 2008), 40–53.
[65]
OpenACC. The OpenACC Application Programming Interface. http://www.openacc.org/sites/default/files/OpenACC_2pt5. pdf .
[66]
Bui Tuong Phong. 1975. Illumination for Computer Generated Pictures. Commun. ACM 18, 6 (June 1975), 311–317.
[67]
Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, and Saman Amarasinghe. 2013. Portable Performance on Heterogeneous Architectures. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[68]
Kekoa Proudfoot, William R. Mark, Svetoslav Tzvetkov, and Pat Hanrahan. 2001. A Real-time Procedural Shading System for Programmable Graphics Hardware. In SIGGRAPH.
[69]
Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth, Gopal Jan, Gray Michael, Haselman Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Y. Xiao, and Doug Burger. 2014. A Reconfigurable Fabric for Accelerating Large-scale Datacenter Services. In International Symposium on Computer Architecture (ISCA).
[70]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In ACM Conference on Programming Language Design and Implementation (PLDI).
[71]
Tiark Rompf and Martin Odersky. 2010. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In International Conference on Generative Programming and Component Engineering (GPCE).
[72]
Tiark Rompf, Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Hassan Chafi, and Kunle Olukotun. 2014. Surgical Precision JIT Compilers. In ACM Conference on Programming Language Design and Implementation (PLDI).
[73]
Adrian Sampson. Braid source code, documentation, and interactive compiler. https://capra.cs.cornell.edu/braid/ .
[74]
Ben Sander, Greg Stoner, Siu-Chi Chan, Wen-Heng Chung, and Robin Maffeo. HCC: A C++ Compiler For Heterogeneous Computing. http://www.open- std.org/jtc1/sc22/wg21/docs/papers/2015/p0069r0.pdf .
[75]
Carlos Scheidegger. Lux: the DSEL for WebGL graphics. http://cscheid.github.io/lux/ .
[76]
Mark Segal and Kurt Akeley. The OpenGL 4.5 Graphics System: A Specification. https://www.opengl.org/registry/doc/ glspec45.core.pdf .
[77]
Stanford. The Stanford 3D Scanning Repository. http://graphics.stanford.edu/data/3Dscanrep/ .
[78]
John E. Stone, David Gohara, and Guochun Shi. 2010. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems. IEEE Design & Test 12, 3 (May 2010), 66–73.
[79]
Walid Taha. 2003. Domain-Specific Program Generation: International Seminar, Dagstuhl Castle, Germany, March 23–28, 2003. Revised Papers. Chapter A Gentle Introduction to Multi-stage Programming, 30–50.
[80]
Walid Taha and Michael Florentin Nielsen. 2003. Environment Classifiers. In ACM SIGPLAN–SIGACT Symposium on Principles of Programming Languages (POPL).
[81]
Walid Taha and Tim Sheard. 1997. Multi-stage Programming with Explicit Annotations. In ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM).
[82]
Naoki Takashima, Hiroki Sakamoto, and Yukiyoshi Kameyama. 2015. Generate and Offshore: Type-safe and Modular Code Generation for Low-level Optimization. In Workshop on Functional High-Performance Computing (FHPC).
[83]
Web Hypertext Application Technology Working Group. HTML Living Standard. Section 8.9: Animation Frames. https://html.spec.whatwg.org/multipage/webappapis.html .

Cited By

View all
  • (2024)RenderKernel: High-level programming for real-time rendering systemsVisual Informatics10.1016/j.visinf.2024.09.0048:3(82-95)Online publication date: Sep-2024
  • (2022)Supporting Unified Shader Specialization by Co-opting C++ FeaturesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/35438665:3(1-17)Online publication date: 27-Jul-2022
  • (2022)Optimum Scheduling and Routing of Material Through Computational TechniquesIndustry 4.0 and Advanced Manufacturing10.1007/978-981-19-0561-2_5(49-59)Online publication date: 24-Jul-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
October 2017
1786 pages
EISSN:2475-1421
DOI:10.1145/3152284
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2017
Published in PACMPL Volume 1, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Multi-stage programming
  2. OpenGL
  3. graphics programming
  4. heterogeneous programming

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)17
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)RenderKernel: High-level programming for real-time rendering systemsVisual Informatics10.1016/j.visinf.2024.09.0048:3(82-95)Online publication date: Sep-2024
  • (2022)Supporting Unified Shader Specialization by Co-opting C++ FeaturesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/35438665:3(1-17)Online publication date: 27-Jul-2022
  • (2022)Optimum Scheduling and Routing of Material Through Computational TechniquesIndustry 4.0 and Advanced Manufacturing10.1007/978-981-19-0561-2_5(49-59)Online publication date: 24-Jul-2022
  • (2020)Multi-stage programming in the large with staged classesProceedings of the 19th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3425898.3426961(35-49)Online publication date: 16-Nov-2020
  • (2020)Fluid quotes: metaprogramming across abstraction boundaries with dependent typesProceedings of the 19th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3425898.3426953(98-110)Online publication date: 16-Nov-2020
  • (2019)Staged metaprogramming for shader system developmentACM Transactions on Graphics10.1145/3355089.335655438:6(1-15)Online publication date: 8-Nov-2019
  • (2019)Specialization Opportunities in Graphical Workloads2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT.2019.00029(272-283)Online publication date: Sep-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media