Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3445814.3446712acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

A compiler infrastructure for accelerator generators

Published: 17 April 2021 Publication History

Abstract

We present Calyx, a new intermediate language (IL) for compiling high-level programs into hardware designs. Calyx combines a hardware-like structural language with a software-like control flow representation with loops and conditionals. This split representation enables a new class of hardware-focused optimizations that require both structural and control flow information which are crucial for high-level programming models for hardware design. The Calyx compiler lowers control flow constructs using finite-state machines and generates synthesizable hardware descriptions.
We have implemented Calyx in an optimizing compiler that translates high-level programs to hardware. We demonstrate Calyx using two DSL-to-RTL compilers, a systolic array generator and one for a recent imperative accelerator language, and compare them to equivalent designs generated using high-level synthesis (HLS). The systolic arrays are 4.6× faster and 1.11× larger on average than HLS implementations, and the HLS-like imperative language compiler is within a few factors of a highly optimized commercial HLS toolchain. We also describe three optimizations implemented in the Calyx compiler.

References

[1]
Ali E. Abdallah and John Hawkins. 2003. Formal Behavioural Synthesis of HandelC Parallel Hardware Implementations from Functional Specifications. In Hawaii International Conference on System Sciences (HICSS).
[2]
C Scott Ananian. 1998. Silicon C: A Hardware Backend for SUIF. https: //flex.cscott.net/SiliconC/.
[3]
Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Jason H Anderson, Stephen Brown, and Tomasz Czajkowski. 2011. LegUp: Highlevel synthesis for FPGA-based processor/accelerator systems. In International Symposium on Field-Programmable Gate Arrays (FPGA).
[4]
Luca P Carloni, Kenneth L McMillan, and Alberto L Sangiovanni-Vincentelli. 2001. Theory of latency-insensitive design. IEEE/ACM International Conference on Computer-Aided Design (ICCAD) ( 2001 ).
[5]
J. Cong and J. Wang. 2018. PolySA: Polyhedral-Based Systolic Array AutoCompilation. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[6]
J. Cong and Zhiru Zhang. 2006. An eficient and versatile scheduling algorithm based on SDC formulation. In Design Automation Conference (DAC).
[7]
Ross Daly, Lenny Truong, and Pat Hanrahan. 2018. Invoking and Linking Generators from Multiple Hardware Languages using CoreIR. In Second Workshop on Open-Source EDA Technology (WOSET).
[8]
David Durst, Matthew Feldman, Dillon Huf, David Akeley, Ross Daly, Gilbert Louis Bernstein, Marco Patrignani, Kayvon Fatahalian, and Pat Hanrahan. 2020. Type-Directed Scheduling of Streaming Accelerators. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
[9]
Nikil D Dutt, Tedd Hadley, and Daniel D Gajski. 1991. An intermediate representation for behavioral synthesis. In Design Automation Conference (DAC).
[10]
Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, and Doug Burger. 2018. A Configurable Cloud-scale DNN Processor for Real-time AI. In International Symposium on Computer Architecture (ISCA).
[11]
Zhi Guo, Betul Buyukkurt, John Cortes, Abhishek Mitra, and Walild Najjar. 2008. A compiler intermediate representation for reconfigurable fabrics. International Journal of Parallel Programming ( 2008 ).
[12]
S Gupta, Renu Gupta, Nikil Dutt, and Alex Nicolau. 2004. SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits.
[13]
James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: Compiling high-level image processing code into hardware pipelines. ACM Transactions on Graphics.
[14]
Intel. 2021. Intel High Level Synthesis Compiler. Retrieved January 16, 2021 from https://www.altera.com/products/design-software/high-level-design/intelhls-compiler/overview.html
[15]
Adam M. Izraelevitz, Jack Koenig, Patrick Li, Richard Lin, Angie Wang, Albert Magyar, Donggyu Kim, Colin Schmidt, Chick Markley, Jim Lawson, and Jonathan Bachrach. 2017. Reusability is FIRRTL ground: Hardware construction languages, compiler frameworks, and transformations. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[16]
Lana Josipoviundefined, Radhika Ghosal, and Paolo Ienne. 2018. Dynamically Scheduled High-Level Synthesis. In International Symposium on FieldProgrammable Gate Arrays (FPGA).
[17]
Norman P. Jouppi, Clif Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre luc Cantin, Cliford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jefrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jafey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In International Symposium on Computer Architecture (ISCA).
[18]
David Koeplinger, Matthew Feldman, Raghu Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, Ardavan Pedram, Christos Kozyrakis, and Kunle Olukotun. 2018. Spatial: A language and compiler for application accelerators. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
[19]
Hsiang-Tsung Kung. 1982. Why systolic architectures? IEEE computer ( 1982 ).
[20]
Yi-Hsiang Lai, Yuze Chi, Yuwei Hu, Jie Wang, Cody Hao Yu, Yuan Zhou, Jason Cong, and Zhiru Zhang. 2019. HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing. In International Symposium on Field-Programmable Gate Arrays (FPGA).
[21]
Y.-H. Lai, H. Rong, S. Zheng, W. Zhang, X. Cui, Y. Jia, J. Wang, B. Sullivan, Z. Zhang, Y. Liang, Y. Zhang, J. Cong, N. George, J. Alvarez, C. Hughes, and P. Dubey. 2020. SuSy: A Programming Model for Productive Construction of HighPerformance Systolic Arrays on FPGAs. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[22]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In International Symposium on Code Generation and Optimization (CGO).
[23]
Louis-Noel Pouchet. 2021. PolyBench/C: The Polyhedral Benchmark Suite. Retrieved January 16, 2021 from http://web.cse.ohio-state.edu/~pouchet.2/software/ polybench/
[24]
Mentor Graphics. 2021. Catapult High-Level Synthesis. Retrieved January 16, 2021 from https://www.mentor.com/hls-lp/ catapult-high-level-synthesis/
[25]
Rachit Nigam, Sachille Atapattu, Samuel Thomas, Zhijing Li, Theodore Bauer, Yuwei Ye, Apurva Koti, Adrian Sampson, and Zhiru Zhang. 2020. Predictable Accelerator Design with Time-Sensitive Afine Types. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
[26]
Rishiyur Nikhil. 2004. Bluespec System Verilog: Eficient, correct RTL from high level specifications. In Conference on Formal Methods and Models for Co-Design (MEMOCODE).
[27]
Preeti Ranjan Panda. 2001. SystemC: A modeling platform supporting multiple design abstractions. In International Symposium on Systems Synthesis.
[28]
Christian Pilato and Fabrizio Ferrandi. 2013. Bambu: A modular framework for the high level synthesis of memory-intensive applications. In International Conference on Field-Programmable Logic and Applications (FPL).
[29]
Raghu Prabhakar, David Koeplinger, Kevin J Brown, HyoukJoong Lee, Christopher De Sa, Christos Kozyrakis, and Kunle Olukotun. 2016. Generating conifgurable hardware from parallel patterns. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[30]
Jing Pu, Steven Bell, Xuan Yang, Jef Setter, Stephen Richardson, Jonathan RaganKelley, and Mark Horowitz. 2017. Programming heterogeneous systems from an image processing DSL. ACM Transactions on Architecture and Code Optimization (TACO).
[31]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman P. Amarasinghe. 2013. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
[32]
Sameer D Sahasrabuddhe, Hakim Raja, Kavi Arya, and Madhav P Desai. 2007. AHIR: A hardware intermediate representation for hardware generation from high-level programs. In International Conference on VLSI Design (VLSID).
[33]
Fabian Schuiki, Andreas Kurth, Tobias Grosser, and Luca Benini. 2020. LLHD: A Multi-Level Intermediate Representation for Hardware Description Languages. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).
[34]
Shang HLS Authors. 2021. The Shang High-Level Synthesis Framework. Retrieved January 16, 2021 from https://web.archive.org/web/20180610233052/https: //github.com/etherzhhb/Shang
[35]
Amirali Sharifian, Reza Hojabr, Navid Rahimi, Sihao Liu, Apala Guha, Tony Nowatzki, and Arrvindh Shriraman. 2019. IR: An Intermediate Representation for Transforming and Optimizing the Microarchitecture of Application Accelerators. In IEEE/ACM International Symposium on Microarchitecture (MICRO).
[36]
Satnam Singh and David J. Greaves. 2008. Kiwi: Synthesis of FPGA Circuits from Parallel Programs. In Field-Programmable Custom Computing Machines (FCCM).
[37]
Rohit Sinha and Hiren D Patel. 2012. synASM: A high-level synthesis framework with support for parallel and timed constructs. IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[38]
H. Srinivasan and M. Wolfe. 1992. Analyzing programs with explicit parallelism. In Languages and Compilers for Parallel Computing.
[39]
Veripool. 2021. Verilator. https://www.veripool.org/wiki/verilator.
[40]
Han Wang, Robert Soulé, Huynh Tu Dang, Ki Suh Lee, Vishal Shrivastav, Nate Foster, and Hakim Weatherspoon. 2017. P4FPGA: A Rapid Prototyping Framework for P4. In Symposium on SDN Research (SOSR).
[41]
Sheng-Hong Wang, Akash Sridhar, and Jose Renau. 2019. LNAST: A language neutral intermediate representation for hardware description languages. In Second Workshop on Open-Source EDA Technology (WOSET).
[42]
Claire Wolf. 2021. Yosys Manual. Retrieved January 16, 2021 from http://www. cliford.at/yosys/files/yosys_manual.pdf
[43]
Qiang Wu, Yunfeng Wang, Jinian Bian, Weimin Wu, and Hongxi Xue. 2002. A hierarchical CDFG as intermediate representation for hardware/software codesign. In International Conference on Communications, Circuits and Systems (ICCCAS).
[44]
Xilinx Inc. 2021. Vivado Design Suite User Guide: High-Level Synthesis. UG902 (v2017.2) June 7, 2017. Retrieved January 16, 2021 from https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_2/ ug902-vivado-high-level-synthesis.pdf
[45]
Zhiru Zhang, Yiping Fan, Wei Jiang, Guoling Han, Changqi Yang, and Jason Cong. 2008. AutoPilot: A platform-based ESL synthesis system. In High-Level Synthesis. 99-112.

Cited By

View all
  • (2024)Unifying Static and Dynamic Intermediate Languages for Accelerator GeneratorsProceedings of the ACM on Programming Languages10.1145/36897908:OOPSLA2(2242-2267)Online publication date: 8-Oct-2024
  • (2024)Wavefront Threading Enables Effective High-Level SynthesisProceedings of the ACM on Programming Languages10.1145/36564208:PLDI(1066-1090)Online publication date: 20-Jun-2024
  • (2024)Allo: A Programming Model for Composable Accelerator DesignProceedings of the ACM on Programming Languages10.1145/36564018:PLDI(593-620)Online publication date: 20-Jun-2024
  • Show More Cited By

Index Terms

  1. A compiler infrastructure for accelerator generators

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
    April 2021
    1090 pages
    ISBN:9781450383172
    DOI:10.1145/3445814
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 April 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. Accelerator Design
    2. Intermediate Language

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASPLOS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)493
    • Downloads (Last 6 weeks)82
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Unifying Static and Dynamic Intermediate Languages for Accelerator GeneratorsProceedings of the ACM on Programming Languages10.1145/36897908:OOPSLA2(2242-2267)Online publication date: 8-Oct-2024
    • (2024)Wavefront Threading Enables Effective High-Level SynthesisProceedings of the ACM on Programming Languages10.1145/36564208:PLDI(1066-1090)Online publication date: 20-Jun-2024
    • (2024)Allo: A Programming Model for Composable Accelerator DesignProceedings of the ACM on Programming Languages10.1145/36564018:PLDI(593-620)Online publication date: 20-Jun-2024
    • (2024)Application-level Validation of Accelerator Designs Using a Formal Software/Hardware InterfaceACM Transactions on Design Automation of Electronic Systems10.1145/363905129:2(1-25)Online publication date: 14-Feb-2024
    • (2024)Cement: Streamlining FPGA Hardware Design with Cycle-Deterministic eHDL and SynthesisProceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3626202.3637561(211-222)Online publication date: 1-Apr-2024
    • (2024)HIDA: A Hierarchical Dataflow Compiler for High-Level SynthesisProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624850(215-230)Online publication date: 27-Apr-2024
    • (2024)Leveraging MLIR for Efficient Irregular-Shaped CGRA Overlay Design: (PhD Forum Paper)2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP61560.2024.00048(204-205)Online publication date: 24-Jul-2024
    • (2024)High-Level SynthesisFPGA EDA10.1007/978-981-99-7755-0_8(113-134)Online publication date: 1-Feb-2024
    • (2023)Compiler Technologies in Deep Learning Co-Design: A SurveyIntelligent Computing10.34133/icomputing.00402Online publication date: 19-Jun-2023
    • (2023)Modular Hardware Design with Timeline TypesProceedings of the ACM on Programming Languages10.1145/35912347:PLDI(343-367)Online publication date: 6-Jun-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media