survey

Pushing the Level of Abstraction of Digital System Design: A Survey on How to Program FPGAs

Authors:

Emanuele Del Sozzo,

Davide Conficconi,

Donatella Sciuto,

Marco D. SantambrogioAuthors Info & Claims

ACM Computing Surveys, Volume 55, Issue 5

Article No.: 106, Pages 1 - 48

https://doi.org/10.1145/3532989

Published: 03 December 2022 Publication History

Abstract

Field Programmable Gate Arrays (FPGAs) are spatial architectures with a heterogeneous reconfigurable fabric. They are state-of-the-art for prototyping, telecommunications, embedded, and an emerging alternative for cloud-scale acceleration. However, FPGA adoption found limitations in their programmability and required knowledge. Therefore, researchers focused on FPGA abstractions and automation tools. Here, we survey three leading digital design abstractions: Hardware Description Languages (HDLs), High-Level Synthesis (HLS) tools, and Domain-Specific Languages (DSLs). We review these abstraction solutions, provide a timeline, and propose a taxonomy for each abstraction trend: programming models for HDLs; Intellectual Property (IP)-based or System-based toolchains for HLS; application, architecture, and infrastructure domains for DSLs.

References

[1]

ACE. 2017. CoSy compiler development system. Retrieved from http://www.ace.nl/compiler/cosy.

[2]

Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison Wesley.

[3]

Alon Amid, David Biancolin, Abraham Gonzalez, Daniel Grubb, Sagar Karandikar, Harrison Liew, Albert Magyar, Howard Mao, Albert Ou, Nathan Pemberton, Paul Rigge, Colin Schmidt, John Wright, Jerry Zhao, Yakun Sophia Shao, Krste Asanović, and Borivoje Nikolić. 2020. Chipyard: Integrated design, simulation, and implementation framework for custom SoCs. IEEE Micro 40, 4 (2020), 10–21.

Digital Library

[4]

Alon Amid, David Biancolin, Abraham Gonzalez, Daniel Grubb, Sagar Karandikar, Harrison Liew, Albert Magyar, Howard Mao, Albert Ou, Nathan Pemberton, Paul Rigge, Colin Schmidt, John Wright, Jerry Zhao, Yakun Sophia Shao, Krste Asanović, and Borivoje Nikolić. 2020. Chipyard: Integrated design, simulation, and implementation framework for custom SoCs. IEEE Micro 40, 4 (2020), 10–21.

Digital Library

[5]

Krste Asanovic, Rimas Avizienis, Jonathan Bachrach, Scott Beamer, David Biancolin, Christopher Celio, Henry Cook, Daniel Dabbelt, John Hauser, Adam Izraelevitz, et al. 2016. The rocket chip generator. EECS Department, University of California, Berkeley, Technical Report UCB/EECS-2016-17.

[6]

Krste Asanović and David A. Patterson. 2014. Instruction sets should be free: The case for RISC-V. EECS Department, University of California, Berkeley, Technical Report UCB/EECS-2014-146.

[7]

Christiaan Baaij, Matthijs Kooijman, Jan Kuper, Arjan Boeijink, and Marco Gerards. 2010. C\(\lambda\)ash: Structural descriptions of synchronous hardware using haskell. In Proceedings of the 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools. IEEE, 714–721.

[8]

Jonathan Babb, Martin Rinard, Csaba Andras Moritz, Walter Lee, Matthew Frank, Rajeev Barua, and Saman Amarasinghe. 1999. Parallelizing applications into silicon. In Proceedings of the 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. IEEE, 70–80.

[9]

J. Bachrach, H. Vo, B. Richards, Y. Lee, A. Waterman, R. Avizienis, J. Wawrzynek, and K. Asanovic. 2012. Chisel: Constructing hardware in a Scala embedded language. In Proceedings of the DAC Design Automation Conference. 1212–1221.

Digital Library

[10]

John Backus. 1978. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM 21, 8 (1978), 613–641.

Digital Library

[11]

Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming Zhang, Patricia Suriana, Shoaib Kamil, and Saman Amarasinghe. 2019. Tiramisu: A polyhedral compiler for expressing fast and portable code. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO’19). IEEE.

[12]

Jonathan Balkind, Michael McKeown, Yaosheng Fu, Tri Nguyen, Yanqi Zhou, Alexey Lavrov, Mohammad Shahrad, Adi Fuchs, Samuel Payne, Xiaohua Liang, et al. 2016. OpenPiton: An open source manycore research framework. ACM SIGPLAN Notices 51, 4 (2016), 217–232.

Digital Library

[13]

Shunning Jiang Christopher Torng Christopher Batten. 2018. An open-source python-based hardware generation, simulation, and verification framework. In Proceedings of the Workshop on Open-Source EDA Technology (WOSET’18). 1–5.

[14]

Peter Bellows and Brad Hutchings. 1998. JHDL—An HDL for reconfigurable systems. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines. IEEE.

Digital Library

[15]

Pavel Benácek, Viktor Pu, and Hana Kubátová. 2016. P4-to-VHDL: Automatic generation of 100 Gbps packet parsers. In Proceedings of the IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’16). IEEE.

[16]

Endri Bezati, Marco Mattavelli, and Jorn W. Janneck. 2013. High-level synthesis of dataflow programs for signal processing systems. In Proceedings of the 8th International Symposium on Image and Signal Processing and Analysis (ISPA). IEEE.

[17]

Bjarne Stroustrup. 2007. Bjarne Stroustrup’s C++ Glossary. Retrieved from https://www.stroustrup.com/glossary.html#Gpolymorphism.

[18]

Bruno Bodin, Luigi Nardi, M. Zeeshan Zia, Harry Wagstaff, Govind Sreekar Shenoy, Murali Emani, John Mawer, Christos Kotselidis, Andy Nisbet, Mikel Lujan, Björn Franke, Paul H. J. Kelly, and Michael O’Boyle. 2016. Integrating algorithmic parameters into benchmarking and design space exploration in 3D scene understanding. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’16). ACM, New York, NY, 57–69.

[19]

Thomas Bollaert. 2008. Catapult synthesis: A practical introduction to interactive C synthesis. In High-Level Synthesis. Springer, 29–52.

[20]

Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, et al. 2014. P4: Programming protocol-independent packet processors. ACM SIGCOMM Comput. Commun. Rev. 44, 3 (2014), 87–95.

Digital Library

[21]

Thomas Bourgeat, Clément Pit-Claudel, and Adam Chlipala. 2020. The essence of Bluespec: A core language for rule-based hardware design. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 243–257.

Digital Library

[22]

Andrew Boutros and Vaughn Betz. 2021. FPGA architecture: Principles and progression. IEEE Circ. Syst. Mag. 21, 2 (2021), 4–29.

[23]

Jakub Cabal, Pavel Benáček, Lukáš Kekely, Michal Kekely, Viktor Puš, and Jan Kořenek. 2018. Configurable FPGA packet parser for terabit networks with guaranteed wire-speed throughput. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 249–258.

Digital Library

[24]

Cadence. 2021. Stratus High-Level Synthesis. Retrieved from https://www.cadence.com/ko_KR/home/tools/digital-design-and-signoff/synthesis/stratus-high-level-synthesis.html.

[25]

Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Jason H. Anderson, Stephen Brown, and Tomasz Czajkowski. 2011. LegUp: High-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, Association for Computing Machinery, New York, NY, 33–36.

Digital Library

[26]

Luca Cardelli and Peter Wegner. 1985. On understanding types, data abstraction, and polymorphism. ACM Comput. Surveys 17, 4 (1985), 471–523.

Digital Library

[27]

Joao M. P. Cardoso, Pedro C. Diniz, and Markus Weinhardt. 2010. Compiling for reconfigurable computing: A survey. ACM Comput. Surveys 42, 4 (2010), 1–65.

Digital Library

[28]

Riccardo Cattaneo, Giuseppe Natale, Carlo Sicignano, Donatella Sciuto, and Marco Domenico Santambrogio. 2015. On how to accelerate iterative stencil loops: A scalable streaming-based approach. ACM Trans. Architect. Code Optimiz. 12, 4 (2015), 1–26.

Digital Library

[29]

Raghunandan Chaware, Kumar Nagarajan, and Suresh Ramalingam. 2012. Assembly and reliability challenges in 3D integration of 28nm FPGA die on a large high density 65nm passive interposer. In Proceedings of the IEEE 62nd Electronic Components and Technology Conference. IEEE, 279–283.

[30]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). 578–594.

[31]

Stefano Cherubin and Giovanni Agosta. 2020. Tools for reduced precision computation: A survey. ACM Comput. Surveys 53, 2 (2020), 1–35.

Digital Library

[32]

Yuze Chi, Jason Cong, Peng Wei, and Peipei Zhou. 2018. SODA: Stencil with optimized dataflow architecture. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’18). IEEE, 1–8.

Digital Library

[33]

Nitin Chugh, Vinay Vasista, Suresh Purini, and Uday Bondhugula. 2016. A DSL compiler for accelerating image processing pipelines on FPGAs. In Proceedings of the International Conference on Parallel Architectures and Compilation. 327–338.

Digital Library

[34]

Michael D. Ciletti. 2003. Advanced Digital Design with the Verilog HDL. Vol. 1. Prentice Hall, Upper Saddle River, NJ.

[35]

John Clow, Georgios Tzimpragos, Deeksha Dangwal, Sammy Guo, Joseph McMahan, and Timothy Sherwood. 2017. A pythonic approach for rapid hardware prototyping and instrumentation. In Proceedings of the 27th International Conference on Field Programmable Logic and Applications (FPL’17). IEEE, 1–7.

[36]

Alessandro Comodi, Davide Conficconi, Alberto Scolari, and Marco D. Santambrogio. 2018. TiReX: Tiled regular expression matching architecture. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’18). IEEE, 131–137.

[37]

Katherine Compton and Scott Hauck. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surveys 34, 2 (2002), 171–210.

Digital Library

[38]

Davide Conficconi, Eleonora D’Arnese, Emanuele Del Sozzo, Donatella Sciuto, and Marco D. Santambrogio. 2021. A framework for customizable FPGA-based image registration accelerators. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, New York, NY, 251–261.

Digital Library

[39]

Davide Conficconi, Emanuele Del Sozzo, Filippo Carloni, Alessandro Comodi, Alberto Scolari, and Marco Domenico Santambrogio. 2022. An energy-efficient domain-specific architecture for regular expressions. IEEE Trans. Emerg. Topics Comput. (2022), 1–5.

[40]

Jason Cong, Bin Liu, Stephen Neuendorffer, Juanjo Noguera, Kees Vissers, and Zhiru Zhang. 2011. High-level synthesis for FPGAs: From prototyping to deployment. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 30, 4 (2011), 473–491.

Digital Library

[41]

Jason Cong, Vivek Sarkar, Glenn Reinman, and Alex Bui. 2010. Customizable domain-specific computing. IEEE Design Test Comput. 28, 2 (2010), 6–15.

Digital Library

[42]

Jason Cong and Jie Wang. 2018. PolySA: Polyhedral-based systolic array auto-compilation. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’18). IEEE, 1–8.

Digital Library

[43]

Achronix Semiconductor Corporation. 2020. Speedster7t Network on Chip User Guide (UG089). Retrieved from https://tinyurl.com/achronixnoc.

[44]

Philippe Coussy, Cyrille Chavet, Pierre Bomel, Dominique Heller, Eric Senn, and Eric Martin. 2008. GAUT: A high-level synthesis tool for DSP applications. In High-Level Synthesis. Springer, 147–169.

[45]

Philippe Coussy, Daniel D. Gajski, Michael Meredith, and Andres Takach. 2009. An introduction to high-level synthesis. IEEE Design Test Comput. 26, 4 (2009), 8–17.

Digital Library

[46]

Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95–105.

Digital Library

[47]

Tomasz S. Czajkowski, Utku Aydonat, Dmitry Denisenko, John Freeman, Michael Kinsner, David Neto, Jason Wong, Peter Yiannacouras, and Deshanand P. Singh. 2012. From OpenCL to high-performance hardware on FPGAs. In Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL’12). IEEE, 531–534.

[48]

Andrea Damiani, Emanuele Del Sozzo, and Marco D. Santambrogio. 2022. Large forests and where to “partially” fit them. In Proceedings of the 27th Asia and South Pacific Design Automation Conference (ASP-DAC’22). IEEE, 550–555.

Digital Library

[49]

Giovanni De Micheli. 1994. Synthesis and Optimization of Digital Circuits. Number BOOK. McGraw Hill.

Digital Library

[50]

Jan Decaluwe. 2004. MyHDL: A python-based hardware description language. Linux J.127 (2004), 84–87.

[51]

André DeHon. 2000. The density advantage of configurable computing. Computer 33, 4 (2000), 41–49.

Digital Library

[52]

Emanuele Del Sozzo, Riyadh Baghdadi, Saman Amarasinghe, and Marco D. Santambrogio. 2017. A common backend for hardware acceleration on FPGA. In Proceedings of the IEEE International Conference on Computer Design (ICCD’17). IEEE, 427–430.

[53]

Emanuele Del Sozzo, Riyadh Baghdadi, Saman Amarasinghe, and Marco D. Santambrogio. 2018. A unified backend for targeting FPGAs from DSLs. In Proceedings of the IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP’18).

[54]

Emanuele Del Sozzo, Marco Rabozzi, Lorenzo Di Tucci, Donatella Sciuto, and Marco D. Santambrogio. 2018. A scalable FPGA design for cloud n-body simulation. In Proceedings of the IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP’18).

[55]

Zachary DeVito, James Hegarty, Alex Aiken, Pat Hanrahan, and Jan Vitek. 2013. Terra: A multi-stage language for high-performance computing. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). ACM, New York, NY, 105–116.

Digital Library

[56]

Yang Ding and Weng Fai Wong. 2005. Bit-width analysis for general applications. http://hdl.handle.net/1721.1/7412.

[57]

Douglas do Couto Teixeira and Fernando Magno Quintao Pereira. 2011. The design and implementation of a non-iterative range analysis algorithm on a production compiler. In Proceedings of the 2011 Brazilian Symposium on Programming Languages (SBLP’11). 45–59.

[58]

ECE Department, University of Toronto. 2021. LegUp High-Level Synthesis. Retrieved from https://github.com/wincle626/HLS_Legup.

[59]

Martin Fowler. 2010. Domain-specific Languages. Pearson Education.

Digital Library

[60]

Franz Franchetti, Tze Meng Low, Doru Thom Popovici, Richard M. Veras, Daniele G. Spampinato, Jeremy R. Johnson, Markus Püschel, James C. Hoe, and José M. F. Moura. 2018. SPIRAL: Extreme performance portability. Proc. IEEE 106, 11 (2018), 1935–1968.

[61]

Yoshihiko Futamura. 1983. Partial computation of programs. In Proceedings of the RIMS Symposia on Software Science and Engineering. Springer, 1–35.

Digital Library

[62]

Brian Gaide, Dinesh Gaitonde, Chirag Ravishankar, and Trevor Bauer. 2019. Xilinx adaptive compute acceleration platform: VersalTM architecture. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 84–93.

Digital Library

[63]

Nithin George, HyoukJoong Lee, David Novo, Tiark Rompf, Kevin J. Brown, Arvind K. Sujeeth, Martin Odersky, Kunle Olukotun, and Paolo Ienne. 2014. Hardware system synthesis from domain-specific languages. In Proceedings of the 24th International Conference on Field Programmable Logic and Applications (FPL’14). IEEE, 1–8.

[64]

Maya Gokhale and Lesley Shannon. 2021. FPGA computing. IEEE Micro 41, 4 (2021), 6–7.

[65]

Google. 2021. XLS: Accelerated HW Synthesis. Retrieved from https://google.github.io/xls/.

[66]

Zhi Guo, Walid Najjar, Frank Vahid, and Kees Vissers. 2004. A quantitative analysis of the speedup factors of FPGAs over processors. In Proceedings of the ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays. 162–170.

Digital Library

[67]

Robert Harper, David MacQueen, and Robin Milner. 1986. Standard ML. Department of Computer Science, University of Edinburgh.

[68]

James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: Compiling high-level image processing code into hardware pipelines. ACM Trans. Graph. 33, 4 (2014), 144–1.

Digital Library

[69]

James Hegarty, Ross Daly, Zachary DeVito, Jonathan Ragan-Kelley, Mark Horowitz, and Pat Hanrahan. 2016. Rigel: Flexible multi-rate image processing hardware. ACM Trans. Graph. 35, 4, Article 85 (July2016), 11 pages.

Digital Library

[70]

John L. Hennessy and David A. Patterson. 2019. A new golden age for computer architecture. Commun. ACM 62, 2 (2019), 48–60.

Digital Library

[71]

Steven F. Hoover. 2017. Timing-abstract circuit design in transaction-level Verilog. In Proceedings of the IEEE International Conference on Computer Design (ICCD’17). IEEE, 525–532.

[72]

Lan Huang, Da-Lin Li, Kang-Ping Wang, Teng Gao, and Adriano Tavares. 2020. A survey on performance optimization of high-level synthesis tools. J. Comput. Sci. Technol. 35 (2020), 697–720.

Digital Library

[73]

Maxeler Inc.2021. Multiscale Dataflow Programming. Retrieved from https://www.maxeler.com/products/software/maxcompiler/.

[74]

Intel. 2020. Intel Quartus Documentation. Retrieved from https://www.intel.com/content/www/us/en/programmable/products/design-software/fpga-design/quartus-prime/user-guides.html.

[75]

Intel. 2021. Intel FPGA SDK for OpenCL. Retrieved from https://www.intel.com/content/www/us/en/software/programmable/sdk-for-opencl/overview.html.

[76]

Intel. 2021. Intel HLS Compiler. Retrieved from https://www.intel.it/content/www/it/it/software/programmable/quartus-prime/hls-compiler.html.

[77]

Intel Inc.2021. Intel oneAPI. Retrieved from https://software.intel.com/content/www/us/en/develop/tools/oneapi.html.

[78]

Adam Izraelevitz, Jack Koenig, Patrick Li, Richard Lin, Angie Wang, Albert Magyar, Donggyu Kim, Colin Schmidt, Chick Markley, Jim Lawson, and Jonathan Bachrach. 2017. Reusability is FIRRTL ground: Hardware construction languages, compiler frameworks, and transformations. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’17). 209–216.

Digital Library

[79]

Ricardo Jasinski. 2016. Effective Coding with VHDL: Principles and Best Practice. MIT Press.

Digital Library

[80]

Shunning Jiang, Berkin Ilbeyi, and Christopher Batten. 2018. Mamba: Closing the performance gap in productive hardware development frameworks. In Proceedings of the 55th ACM/ESDA/IEEE Design Automation Conference (DAC’18). IEEE, 1–6.

Digital Library

[81]

Lana Josipović, Andrea Guerrieri, and Paolo Ienne. 2019. DYNAMATIC—Dynamically Scheduled High-Level Synthesis. Retrieved from https://github.com/lana555/dynamatic.

[82]

Lana Josipović, Andrea Guerrieri, and Paolo Ienne. 2019. DYNAMATIC—From C/C++ to Dynamically-Scheduled Circuits. Retrieved from https://dynamatic.epfl.ch.

[83]

Lana Josipović, Andrea Guerrieri, and Paolo Ienne. 2020. Invited tutorial: Dynamatic: From C/C++ to dynamically scheduled circuits. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 1–10.

Digital Library

[84]

Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture. 1–12.

Digital Library

[85]

Richard M. Karp, Raymond E. Miller, and Shmuel Winograd. 1967. The organization of computations for uniform recurrence equations. J. ACM 14, 3 (1967), 563–590.

Digital Library

[86]

David Koeplinger, Matthew Feldman, Raghu Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, Ardavan Pedram, Christos Kozyrakis, et al. 2018. Spatial: A language and compiler for application accelerators. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. 296–311.

Digital Library

[87]

D. Koeplinger, R. Prabhakar, Y. Zhang, C. Delimitrou, C. Kozyrakis, and K. Olukotun. 2016. Automatic generation of efficient accelerators for reconfigurable hardware. In Proceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). 115–127.

Digital Library

[88]

Martin Kristien, Bruno Bodin, Michel Steuwer, and Christophe Dubach. 2019. High-level synthesis of functional patterns with lift. In Proceedings of the 6th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming. 35–45.

Digital Library

[89]

Hsiang Tsung Kung and Charles E. Leiserson. 1978. Systolic Arrays for (VLSI).Technical Report. Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA.

[90]

S. Kung. 1985. VLSI array processors. IEEE ASSP Mag. 2, 3 (1985), 4–22.

[91]

Ian Kuon and Jonathan Rose. 2007. Measuring the gap between FPGAs and ASICs. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 26, 2 (2007), 203–215.

Digital Library

[92]

Ian Kuon, Russell Tessier, and Jonathan Rose. 2008. FPGA Architecture: Survey and Challenges. Now Publishers.

[93]

Andreas Kurth, Pirmin Vogel, Alessandro Capotondi, Andrea Marongiu, and Luca Benini. 2017. HERO: Heterogeneous embedded research platform for exploring RISC-V manycore accelerators on FPGA. Retrieved from https://arXiv:1712.06497.

[94]

Sakari Lahti, Panu Sjövall, Jarno Vanne, and Timo D. Hämäläinen. 2018. Are we there yet? A study on the state of high-level synthesis. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 38, 5 (2018).

[95]

Yi-Hsiang Lai, Yuze Chi, Yuwei Hu, Jie Wang, Cody Hao Yu, Yuan Zhou, Jason Cong, and Zhiru Zhang. 2019. HeteroCL: A multi-paradigm programming infrastructure for software-defined reconfigurable computing. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 242–251.

Digital Library

[96]

Yi-Hsiang Lai, Hongbo Rong, Size Zheng, Weihao Zhang, Xiuping Cui, Yunshan Jia, Jie Wang, Brendan Sullivan, Zhiru Zhang, Yun Liang, et al. 2020. SuSy: A programming model for productive construction of high-performance systolic arrays on FPGAs. In Proceedings of the IEEE/ACM International Conference On Computer Aided Design (ICCAD’20). IEEE, 1–9.

Digital Library

[97]

William Landi. 1992. Undecidability of static analysis. ACM Lett. Program. Lang. Syst. 1, 4 (1992), 323–337.

Digital Library

[98]

Halide Language. 2013. Documentation. Retrieved from https://halide-lang.org/docs/class_halide_1_1_func.html.

[99]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’04). IEEE, 75–86.

[100]

Dominique Lavenier, Patrice Quinton, and Sanjay Rajopadhye. 1999. Advanced systolic design. Dig. Signal Process. Multimedia Syst. (1999), 657–692.

[101]

C. Lavin and A. Kaviani. 2018. RapidWright: Enabling custom crafted implementations for FPGAs. In Proceedings of the IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’18). 133–140.

[102]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (28 052015), 436–444.

[103]

HyoukJoong Lee, Kevin Brown, Arvind Sujeeth, Hassan Chafi, Tiark Rompf, Martin Odersky, and Kunle Olukotun. 2011. Implementing domain-specific languages for heterogeneous parallel computing. IEEE Micro 31, 5 (2011), 42–53.

Digital Library

[104]

LegUp Computing. 2019. High-Level Synthesis For Any FPGA. Retrieved from https://www.legupcomputing.com/.

[105]

Roland Leißa, Klaas Boesche, Sebastian Hack, Richard Membarth, and Philipp Slusallek. 2015. Shallow embedding of DSLs via online partial evaluation. In Proceedings of the ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences. ACM, New York, NY, 11–20.

Digital Library

[106]

Roland Leißa, Klaas Boesche, Sebastian Hack, Richard Membarth, and Philipp Slusallek. 2015. Shallow embedding of DSLs via online partial evaluation. In Proceedings of the ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences. ACM, New York, NY, 11–20.

Digital Library

[107]

Roland Leißa, Klaas Boesche, Sebastian Hack, Arsène Pérard-Gayot, Richard Membarth, Philipp Slusallek, André Müller, and Bertil Schmidt. 2018. AnyDSL: A partial evaluation framework for programming high-performance libraries. Proc. ACM Program. Lang. 2, OOPSLA (2018), 1–30.

Digital Library

[108]

Ang Li and David Wentzlaff. 2019. PRGA: An open-source framework for building and using custom FPGAs. In Proceedings of the 1st Workshop on Open-Source Design Automation. 1–6.

[109]

Jiajie Li, Yuze Chi, and Jason Cong. 2020. HeteroHalide: From image processing DSL to efficient FPGA acceleration. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 51–57.

Digital Library

[110]

Yanbing Li and Miriam Leeser. 2000. HML, a novel hardware description language and its translation to VHDL. IEEE Trans. Very Large Scale Integr. Syst. 8, 1 (2000), 1–8.

Digital Library

[111]

Olav Lindtjorn, Robert G. Clapp, Oliver Pell, Oskar Mencer, and Michael J. Flynn. 2010. Surviving the end of scaling of traditional microprocessors in HPC. IEEE Hot Chips 22 (2010), 22–24.

[112]

Leibo Liu, Jianfeng Zhu, Zhaoshi Li, Yanan Lu, Yangdong Deng, Jie Han, Shouyi Yin, and Shaojun Wei. 2019. A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications. ACM Comput. Surveys 52, 6 (2019), 1–39.

Digital Library

[113]

Yanqiang Liu, Yao Li, Zhengwei Qi, and Haibing Guan. 2019. A scala based framework for developing acceleration systems with FPGAs. J. Syst. Architect. 98 (2019), 231–242.

Digital Library

[114]

Derek Lockhart, Stephen Twigg, Ravi Narayanaswami, Jeremy Coriell, Uday Dasari, Richard Ho, Doug Hogberg, George Huang, Anand Kane, Chintan Kaur, Tao Liu, Adriana Maggiore, Kevin Townsend, and Emre Tuncer. 2018. Experiences Building Edge TPU with Chisel. Retrieved from https://www.youtube.com/watch?v=x85342Cny8c.

[115]

Derek Lockhart, Gary Zibrat, and Christopher Batten. 2014. PyMTL: A unified framework for vertically integrated computer architecture research. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE.

Digital Library

[116]

Lombiq Technologies. 2019. Hastlayer SDK—GitHub. Retrieved from https://github.com/Lombiq/Hastlayer-SDK.

[117]

Kenneth C. Louden and Kenneth A. Lambert. 2011. Programming Languages: Principles and Practices. Cengage Learning.

[118]

Scott A. Mahlke, Richard E. Hank, Roger A. Bringmann, John C. Gyllenhaal, David M. Gallagher, and Wen-mei W. Hwu. 1994. Characterizing the impact of predicated execution on branch prediction. In Proceedings of the 27th Annual International Symposium on Microarchitecture. 217–227.

Digital Library

[119]

Steven Margerm, Amirali Sharifian, Apala Guha, Arrvindh Shriraman, and Gilles Pokam. 2018. TAPAS: Generating parallel accelerators from parallel programs. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE, 245–257.

Digital Library

[120]

Grant Martin and Gary Smith. 2009. High-level synthesis: Past, present, and future. IEEE Design Test Comput. 26, 4 (2009), 18–25.

Digital Library

[121]

Ali Mashtizadeh. 2007. PHDL: A Python Hardware Design Framework. Retrieved from https://dspace.mit.edu/handle/1721.1/41543.

[122]

MathWorks Inc.2021. HDL Coder. Retrieved from https://www.mathworks.com/help/pdf_doc/hdlcoder/hdlcoder_ug.pdf.

[123]

Clive Maxfield. 2004. The Design Warrior’s Guide to FPGAs: Devices, Tools and Flows. Elsevier.

Digital Library

[124]

Wim Meeus, Kristof Van Beeck, Toon Goedemé, Jan Meel, and Dirk Stroobandt. 2012. An overview of today’s high-level synthesis tools. Design Autom. Embed. Syst. 16, 3 (2012), 31–51.

Digital Library

[125]

Richard Membarth, Frank Hannig, Jürgen Teich, Mario Körner, and Wieland Eckert. 2012. Generating device-specific GPU code for local operators in medical imaging. In Proceedings of the IEEE 26th International Parallel and Distributed Processing Symposium. IEEE, 569–581.

Digital Library

[126]

Richard Membarth, Oliver Reiche, Frank Hannig, Jürgen Teich, Mario Körner, and Wieland Eckert. 2015. Hipa^cc: A domain-specific language and compiler for image processing. IEEE Trans. Parallel Distrib. Syst. 27, 1 (2015), 210–224.

Digital Library

[127]

Richard Membarth, Oliver Reiche, Özkan Mehmet Akif, and Bo Qiao. 2013. Repo HIP^acc. Retrieved from https://github.com/hipacc/hipacc.

[128]

Mentor Graphics. 2021. Catapult HLS. Retrieved from https://www.mentor.com/hls-lp/catapult-high-level-synthesis/.

[129]

Mentor Graphics. 2021. Design Creation. Retrieved from https://www.mentor.com/products/fpga/hdl_design/.

[130]

Peter Milder, Franz Franchetti, James C. Hoe, and Markus Püschel. 2012. Computer generation of hardware for linear digital signal processing transforms. ACM Trans. Design Autom. Electr. Syst. 17, 2 (2012), 1–33.

Digital Library

[131]

Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. 2015. PolyMage: Automatic optimization for image processing pipelines. ACM SIGARCH Comput. Architect. News 43, 1 (2015), 429–443.

Digital Library

[132]

Kevin E. Murray, Mohamed A. Elgammal, Vaughn Betz, Tim Ansell, Keith Rothman, and Alessandro Comodi. 2020. SymbiFlow and VPR: An open-source design flow for commercial and novel FPGAs. IEEE Micro 40, 4 (2020), 49–57.

Digital Library

[133]

Razvan Nane, Vlad-Mihai Sima, Bryan Olivier, Roel Meeuws, Yana Yankova, and Koen Bertels. 2012. DWARV 2.0: A CoSy-based C-to-VHDL hardware compiler. In Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL’12). IEEE, 619–622.

[134]

Razvan Nane, Vlad-Mihai Sima, Christian Pilato, Jongsok Choi, Blair Fort, Andrew Canis, Yu Ting Chen, Hsuan Hsiao, Stephen Brown, Fabrizio Ferrandi, et al. 2016. A survey and evaluation of FPGA high-level synthesis tools. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 35, 10 (2016), 1591–1604.

Digital Library

[135]

NEC. 2011. CyberWorkBench: High Level Synthesis from C/C++/SystemC to ASIC/FPGA. Retrieved from https://www.nec.com/en/global/prod/cwb/index.html.

[136]

Rachit Nigam, Samuel Thomas, Zhijing Li, and Adrian Sampson. 2021. A compiler infrastructure for accelerator generators. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 804–817.

Digital Library

[137]

Rishiyur Nikhil. [n.d.]. Bluespec system verilog: Efficient, correct RTL from high level specifications. In Proceedings of the 2nd ACM and IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE’04).

[138]

Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, and Karthikeyan Sankaralingam. 2017. Stream-dataflow acceleration. In Proceedings of the ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA’17). IEEE, 416–429.

Digital Library

[139]

Martin Odersky, Philippe Altherr, Vincent Cremet, Burak Emir, Stphane Micheloud, Nikolay Mihaylov, Michel Schinz, Erik Stenman, and Matthias Zenger. 2004. The Scala language specification. Citeseer. https://www.scala-lang.org/files/archive/spec/2.11/.

[140]

M. Akif Özkan, Arsène Pérard-Gayot, Richard Membarth, Philipp Slusallek, Roland Leißa, Sebastian Hack, Jürgen Teich, and Frank Hannig. 2020. AnyHLS: High-level synthesis with partial evaluation. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 39, 11 (2020), 3202–3214.

[141]

M. Akif Özkan, Arsène Pérard-Gayot, Richard Membarth, Philipp Slusallek, Roland Leißa, Sebastian Hack, Jürgen Teich, and Frank Hannig. 2020. AnyHLS. Retrieved from https://github.com/AnyDSL/anyhls.

[142]

Samir Palnitkar. 2003. Verilog HDL: A Guide to Digital Design and Synthesis. Vol. 1. Prentice Hall Professional.

[143]

Preeti Ranjan Panda, B. V. N. Silpa, Aviral Shrivastava, and Krishnaiah Gummidipudi. 2010. Power-efficient System Design. Springer Science & Business Media.

[144]

Panda Team. 2021. Bambu: A Free Framework for the High-Level Synthesis of Complex Applications. Retrieved from https://panda.dei.polimi.it/?page_id=31.

[145]

C. Papon. 2017. SpinalHDL: An alternative hardware description language. In Proceedings of the Free and Open Source Software Developers’ European Meeting (FOSDEM’17).

[146]

Daniele Parravicini, Davide Conficconi, Emanuele Del Sozzo, Christian Pilato, and Marco D. Santambrogio. 2021. CICERO: A domain-specific architecture for efficient regular expression matching. ACM Trans. Embed. Comput. Syst. 20, 5s (2021), 1–24.

Digital Library

[147]

Jason R. C. Patterson. 1995. Accurate static branch prediction by value range propagation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 67–78.

Digital Library

[148]

Maxime Pelcat, Cédric Bourrasset, Luca Maggiani, and François Berry. 2016. Design productivity of a high level synthesis compiler versus HDL. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS’16). IEEE, 140–147.

[149]

Christian Pilato and Fabrizio Ferrandi. 2013. Bambu: A modular framework for the high level synthesis of memory-intensive applications. In Proceedings of the 23rd International Conference on Field Programmable Logic and Applications (FPL’13). IEEE.

[150]

Oron Port and Yoav Etsion. 2017. DFiant: A dataflow hardware description language. In Proceedings of the 27th International Conference on Field Programmable Logic and Applications (FPL’17). IEEE, 1–4.

[151]

Raghu Prabhakar, David Koeplinger, Kevin J. Brown, HyoukJoong Lee, Christopher De Sa, Christos Kozyrakis, and Kunle Olukotun. 2016. Generating configurable hardware from parallel patterns. ACM Sigplan Notices 51, 4 (2016).

Digital Library

[152]

Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, Ardavan Pedram, Christos Kozyrakis, and Kunle Olukotun. 2017. Plasticine: A reconfigurable architecture for parallel patterns. In Proceedings of the ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA’17). IEEE, 389–402.

Digital Library

[153]

Jing Pu, Steven Bell, Xuan Yang, Jeff Setter, Stephen Richardson, Jonathan Ragan-Kelley, and Mark Horowitz. 2017. Programming heterogeneous systems from an image processing DSL. ACM Trans. Archit. Code Optim. 14, 3, Article 26 (Aug.2017), 25 pages.

[154]

David Pursley and Tung-Hua Yeh. 2017. High-level low-power system design optimization. In Proceedings of the International Symposium on VLSI Design, Automation and Test (VLSI-DAT’17). IEEE, 1–4.

[155]

Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, et al. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture (ISCA’14). IEEE, 13–24.

[156]

QuickLogic. 2021. QuickLogic Open Reconfigurable Computing (QORC) MCU + eFPGA SoC Open Source Software Tools. Retrieved from https://www.quicklogic.com/software/qorc-mcu-efpga-fpga-open-source-tools/.

[157]

M. Rabozzi, G. Natale, E. Del Sozzo, A. Scolari, L. Stornaiuolo, and M. D. Santambrogio. 2017. Heterogeneous exascale supercomputing: The role of CAD in the exaFPGA project. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE’17). 410–415.

[158]

Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices 48, 6 (2013), 519–530.

Digital Library

[159]

Parthasarathy Ranganathan, Daniel Stodolsky, Jeff Calow, Jeremy Dorfman, Marisabel Guevara, Clinton Wills Smullen IV, Aki Kuusela, Raghu Balasubramanian, Sandeep Bhatia, Prakash Chauhan, et al. 2021. Warehouse-scale video acceleration: Co-design and deployment in the wild. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 600–615.

Digital Library

[160]

Enrico Reggiani, Emanuele Del Sozzo, Davide Conficconi, Giuseppe Natale, Carlo Moroni, and Marco D. Santambrogio. 2021. Enhancing the scalability of multi-FPGA stencil computations via highly optimized hdl components. ACM Trans. Reconfig. Technol. Syst. 14, 3 (2021), 1–33.

Digital Library

[161]

Enrico Reggiani, Eleonora D’Arnese, Andrea Purgato, and Marco D. Santambrogio. 2017. Pearson correlation coefficient acceleration for modeling and mapping of neural interconnections. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’17). IEEE, 223–228.

[162]

Oliver Reiche, M. Akif Özkan, Richard Membarth, Jürgen Teich, and Frank Hannig. 2017. Generating FPGA-based image processing accelerators with Hipacc. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’17). IEEE, 1026–1033.

[163]

Oliver Reiche, Moritz Schmid, Frank Hannig, Richard Membarth, and Jürgen Teich. 2014. Code generation from a domain-specific language for C-based HLS of hardware accelerators. In Proceedings of the International Conference on Hardware/software Codesign and System Synthesis (CODES+ ISSS’14). IEEE, 1–10.

Digital Library

[164]

David I. Rich. 2003. The evolution of SystemVerilog. IEEE Ann. History Comput. 20, 04 (2003), 82–84.

[165]

Kentaro Sano, Hayato Suzuki, Ryo Ito, Tomohiro Ueno, and Satoru Yamamoto. 2014. Stream processor generator for HPC to embedded applications on FPGA-based system platform. Retrieved from https://arXiv:1408.5386.

[166]

Jeferson Santiago da Silva, François-Raymond Boyer, and J. M. Pierre Langlois. 2018. P4-compatible high-level synthesis of low latency 100 Gb/s streaming packet parsers in FPGAs. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 147–152.

[167]

Jeferson Santiago da Silva, François-Raymond Boyer, and J. M. Pierre Langlois. 2018. Repo P4HLS. Retrieved from https://github.com/engjefersonsantiago/P4HLS.

[168]

Simpei Sato and Kenji Kise. 2013. ArchHDL: A new hardware description language for high-speed architectural evaluation. In Proceedings of the IEEE 7th International Symposium on Embedded Multicore Systems-on-Chip (MCSoC’13). IEEE, 107–112.

Digital Library

[169]

Tao B. Schardl, William S. Moses, and Charles E. Leiserson. 2017. Tapir: Embedding fork-join parallelism into LLVM’s intermediate representation. In Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 249–265.

Digital Library

[170]

Christian Schmitt, Sebastian Kuckuk, Frank Hannig, Harald Köstler, and Jürgen Teich. 2014. ExaSlang: A domain-specific language for highly scalable multigrid solvers. In Proceedings of the 4th International Workshop on Domain-specific Languages and High-level Frameworks for High Performance Computing. IEEE, 42–51.

[171]

Christian Schmitt, Moritz Schmid, Frank Hannig, Jürgen Teich, Sebastian Kuckuk, and Harald Köstler. 2015. Generation of multigrid-based numerical solvers for FPGA accelerators. In Proceedings of the 2nd International Workshop on High-Performance Stencil Computations (HiStencils’15). 9–15.

[172]

Sanjit Seshia, Albert Magyar, David Biancolin, John Koenig, Jonathan Bachrach, and Krste Asanovic. 2021. Golden Gate: Bridging the resource-efficiency gap between ASICs and FPGA prototypes. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD’19). 1–8. DOI:

[173]

Ofer Shacham, Sameh Galal, Sabarish Sankaranarayanan, Megan Wachs, John Brunhaver, Artem Vassiliev, Mark Horowitz, Andrew Danowitz, Wajahat Qadeer, and Stephen Richardson. 2012. Avoiding game over: Bringing design to the next level. In Proceedings of the DAC Design Automation Conference. IEEE, 623–629.

Digital Library

[174]

David Shah, Eddie Hung, Clifford Wolf, Serge Bazanski, Dan Gisselquist, and Miodrag Milanovic. 2019. Yosys+ nextpnr: An open source framework from verilog to bitstream for commercial FPGAs. In Proceedings of the IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’19). IEEE, 1–4.

[175]

Amirali Sharifian, Reza Hojabr, Navid Rahimi, Sihao Liu, Apala Guha, Tony Nowatzki, and Arrvindh Shriraman. 2019. \(\mu\)ir-an intermediate representation for transforming and optimizing the microarchitecture of application accelerators. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 940–953.

Digital Library

[176]

Sergey Shumarayev. 2017. Intel’s 14nm heterogeneous FPGA system-in-package platform. In Proceedings of the Hot Chips 29 Symposium.

[177]

Silexica. 2021. SLX FPGA. Retrieved from https://www.silexica.com/products/slx-fpga/.

[178]

Satnam Singh and David J. Greaves. 2008. Kiwi: Synthesis of FPGA circuits from parallel programs. In Proceedings of the 16th International Symposium on Field-Programmable Custom Computing Machines. IEEE, 3–12.

Digital Library

[179]

Marco Siracusa, Marco Rabozzi, Emanuele Del Sozzo, Lorenzo Di Tucci, Samuel Williams, and Marco D. Santambrogio. 2020. A CAD-based methodology to optimize HLS code via the Roofline model. In Proceedings of the IEEE/ACM International Conference On Computer Aided Design (ICCAD’20). 1–9.

Digital Library

[180]

Wilson Snyder. 2004. Verilator and SystemPerl. In Proceedings of the North American SystemC Users’ Group, Design Automation Conference.

[181]

Mark Stephenson, Jonathan Babb, and Saman Amarasinghe. 2000. Bidwidth analysis with application to silicon compilation. ACM SIGPLAN Notices 35, 5 (2000), 108–120.

Digital Library

[182]

Michel Steuwer, Christian Fensch, Sam Lindley, and Christophe Dubach. 2015. Generating performance portable code using rewrite rules: From high-level functional expressions to high-performance OpenCL code. ACM SIGPLAN Notices 50, 9 (2015), 205–217.

Digital Library

[183]

Robert Stewart, Kirsty Duncan, Greg Michaelson, Paulo Garcia, Deepayan Bhowmik, and Andrew M. Wallace. 2018. RIPL: A parallel image processing language for FPGAs. ACM Trans. Reconfig. Technol. Syst. 11, 1 (2018), 7:1–7:24.

Digital Library

[184]

Robert Stewart, Greg Michaelson, Deepayan Bhowmik, Paulo Garcia, and Andy Wallace. 2016. RIPL. Retrieved from https://github.com/robstewart57/ripl.

[185]

Leon Stok. 1994. Data path synthesis. Integration 18, 1 (1994), 1–71.

Digital Library

[186]

Arvind K. Sujeeth, Kevin J. Brown, Hyoukjoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun. 2014. Delite: A compiler architecture for performance-oriented embedded domain-specific languages. ACM Trans. Embed. Comput. Syst. 13, 4s (2014), 1–25.

Digital Library

[187]

Arvind K. Sujeeth, HyoukJoong Lee, Kevin J. Brown, Tiark Rompf, Hassan Chafi, Michael Wu, Anand R. Atreya, Martin Odersky, and Kunle Olukotun. 2011. OptiML: An implicitly parallel domain-specific language for machine learning. In Proceedings of the International Conference on Machine Learning (ICML’11). 609–616.

[188]

Ian Swarbrick, Dinesh Gaitonde, Sagheer Ahmad, Brian Gaide, and Ygal Arbel. 2019. Network-on-chip programmable platform in versal acap architecture. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 212–221.

Digital Library

[189]

Synopsys. 2021. RTL Synthesis. Retrieved from https://www.synopsys.com/implementation-and-signoff/rtl-synthesis-test.html.

[190]

Shinya Takamaeda-Yamazaki. 2015. PyVerilog: A Python-based hardware design processing toolkit for verilog HDL. In Applied Reconfigurable Computing. Springer, 451–460.

[191]

Xifan Tang, Edouard Giacomin, Baudouin Chauviere, Aurelien Alacchi, and Pierre-Emmanuel Gaillardon. 2020. OpenFPGA: An open-source framework for agile prototyping customizable FPGAs. IEEE Micro 40, 4 (2020), 41–48.

Digital Library

[192]

Russell Tessier, Kenneth Pocek, and Andre DeHon. 2015. Reconfigurable computing architectures. Proc. IEEE 103, 3 (2015), 332–354.

[193]

The Stanford Pervasive Parallelism Lab. 2017. Spatial: Specify Parameterized Accelerators Through Inordinately Abstract Language. Retrieved from https://github.com/stanford-ppl/spatial.

[194]

Lenny Truong and Pat Hanrahan. 2019. A golden age of hardware description languages: Applying programming language techniques to improve design productivity. In Proceedings of the 3rd Summit on Advances in Programming Languages (SNAPL’19). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

[195]

UCLA VAST Lab. 2020. HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration. Retrieved from https://github.com/UCLA-VAST/heterohalide.

[196]

Yaman Umuroglu, Davide Conficconi, Lahiru Rasnayake, Thomas B. Preusser, and Magnus Själander. 2019. Optimizing bit-serial matrix multiplication for reconfigurable computing. ACM Trans. Reconfig. Technol. Syst. 12, 3 (2019), 1–24.

Digital Library

[197]

Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. 2017. FINN: A framework for fast, scalable binarized neural network inference. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’17). ACM, 65–74.

Digital Library

[198]

Universitè Bretagne Sud. 2020. GAUT—High-Level Synthesis Tool from C to RTL. Retrieved from http://hls-labsticc.univ-ubs.fr/.

[199]

University of California, Riverside. 2010. ROCCC 2.0. Retrieved from http://roccc.cs.ucr.edu/.

[200]

University of Cambridge. 2016. Kiwi Scientific Acceleration using FPGA. Retrieved from https://www.cl.cam.ac.uk/djg11/kiwi/.

[201]

Stylianos I. Venieris, Alexandros Kouris, and Christos-Savvas Bouganis. 2018. Toolflows for mapping convolutional neural networks on FPGAs: A survey and future directions. ACM Comput. Surveys 51, 3 (2018), 1–39.

Digital Library

[202]

Jason Villarreal, Adrian Park, Walid Najjar, and Robert Halstead. 2010. Designing modular hardware accelerators in C with ROCCC 2.0. In Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’10). IEEE, 127–134.

Digital Library

[203]

Kizheppatt Vipin and Suhaib A. Fahmy. 2018. FPGA dynamic and partial reconfiguration: A survey of architectures, methods, and applications. ACM Comput. Surveys 51, 4 (2018), 1–39.

Digital Library

[204]

OpenCV: Open Source Computer Vision. 2000. Image Filtering. Retrieved from https://docs.opencv.org/3.4/d4/d86/group__imgproc__filter.html.

[205]

Han Wang, Robert Soulé, Huynh Tu Dang, Ki Suh Lee, Vishal Shrivastav, Nate Foster, and Hakim Weatherspoon. 2017. P4FPGA: A rapid prototyping framework for P4. In Proceedings of the Symposium on SDN Research. 122–135.

Digital Library

[206]

Jie Wang, Licheng Guo, and Jason Cong. 2021. AutoSA: A polyhedral compiler for high-performance systolic arrays on FPGA. In Proceedings of the ACM/SIGDA International Symposium on Field-programmable Gate Arrays.

Digital Library

[207]

Yutaka Watanabe, Jinpil Lee, Kentaro Sano, Taisuke Boku, and Mitsuhisa Sato. 2020. Design and preliminary evaluation of OpenACC compiler for FPGA with OpenCL and stream processing DSL. In Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops. 10–16.

Digital Library

[208]

Xilinx. 2021. Vitis HLS Front-end is Now Open Source. Retrieved from https://github.com/Xilinx/HLS.

[209]

Xilinx Inc.2013. Vivado Design Suite. Retrieved from http://www.xilinx.com/products/design-tools/vivado.html.

[210]

Xilinx Inc.2013. Vivado HLS. Retrieved from https://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html.

[211]

Xilinx Inc.2018. SDSoC. Retrieved from https://www.xilinx.com/products/design-tools/software-zone/sdsoc.html.

[212]

Xilinx Inc.2019. SDAccel. Retrieved from https://www.xilinx.com/products/design-tools/software-zone/sdaccel.html.

[213]

Xilinx Inc.2019. Vitis Unified Software Platform. Retrieved from https://www.xilinx.com/products/design-tools/vitis.html.

[214]

Xilinx Inc.2020. Vitis HLS. Retrieved from https://www.xilinx.com/html_docs/xilinx2020_1/vitis_doc/introductionvitishls.html.

[215]

Xilinx Inc.2021. Vitis Accelerated Libraries. Retrieved from https://github.com/Xilinx/Vitis_Libraries.

[216]

Xilinx Inc.2021. Xilinx Runtime Library. Retrieved from https://github.com/Xilinx/XRT.

[217]

Jianxin Xiong, Jeremy Johnson, Robert Johnson, and David Padua. 2001. SPL: A language and compiler for DSP algorithms. ACM SIGPLAN Notices 36, 5 (2001), 298–308.

Digital Library

[218]

Alberto Zeni, Guido Walter Di Donato, Lorenzo Di Tucci, Marco Rabozzi, and Marco D. Santambrogio. 2021. The importance of being X-drop: High performance genome alignment on reconfigurable hardware. In Proceedings of the IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’21). IEEE, 133–141.

[219]

Alberto Zeni, Kenneth O’Brien, Michaela Blott, and Marco D. Santambrogio. 2021. Optimized implementation of the HPCG benchmark on reconfigurable hardware. In Proceedings of the European Conference on Parallel Processing. Springer, 616–630.

Digital Library

[220]

Jerry Zhao, Ben Korpan, Abraham Gonzalez, and Krste Asanovic. 2020. SonicBOOM: The 3rd generation Berkeley out-of-order machine. In Proceedings of the Fourth Workshop on Computer Architecture Research with RISC-V, vol. 5.

[221]

Wei Zuo, Peng Li, Deming Chen, Louis-Noël Pouchet, Shunan Zhong, and Jason Cong. 2013. Improving polyhedral code generation for high-level synthesis. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ ISSS’13). IEEE, 1–10.

Cited By

Nieto RMachado FFernández-Conde JLobato DCañas J(2025)Open-source ROS-based simulation for verification of FPGA robotics applicationsMicroprocessors and Microsystems10.1016/j.micpro.2025.105143(105143)Online publication date: Feb-2025
https://doi.org/10.1016/j.micpro.2025.105143
Axinte C(2024)A Methodology for the Synthesis and Evaluation of Hardware AcceleratorsBulletin of the Polytechnic Institute of Iași. Electrical Engineering, Power Engineering, Electronics Section10.2478/bipie-2023-001169:2(91-100)Online publication date: 30-Aug-2024
https://doi.org/10.2478/bipie-2023-0011
Ueno TDel Sozzo ESano K(2024)Flexible Systolic Array Platform on Virtual 2-D Multi-FPGA PlaneProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3635035.3637285(84-94)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1145/3635035.3637285
Show More Cited By

Index Terms

Pushing the Level of Abstraction of Digital System Design: A Survey on How to Program FPGAs

Recommendations

Power- and Area-Optimized High-Level Synthesis Implementation of a Digital Down Converter for Software-Defined Radio Applications
Abstract
In digital signal processing, digital down converters (DDCs) convert digitized, band-limited signals to lower frequency signals at a smaller sampling rate to simplify subsequent filtering stages. Software-defined radio (SDR) is a radio ...
High-Level Synthesis for FPGAs: From Prototyping to Deployment

Escalating system-on-chip design complexity is pushing the design community to raise the level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of early generations of commercial high-level synthesis (HLS) systems, we ...
Efficient System-Level Hardware Synthesis of Dataflow Programs Using Shared Memory Based FIFO

The purpose of this paper is to raise the level of abstraction in the design of embedded systems to the system-level. A novel design flow was proposed that enables an efficient hardware implementation of video processing applications described using a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 55, Issue 5

May 2023

810 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3567470

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2022

Online AM: 27 April 2022

Accepted: 06 April 2022

Revised: 22 February 2022

Received: 31 July 2021

Published in CSUR Volume 55, Issue 5

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
2,911
Total Downloads

Downloads (Last 12 months)690
Downloads (Last 6 weeks)41

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nieto RMachado FFernández-Conde JLobato DCañas J(2025)Open-source ROS-based simulation for verification of FPGA robotics applicationsMicroprocessors and Microsystems10.1016/j.micpro.2025.105143(105143)Online publication date: Feb-2025
https://doi.org/10.1016/j.micpro.2025.105143
Axinte C(2024)A Methodology for the Synthesis and Evaluation of Hardware AcceleratorsBulletin of the Polytechnic Institute of Iași. Electrical Engineering, Power Engineering, Electronics Section10.2478/bipie-2023-001169:2(91-100)Online publication date: 30-Aug-2024
https://doi.org/10.2478/bipie-2023-0011
Ueno TDel Sozzo ESano K(2024)Flexible Systolic Array Platform on Virtual 2-D Multi-FPGA PlaneProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3635035.3637285(84-94)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1145/3635035.3637285
Del Sozzo EConficconi DSano K(2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
https://dl.acm.org/doi/10.1145/3634920
Winandy IDion AGaroche PManni F(2024)A Reactive System-Specific Compilation Chain from Synchronous Dataflow Models to FPGA Netlist2024 IEEE 10th International Conference on Space Mission Challenges for Information Technology (SMC-IT)10.1109/SMC-IT61443.2024.00009(11-21)Online publication date: 15-Jul-2024
https://doi.org/10.1109/SMC-IT61443.2024.00009
Sá Alves GCarlos Bittencourt JDias A(2024)An FPGA Integrated 2D Graphic Processor for Enhanced Digital Design and Computer Architecture EducationIEEE Revista Iberoamericana de Tecnologias del Aprendizaje10.1109/RITA.2024.345885019(176-185)Online publication date: 2024
https://doi.org/10.1109/RITA.2024.3458850
Bertolini RCarloni FConficconi DSantambrogio M(2024)POCA: A PYNQ Offloaded Cryptographic Accelerator on Embedded FPGA-Based Systems2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00054(194-194)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00054
Venere MGuerrini VBranchini BConficconi DSciuto DSantambrogio M(2024)Towards the Acceleration of the Sparse Blossom Algorithm for Quantum Error Correction2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00033(106-110)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00033
Valentino FBranchini BConficconi DSciuto DSantambrogio M(2024)An Accurate Union Find Decoder for Quantum Error Correction on the Toric Code2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00032(99-105)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPSW63119.2024.00032
Del Sozzo EWang XAdhi BCortes CAnderson JSano K(2024)Exploration of Trade-offs Between General-Purpose and Specialized Processing Elements in HPC-Oriented CGRA2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00065(668-680)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00065
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents