research-article

Time-Multiplexed FPGA Overlay Architectures: A Survey

Authors:

Douglas L. MaskellAuthors Info & Claims

ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 24, Issue 5

Article No.: 54, Pages 1 - 19

https://doi.org/10.1145/3339861

Published: 23 July 2019 Publication History

Abstract

This article presents a comprehensive survey of time-multiplexed (TM) FPGA overlays from the research literature. These overlays are categorized based on their implementation into two groups: processor-based overlays, as their implementation follows that of conventional silicon-based microprocessors, and; CGRA-like overlays, with either an array of interconnected processor-based functional units or medium-grained arithmetic functional units. Time-multiplexing the overlay allows it to change its behavior with a cycle-by-cycle execution of the application kernel, thus allowing better sharing of the limited FPGA hardware resource. However, most TM overlays suffer from large resource overheads, due to either the underlying processor-like architecture (for processor-based overlays) or due to the routing array and instruction storage requirements (for CGRA-like overlays). Reducing the area overhead for CGRA-like overlays, specifically that required for the routing network, and better utilizing the hard macros in the target FPGA are active areas of research.

References

[1]

COBHAM GAISLER AB. 2017. GRLIB IP core user’s manual. (2017).

[2]

Muhammed Al Kadi, Benedikt Janssen, and Michael Huebner. 2016. FGPU: An SIMT-architecture for FPGAs. In Proc. 24th Int. Symp. Field Program. Gate Arrays (FPGA). 254--263.

Digital Library

[3]

Altera. 2016. Nios II processor reference handbook.

[4]

Kevin Andryc, Murtaza Merchant, and Russell Tessier. 2013. FlexGrip: A soft GPGPU for FPGAs. In Proc. Int. Conf. Field-Programmable Technol. (FPT). 230--237.

[5]

Raghuraman Balasubramanian, Vinay Gangadhar, Ziliang Guo, Chen-Han Ho, Cherin Joseph, Jaikrishnan Menon, Mario Paulo Drumond, Robin Paul, Sharath Prasad, Pradip Valathol, et al. 2015. Enabling GPGPU low-level hardware explorations with MIAOW: An open-source RTL implementation of a GPGPU. ACM Trans. on Archit. and Code Optimization (TACO) 12, 2 (2015), 21.

Digital Library

[6]

Jesse Benson, Ryan Cofell, Chris Frericks, Chen-Han Ho, Venkatraman Govindaraju, Tony Nowatzki, and Karthikeyan Sankaralingam. 2012. Design, integration and implementation of the DySER hardware accelerator into OpenSPARC. In Proc. 18th Int. Symp. High Performance Comput. Archit. (HPCA). 1--12.

Digital Library

[7]

Alexander Brant and Guy GF Lemieux. 2012. ZUMA: An open FPGA overlay architecture. In Proc. 20th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 93--96.

Digital Library

[8]

Alexander Dunlop Brant. 2013. Coarse and Fine Grain Programmable Overlay Architectures for FPGAs. Ph.D. Dissertation. University of British Columbia.

[9]

Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Jason H Anderson, Stephen Brown, and Tomasz Czajkowski. 2011. LegUp: High-level synthesis for FPGA-based processor/accelerator systems. In Proc. 19th Int. Symp. Field Program. Gate Arrays (FPGA). 33--36.

Digital Library

[10]

Davor Capalija and Tarek S. Abdelrahman. 2011. Towards synthesis-free JIT compilation to commodity FPGAs. In Proc. 19th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 202--205.

Digital Library

[11]

Davor Capalija and Tarek S. Abdelrahman. 2013. A high-performance overlay architecture for pipelined execution of data flow graphs. In Proc. 23rd Int. Conf. Field Program. Logic Appl. (FPL). 1--8.

[12]

Hui Yan Cheah, Fredrik Brosser, Suhaib A. Fahmy, and Douglas L. Maskell. 2014. The iDEA DSP block-based soft processor for FPGAs. ACM Trans. on Reconfigurable Technol. and Syst. (TRETS) 7, 3 (2014), 19.

Digital Library

[13]

Hui Yan Cheah, Suhaib A. Fahmy, and Douglas L. Maskell. 2012. iDEA: A DSP block based FPGA soft processor. In Proc. Int. Conf. Field-Programmable Technol. (FPT). 151--158.

[14]

Christopher H. Chou, Aaron Severance, Alex D. Brant, Zhiduo Liu, Saurabh Sant, and Guy G. F. Lemieux. 2011. VEGAS: Soft vector processor with scratchpad memory. In Proc. 19th Int. Symp. Field Program. Gate Arrays (FPGA). 15--24.

Digital Library

[15]

James Coole and Greg Stitt. 2010. Intermediate fabrics: Virtual architectures for circuit portability and fast placement and routing. In Proc. Int. Conf. Hardware/Software Codesign and Syst. Synthesis (CODES+ ISSS). 13--22.

Digital Library

[16]

James Coole and Greg Stitt. 2015. Adjustable-cost overlays for runtime compilation. In Proc. 23rd Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 21--24.

Digital Library

[17]

Tomasz S. Czajkowski, Utku Aydonat, Dmitry Denisenko, John Freeman, Michael Kinsner, David Neto, Jason Wong, Peter Yiannacouras, and Deshanand P. Singh. 2012. From OpenCL to high-performance hardware on FPGAs. In Proc. 22nd Int. Conf. Field Program. Logic Appl. (FPL). 531--534.

[18]

Robert Dimond, Oskar Mencer, and Wayne Luk. 2005. CUSTARD-a customisable threaded FPGA soft processor and tools. In Proc. 15th Int. Conf. Field Program. Logic Appl. (FPL). 1--6.

[19]

R. Dimond, O. Mencer, and W. Luk. 2006. Application-specific customisation of multi-threaded soft processors. IEE Proc.-Comput. and Digital Tech. 153, 3 (2006), 173--180.

[20]

Pedro Duarte, Pedro Tomas, and Gabriel Falcao. 2017. SCRATCH: An end-to-end application-aware soft-GPGPU architecture and trimming tool. In Proc. 50th Int. Symp. Microarchitecture. 165--177.

Digital Library

[21]

Tom Feist. 2012. Vivado design suite. White Paper 5 (2012).

[22]

Ricardo Ferreira, Julio Goldner Vendramini, Lucas Mucida, Monica Magalhaes Pereira, and Luigi Carro. 2011. An FPGA-based heterogeneous coarse-grained dynamically reconfigurable architecture. In Proc. Int. Conf. Compilers, Archit. and Synthesis for Embedded Syst. (CASES). 195--204.

Digital Library

[23]

Blair Fort, Davor Capalija, Zvonko G. Vranesic, and Stephen D. Brown. 2006. A multithreaded soft processor for SoPC area reduction. In Proc. 14th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 131--142.

Digital Library

[24]

Jiri Gaisler, Edvin Catovic, Marko Isomaki, Kristoffer Glembo, and Sandi Habinc. 2007. GRLIB IP core user’s manual. Gaisler Research (2007).

[25]

Venkatraman Govindaraju, Chen-Han Ho, and Karthikeyan Sankaralingam. 2011. Dynamically specialized datapaths for energy efficient computing. In Proc. Int. Symp. High Performance Comput. Archit. (HPCA). 503--514.

Digital Library

[26]

Jan Gray. 2016. GRVI-Phalanx: A massively parallel RISC-V FPGA accelerator. In Proc. 24th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 17--20.

[27]

Sarah L. Harris, Robert Owen, Enrique Sedano, and Daniel Chaver Martinez. 2016. MIPSfpga: Hands-on learning on a commercial soft-core. In Proc. 11th European Workshop on Microelectronics Education (EWME). 1--5.

[28]

Reiner Hartenstein. 2001. Coarse grain reconfigurable architectures. In Proc. Asia and South Pacific Design Automation Conference (ASP-DAC). 564--569.

Digital Library

[29]

Reiner Hartenstein. 2001. A decade of reconfigurable computing: A visionary retrospective. In Proc. Conf. Design, Autom. and Test in Europe (DATE). 642--649.

Digital Library

[30]

Karel Heyse, Timothy N. Davidson, Elias Vansteenkiste, Karel Bruneel, and Dirk Stroobandt. 2013. Efficient implementation of virtual coarse grained reconfigurable arrays on FPGAs. In Proc. 23rd Int. Conf. Field Program. Logic Appl. (FPL). 1--8.

[31]

Clint Hilton and Brent Nelson. 2006. PNoC: A flexible circuit-switched NoC for FPGA-based systems. IEE Proc.-Comput. and Digital Tech. 153, 3 (2006), 181--188.

[32]

Cheah Hui Yan, Suhaib Fahmy, and Nachiket Kapre. 2015. On data forwarding in deeply pipelined soft processors. In Proc. 23rd Int. Symp. Field Program. Gate Arrays (FPGA). 181--189.

Digital Library

[33]

Abhishek Kumar Jain, Suhaib A. Fahmy, and Douglas L. Maskell. 2015. Efficient overlay architecture based on DSP blocks. In Proc. 23rd Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 25--28.

Digital Library

[34]

Abhishek Kumar Jain, Xiangwei Li, Suhaib A. Fahmy, and Douglas L. Maskell. 2016. Adapting the DySER architecture with DSP blocks as an overlay for the Xilinx Zynq. ACM SIGARCH Comput. Archit. News 43, 4 (2016), 28--33.

Digital Library

[35]

Abhishek Kumar Jain, Xiangwei Li, Pranjul Singhai, Douglas L. Maskell, and Suhaib A. Fahmy. 2016. DeCO: A DSP block based FPGA accelerator overlay with low overhead interconnect. In Proc. 24th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 1--8.

[36]

Abhishek Kumar Jain, Douglas L. Maskell, and Suhaib A. Fahmy. 2016. Throughput oriented FPGA overlays using DSP blocks. In Proc. Conf. Design, Autom. and Test in Europe (DATE). 1628--1633.

Digital Library

[37]

Abhishek Kumar Jain, Khoa Dang Pham, Jin Cui, Suhaib A. Fahmy, and Douglas L. Maskell. 2014. Virtualized execution and management of hardware tasks on a hybrid ARM-FPGA platform. J. of Signal Process. Syst. 77, 1-2 (2014), 61--76.

Digital Library

[38]

Rui Jia, Colin Yu Lin, Zhenhong Guo, Rui Chen, Fei Wang, Tongqiang Gao, and Haigang Yang. 2014. A survey of open source processors for FPGAs. In Proc. 24th Int. Conf. Field Program. Logic Appl. (FPL). 1--6.

[39]

Soh Jun Jie and Nachiket Kapre. 2014. Comparing soft and hard vector processing in FPGA-based embedded systems. In Proc. 24th Int. Conf. Field Program. Logic Appl. (FPL). 1--7.

[40]

Alex K. Jones, Raymond Hoare, Dara Kusic, Joshua Fazekas, and John Foster. 2005. An FPGA-based VLIW processor with custom hardware execution. In Proc. 13th Int. Symp. Field Program. Gate Arrays (FPGA). 107--117.

Digital Library

[41]

Nachiket Kapre and Jan Gray. 2015. Hoplite: Building austere overlay NoCs for FPGAs. In Proc. 25th Int. Conf. Field Program. Logic Appl. (FPL). 1--8.

[42]

Nachiket Kapre, Nikil Mehta, Michael Delorimier, Raphael Rubin, Henry Barnor, Michael J. Wilson, Michael Wrighton, and André Dehon. 2006. Packet switched vs. time multiplexed FPGA overlay networks. In Proc. 14th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 205--216.

Digital Library

[43]

Michel A. Kinsy, Michael Pellauer, and Srinivas Devadas. 2011. Heracles: Fully synthesizable parameterized mips-based multicore system. In Proc. 21st Int. Conf. Field Program. Logic Appl. (FPL). 356--362.

Digital Library

[44]

Dirk Koch, Christian Beckhoff, and Guy G. F. Lemieux. 2013. An efficient FPGA overlay for portable custom instruction set extensions. In Proc. 23rd Int. Conf. Field Program. Logic Appl. (FPL). 1--8.

[45]

Dirk Koch, Frank Hannig, and Daniel Ziener. 2016. FPGAs for Software Programmers. Springer.

Digital Library

[46]

Christoforos Kozyrakis and David Patterson. 2002. Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks. In Proc. 35th Int. Symp. Microarchitecture. 283--293.

Digital Library

[47]

Tamar Kranenburg and Rene Van Leuken. 2010. MB-LITE: A robust, light-weight soft-core implementation of the MicroBlaze architecture. In Proc. Conf. Design, Autom. and Test in Europe (DATE). 997--1000.

Digital Library

[48]

Chethan Kumar HB, Prashant Ravi, Gourav Modi, and Nachiket Kapre. 2017. 120-core microAptiv MIPS overlay for the Terasic DE5-NET FPGA board. In Proc. 25th Int. Symp. Field Program. Gate Arrays (FPGA). 141--146.

Digital Library

[49]

Martin Labrecque and J. Gregory Steffan. 2007. Improving pipelined soft processors with multithreading. In Proc. 17th Int. Conf. Field Program. Logic Appl. (FPL). 210--215.

[50]

Martin Labrecque, Peter Yiannacouras, and J. Gregory Steffan. 2008. Scaling soft processor systems. In Proc. 16th Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 195--205.

Digital Library

[51]

Charles Eric LaForest. 2015. High-speed Soft-processor Architecture for FPGA Overlays. Ph.D. Dissertation. University of Toronto.

[52]

Charles Eric Laforest and Jason H. Anderson. 2017. Microarchitectural comparison of the MXP and Octavo soft-processor FPGA overlays. ACM Trans. on Reconfigurable Technol. and Syst. (TRETS) 10, 3 (2017), 19.

Digital Library

[53]

Charles Eric LaForest and John Gregory Steffan. 2012. Octavo: An FPGA-centric processor family. In Proc. 20th Int. Symp. Field Program. Gate Arrays (FPGA). 219--228.

Digital Library

[54]

Monica Lam. 1988. Software pipelining: An effective scheduling technique for VLIW machines. In ACM Sigplan Notices, Vol. 23. 318--328.

Digital Library

[55]

Damjan Lampret. 2001. OpenRISC 1200 IP core specification. https://opencores.org/.

[56]

Charles E. Leiserson. 1985. Fat-trees: Universal networks for hardware-efficient supercomputing. IEEE Trans. on Comput. 100, 10 (1985), 892--901.

Digital Library

[57]

Xiangwei Li, Abhishek Jain, Douglas Maskell, and Suhaib A. Fahmy. 2016. An area-efficient FPGA overlay using DSP block based time-multiplexed functional units. In Proc. 2nd Int. Workshop on Overlay Archit. for FPGAs (OLAF).

[58]

Xiangwei Li, Abhishek Kumar Jain, Douglas L. Maskell, and Suhaib A. Fahmy. 2018. A time-multiplexed FPGA overlay with linear interconnect. In Proc. Conf. Design, Autom. and Test in Europe (DATE). 1075--1080.

[59]

Cheng Liu, Ho-Cheung Ng, and Hayden Kwok-Hay So. 2015. QuickDough: A rapid FPGA loop accelerator design framework using soft CGRA overlay. In Proc. Int. Conf. Field-Programmable Technol. (FPT). 56--63.

[60]

Cheng Liu, Colin Lin Yu, and Hayden Kwok-Hay So. 2013. A soft coarse-grained reconfigurable array based high-level synthesis methodology: Promoting design productivity and exploring extreme FPGA frequency. In Proc. 21st Int. Symp. Field-Programmable Custom Comput. Mach. (FCCM). 228--228.

Digital Library

[61]

Bingfeng Mei, Serge Vernalde, Diederik Verkest, Hugo De Man, and Rudy Lauwereins. 2003. ADRES: An architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In Proc. 13rd Int. Conf. Field Program. Logic Appl. (FPL). 61--70.

[62]

Thomas Moscibroda and Onur Mutlu. 2009. A case for bufferless routing in on-chip networks. ACM SIGARCH Comput. Archit. News 37, 3 (2009), 196--207.

Digital Library

[63]

Roger Moussali, Nabil Ghanem, and Mazen A. R. Saghir. 2007. Supporting multithreading in configurable soft processor cores. In Proc. Int. Conf. Compilers, Archit. and Synthesis for Embedded Syst. (CASES). 155--159.

Digital Library

[64]

Kalin Ovtcharov, Ilian Tili, and J. Gregory Steffan. 2013. TILT: A multithreaded VLIW soft processor family. In Proc. 23rd Int. Conf. Field Program. Logic Appl. (FPL). 1--4.

[65]

Kolin Paul, Chinmaya Dash, and Mansureh Shahraki Moghaddam. 2012. reMORPH: A runtime reconfigurable architecture. In Proc. 15th Euromicro Conf. Digit. Syst. Design (DSD). 26--33.

Digital Library

[66]

Franjo Plavec, Blair Fort, Zvonko G. Vranesic, and Stephen Dean Brown. 2005. Experiences with soft-core processor design. In Proc. 19th Int. Parallel and Distrib. Process. Symp. (IPDPS). 4--pp.

Digital Library

[67]

Rafat Rashid. 2015. A Dual-Engine Fetch/Compute Overlay Processor for FPGAs. Ph.D. Dissertation. University of Toronto.

[68]

Rafat Rashid, J Gregory Steffan, and Vaughn Betz. 2014. Comparing performance, productivity and scalability of the TILT overlay processor to OpenCL HLS. In Proc. Int. Conf. Field-Programmable Technol. (FPT). 20--27.

[69]

Steve Rhoads. 2009. Plasma-most MIPS I(TM) opcodes. https://opencores.org/project,plasma.

[70]

Graham Schelle, Jamison Collins, Ethan Schuchman, Perrry Wang, Xiang Zou, Gautham Chinya, Ralf Plate, Thorsten Mattner, Franz Olbrich, Per Hammarlund, et al. 2010. Intel nehalem processor core made FPGA synthesizable. In Proc. 18th Int. Symp. Field Program. Gate Arrays (FPGA). 3--12.

Digital Library

[71]

Aaron Severance. 2015. Broadening the Applicability of FPGA-based Soft Vector Processors. Ph.D. Dissertation. University of British Columbia.

[72]

Aaron Severance, Joe Edwards, Hossein Omidian, and Guy Lemieux. 2014. Soft vector processors with streaming pipelines. In Proc. 22nd Int. Symp. Field Program. Gate Arrays (FPGA). 117--126.

Digital Library

[73]

Aaron Severance and George Lemieux. 2012. VENICE: A compact vector processor for FPGA applications. In Proc. Int. Conf. Field-Programmable Technol. (FPT). 261--268.

[74]

Aaron Severance and Guy G. F. Lemieux. 2013. Embedded supercomputing in FPGAs with the VectorBlox MXP matrix processor. In Proc. Int. Conf. Hardware/Software Codesign and Syst. Synthesis (CODES+ ISSS). 1--10.

Digital Library

[75]

Sunil Shukla, Neil W. Bergmann, and Jürgen Becker. 2006. QUKU: A coarse grained paradigm for FPGAs. In Proc. Dagstuhl Seminar.

[76]

Hartej Singh, Ming-Hau Lee, Guangming Lu, Fadi J. Kurdahi, Nader Bagherzadeh, et al. 1998. MorphoSys: A reconfigurable architecture for multimedia applications. In Proc. XI Brazilian Symp. Integr. Circuit Design. 134--139.

Digital Library

[77]

Andrew Waterman, Yunsup Lee, David A Patterson, and Krste Asanovic. 2011. The RISC-V instruction set manual, Volume I: Base user-level ISA. EECS Department, UC Berkeley, Tech. Rep. UCB/EECS-2011-62 (2011).

[78]

David L. Weaver. 2008. OpenSPARC Internals: OpenSPARC T1/T2 CMT Throughput Computing. Sun Microsystems.

[79]

Xilinx. 2017. MicroBlaze processor reference guide.

[80]

Peter Yiannacouras, Jonathan Rose, and J. Gregory Steffan. 2005. The microarchitecture of FPGA-based soft processors. In Proc. Int. Conf. Compilers, Archit. and Synthesis for Embedded Syst. (CASES). 202--212.

Digital Library

[81]

Peter Yiannacouras, J. Gregory Steffan, and Jonathan Rose. 2006. Application-specific customization of soft processor microarchitecture. In Proc. 14th Int. Symp. Field Program. Gate Arrays (FPGA). 201--210.

Digital Library

[82]

Peter Yiannacouras, J Gregory Steffan, and Jonathan Rose. 2008. VESPA: Portable, scalable, and flexible FPGA-based vector processors. In Proc. Int. Conf. Compilers, Archit. and Synthesis for Embedded Syst. (CASES). 61--70.

Digital Library

[83]

Peter Yiannacouras, J Gregory Steffan, and Jonathan Rose. 2009. Fine-grain performance scaling of soft vector processors. In Proc. Int. Conf. Compilers, Archit. and Synthesis for Embedded Syst. (CASES). 97--106.

Digital Library

[84]

Jason Yu, Christopher Eagleston, Christopher Han-Yu Chou, Maxime Perreault, and Guy Lemieux. 2009. Vector processing as a soft processor accelerator. ACM Trans. on Reconfigurable Technol. and Syst. (TRETS) 2, 2 (2009), 12.

Digital Library

[85]

Jason Yu, Guy Lemieux, and Christpher Eagleston. 2008. Vector processing as a soft-core CPU accelerator. In Proc. 16th Int. Symp. Field Program. Gate Arrays (FPGA). 222--232.

Digital Library

Cited By

Maas MBeaugnon UChauhan AIlbeyi BAamodt TJerger NSwift M(2023)TelaMalloc: Efficient On-Chip Memory Allocation for Production Machine Learning AcceleratorsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3567955.3567961(123-137)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3567955.3567961
Abdelhamid RYamaguchi YBoku T(2023)A Scalable Many-core Overlay Architecture on an HBM2-enabled Multi-Die FPGAACM Transactions on Reconfigurable Technology and Systems10.1145/354765716:1(1-33)Online publication date: 18-Jan-2023
https://dl.acm.org/doi/10.1145/3547657
Bachini Lopes FSchaeffer-Filho ANazar G(2023)Modular VNF Components Acceleration With FPGA OverlaysIEEE Transactions on Network and Service Management10.1109/TNSM.2022.321144820:1(846-857)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1109/TNSM.2022.3211448
Show More Cited By

Index Terms

Time-Multiplexed FPGA Overlay Architectures: A Survey
1. General and reference
  1. Document types
    1. Surveys and overviews
2. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs

Recommendations

An architecture and design tool flow for embedding a virtual FPGA into a reconfigurable system-on-chip

Virtual field programmable gate arrays (FPGA) are overlay architectures realized on top of physical FPGAs. They are proposed to enhance or abstract away from the physical FPGA for experimenting with novel architectures and design tool flows. In this ...
A time-multiplexed FPGA
FCCM '97: Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines

This paper describes the architecture of a time-multiplexed FPGA. Eight configurations of the FPGA are stored in on-chip memory. This inactive on-chip memory is distributed around the chip, and accessible so that the entire configuration of the FPGA can ...
An Integrated on-Silicon Verification Method for FPGA Overlays

Field Programmable Gate Arrays (FPGAs) gain popularity as higher-level tools evolve to deliver the benefits of re-programmable silicon to engineers and scientists at all levels of expertise. In order to use FPGAs efficiently, new CAD tools and modern ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems

ACM Transactions on Design Automation of Electronic Systems Volume 24, Issue 5

September 2019

282 pages

ISSN:1084-4309

EISSN:1557-7309

DOI:10.1145/3339837

Editor:
Naehyuck Chang
Korea Advanced Institute of Science and Technology, Korea

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 23 July 2019

Accepted: 01 June 2019

Revised: 01 April 2019

Received: 01 September 2018

Published in TODAES Volume 24, Issue 5

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Ministry of Education (MoE), Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
564
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)3

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Maas MBeaugnon UChauhan AIlbeyi BAamodt TJerger NSwift M(2023)TelaMalloc: Efficient On-Chip Memory Allocation for Production Machine Learning AcceleratorsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3567955.3567961(123-137)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3567955.3567961
Abdelhamid RYamaguchi YBoku T(2023)A Scalable Many-core Overlay Architecture on an HBM2-enabled Multi-Die FPGAACM Transactions on Reconfigurable Technology and Systems10.1145/354765716:1(1-33)Online publication date: 18-Jan-2023
https://dl.acm.org/doi/10.1145/3547657
Bachini Lopes FSchaeffer-Filho ANazar G(2023)Modular VNF Components Acceleration With FPGA OverlaysIEEE Transactions on Network and Service Management10.1109/TNSM.2022.321144820:1(846-857)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1109/TNSM.2022.3211448
Stojilović MRasmussen KRegazzoni FTahoori MTessier R(2023)A Visionary Look at the Security of Reconfigurable Cloud ComputingProceedings of the IEEE10.1109/JPROC.2023.3330729111:12(1548-1571)Online publication date: Dec-2023
https://doi.org/10.1109/JPROC.2023.3330729
Mudza ZKiełbik R(2022)Mapping Processing Elements of Custom Virtual CGRAs onto Reconfigurable PartitionsElectronics10.3390/electronics1108126111:8(1261)Online publication date: 16-Apr-2022
https://doi.org/10.3390/electronics11081261
Mba MEwo RDenoulet JYonta PGranado B(2022)An efficient FPGA overlay for MPI-2 RMA parallel applications2022 20th IEEE Interregional NEWCAS Conference (NEWCAS)10.1109/NEWCAS52662.2022.9842139(412-416)Online publication date: 19-Jun-2022
https://doi.org/10.1109/NEWCAS52662.2022.9842139
Liu SWeng JKupsh DSohrabizadeh AWang ZGuo LLiu JZhulin MMani RZhang LCong JNowatzki T(2022)OverGen: Improving FPGA Usability through Domain-specific Overlay Generation2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00018(35-56)Online publication date: Oct-2022
https://doi.org/10.1109/MICRO56248.2022.00018
Madroñal DPalumbo FCapotondi AMarongiu A(2021)Unmanned Vehicles in Smart Farming: a Survey and a Glance at Future HorizonsProceedings of the 2021 Drone Systems Engineering and Rapid Simulation and Performance Evaluation: Methods and Tools Proceedings10.1145/3444950.3444958(1-8)Online publication date: 24-Feb-2021
https://doi.org/10.1145/3444950.3444958
Quraishi MTavakoli ERen F(2021)A Survey of System Architectures and Techniques for FPGA VirtualizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.306367032:9(2216-2230)Online publication date: 1-Sep-2021
https://doi.org/10.1109/TPDS.2021.3063670
Abdelhamid RYamaguchi YBoku T(2021)A Highly-Efficient and Tightly-Connected Many-Core Overlay ArchitectureIEEE Access10.1109/ACCESS.2021.30741719(65277-65292)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3074171
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents