Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2463209.2488797acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices

Published: 29 May 2013 Publication History

Abstract

The automatic generation of hardware implementations for a given algorithm is generally a difficult task, especially when data dependencies span across multiple iterations such as in iterative stencil loops (ISLs). In this paper, we introduce an automatic design flow to extract parallelism from an ISL algorithm and perform a design space exploration to identify its best FPGA hardware implementation, in terms of both area and throughput. Experimental results show that the proposed methodology generates hardware designs whose performance is comparable to the one of manually-optimized solutions, and orders of magnitude higher than the implementations generated by commercial high-level synthesis tools.

References

[1]
D. Crookes and K. Benkrid, "FPGA implementation of image component labelling", in Reconfigurable Technology: FPGAs for Computing and Applications, SPIE vol 3844, 17--23 (1999)
[2]
K. Benkrid, S. Sukhsawas, D. Crookes, and A. Benkrid, "An FPGA-based image connected component labeller", in Field-Programmable Logic and Applications. Springer Berlin, 1012--1015 (2003)
[3]
T. Pock et al., "A duality based algorithm for TV-LI optical-flow image registration," in Proc. of MICCAI, 2007, pp. 511--518.
[4]
J. Fowers at al., "A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications", in Proc. of FPGA '12, pp. 47--56
[5]
M. Christen, "PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures", IPDPS 2011, pp. 676--687
[6]
Z. Li and Y. Song, "Automatic tiling of iterative stencil loops", ACM Trans. Program. Lang. Syst. 26, Nov. 2004, pp. 975--1028.
[7]
J. Meng and K. Skadron, "Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs.", ICS, 2009, pp. 256--265.
[8]
J. Meng and K. Skadron, "A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations", International Journal of Parallel Programming, 2011, 39, pp. 115--142.
[9]
C. Alias et al., "Automatic generation of FPGA-specific pipelined accelerators," in Proc. of ARC, 2011, pp. 53--66.
[10]
D. V. Rao et al., "Implementation and evaluation of image processing algorithms on reconfigurable architecture using c-based hardware descriptive languages," JATIT, vol. 1, pp. 9--34, 2006.
[11]
Y. Park et al., "A new method of illumination normalization for robust face recognition," in Progress in Pattern Recognition, Image Analysis and Applications, Springer, 2006, vol. 4225, pp. 38--47.
[12]
S. L. Park, "Retinex method based on cmsb-plane for variable lighting face recognition," in Proc. of ICALIP, 2008, pp. 499--503.
[13]
E. Jamro et al., "Convolution operation implemented in FPGA structures for real-time image processing," in Proc. of ISPA, 2001, pp. 417--422.
[14]
C. Charoensak and F Sattar, "A single-chip FPGA design for real-time ica-based blind source separation algorithm," in Proc. of ISCAS, 2005, pp. 5822--5825, vol. 6.
[15]
K. Mohammad and S. Agaian, "Efficient FPGA implementation of convolution," in Proc. of SMC, 2009, pp. 3478--3483.
[16]
B. Cope, "Implementation of 2D Convolution on FPGA, GPU and CPU," Master's thesis, Department of Electrical & Electronic Engineering, Imperial College London, 2006.
[17]
L. Gerard et al., "A Jacobi-Davidson Iteration Method for Linear Eigenvalue Problems," SIAM Review, Vol. 42, No. 2, 2000, pp. 267--293.
[18]
A. Chambolle, "An algorithm for total variation minimization and applications," Journal of Mathematical Imaging and Vision, vol. 20, pp. 89--97, 2004.
[19]
A. Akin et al., "A high-performance parallel implementation of the Chambolle algorithm," in Proc. of DATE, 2011, pp.1,6.
[20]
V. Rana et al., "Design Methods for Parallel Hardware Implementation of Multimedia Iterative Algorithms," IEEE Design & Test of Computers, 2012.
[21]
J. C. King, "Symbolic execution and program testing," Commun. ACM, vol. 19, no. 7, pp. 385--394, 1976.
[22]
C. Zach et al., "A duality based approach for realtime TV-LI optical flow," DAGM conference on Pattern recognition, 2007, pp. 214--223.
[23]
A. Weishaupt et al., "Tracking and Structure from Motion," Master's thesis, EPFL, 2010.
[24]
Synopsys, "Synphony C Compiler," 2012.
[25]
Xilinx Inc., "Vivado Design Suite User Guide, High-Level Synthesis," UG902, 2012.

Cited By

View all
  • (2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
  • (2021)Enhancing the Scalability of Multi-FPGA Stencil Computations via Highly Optimized HDL ComponentsACM Transactions on Reconfigurable Technology and Systems10.1145/346147814:3(1-33)Online publication date: 12-Aug-2021
  • (2021)Hexagonal Tiling based Multiple FPGAs Stencil Computation Acceleration and Optimization Methodology2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00101(697-705)Online publication date: Sep-2021
  • Show More Cited By

Index Terms

  1. A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      DAC '13: Proceedings of the 50th Annual Design Automation Conference
      May 2013
      1285 pages
      ISBN:9781450320719
      DOI:10.1145/2463209
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 May 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. high level synthesis
      2. iterative stencil loops
      3. performance and area estimation
      4. symbolic execution

      Qualifiers

      • Research-article

      Conference

      DAC '13
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

      Upcoming Conference

      DAC '25
      62nd ACM/IEEE Design Automation Conference
      June 22 - 26, 2025
      San Francisco , CA , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 01 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Across Time and Space: Senju’s Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/363492017:2(1-33)Online publication date: 30-Apr-2024
      • (2021)Enhancing the Scalability of Multi-FPGA Stencil Computations via Highly Optimized HDL ComponentsACM Transactions on Reconfigurable Technology and Systems10.1145/346147814:3(1-33)Online publication date: 12-Aug-2021
      • (2021)Hexagonal Tiling based Multiple FPGAs Stencil Computation Acceleration and Optimization Methodology2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00101(697-705)Online publication date: Sep-2021
      • (2021)SASIAF, A Scalable Accelerator for Seismic Imaging on Amazon AWS FPGAs2021 11th International Conference on Computer Engineering and Knowledge (ICCKE)10.1109/ICCKE54056.2021.9721464(352-357)Online publication date: 28-Oct-2021
      • (2020)Efficient Acceleration of Stencil Applications through In-Memory ComputingMicromachines10.3390/mi1106062211:6(622)Online publication date: 26-Jun-2020
      • (2020)High-Level Synthesis Design for Stencil Computations on FPGA with High Bandwidth MemoryElectronics10.3390/electronics90812759:8(1275)Online publication date: 8-Aug-2020
      • (2019)DCMIACM Transactions on Architecture and Code Optimization10.1145/335281316:4(1-24)Online publication date: 11-Oct-2019
      • (2019)Smart-Cache: Optimising Memory Accesses for Arbitrary Boundaries and Stencils on FPGAs2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2019.00024(87-90)Online publication date: May-2019
      • (2019)Evaluation of Stencil Based Algorithm Parallelization over System-on-Chip FPGA Using a High Level Synthesis ToolApplied Computer Sciences in Engineering10.1007/978-3-030-31019-6_5(52-63)Online publication date: 9-Oct-2019
      • (2018)An FPGA-Based Acceleration Methodology and Performance Model for Iterative Stencils2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2018.00026(115-122)Online publication date: May-2018
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media