Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Input data reuse in compiling window operations onto reconfigurable hardware

Published: 11 June 2004 Publication History

Abstract

Balancing computation with I/O has been considered as a critical factor of the overall performance for embedded systems in general and reconfigurable computing systems in particular. Data I/O often dominates the overall computation performance for window operation, which are frequently used in image processing, image compression, pattern recognition and digital signal processing. This problem is more acute in reconfigurable systems since the compiler must generate the data path and the sequence of operations. The challenge is to intelligently exploit data reuse on the reconfigurable fabric (FPGA) to minimize the required memory or I/O bandwidth while maximizing parallelism.In this paper, we present a compile-time approach to reuse data in window-based codes. The compiler, called ROCCC, first analyzes and optimizes the window operation in C. It then computes the size of the hardware buffer and defines three sets of data values for each window: the window set, the managed set and the killed set. This compile-time analysis simplifies the HDL code generation and improves the resulting hardware performance. We also discuss in-place window operations.

References

[1]
A. V. Oppenheim, R. W. Schafer, Discrete-Time Signal Processing. Prentice-Hall, Inc. 1989.]]
[2]
R. C. Gonzales, R. E. Woods, Digital Image Processing. Prentice-Hall Inc. 2002.]]
[3]
A. M. Tekalp. Digital Video Processing. Prentice-Hall Inc. 1995.]]
[4]
H. T. Kung. Why Systolic Architectures? IEEE Computer. Vol. 15, No. 1 (Jan. 1982), pp. 37--46.]]
[5]
Z. Guo, W. Najjar, F. Vahid and K. Vissers. A Quantitative Analysis of the Speedup Factors of FPGAs over Processors, Int. Symp. Field-Programmable gate Arrays (FPGA), Monterrey, CA, February 2004.]]
[6]
D. C. Suresh, W. A. Najjar J. Villareal, G. Stitt and F. Vahid. Profiling Tools for Hardware/Software Partitioning of Embedded Applications. Proc. ACM Symp. On Languages, Compilers and Tools for Embedded Systems (LCTES 2003), San Diego, CA, June 2003.]]
[7]
Triscend Corporation, "Triscend A7 Configurable System on a Chip Family." http://www.triscend.com/products/a7.htm 2004.]]
[8]
Xilinx Corp. "IBM and Xilinx Team." http://www.xilinx.com/prs_rls/ibmpartner.htm 2004.]]
[9]
Altera Corp. "Excalibur: System-on-a-Programmable." http://www.altera.com 2004.]]
[10]
SUIF Compiler System. http://suif.stanford.edu, 2004.]]
[11]
G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, C. Sapuntzakis. An Overview of the SUIF2 Compiler Infrastructure. Computer Systems Laboratory, Stanford University.]]
[12]
Machine-SUIF. http://www.eecs.harvard.edu/hube/research/machsuif.html, 2004.]]
[13]
M. D. Smith and G. Holloway. An introduction to machine SUIF and its portable libraries for analysis and optimization. Division of Engineering and Applied Sciences, Harvard University.]]
[14]
G. Holloway and M. D. Smith. Machine-SUIF SUIFvm Library. Division of Engineering and Applied Sciences, Harvard University 2002.]]
[15]
G. Holloway and M. D. Smith. Machine SUIF Control Flow Graph Library. Division of Engineering and Applied Sciences, Harvard University 2002.]]
[16]
G. Holloway and A. Dimock. The Machine SUIF Bit-Vector Data-Flow-Analysis Library. Division of Engineering and Applied Sciences, Harvard University 2002.]]
[17]
G. Holloway. The Machine-SUIF Static Single Assignment Library. Division of Engineering and Applied Sciences, Harvard University 2002.]]
[18]
Synplicity, Inc. http://www.synplicity.com/ 2004.]]
[19]
SystemC Consortium. http://www.systemc.org 2004.]]
[20]
Handel-C Language Overview. Celoxica, Inc. http://www.celoxica.com 2004.]]
[21]
Y. Li, T. Callahan, E. Darnell, R. Harr, U. Kurkure, and J. Stockwood. Hardware-software co-design of embedded reconfigurable architectures. In Design Automation Conf. (DAC), 1999.]]
[22]
W. Najjar, W. Böhm, B. Draper, J. Hammes, R. Rinker, R. Beveridge, M. Chawathe and C. Ross. From Algorithms to Hardware - A High-Level Language Abstraction for Reconfigurable Computing. IEEE Computer, August 2003.]]
[23]
W. Böhm, J. Hammes, B. Draper, M. Chawathe, C. Ross, R. Rinker, W. Najjar. Mapping a Single Assignment Programming Language to Reconfigurable Systems, The Journal of Supercomputing, Volume 21, pages 117--130, 2002.]]
[24]
M. B. Gokhale, J. M. Stone, J. Arnold, and M. Lalinowski. Stream-oriented FPGA computing in the Streams-C high level language. In IEEE Symp. on FPGAs for Custom Computing Machines (FCCM), 2000.]]
[25]
T. J. Callahan, J. R. Hauser, J. Wawrzynek. The Garp Architecture and C Compiler. IEEE Computer, April 2000.]]
[26]
SPARK. http://www.cecs.uci.edu/~spark/ 2004]]
[27]
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4): 452--470, October 1991.]]
[28]
D. Kulkarni, W. Najjar, R. Rinker, and F. Kurdahi, Fast Area Estimation to Support Compiler Optimizations in FPGA-based Reconfigurable Systems, IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, April 2002.]]
[29]
J. Frigo, M. Gokhale, and D. Lavenier. Evaluation of the Streams-C C-to-FPGA Compiler: An Applications Perspective. Ninth ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), Monterey, CA, 2001.]]
[30]
B. So, M. W. Hall and P. C. Diniz, "A Compiler Approach to Fast Hardware Design Space Exploration in FPGA-based Systems", Int. Symp. On Programmong Language Design and Implementation (PLDI) 2002.]]
[31]
X. Liang, J. Jean, Mapping of Generalized Template Mapping on Reconfigurable Computers. IEEE Trans. on VLSI System, 11(3): 485--498, 2003.]]

Cited By

View all
  • (2023)Field‐programmable Gate ArraysDesign for Embedded Image Processing on FPGAs10.1002/9781119819820.ch2(19-44)Online publication date: 5-Sep-2023
  • (2022)An FPGA Overlay for CNN Inference with Fine-grained Flexible ParallelismACM Transactions on Architecture and Code Optimization10.1145/351959819:3(1-26)Online publication date: 4-May-2022
  • (2018)Reconfigurable Buffer Structures for Coarse-Grained Reconfigurable ArraysSystem Level Design from HW/SW to Memory for Embedded Systems10.1007/978-3-319-90023-0_18(218-229)Online publication date: 17-Apr-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 39, Issue 7
LCTES '04
July 2004
265 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/998300
Issue’s Table of Contents
  • cover image ACM Conferences
    LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
    June 2004
    276 pages
    ISBN:1581138067
    DOI:10.1145/997163
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2004
Published in SIGPLAN Volume 39, Issue 7

Check for updates

Author Tags

  1. VHDL
  2. compilation
  3. high-level synthesis
  4. reconfigurable computing
  5. reuse analysis

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Field‐programmable Gate ArraysDesign for Embedded Image Processing on FPGAs10.1002/9781119819820.ch2(19-44)Online publication date: 5-Sep-2023
  • (2022)An FPGA Overlay for CNN Inference with Fine-grained Flexible ParallelismACM Transactions on Architecture and Code Optimization10.1145/351959819:3(1-26)Online publication date: 4-May-2022
  • (2018)Reconfigurable Buffer Structures for Coarse-Grained Reconfigurable ArraysSystem Level Design from HW/SW to Memory for Embedded Systems10.1007/978-3-319-90023-0_18(218-229)Online publication date: 17-Apr-2018
  • (2017)FPGA Implementation of the Coupled Filtering Method and the Affine Warping MethodIEEE Transactions on NanoBioscience10.1109/TNB.2017.270510416:5(314-325)Online publication date: Jul-2017
  • (2016)ROCCC 2.0FPGAs for Software Programmers10.1007/978-3-319-26408-0_11(191-204)Online publication date: 18-Jun-2016
  • (2015)The advantages and limitations of high level synthesis for FPGA based image processingProceedings of the 9th International Conference on Distributed Smart Cameras10.1145/2789116.2789145(134-139)Online publication date: 8-Sep-2015
  • (2012)Real-time computation of local neighborhood functions in application-specific instruction-set processorsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2011.217020420:11(2031-2043)Online publication date: 1-Nov-2012
  • (2010)Compiling for reconfigurable computingACM Computing Surveys10.1145/1749603.174960442:4(1-65)Online publication date: 23-Jun-2010
  • (2009)What is hardware/software partitioning?ACM SIGDA Newsletter10.1145/1862900.186290139:6(1-1)Online publication date: 1-Jun-2009
  • (2009)A computing origamiProceedings of the 46th Annual Design Automation Conference10.1145/1629911.1629987(282-287)Online publication date: 26-Jul-2009
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media