Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

A coarse-grained reconfigurable architecture with compilation for high performance

Published: 01 January 2012 Publication History

Abstract

We propose a fast data relay (FDR) mechanism to enhance existing CGRA (coarse-grained reconfigurable architecture). FDR can not only provide multicycle data transmission in concurrent with computations but also convert resource-demanding inter-processing-element global data accesses into local data accesses to avoid communication congestion. We also propose the supporting compiler techniques that can efficiently utilize the FDR feature to achieve higher performance for a variety of applications. Our results on FDR-based CGRA are compared with two other works in this field: ADRES and RCP. Experimental results for various multimedia applications show that FDR combined with the new compiler deliver up to 29% and 21% higher performance than ADRES and RCP, respectively.

References

[1]
S. Hauck and A. DeHon, Eds., Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation (Systems on Silicon), Morgan Kaufmann, Boston, Mass, USA, 2007.
[2]
T. J. Todman, G. A. Constantinides, S. J. E. Wilton, O. Mencer, W. Luk, and P. Y. K. Cheung, "Reconfigurable computing: architectures and design methods," IEE Proceedings-- Computers and Digital Techniques, vol. 152, no. 2, article 193.
[3]
J. R. Hauser and J. Wawrzynek, "Garp: a MIPS processor with a reconfigurable coprocessor," in Proceedings of the 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 12-21, April 1997.
[4]
Z. A. Ye, A. Moshovos, S. Hauck, and P. Banerjee, "Chimaera: a high-performance architecture with a tightly-coupled reconfigurable functional unit," in Proceedings of the The 27th Annual International Symposium on Computer Architecture (ISCA '00), pp. 225-235, June 2000.
[5]
R. Hartenstein, "Coarse grain reconfigurable architecture (embedded tutorial)," in Proceedings of the 16th Asia South Pacific Design Automation Conference (ASP-DAC '01), pp. 564- 570, 2001.
[6]
C. Ebeling, D. C. Cronquist, and P. Franklin, "RaPiD-reconfigur-able pipelined datapath," in Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers (FPL '96), 1996.
[7]
S. C. Goldstein, H. Schmit, M. Budiu, S. Cadambi, M. Matt, and R. R. Taylor, "PipeRench: a reconfigurable architecture and compiler," Computer, vol. 33, no. 4, pp. 70-77, 2000.
[8]
H. Singh, M. H. Lee, G. Lu, F. J. Kurdahi, N. Bagherzadeh, and E. M. Chaves Filho, "MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications," IEEE Transactions on Computers, vol. 49, no. 5, pp. 465- 481, 2000.
[9]
R. W. Hartenstein and R. Kress, "Datapath synthesis system for the reconfigurable datapath architecture," in Proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC '95), pp. 479-484, September 1995.
[10]
E. Mirsky and A. DeHon, "MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources," in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM '96), pp. 157-166, April 1996.
[11]
B. Mei, S. Vernalde, D. Verkest, and R. Lauwereins, "Design methodology for a tightly coupled VLIW/reconfigurable matrix architecture: a case study," in Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE '04), pp. 1224-1229, February 2004.
[12]
O. Colavin and D. Rizzo, "A scalable wide-issue clustered VLIW with a reconfigurable interconnect," in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '03), pp. 148-158, November 2003.
[13]
M. B. Taylor, W. Lee, J. Miller et al., "Evaluation of the raw microprocessor: an exposed-wire-delay architecture for ILP and streams," in Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA '04), pp. 2-13, June 2004.
[14]
S. Friedman, A. Carroll, B. Van Essen, B. Ylvisaker, C. Ebeling, and S. Hauck, "SPR: an architecture-adaptive CGRA mapping tool," in Proceedings of the 7th ACM SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '09), pp. 191-200, February 2009.
[15]
H. Park, K. Fan, S. Mahlke, T. Oh, H. Kim, and H. S. Kim, "Edge-centric modulo scheduling for coarse-grained reconfigurable architectures," in Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT '08), pp. 166-176, October 2008.
[16]
G. Lee, K. Choi, and N. D. Dutt, "Mapping multidomain applications onto coarse-grained reconfigurable architectures," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 5, pp. 637-650, 2011.
[17]
T. Suzuki, H. Yamada, T. Yamagishi et al., "High-throughput, low-power software-defined radio using reconfigurable processors," IEEE Micro, vol. 31, no. 6, pp. 19-28, 2011.
[18]
Z. Kwok and S. J. E. Wilton, "Register file architecture optimization in a coarse-grained reconfigurable architecture," in Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '05), pp. 35-44, April 2005.
[19]
S. Cadambi and S. C. Goldstein, "Efficient place and route for pipeline reconfigurable architectures," in Proceedings of the International Conference on Computer Design (ICCD '00), pp. 423-429, September 2000.
[20]
S. Rixner, W. J. Dally, B. Khailany, P. Mattson, U. J. Kapasi, and J. D. Owens, "Register organization for media processing," in Proceedings of the 6th International Symposium on High-Performance Computer Architecture (HPCA '00), pp. 375-386, January 2000.
[21]
R. Balasubraamonian, S. Dwarkadas, and D. H. Albonesi, "Reducing the complexity of the register file in dynamic superscalar processors," in Proceedings of the 34th Annual International Symposium on Microarchitecture (ACM/IEEE '01), pp. 237-248, December 2001.
[22]
B. Mei, F. J. Veredas, and B. Masschelein, "Mapping an H.264/AVC decoder onto the adres reconfigurable architecture," in Proceedings of the International Conference on Field Programmable Logic and Applications (FPL '05), pp. 622-625, August 2005.
[23]
C. Lattner, "Introduction to the LLVM Compiler Infrastructure," in Itanium Conference and Expo, April 2006.
[24]
J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, chapter 3, Morgan Kauffmann, Boston, Mass, USA, 4th edition, 2006.
[25]
G. D. Micheli, Synthesis and Optimization of Digital Circuits, McGraw-Hill, 1994.
[26]
R. Nair, "A simple yet effective technique for global wiring," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 6, no. 2, pp. 165-172, 1987.
[27]
Xvid video codec, http://www.xvid.org/.
[28]
Opensource H.264 reference code, http://iphome.hhi.de/ suehring/tml/.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Reconfigurable Computing
International Journal of Reconfigurable Computing  Volume 2012, Issue
Special issue on High-Performance Reconfigurable Computing
January 2012
116 pages
ISSN:1687-7195
EISSN:1687-7209
Issue’s Table of Contents

Publisher

Hindawi Limited

London, United Kingdom

Publication History

Accepted: 09 January 2012
Revised: 05 January 2012
Published: 01 January 2012
Received: 05 October 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 87
    Total Downloads
  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media