Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Core-level modeling and frequency prediction for DSP applications on FPGAs

Published: 01 January 2016 Publication History

Abstract

Field-programmable gate arrays (FPGAs) provide a promising technology that can improve performance of many high-performance computing and embedded applications. However, unlike software design tools, the relatively immature state of FPGA tools significantly limits productivity and consequently prevents widespread adoption of the technology. For example, the lengthy design-translate-execute (DTE) process often must be iterated to meet the application requirements. Previous works have enabled model-based, design-space exploration to reduce DTE iterations but are limited by a lack of accurate model-based prediction of key design parameters, the most important of which is clock frequency. In this paper, we present a core-level modeling and design (CMD) methodology that enables modeling of FPGA applications at an abstract level and yet produces accurate predictions of parameters such as clock frequency, resource utilization (i.e., area), and latency. We evaluate CMD's prediction methods using several high-performance DSP applications on various families of FPGAs and show an average clock-frequency prediction error of 3.6%, with a worst-case error of 20.4%, compared to the best of existing high-level prediction methods, 13.9% average error with 48.2% worst-case error. We also demonstrate how such prediction enables accurate design-space exploration without coding in a hardware-description language (HDL), significantly reducing the total design time.

References

[1]
K. Compton and S. Hauck, "Reconfigurable computing: a survey of systems and software," ACM Computing Surveys, vol. 34, no. 2, pp. 171-210, 2002.
[2]
T. El-Ghazawi, E. El-Araby, M. Huang, K. Gaj, V. Kindratenko, and D. Buell, "The promise of high-performance reconfigurable computing," Computer, vol. 41, no. 2, pp. 69-76, 2008.
[3]
A. George, H. Lam, and G. Stitt, "Novo-G: at the forefront of scalable reconfigurable supercomputing," Computing in Science and Engineering, vol. 13, no. 1, Article ID 5678570, pp. 82-86, 2011.
[4]
S. Choi, R. Scrofano, V. K. Prasanna, and J.-W. Jang, "Energy-efficient signal processing using FPGAs," in Proceedings of the ACM/SIGDA 11th ACM International Symposium on Field Programmable Gate Arrays (FPGA '03), pp. 225-234, ACM, February 2003.
[5]
J. Noguera and R. M. Badia, "Power-performance trade-offs for reconfigurable computing," in Proceedings of the 2nd IEEE/ ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '04), pp. 116-121, ACM, Stockholm, Sweden, September 2004.
[6]
J. M. Rabaey, "Reconfigurable processing: the solution to low-power programmable DSP," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 275-278, Munich, Germany, April 1997.
[7]
M. Haldar, A. Nayak, A. Choudhary, and P. Banerjee, "Parallel algorithms for FPGA placement," in Proceedings of the 10th Great Lakes Symposium on VLSI (GLSVLSI '00), pp. 86-94, Chicago, Ill, USA, March 2000.
[8]
J. Rose and D. Hill, "Architectural and physical design challenges for one-million gate FPGAs and beyond," in Proceedings of the ACM 5th International Symposium on Field-Programmable Gate Arrays, pp. 129-132, Monterey, Calif, USA, February 1997.
[9]
H. Blume, H. Hubert, H. T. Feldkamper, and T. G. Noll, "Model-based exploration of the design space for heterogeneous systems on chip," in Proceedings of the IEEE International Conference on Application- Specific Systems, Architectures, and Processors, pp. 29-40, San Jose, Calif, USA, 2002.
[10]
M. F. D. S. Oliveira, L. B. de Brisolara, L. Carro, and F. R. Wagner, "Early embedded software design space exploration using UML-based estimation," in Proceedings of the 17th IEEE International Workshop on Rapid System Prototyping (RSP '06), pp. 24-32, IEEE, Chania, Greece, June 2006.
[11]
C. Reardon, B. Holland, A. George, G. Stitt, and H. Lam, "RCML: an environment for estimation modeling of reconfigurable computing systems," ACM Transactions on Embedded Computing Systems, vol. 11, supplement 2, pp. 43:1-43:22, 2012.
[12]
C. Reardon, E. Grobelny, A. D. George, and G. Wang, "A simulation framework for rapid analysis of reconfigurable computing systems," ACM Transactions on Reconfigurable Technology and Systems, vol. 3, no. 4, article 25, 2010.
[13]
K. Sigdel, M. Thompson, A. D. Pimentel, C. Galuzzi, and K. Bertelsy, "System-level runtime mapping exploration of reconfigurable architectures," in Proceedings of the IEEE International Symposium on Parallel & Distributed Processing, Rome, Italy, May 2009.
[14]
B. N. Uchevler, K. Svarstad, J. Kuper, and C. Baaij, "System-level modelling of dynamic reconfigurable designs using functional programming abstractions," in Proceedings of the 14th International Symposium on Quality Electronic Design (ISQED '13), pp. 379-385, Santa Clara, Calif, USA, March 2013.
[15]
B. Holland, K. Nagarajan, C. Conger, A. Jacobs, and A. D. George, "RAT: a methodology for predicting performance in application design migration to FPGAs," in Proceedings of the 1st International Workshop on High-Performance Reconfigurable Computing Technology & Applications (HPRCTA '07), pp. 1-10, ACM, Reno, Nev, USA, November 2007.
[16]
S. Merchant, B. Holland, C. Reardon et al., "Strategic challenges for application development productivity in reconfigurable computing," in Proceedings of the IEEE National Aerospace and Electronics Conference (NAECON '08), pp. 209-218, IEEE, Dayton, Ohio, USA, July 2008.
[17]
G. Wang, G. Stitt, H. Lam, and A. D. George, "A framework for core-level modeling and design of reconfigurable computing algorithms," in Proceedings of the 3rd International Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA '09), pp. 29-38, ACM, Portland, Ore, USA, November 2009.
[18]
S. Mohanty and V. K. Prasanna, "A model-based extensible framework for efficient application design using FPGA," ACM Transactions on Design Automation of Electronic Systems, vol. 12, no. 2, Article ID 1230805, 2007.
[19]
Xilinx System Generator for DSP User Guides, Release 10.1.1, 2008.
[20]
J. Eker, J. W. Janneck, E. A. Lee et al., "Taming heterogeneity-- the ptolemy approach," Proceedings of the IEEE, vol. 91, no. 1, pp. 127-143, 2003.
[21]
D. Strenski, "FPGA Floating Point Performance," 2007, http://www.hpcwire.com/2007/01/12/fpga_floating_point_performance/.
[22]
A. Nayak, M. Haldar, A. Choudhary, and P. Banerjee, "Accurate area and delay estimators for FPGAs," in Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE '02), pp. 862-869, IEEE, Paris, France, March 2002.
[23]
R. Enzler, T. Jeger, D. Cottet, and G. Troster, "High-level area and performance estimation of hardware building blocks on FPGAs," in Proceedings of the 10th International Conference on Field Programmable Logic and Its Applications (FPL '00), Villach, Austria, August 2000, R. W. Hartenstein and H. Grünbacher, Eds., vol. 1896, pp. 525-534, Springer, 2000.
[24]
M. B. Abdelhalim and S. E.-D. Habib, "Fast FPGA-based delay estimation for a novel hardware/software partitioning scheme," in Proceedings of the 2nd international Design and Test Workshop (IDT '07), pp. 175-181, Cairo, Egypt, December 2007.
[25]
R. J. Francis, J. Rose, and K. Chung, "Chortle: a technology mapping program for lookup table-based field programmable gate arrays," in Proceedings of the 27th ACM/IEEE Design Automation Conference (DAC '90), pp. 613-619, Orlando, Fla, USA, June 1990.
[26]
M. D. F. Schlag, P. K. Chan, and J. Kong, "Empirical evaluation of multilevel logic minimization tools for a field-programmable gate array technology," Tech. Rep., University of California, Santa Cruz, Santa Cruz, Calif, USA, 1991.
[27]
XACT Development System, Libraries Guide, Xilinx, San Jose, Calif, USA, 1994.
[28]
XACT Xilinx Synopsys Interface FPGA User Guide, Xilinx, San Jose, Calif, USA, 1995.
[29]
L. Yan, T. Srikanthan, and N. Gang, "Area and delay estimation for FPGA implementation of coarse-grained reconfigurable architectures," in Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES '06), pp. 182-188, Ottawa, Canada, June 2006.
[30]
P. Bjureus, M. Millberg, and A. Jantsch, "FPGA resource and timing estimation from Matlab execution traces," in Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES '02), pp. 31-36, Estes Park, Colo, USA, May 2002.
[31]
C. Brandolese, W. Fornaciari, and F. Salice, "An area estimation methodology for FPGA based designs at systemc-level," in Proceedings of the 41st Design Automation Conference (DAC '04), pp. 129-132, San Diego, Calif, USA, June 2004.
[32]
T. Jiang, X. Tang, and P. Banerjee, "Macro-models for high level area and power estimation on FPGAs," in Proceedings of the 14th ACM Great Lakes symposium on VLSI (GLSVLSI '04), pp. 162- 165, ACM, Boston, Mass, USA, April 2004.
[33]
D. Kulkarni, W. A. Najjar, R. Rinker, and F. J. Kurdahi, "Compile-time area estimation for LUT-based FPGAs," ACM Transactions on Design Automation of Electronic Systems, vol. 11, no. 1, pp. 104-122, 2006.
[34]
M. C. Lieu, S. K. Lam, and T. Srikanthan, "Rapid area-time estimation technique for porting C-based applications onto FPGA platforms," Scalable Computing: Practice and Experience, vol. 8, no. 4, pp. 359-371, 2007.
[35]
J. Das, A. Lam, S. J. E. Wilton, P. H. W. Leong, and W. Luk, "An analytical model relating FPGA architecture to logic density and depth," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19, no. 12, pp. 2229-2242, 2011.
[36]
J. Pistorius and M. Hutton, "Placement rent exponent calculation methods, temporal behaviour and FPGA architecture evaluation," in Proceedings of the International Workshop on System Level Interconnect Prediction, pp. 31-38, April 2003.
[37]
B. S. Landman and R. L. Russo, "On a pin versus block relationship for partitions of logic graphs," IEEE Transactions on Computers C, vol. 20, no. 12, pp. 1469-1479, 1971.
[38]
M. Feuer, "Connectivity of random logic," IEEE Transactions on Computers, vol. 31, no. 1, pp. 29-33, 1982.
[39]
X. Yang, E. Bozorgzadeh, and M. Sarrafzadeh, "Wirelength estimation based on Rent exponents of partitioning and placement," in Proceedings of the International Workshop on System-Level Interconnect Prediction (SLIP '01), pp. 25-31, April 2001.
[40]
P. Coussy, C. Chavet, P. Bomel et al., "GAUT: a high-level synthesis tool for DSP applications," in High-Level Synthesis: From Algorithm to Digital Circuit, pp. 147-169, Springer, New York, NY, USA, 2008.
[41]
LabVIEW FPGA Module, National Instrument, http://www.ni.com/fpga/.
[42]
B. So, M. W. Hall, and P. C. Diniz, "A compiler approach to fast hardware design-space exploration in FPGA-based systems," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 165-176, Berlin, Germany, June 2002.
[43]
K. B. Chehida and M. Auguin, "HW/SW partitioning approach for reconfigurable system design," in Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '02), pp. 247-251, ACM, Grenoble, France, October 2002.
[44]
Floating-Point Operator v4.0 Data Sheet, http://www.xilinx.com/.
[45]
Floating-Point Megafunctions User Guide, http://www.altera.com/.
[46]
Simulink User's Guide, http://www.mathworks.com/help/simulink/index.html.
[47]
F. J. Budinsky, M. A. Finnie, J. M. Vlissides, and P. S. Yu, "Automatic code generation from design patterns," IBM Systems Journal, vol. 35, no. 2, pp. 151-171, 1996.
[48]
J. Herrington, Code Generation in Action, Manning Publications, Greenwich, Conn, USA, 2003.
[49]
P. Lee and M. Leone, "Optimizing ML with run-time code generation," ACM SIGPLAN Notices, vol. 31, no. 5, pp. 137-148, 1996.
[50]
R. Y. Rubin and A. M. DeHon, "Timing-driven pathfinder pathology and remediation: quantifying and reducing delay noise in VPR-pathfinder," in Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '11), pp. 173-176, ACM, Monterey, Calif, USA, March 2011.
[51]
V. Betz and J. Rose, "VPR: a new packing, placement and routing tool for FPGA research," in Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications, London, UK, September 1997.
[52]
P. Christie and D. Stroobandt, "The interpretation and application of rent's rule," IEEE Transactions on Very Large Scale Integration Systems, vol. 8, no. 6, pp. 639-648, 2000.
[53]
J. Hartigan and M. Wong, "A k-means clustering algorithm," Applied Statistics, vol. 28, pp. 100-108, 1979.
[54]
Y. Liang, K. Rupnow, Y. Li, D. Min, M. N. Do, and D. Chen, "High-level synthesis: productivity, performance, and software constraints," Journal of Electrical and Computer Engineering, vol. 2012, Article ID 649057, 14 pages, 2012.
[55]
J. Rose, J. Luu, C. W. Yuet al., "The VTR project: architecture and CAD for FPGAs from verilog to routing," in Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '12), pp. 77-86, ACM, New York, NY, USA, February 2012.

Cited By

View all
  • (2019)Graph sparsification with parallelization to optimize the identification of causal genes and dysregulated pathwaysProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297352(747-753)Online publication date: 8-Apr-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Reconfigurable Computing
International Journal of Reconfigurable Computing  Volume 2015, Issue
January 2015
188 pages
ISSN:1687-7195
EISSN:1687-7209
Issue’s Table of Contents

Publisher

Hindawi Limited

London, United Kingdom

Publication History

Published: 01 January 2016
Accepted: 10 August 2015
Received: 03 March 2015

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)31
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Graph sparsification with parallelization to optimize the identification of causal genes and dysregulated pathwaysProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297352(747-753)Online publication date: 8-Apr-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media