Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A practical evaluation of the performance of the Impulse CoDeveloper HLS tool for implementing large-kernel 2-D filters

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Bidimensional convolution is a low-level processing algorithm which is of great interest in many areas, but its high computational cost limits the size of the kernels, especially in real-time embedded systems. This work describes the process of designing 2-D filters with large kernels (up to 50 × 50 coefficients) using the Impulse CoDeveloperTM high-level synthesis (HLS) tool. The purpose of this paper is twofold: first, to provide a practical guide for designers willing to make the most of an HLS tool like Impulse CoDeveloper, and second, to compare the results, in terms of area utilization, minimum clock period and power consumption, with implementations developed using lower-level design tools. The results show that RTL-based implementations can achieve higher throughputs (up to 44 % faster) than CoDeveloper-based ones. Nevertheless, CoDeveloper can also meet the high-performance requirements of the most demanding real-time applications, but with less effort and shorter design cycles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Bankman, I.N. (ed.): Handbook of Medical Imaging. Academic Press Inc, London (2000)

    Google Scholar 

  2. Berkeley Design Technology, Inc: An Independent Evaluation of: High-Level Synthesis Tools for Xilinx FPGAs. White paper (2010). http://www.bdti.com/MyBDTI/pubs/Xilinx_hlstcp.pdf

  3. Bouganis, C., Park, S., Constantinides, G., Cheung, P.: Synthesis and optimization of 2D filter designs for heterogeneous FPGAs. ACM Trans. Reconfig. Technol. Syst. 1(4):24:1−24:28 (2009)

    Article  Google Scholar 

  4. Cardoso, J. M., Diniz, P. C., Weinhardt, M.: Compiling for reconfigurable computing: a survey. ACM Comput. Surv. (CSUR), 42(4), 13 (2010)

    Google Scholar 

  5. Colodro-Conde, C., Toledo-Moreo, F.J., Martínez-Álvarez, J.J., Garrigós-Guerrero, F.J., Ferrández-Vicente, J.M.: Implementing large-kernel 2-D filters using Impulse CoDeveloper. In: International Conference on Design and Architectures for Signal and Image Processing, pp. 1–8 (2012)

  6. Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for FPGAs: from prototyping to deployment. IEEE Trans. Comput. Des. Integr. Circuits Syst. 30(4):473–491 (2011)

    Article  Google Scholar 

  7. Cope, B., Cheung, P., Luk, W., Howes, L.: Performance comparison of graphics processors to reconfigurable computing: a case study. IEEE Trans. Comput. 59(4):433–448 (2010)

    Article  MathSciNet  Google Scholar 

  8. Cornu, A., Derrien, S., Lavenier, D.: HLS tools for FPGA: faster development with better performance. In: International Conference on Reconfigurable computing: architectures, tools and applications, pp. 67–78 (2011)

  9. Curreri, J., Koehler, S., George, A., Holland, B., Garcia, R.: Performance analysis framework for high-level language applications in reconfigurable computing. ACM Trans. Reconfig. Technol. Syst. 3(1):5:1−5:23 (2010)

    Article  Google Scholar 

  10. Denolf, K., Neuendorffer, S., Vissers, K.: Using C-to-gates to program streaming image processing kernels efficiently on FPGAs. In: International Conference on Field Programmable Logic and Applications, pp. 626–630 (2009)

  11. Dietrich, B.: Design and implementation of an FPGA-based stereo vision system for the EyeBot M6. University of Western Australia, Crawley (2009)

  12. Farabet, C., Poulet, C., Han, J., LeCun, Y.: CNP: an FPGA-based processor for convolutional networks. In: International Conference Field Programmable Logic and Applications FPL, pp. 32 –37 (2009)

  13. Fons, F., Fons, M., Cantó, E.: Run-time self-reconfigurable 2D convolver for adaptive image processing. Microelectron. J. 42:204–217 (2011)

    Article  Google Scholar 

  14. Gokhale, M.B., Graham, P.S.: Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays. Springer, Berlin (2005)

    Google Scholar 

  15. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley Longman Publishing Co., Inc., Chicago (2007)

    Google Scholar 

  16. Gunn, S.R.: On the discrete representation of the laplacian of gaussian. Pattern Recognit. 32(8):1463–1472 (1999)

    Article  Google Scholar 

  17. Impulse accelerated technologies impulse codeveloper c-to-fpga tools. http://www.impulseaccelerated.com/products_universal.htm (2011)

  18. Ivanovic, V.N., Stojanovic, R.D.: An efficient hardware design of the flexible 2-D system for space/spatial-frequency signal analysis. IEEE Trans. Signal Process. 55(6):3116–3125 (2007)

    Article  MathSciNet  Google Scholar 

  19. Jain, R.C., Kasturi, R., Schunck, B.G.: Machine Vision. McGraw-Hill, Maidenheach (1995)

    Google Scholar 

  20. Liang, Y., Rupnow, K., Li, Y., Min, D., Do, M.N., Chen, D.: High-level synthesis: productivity, performance, and software constraints. J. Electr. Comput. Eng. 2012 (2012)

  21. Martin, G., Smith, G.: High-level synthesis: past, present and future. IEEE Des. Test Comput. 26(4):18–24 (2009)

    Article  Google Scholar 

  22. Martínez, J.J., Toledo, F.J., Fernández, E., Ferrández, J.M.: Study of the contrast processing in the early visual system using a neuromorphic retinal architecture. Neurocomputing 72(4–6):928–935 (2009)

    Article  Google Scholar 

  23. Meeus, W., Van Beeck, K., Goedeme, T., Meel, J., Stroobandt, D.: An overview of today's high-level synthesis tools. Des. Autom. Embed. Syst. 1–21 (2012)

  24. Nixon, M., Aguado, A.S.: Feature Extraction & Image Processing, 2nd edn. Academic Press, London (2008)

    Google Scholar 

  25. Pellerin, D., Thibault, S.: Practical FPGA programming in C. Prentice Hall Professional Technical Reference (2005)

  26. Russo, L.M., Pedrino, E.C., Kato, E., Roda, V.O.: Image convolution processing: a GPU versus FPGA comparison. In: IEEE Southern Conference on Programmable Logic (SPL), pp. 1–6 (2012)

  27. Sankaradass, M., Jakkula, V., Cadambi, S., Chakradhar, S.T., Durdanovic, I., Cosatto. E., Graf, H.P.: A massively parallel coprocessor for convolutional neural networks. In: International Conference on Application-specific Systems, Architectures and Processors ASAP, IEEE, pp. 53–60 (2009)

  28. Savarimuthu, T.R., Kjaer-Nielsen, A., Sørensen, A.S.: Real-time medical video processing, enabled by hardware accelerated correlations. J Real-Time Image Process 6(3):187–197 (2011)

    Article  Google Scholar 

  29. Sriram, V., Cox, D., Tsoi, K.H., Luk, W.: Towards an embedded biologically-inspired machine vision processor. In: International Conference of Field-Programmable Technology FPT, pp. 273–278 (2010)

  30. Starck, J., Murtagh, F. (eds.): Handbook of Astronomical Data Analysis. Elsevier, Amsterdam (2002)

    Google Scholar 

  31. Toledo-Moreo, F.J., Martínez-Álvarez, J.J., Garrigós-Guerrero, F.J., Ferrández-Vicente, J.M.: FPGA-based architecture for the real-time computation of 2-D convolution with large kernel size. J. Syst. Archit. Embed. Syst. Des. 58(8):277–285 (2012)

    Article  Google Scholar 

  32. Veelaert, P., Teelen, K.: Adaptive and optimal difference operators in image processing. Pattern Recognit. 42(10):2317–2326 (2009)

    Article  MATH  Google Scholar 

  33. Wang, W., Duan, B., Zhang, C., Zhang, P., Sun, N.: Accelerating 2D FFT with non-power-of-two problem size on FPGA. In: International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp. 208–213 (2010)

  34. Xu, J., Subramanian, N., Alessio, A., Hauck, S.: Impulse C vs. VHDL for accelerating tomographic reconstruction. In: IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 171–174 (2010)

  35. Zhang, M., Asari, V.: An efficient multiplier-less architecture for 2-D convolution with quadrant symmetric kernels. Integr. VLSI J. 40:490–502 (2007)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been supported by the Spanish Ministerio de Economía y Competitividad under the grant AYA2011-14245-E.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Carlos Colodro-Conde or F. Javier Toledo-Moreo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Colodro-Conde, C., Toledo-Moreo, F.J., Toledo-Moreo, R. et al. A practical evaluation of the performance of the Impulse CoDeveloper HLS tool for implementing large-kernel 2-D filters. J Real-Time Image Proc 9, 263–279 (2014). https://doi.org/10.1007/s11554-013-0374-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-013-0374-x

Keywords