Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Convolutional neural network acceleration with hardware/software co-design

Published: 01 May 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Convolutional Neural Networks (CNNs) have a broad range of applications, such as image processing and natural language processing. Inspired by the mammalian visual cortex, CNNs have been shown to achieve impressive results on a number of computer vision challenges, but often with large amounts of processing power and no timing restrictions. This paper presents a design methodology for accelerating CNNs using Hardware/Software Co-design techniques, in order to balance performance and flexibility, particularly for resource-constrained systems. The methodology is applied to a gender recognition case study, using an ARM processor and FPGA fabric to create an embedded system that can process facial images in real-time.

    References

    [1]
    Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) TensorFlow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI)
    [2]
    Tivive FHC, Bouzerdoum A (2006) A gender recognition system using shunting inhibitory convolutional neural networks. In: International joint conference on neural networks (IJCNN), pp 5336---5341
    [3]
    Chen ATY, Biglari-Abhari M, Wang KIK, Bouzerdoum A, Tivive FHC (2016) Hardware/software co-design for a gender recognition embedded system. In: Trends in applied knowledge-based systems and data science, vol 9799, pp 541---552
    [4]
    de Michell G, Gupta RK (1997) Hardware/software co-design. Proc IEEE 85(3):349---365
    [5]
    Teich J (2012) Hardware/software codesign: the past, the present, and predicting the future. Proc IEEE 100:1411---1430
    [6]
    Alt N, Clause C, Stechele W (2008) Hardware/software architecture of an algorithm for vision-based real-time vehicle detection in dark environments. In: Design, automation, and test in europe (DATE), pp 176---181
    [7]
    van der Wal G, Zhang D, Kandaswamy I, Marakowitz J, Kaighn K, Zhang J, Chai S (2015) FPGA acceleration for feature based processing applications. In: Conference on computer vision and pattern recognition (CVPR), pp 42---47
    [8]
    Tasson D, Montagnini A, Marzotto R, Farenzena M (2015) FPGA-based pedestrian detection under strong distortions. In: Conference on computer vision and pattern recognition (CVPR), pp 65---70
    [9]
    Farabet C, Poulet C, Han JY, LeCun Y (2009) CNP: An FPGA-based processor for convolutional networks. In: International conference on field programmable logic (FPL), pp 32---37
    [10]
    Sankaradas M, Jakkula V, Cadambi S, Chakradhar S, Durdanovic I, Cosatto E, Graf HP (2009) A massively parallel coprocessor for convolutional neural networks. In: 20th international conference on application-specific systems, architectures, and processors (ASAP), pp 53---60
    [11]
    Farabet C, Martini B, Corda B, Akselrod P, Culurciello E, LeCun Y (2011) NeuFlow: a runtime reconfigurable dataflow processor for vision. In: Conference on computer vision and pattern recognition workshops (CVPR), pp 109---116
    [12]
    Cavigelli L, Gschwend D, Mayer C, Willi S, Muheim B, Benini L (2015) Origami: a convolutional network accelerator. In: 25th great lakes symposium on VLSI (GLSVLSI), pp 199---204
    [13]
    Pham PH, Jelaca D, Farabet C, Martini B, LeCun Y, Culurciello E (2012) NeuFlow: dataflow vision processing system-on-a-chip. In: 55th midwest symposium on circuits and systems (MWSCAS), pp 1044---1047
    [14]
    Li X, Areibi S (2004) A hardware/software co-design approach for face recognition. In: 16th international conference on microelectronics (ICM), pp 55---58
    [15]
    Che M, Chang Y (2010) A hardware/software co-design of a face detection algorithm based on FPGA. In: International conference on measuring technology and mechatronics automation (ICMTMA), pp 109---112
    [16]
    Qiu J, Wang J, Yao S, Guo K, Li B, Zhou E, Yu J, Tang T, Xu N, Song S, Wang Y, Yang H (2016) Going deeper with embedded FPGA platform for convolutional neural network. In: International symposium on field-programmable gate arrays (FPGA), pp 26---35
    [17]
    Maclean WJ (2005) An evaluation of the suitability of FPGAs for embedded vision systems. In: Conference on computer vision and pattern recognition workshops (CVPR), pp 131---138
    [18]
    Zhang C, Li P, Sun G, Guan Y, Xiao B, Cong J (2015) Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: International symposium on field-programmable gate arrays (FPGA), pp 161---170
    [19]
    Gupta S, Agrawal A, Gopalakrishnan K (2015) Deep learning with limited numerical precision. In: 32nd international conference on machine learning (ICML), pp 1737---1746
    [20]
    Ng CB, Tay YH, Goi BM (2012) Recognizing human gender in computer vision: a survey. In: Pacific rim international conference on artificial intelligence: trends in artificial intelligence (PRICAI), pp 335---346
    [21]
    Zheng J, Lu B (2011) A support vector machine classifier with automatic confidence. Neurocomputing 74(11):1926---1935
    [22]
    Shan C (2012) Learning local binary patterns for gender classification on real-world face images. Pattern Recogn Lett 4(33):431---437
    [23]
    Azarmehr R, Laganiere R, Lee WS, Xu C, Laroche D (2015) Real-time embedded age and gender classification in unconstrained video. In: Conference on computer vision and pattern recognition workshops (CVPR), pp 56---64
    [24]
    Irick KM, DeBole M, Narayanan V, Gayasen A (2008) A hardware efficient support vector machine architecture for FPGA. In: 16th international symposium on field-programmable custom computing machines (FCCM), pp 304---305
    [25]
    Irick K, DeBole M, Narayanan V, Sharma R, Moon H, Mummareddy S (2007) A unified streaming architecture for real time face detection and gender classification. In: international conference on field programmable logic and applications (FPL), pp 267---272
    [26]
    Ratnakar A, More G (2015) Real time gender recognition on FPGA. Int J Sci Eng Res 6(2):19---22
    [27]
    Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Conference on computer vision and pattern recognition (CVPR), pp 779---788
    [28]
    Tivive FHC, Bouzerdoum A, Phung SL, Iftekharuddin KM (2010) Adaptive hierarchical architecture for visual recognition. Appl Opt 49(10):B1---B8
    [29]
    Fogel I, Sagi D (1989) Gabor filters as texture discriminator. Biol Cybern 61(2):103---113
    [30]
    Wu J, An G, Ruan Q (2009) Independent Gabor analysis of discriminant features fusion for face recognition. IEEE Signal Processing Lett 16(2):97---100
    [31]
    Li W, Du Q (2014) Gabor-filtering-based nearest regularized subspace for hyperspectral image classification. IEEE J Select Topics Appl Earth Observ Rem Sens 7(4):1012---1022
    [32]
    Jones JP, Palmer L (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophys 58(6):1233---1258
    [33]
    Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J Optic Soc Amer A: Optic Image Sci Vis 2(7):1160---1169
    [34]
    Naka KI, Rushton WAH (1966) S-potentials from colour units in the retina of fish (Cyprinidae). J Phys 185:536---555
    [35]
    Hagan MT, Menhaj M (1994) Training feedforward networks with the marquardt algorithm. IEEE Trans Neural Networks 5(6):989---993
    [36]
    Cesur E, Yildiz N, Tavsanoglu V (2012) On an improved FPGA implementation of CNN-based Gabor-type filters. IEEE Trans Circuits Systems 59(11):815---819
    [37]
    Pauwels K, Tomasi M, Alonso JD, Ros E, van Hulle MM (2012) A comparison of FPGA and GPU for real-time phase-based optical flow, stereo, and local image features. IEEE Trans Comput 61(7):999---1012
    [38]
    Han S, Mao H, Dally WJ (2016) Deep compression: Compressing deep neural networks with pruning trained quantization and huffman coding. In: International conference on learning representations (ICLR)
    [39]
    Chen Y, Xu W, Zhao R, Chen X (2014) Design and evaluation of a hardware/software FPGA-based system for fast image processing. Photonic Sensors 4(3):274---280
    [40]
    Gudis E, Lu P, Berends D, Kaighn K, van der Wal G, Buchanan G, Chai S, Piacentino M (2013) An embedded vision services framework for heterogeneous accelerators. In: conference on computer vision and pattern recognition workshops (CVPR), pp 598---603
    [41]
    Albericio J, Judd P, Hetherington T, Aamodt T, Jerger NE, Moshovos A (2016) Cnvlutin: ineffectual-neuron-free deep neural network computing. In: 43rd international symposium on comparative archives (ISCA), pp 1---13
    [42]
    Jesorsky O, Kirchberg KJ, Frischholz RW (2001) Robust face detection using the Hausdorff distance. In: 3rd international conference on audio- and video-based biometric person authentication (AVBPA), pp 90---95
    [43]
    Pantic M, Valstar M, Rademaker R (2005) Web-based database for facial expression analysis. In: International conference on multimedia and expo (ICME), pp. 317---321
    [44]
    Phillips PJ, Moon H, Rauss PJ, Rizvi S (2000) The FERET evaluation methodology for face recognition algorithms. IEEE Trans Pattern Anal Machine Intelligence 22(10):1090---1104
    [45]
    Thomaz CE, Giraldi GA (2010) A new ranking method for principal components analysis and its application to face image analysis. Image Vis Comput 28(6):902---913
    [46]
    Lee PH, Hung JY, Hung YP (2010) Automatic gender recognition using fusion of facial strips. In: 20th international conference on pattern recognition, pp 1140---1143
    [47]
    Leng XM, Wang YD (2008) Improving generalization for gender classification. In: 15th international conference on image processing, pp 1656---1659
    [48]
    Moghaddam B, Yang MH (2002) Learning gender with support faces. IEEE Trans Pattern Anal Machine Intelligence 24(5):707---711
    [49]
    Lu L, Shi P (2009) A novel fusion-based method for expression-invariant gender classification. In: International conference on acoustics, speech, and signal processing, pp 1065---1068
    [50]
    Baluja S, Rowley HA (2007) Boosting sex identification performance. Int J Comp Vision 71(1):111---119
    [51]
    Buchala S, Loomes MJ, Davey N, Frank RJ (2005) The role of global and feature based information in gender classification of faces: a comparison of human performance and computational models. Int J Neural Syst 15:121---128
    [52]
    Sahin I, Saritekin NK (2016) A data path design tool for automatically mapping artificial neural networks on to FPGA-based systems. J Elec Eng Tech 11(5):1921---1929

    Cited By

    View all
    • (2021)Block-sparse CNN: towards a fast and memory-efficient framework for convolutional neural networksApplied Intelligence10.1007/s10489-020-01815-z51:1(441-452)Online publication date: 1-Jan-2021
    • (2019)Local curve pattern for content-based image retrievalPattern Analysis & Applications10.1007/s10044-018-0724-122:3(1233-1242)Online publication date: 1-Aug-2019

    Index Terms

    1. Convolutional neural network acceleration with hardware/software co-design
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Applied Intelligence
      Applied Intelligence  Volume 48, Issue 5
      May 2018
      323 pages

      Publisher

      Kluwer Academic Publishers

      United States

      Publication History

      Published: 01 May 2018

      Author Tags

      1. Co-design
      2. Computer vision
      3. Embedded system
      4. FPGA
      5. Gender recognition
      6. Hardware acceleration
      7. Neural network
      8. Real-time

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Block-sparse CNN: towards a fast and memory-efficient framework for convolutional neural networksApplied Intelligence10.1007/s10489-020-01815-z51:1(441-452)Online publication date: 1-Jan-2021
      • (2019)Local curve pattern for content-based image retrievalPattern Analysis & Applications10.1007/s10044-018-0724-122:3(1233-1242)Online publication date: 1-Aug-2019

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media