Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

VIBNN: Hardware Acceleration of Bayesian Neural Networks

Published: 19 March 2018 Publication History

Abstract

Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for small-data training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAM-based Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4 Images/s and energy efficiency upto 52,694.8 Images/J while maintaining similar accuracy as its software counterpart.

References

[1]
Pulkit Agrawal, Ross Girshick, and Jitendra Malik. 2014. Analyzing the performance of multilayer neural networks for object recognition. In European Conference on Computer Vision. Springer, 329-344.
[2]
Filipp Akopyan, Jun Sawada, Andrew Cassidy, Rodrigo Alvarez-Icaza, John Arthur, Paul Merolla, Nabil Imam, Yutaka Nakamura, Pallab Datta, and Gi-Joon Nam. 2015. Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34, 10 (2015), 1537-1557.
[3]
R Andraka and R Phelps. 1998. An FPGA based processor yields a real time high fidelity radar environment simulator. In Military and Aerospace Applications of Programmable Devices and Technologies Conference. 220-224.
[4]
Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. 2003. An introduction to MCMC for machine learning. Machine learning 50, 1-2 (2003), 5-43.
[5]
Bálint Antal and András Hajdu. 2014. An Ensemble-based System for Automatic Screening of Diabetic Retinopathy. Know.-Based Syst. 60 (April 2014), 20-27.
[6]
Matias S Attene-Ramos, Nicole Miller, Ruili Huang, Sam Michael, Misha Itkin, Robert J Kavlock, Christopher P Austin, Paul Shinn, Anton Simeonov, and Raymond R Tice. 2013. The Tox21 robotic platform for the assessment of environmental chemicals?from vision to reality. Drug discovery today 18, 15 (2013), 716-723.
[7]
JD Beasley and SG Springer. 1985. The percentage points of the normal distribution. Algorithm AS 111 (1985).
[8]
David M Blei, Thomas L Griffiths, and Michael I Jordan. 2010. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. Journal of the ACM (JACM) 57, 2 (2010), 7.
[9]
Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424 (2015).
[10]
George EP Box, William Gordon Hunter, and J Stuart Hunter. 1978. Statistics for experimenters: an introduction to design, data analysis, and model building. Vol. 1. JSTOR.
[11]
Michael Braun and Jon McAuliffe. 2010. Variational inference for large-scale models of discrete choice. J. Amer. Statist. Assoc. 105, 489 (2010), 324-335.
[12]
Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, and Ninghui Sun. 2014. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 609-622.
[13]
Yu-Hsin Chen, Tushar Krishna, Joel S Emer, and Vivienne Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127-138.
[14]
Dan C Cireşan, Alessandro Giusti, Luca M Gambardella, and Jürgen Schmidhuber. 2013. Mitosis detection in breast cancer histology images with deep neural networks. In International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 411-418.
[15]
R. David. 1980. Testing by Feedback Shift Register. IEEE Trans. Comput. 29, 7 (July 1980), 668-673.
[16]
Misha Denil, Babak Shakibi, Laurent Dinh, and Nando de Freitas. 2013. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems. 2148-2156.
[17]
Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1-15.
[18]
Clément Farabet, Berin Martini, Benoit Corda, Polina Akselrod, Eugenio Culurciello, and Yann LeCun. 2011. Neuflow: A runtime reconfigurable dataflow processor for vision. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 109-116.
[19]
Meire Fortunato, Charles Blundell, and Oriol Vinyals. 2017. Bayesian Recurrent Neural Networks. arXiv preprint arXiv:1704.02798 (2017).
[20]
Yarin Gal and Zoubin Ghahramani. 2015. Bayesian convolutional neural networks with Bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158 (2015).
[21]
Zoubin Ghahramani and Matthew J Beal. 2001. Propagation algorithms for variational Bayesian learning. In Advances in neural information processing systems. 507-513.
[22]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.
[23]
Philipp Gysel, Mohammad Motamedi, and Soheil Ghiasi. 2016. Hardware-oriented approximation of convolutional neural networks. arXiv preprint arXiv:1604.03168 (2016).
[24]
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 243-254.
[25]
Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).
[26]
Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, and Pieter Abbeel. 2016. VIME: Variational Information Maximizing Exploration. In Advances In Neural Information Processing Systems. 1109-1117.
[27]
Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. 2014. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014).
[28]
Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. 1999. An introduction to variational methods for graphical models. Machine learning 37, 2 (1999), 183-233.
[29]
Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, and Al Borchers. 2017. In-datacenter performance analysis of a tensor processing unit. arXiv preprint arXiv:1704.04760 (2017).
[30]
Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-Normalizing Neural Networks. arXiv preprint arXiv:1706.02515 (2017).
[31]
Yann LeCun, Corinna Cortes, and Christopher JC Burges. 2010. MNIST hand-written digit database. AT&T Labs {Online}. Available: http://yann.lecun.com/exdb/mnist2 (2010).
[32]
Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks 12, 10 (1999), 1399-1404.
[33]
David JC MacKay. 1992. Bayesian methods for adaptive models. Ph.D. Dissertation. California Institute of Technology.
[34]
Jamshaid Sarwar Malik and Ahmed Hemani. 2016. Gaussian Random Number Generation: A Survey on Hardware Architectures. ACM Computing Surveys (CSUR) 49, 3 (2016), 53.
[35]
George Marsaglia and Wai Wan Tsang. 1984. A fast, easily implemented method for sampling from decreasing or symmetric unimodal density functions. SIAM Journal on scientific and statistical computing 5, 2 (1984), 349-359.
[36]
Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through ffts. arXiv preprint arXiv:1312.5851 (2013).
[37]
Mervin E Muller. 1958. An inverse method for the generation of random normal deviates on large-scale computers. Mathematical tables and other aids to computation 12, 63 (1958), 167-174.
[38]
Mervin E Muller. 1959. A comparison of methods for generating normal deviates on digital computers. Journal of the ACM (JACM) 6, 3 (1959), 376-383.
[39]
Radford M Neal. 2012. Bayesian learning for neural networks. Vol. 118. Springer Science&Business Media.
[40]
Eriko Nurvitadhi, Ganesh Venkatesh, Jaewoong Sim, Debbie Marr, Randy Huang, Jason Ong Gee Hock, Yeong Tat Liew, Krishnan Srivatsan, Duncan Moss, and Suchit Subhaschandra. 2017. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?. In FPGA. 5-14.
[41]
Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, and Eric S Chung. 2015. Accelerating deep convolutional neural networks using specialized hardware. Microsoft Research Whitepaper 2, 11 (2015).
[42]
Ao Ren, Zhe Li, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Ji Li, Xuehai Qian, and Bo Yuan. 2017. Sc-dcnn: Highly-scalable deep convolutional neural network using stochastic computing. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 405-418.
[43]
B. E. Sakar, M. E. Isenkul, C. O. Sakar, A. Sertbas, F. Gurgen, S. Delil, H. Apaydin, and O. Kursun. 2013. Collection and Analysis of a Parkinson Speech Dataset With Multiple Types of Sound Recordings. IEEE Journal of Biomedical and Health Informatics 17, 4 (July 2013), 828-834.
[44]
Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural networks 61 (2015), 85-117.
[45]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[46]
Naveen Suda, Vikas Chandra, Ganesh Dasika, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, and Yu Cao. 2016. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 16-25.
[47]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-3112.
[48]
Yee Whye Teh and Michael I Jordan. 2010. Hierarchical Bayesian nonparametric models with applications. Bayesian nonparametrics 1 (2010).
[49]
Daniel Teichroew. 1953. Distribution sampling with high speed computers. Ph.D. Dissertation. North Carolina State College.
[50]
Jonathan L Ticknor. 2013. A Bayesian regularized artificial neural network for stock market forecasting. Expert Systems with Applications 40, 14 (2013), 5501-5506.
[51]
Ganesh Venkatesh, Eriko Nurvitadhi, and Debbie Marr. 2017. Accelerating Deep Convolutional Networks using low-precision and sparsity. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2861-2865.
[52]
Martin J Wainwright and Michael I Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning 1, 1-2 (2008), 1-305.
[53]
Helen M Walker. 1985. De Moivre on the law of normal probability. Smith, David Eugene. A Source Book in Mathematics. Dover. ISBN 0-486-64690-4 (1985).
[54]
Chris S Wallace. 1996. Fast pseudorandom generators for normal and exponential variates. ACM Transactions on Mathematical Software (TOMS) 22, 1 (1996), 119-127.
[55]
Roy Ward and Tim Molteno. 2007. Table of linear feedback shift registers. Datasheet, Department of Physics, University of Otago (2007).
[56]
Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, and Yiran Chen. 2016. A new learning method for inference accuracy, core occupation, and performance co-optimization on TrueNorth chip. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1-6.
[57]
Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161-170.
[58]
Cha Zhang and Yunqian Ma. 2012. Ensemble machine learning: methods and applications. Springer.
[59]
Maciej Zikeba, Jakub M Tomczak, Marek Lubicz, and Jerzy 'Swikatek. 2013. Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Applied Soft Computing (2013). https://doi.org/{WebLink}

Cited By

View all
  • (2024)Agreeing to Stop: Reliable Latency-Adaptive Decision Making via Ensembles of Spiking Neural NetworksEntropy10.3390/e2602012626:2(126)Online publication date: 31-Jan-2024
  • (2024)An FPGA implementation of Bayesian inference with spiking neural networksFrontiers in Neuroscience10.3389/fnins.2023.129105117Online publication date: 5-Jan-2024
  • (2024)NeuSpin: Design of a Reliable Edge Neuromorphic System Based on Spintronics for Green AI2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546675(1-6)Online publication date: 25-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 53, Issue 2
ASPLOS '18
February 2018
809 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3296957
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2018
    827 pages
    ISBN:9781450349116
    DOI:10.1145/3173162
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 March 2018
Published in SIGPLAN Volume 53, Issue 2

Check for updates

Author Tags

  1. Bayesian neural network
  2. FPGA
  3. neural network

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)364
  • Downloads (Last 6 weeks)55
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Agreeing to Stop: Reliable Latency-Adaptive Decision Making via Ensembles of Spiking Neural NetworksEntropy10.3390/e2602012626:2(126)Online publication date: 31-Jan-2024
  • (2024)An FPGA implementation of Bayesian inference with spiking neural networksFrontiers in Neuroscience10.3389/fnins.2023.129105117Online publication date: 5-Jan-2024
  • (2024)NeuSpin: Design of a Reliable Edge Neuromorphic System Based on Spintronics for Green AI2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546675(1-6)Online publication date: 25-Mar-2024
  • (2024)Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGAProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3656528(1-6)Online publication date: 23-Jun-2024
  • (2024)An Energy-Efficient Bayesian Neural Network Implementation Using Stochastic Computing MethodIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326553335:9(12913-12923)Online publication date: Sep-2024
  • (2024)Spatial-SpinDrop: Spatial Dropout-Based Binary Bayesian Neural Network With Spintronics ImplementationIEEE Transactions on Nanotechnology10.1109/TNANO.2024.344545523(636-643)Online publication date: 6-Sep-2024
  • (2024)Accelerating MRI Uncertainty Estimation with Mask-Based Bayesian Neural Network2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP61560.2024.00030(107-115)Online publication date: 24-Jul-2024
  • (2024)Highly parallel and ultra-low-power probabilistic reasoning with programmable gaussian-like memory transistorsNature Communications10.1038/s41467-024-46681-215:1Online publication date: 18-Mar-2024
  • (2024)WALLAXNeurocomputing10.1016/j.neucom.2023.126933566:COnline publication date: 21-Jan-2024
  • (2024)Architectures for Machine LearningHandbook of Computer Architecture10.1007/978-981-97-9314-3_12(321-379)Online publication date: 21-Dec-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media