research-article

Public Access

VIBNN: Hardware Acceleration of Bayesian Neural Networks

Authors:

Massoud Pedram,

Yanzhi WangAuthors Info & Claims

ACM SIGPLAN Notices, Volume 53, Issue 2

Pages 476 - 488

https://doi.org/10.1145/3296957.3173212

Published: 19 March 2018 Publication History

Abstract

Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for small-data training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAM-based Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4 Images/s and energy efficiency upto 52,694.8 Images/J while maintaining similar accuracy as its software counterpart.

References

[1]

Pulkit Agrawal, Ross Girshick, and Jitendra Malik. 2014. Analyzing the performance of multilayer neural networks for object recognition. In European Conference on Computer Vision. Springer, 329-344.

[2]

Filipp Akopyan, Jun Sawada, Andrew Cassidy, Rodrigo Alvarez-Icaza, John Arthur, Paul Merolla, Nabil Imam, Yutaka Nakamura, Pallab Datta, and Gi-Joon Nam. 2015. Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34, 10 (2015), 1537-1557.

Digital Library

[3]

R Andraka and R Phelps. 1998. An FPGA based processor yields a real time high fidelity radar environment simulator. In Military and Aerospace Applications of Programmable Devices and Technologies Conference. 220-224.

[4]

Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. 2003. An introduction to MCMC for machine learning. Machine learning 50, 1-2 (2003), 5-43.

[5]

Bálint Antal and András Hajdu. 2014. An Ensemble-based System for Automatic Screening of Diabetic Retinopathy. Know.-Based Syst. 60 (April 2014), 20-27.

Digital Library

[6]

Matias S Attene-Ramos, Nicole Miller, Ruili Huang, Sam Michael, Misha Itkin, Robert J Kavlock, Christopher P Austin, Paul Shinn, Anton Simeonov, and Raymond R Tice. 2013. The Tox21 robotic platform for the assessment of environmental chemicals?from vision to reality. Drug discovery today 18, 15 (2013), 716-723.

[7]

JD Beasley and SG Springer. 1985. The percentage points of the normal distribution. Algorithm AS 111 (1985).

[8]

David M Blei, Thomas L Griffiths, and Michael I Jordan. 2010. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. Journal of the ACM (JACM) 57, 2 (2010), 7.

Digital Library

[9]

Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural networks. arXiv preprint arXiv:1505.05424 (2015).

Digital Library

[10]

George EP Box, William Gordon Hunter, and J Stuart Hunter. 1978. Statistics for experimenters: an introduction to design, data analysis, and model building. Vol. 1. JSTOR.

[11]

Michael Braun and Jon McAuliffe. 2010. Variational inference for large-scale models of discrete choice. J. Amer. Statist. Assoc. 105, 489 (2010), 324-335.

[12]

Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, and Ninghui Sun. 2014. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 609-622.

Digital Library

[13]

Yu-Hsin Chen, Tushar Krishna, Joel S Emer, and Vivienne Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127-138.

[14]

Dan C Cireşan, Alessandro Giusti, Luca M Gambardella, and Jürgen Schmidhuber. 2013. Mitosis detection in breast cancer histology images with deep neural networks. In International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 411-418.

[15]

R. David. 1980. Testing by Feedback Shift Register. IEEE Trans. Comput. 29, 7 (July 1980), 668-673.

Digital Library

[16]

Misha Denil, Babak Shakibi, Laurent Dinh, and Nando de Freitas. 2013. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems. 2148-2156.

Digital Library

[17]

Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1-15.

Digital Library

[18]

Clément Farabet, Berin Martini, Benoit Corda, Polina Akselrod, Eugenio Culurciello, and Yann LeCun. 2011. Neuflow: A runtime reconfigurable dataflow processor for vision. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 109-116.

[19]

Meire Fortunato, Charles Blundell, and Oriol Vinyals. 2017. Bayesian Recurrent Neural Networks. arXiv preprint arXiv:1704.02798 (2017).

[20]

Yarin Gal and Zoubin Ghahramani. 2015. Bayesian convolutional neural networks with Bernoulli approximate variational inference. arXiv preprint arXiv:1506.02158 (2015).

[21]

Zoubin Ghahramani and Matthew J Beal. 2001. Propagation algorithms for variational Bayesian learning. In Advances in neural information processing systems. 507-513.

Digital Library

[22]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.

Digital Library

[23]

Philipp Gysel, Mohammad Motamedi, and Soheil Ghiasi. 2016. Hardware-oriented approximation of convolutional neural networks. arXiv preprint arXiv:1604.03168 (2016).

[24]

Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 243-254.

Digital Library

[25]

Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).

[26]

Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, and Pieter Abbeel. 2016. VIME: Variational Information Maximizing Exploration. In Advances In Neural Information Processing Systems. 1109-1117.

Digital Library

[27]

Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. 2014. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014).

[28]

Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. 1999. An introduction to variational methods for graphical models. Machine learning 37, 2 (1999), 183-233.

Digital Library

[29]

Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, and Al Borchers. 2017. In-datacenter performance analysis of a tensor processing unit. arXiv preprint arXiv:1704.04760 (2017).

Digital Library

[30]

Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-Normalizing Neural Networks. arXiv preprint arXiv:1706.02515 (2017).

[31]

Yann LeCun, Corinna Cortes, and Christopher JC Burges. 2010. MNIST hand-written digit database. AT&T Labs {Online}. Available: http://yann.lecun.com/exdb/mnist2 (2010).

[32]

Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks 12, 10 (1999), 1399-1404.

Digital Library

[33]

David JC MacKay. 1992. Bayesian methods for adaptive models. Ph.D. Dissertation. California Institute of Technology.

[34]

Jamshaid Sarwar Malik and Ahmed Hemani. 2016. Gaussian Random Number Generation: A Survey on Hardware Architectures. ACM Computing Surveys (CSUR) 49, 3 (2016), 53.

Digital Library

[35]

George Marsaglia and Wai Wan Tsang. 1984. A fast, easily implemented method for sampling from decreasing or symmetric unimodal density functions. SIAM Journal on scientific and statistical computing 5, 2 (1984), 349-359.

[36]

Michael Mathieu, Mikael Henaff, and Yann LeCun. 2013. Fast training of convolutional networks through ffts. arXiv preprint arXiv:1312.5851 (2013).

[37]

Mervin E Muller. 1958. An inverse method for the generation of random normal deviates on large-scale computers. Mathematical tables and other aids to computation 12, 63 (1958), 167-174.

[38]

Mervin E Muller. 1959. A comparison of methods for generating normal deviates on digital computers. Journal of the ACM (JACM) 6, 3 (1959), 376-383.

Digital Library

[39]

Radford M Neal. 2012. Bayesian learning for neural networks. Vol. 118. Springer Science&Business Media.

[40]

Eriko Nurvitadhi, Ganesh Venkatesh, Jaewoong Sim, Debbie Marr, Randy Huang, Jason Ong Gee Hock, Yeong Tat Liew, Krishnan Srivatsan, Duncan Moss, and Suchit Subhaschandra. 2017. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?. In FPGA. 5-14.

Digital Library

[41]

Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, and Eric S Chung. 2015. Accelerating deep convolutional neural networks using specialized hardware. Microsoft Research Whitepaper 2, 11 (2015).

[42]

Ao Ren, Zhe Li, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Ji Li, Xuehai Qian, and Bo Yuan. 2017. Sc-dcnn: Highly-scalable deep convolutional neural network using stochastic computing. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 405-418.

Digital Library

[43]

B. E. Sakar, M. E. Isenkul, C. O. Sakar, A. Sertbas, F. Gurgen, S. Delil, H. Apaydin, and O. Kursun. 2013. Collection and Analysis of a Parkinson Speech Dataset With Multiple Types of Sound Recordings. IEEE Journal of Biomedical and Health Informatics 17, 4 (July 2013), 828-834.

[44]

Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural networks 61 (2015), 85-117.

Digital Library

[45]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[46]

Naveen Suda, Vikas Chandra, Ganesh Dasika, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, and Yu Cao. 2016. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 16-25.

Digital Library

[47]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-3112.

Digital Library

[48]

Yee Whye Teh and Michael I Jordan. 2010. Hierarchical Bayesian nonparametric models with applications. Bayesian nonparametrics 1 (2010).

[49]

Daniel Teichroew. 1953. Distribution sampling with high speed computers. Ph.D. Dissertation. North Carolina State College.

[50]

Jonathan L Ticknor. 2013. A Bayesian regularized artificial neural network for stock market forecasting. Expert Systems with Applications 40, 14 (2013), 5501-5506.

[51]

Ganesh Venkatesh, Eriko Nurvitadhi, and Debbie Marr. 2017. Accelerating Deep Convolutional Networks using low-precision and sparsity. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2861-2865.

[52]

Martin J Wainwright and Michael I Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning 1, 1-2 (2008), 1-305.

Digital Library

[53]

Helen M Walker. 1985. De Moivre on the law of normal probability. Smith, David Eugene. A Source Book in Mathematics. Dover. ISBN 0-486-64690-4 (1985).

[54]

Chris S Wallace. 1996. Fast pseudorandom generators for normal and exponential variates. ACM Transactions on Mathematical Software (TOMS) 22, 1 (1996), 119-127.

Digital Library

[55]

Roy Ward and Tim Molteno. 2007. Table of linear feedback shift registers. Datasheet, Department of Physics, University of Otago (2007).

[56]

Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, and Yiran Chen. 2016. A new learning method for inference accuracy, core occupation, and performance co-optimization on TrueNorth chip. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1-6.

Digital Library

[57]

Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161-170.

Digital Library

[58]

Cha Zhang and Yunqian Ma. 2012. Ensemble machine learning: methods and applications. Springer.

[59]

Maciej Zikeba, Jakub M Tomczak, Marek Lubicz, and Jerzy 'Swikatek. 2013. Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Applied Soft Computing (2013). https://doi.org/{WebLink}

Cited By

Chen JPark SSimeone O(2024)Agreeing to Stop: Reliable Latency-Adaptive Decision Making via Ensembles of Spiking Neural NetworksEntropy10.3390/e2602012626:2(126)Online publication date: 31-Jan-2024
https://doi.org/10.3390/e26020126
Li HWan BFang YLi QLiu JAn L(2024)An FPGA implementation of Bayesian inference with spiking neural networksFrontiers in Neuroscience10.3389/fnins.2023.129105117Online publication date: 5-Jan-2024
https://doi.org/10.3389/fnins.2023.1291051
Ahmed SDanouchi KPrenat GAnghel LTahoori M(2024)NeuSpin: Design of a Reliable Edge Neuromorphic System Based on Spintronics for Green AI2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546675(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546675
Show More Cited By

Index Terms

VIBNN: Hardware Acceleration of Bayesian Neural Networks
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

VIBNN: Hardware Acceleration of Bayesian Neural Networks
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue ...
A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA
Applied Reconfigurable Computing. Architectures, Tools, and Applications
Abstract
Deep neural networks (DNNs) are prevalent for many applications related to classification, prediction and regression. To perform different applications with better performance and accuracy, an optimized network architecture is required, which can ...
Dynamic MAC-based architecture of artificial neural networks suitable for hardware implementation on FPGAs

Artificial neural networks (ANNs) is a well known bio-inspired model that simulates human brain capabilities such as learning and generalization. ANNs consist of a number of interconnected processing units, wherein each unit performs a weighted sum ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 53, Issue 2

ASPLOS '18

February 2018

809 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/3296957

Editor:
Matthew Fluet
Rodchester Institude of Technology

Issue’s Table of Contents

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
March 2018
827 pages
ISBN:9781450349116
DOI:10.1145/3173162
General Chairs:
Xipeng Shen
North Carolina State University, USA
,
James Tuck
North Carolina State University, USA
,
Program Chairs:
Ricardo Bianchini
Microsoft Research, USA
,
Vivek Sarkar
Georgia Institute of Technology, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 March 2018

Published in SIGPLAN Volume 53, Issue 2

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF
DARPA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

81
Total Citations
View Citations
2,051
Total Downloads

Downloads (Last 12 months)364
Downloads (Last 6 weeks)55

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen JPark SSimeone O(2024)Agreeing to Stop: Reliable Latency-Adaptive Decision Making via Ensembles of Spiking Neural NetworksEntropy10.3390/e2602012626:2(126)Online publication date: 31-Jan-2024
https://doi.org/10.3390/e26020126
Li HWan BFang YLi QLiu JAn L(2024)An FPGA implementation of Bayesian inference with spiking neural networksFrontiers in Neuroscience10.3389/fnins.2023.129105117Online publication date: 5-Jan-2024
https://doi.org/10.3389/fnins.2023.1291051
Ahmed SDanouchi KPrenat GAnghel LTahoori M(2024)NeuSpin: Design of a Reliable Edge Neuromorphic System Based on Spintronics for Green AI2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546675(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546675
Zhang ZFan HChen HDudziak LLuk WDe V(2024)Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGAProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3656528(1-6)Online publication date: 23-Jun-2024
https://dl.acm.org/doi/10.1145/3649329.3656528
Jia XGu HLiu YYang JWang XPan WZhang YCotofana SZhao W(2024)An Energy-Efficient Bayesian Neural Network Implementation Using Stochastic Computing MethodIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326553335:9(12913-12923)Online publication date: Sep-2024
https://doi.org/10.1109/TNNLS.2023.3265533
Ahmed SDanouchi KHefenbrock MPrenat GAnghel LTahoori M(2024)Spatial-SpinDrop: Spatial Dropout-Based Binary Bayesian Neural Network With Spintronics ImplementationIEEE Transactions on Nanotechnology10.1109/TNANO.2024.344545523(636-643)Online publication date: 6-Sep-2024
https://dl.acm.org/doi/10.1109/TNANO.2024.3445455
Zhang ZGenci MFan HWetscherek ALuk W(2024)Accelerating MRI Uncertainty Estimation with Mask-Based Bayesian Neural Network2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP61560.2024.00030(107-115)Online publication date: 24-Jul-2024
https://doi.org/10.1109/ASAP61560.2024.00030
Lee CRahimifard LChoi JPark JLee CKumar DShukla PLee STrivedi AYoo HIm S(2024)Highly parallel and ultra-low-power probabilistic reasoning with programmable gaussian-like memory transistorsNature Communications10.1038/s41467-024-46681-215:1Online publication date: 18-Mar-2024
https://doi.org/10.1038/s41467-024-46681-2
Dong XAmirsoleimani AAzghadi MGenov R(2024)WALLAXNeurocomputing10.1016/j.neucom.2023.126933566:COnline publication date: 21-Jan-2024
https://dl.acm.org/doi/10.1016/j.neucom.2023.126933
Yang YChen CWang Z(2024)Architectures for Machine LearningHandbook of Computer Architecture10.1007/978-981-97-9314-3_12(321-379)Online publication date: 21-Dec-2024
https://doi.org/10.1007/978-981-97-9314-3_12
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Issue’s Table of Contents