short-paper

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices

Authors:

Tao MeiAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 2272 - 2275

https://doi.org/10.1145/3343031.3350534

Published: 15 October 2019 Publication History

Abstract

It is always well believed that Binary Neural Networks (BNNs) could drastically accelerate the inference efficiency by replacing the arithmetic operations in float-valued Deep Neural Networks (DNNs) with bit-wise operations. Nevertheless, there has not been open-source implementation in support of this idea on low-end ARM devices (e.g., mobile phones and embedded devices). In this work, we propose daBNN --- a super fast inference framework that implements BNNs on ARM devices. Several speed-up and memory refinement strategies for bit-packing, binarized convolution, and memory layout are uniquely devised to enhance inference efficiency. Compared to the recent open-source BNN inference framework, BMXNet, our daBNN is 7x~23x faster on a single binary convolution, and about 6x faster on Bi-Real Net 18 (a BNN variant of ResNet-18). The daBNN is a BSD-licensed inference framework, and its source code, sample projects and pre-trained models are available on-line: https://github.com/JDAI-CV/dabnn.

References

[1]

2018. Caffe2 deep learning framework. https://github.com/caffe2/caffe2

[2]

2018. TensorFlow Lite. https://www.tensorflow.org/mobile/tflite/

[3]

2019. Open Neural Network Exchange. https://github.com/onnx/onnx.

[4]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. http://tensorflow.org/ Software available from tensorflow.org.

[5]

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun 2016). https://doi.org/10.1109/cvpr.2016.90

[7]

Yuwei Hu, Jidong Zhai, Dinghua Li, Yifan Gong, Yuhao Zhu, Wei Liu, Lei Su, and Jiangming Jin. 2018. BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 244--253.

[8]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). ACM, New York, NY, USA, 675--678. https://doi.org/10.1145/2647868.2654889

Digital Library

[9]

Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, and Kwang-Ting Cheng. 2018. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In Proceedings of the European Conference on Computer Vision (ECCV). 722--737.

Digital Library

[10]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic Differentiation in PyTorch. In NIPS Autodiff Workshop.

[11]

Frank Seide and Amit Agarwal. 2016. CNTK: Microsoft's open-source deeplearning toolkit. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2135--2135.

Digital Library

[12]

Robert J Wang, Xiang Li, and Charles X Ling. 2018. Pelee: A real-time object detection system on mobile devices. In Advances in Neural Information Processing Systems. 1963--1972.

[13]

Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, and Xuanzhe Liu. 2019. A First Look at Deep Learning Apps on Smartphones. In The World Wide Web Conference. ACM, 2125--2136.

Digital Library

[14]

Haojin Yang, Martin Fritzsche, Christian Bartz, and Christoph Meinel. 2017. BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet. In Proceedings of the 2017 ACM on Multimedia Conference (MM '17). ACM, New York, NY, USA, 1209--1212. https://doi.org/10.1145/3123266.3129393

Digital Library

[15]

Tianli Zhao, Xiangyu He, Jian Cheng, and Jing Hu. 2018. BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs. In Proceedings of the 26th ACM International Conference on Multimedia (MM '18). ACM, New York, NY, USA, 1545--1552. https://doi.org/10. 1145/3240508.3240673

Digital Library

Cited By

Vorabbi LMaltoni DSanti S(2024)Optimizing Data Flow in Binary Neural NetworksSensors10.3390/s2415478024:15(4780)Online publication date: 23-Jul-2024
https://doi.org/10.3390/s24154780
Trusov ALimonova ENikolaev DArlazarov V(2024)4.6-Bit Quantization for Fast and Accurate Neural Network Inference on CPUsMathematics10.3390/math1205065112:5(651)Online publication date: 23-Feb-2024
https://doi.org/10.3390/math12050651
Qin HMa XDing YLi XZhang YMa ZWang JLuo JLiu X(2024)BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network PerformanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.324325935:8(10674-10686)Online publication date: Aug-2024
https://doi.org/10.1109/TNNLS.2023.3243259
Show More Cited By

Index Terms

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices
1. Software and its engineering
  1. Software notations and tools
    1. Software libraries and repositories

Recommendations

TAB: Unified and Optimized Ternary, Binary, and Mixed-precision Neural Network Inference on the Edge
Ternary Neural Networks (TNNs) and mixed-precision Ternary Binary Networks (TBNs) have demonstrated higher accuracy compared to Binary Neural Networks (BNNs) while providing fast, low-power, and memory-efficient inference. Related works have improved the ...
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations
FPGA '21: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory requirement is also significantly reduced. However, compared to start-of-the-...
BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Binary Neural Networks (BNNs) can drastically reduce memory size and accesses by applying bit-wise operations instead of standard arithmetic operations. Therefore it could significantly improve the efficiency and lower the energy consumption at runtime, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

35
Total Citations
View Citations
370
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)6

Reflects downloads up to 31 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vorabbi LMaltoni DSanti S(2024)Optimizing Data Flow in Binary Neural NetworksSensors10.3390/s2415478024:15(4780)Online publication date: 23-Jul-2024
https://doi.org/10.3390/s24154780
Trusov ALimonova ENikolaev DArlazarov V(2024)4.6-Bit Quantization for Fast and Accurate Neural Network Inference on CPUsMathematics10.3390/math1205065112:5(651)Online publication date: 23-Feb-2024
https://doi.org/10.3390/math12050651
Qin HMa XDing YLi XZhang YMa ZWang JLuo JLiu X(2024)BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network PerformanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.324325935:8(10674-10686)Online publication date: Aug-2024
https://doi.org/10.1109/TNNLS.2023.3243259
Pu HZhu ZHu QWang D(2024)Ulit-BiDet: An Ultralightweight Object Detector for SAR Images Based on Binary Neural NetworksIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.337348862(1-21)Online publication date: 2024
https://doi.org/10.1109/TGRS.2024.3373488
Kim HJung YKim L(2024)ADC-Free ReRAM-Based In-Situ Accelerator for Energy-Efficient Binary Neural NetworksIEEE Transactions on Computers10.1109/TC.2022.322480073:2(353-365)Online publication date: Feb-2024
https://doi.org/10.1109/TC.2022.3224800
Song MAsim FLee J(2024)Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473822(115-120)Online publication date: 22-Jan-2024
https://doi.org/10.1109/ASP-DAC58780.2024.10473822
Pu HZhang DXu KMo RYan ZWang D(2024)BNN-SAM: Improving generalization of binary object detector by Seeking Flat MinimaApplied Intelligence10.1007/s10489-024-05512-z54:8(6682-6700)Online publication date: 22-May-2024
https://doi.org/10.1007/s10489-024-05512-z
Qin HZhang MDing YLi ACai ZLiu ZYu FLiu XKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)BiBenchProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619585(28351-28388)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619585
Silfa FArnau JGonzález A(2023)Exploiting Kernel Compression on BNNs2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137052(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10137052
Lin QWu SZhao JDai JShi MChen GLi F(2023)SmartLite: A DBMS-Based Serving System for DNN Inference in Resource-Constrained EnvironmentsProceedings of the VLDB Endowment10.14778/3632093.363209517:3(278-291)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.14778/3632093.3632095
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten