Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3343031.3350534acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices

Published: 15 October 2019 Publication History

Abstract

It is always well believed that Binary Neural Networks (BNNs) could drastically accelerate the inference efficiency by replacing the arithmetic operations in float-valued Deep Neural Networks (DNNs) with bit-wise operations. Nevertheless, there has not been open-source implementation in support of this idea on low-end ARM devices (e.g., mobile phones and embedded devices). In this work, we propose daBNN --- a super fast inference framework that implements BNNs on ARM devices. Several speed-up and memory refinement strategies for bit-packing, binarized convolution, and memory layout are uniquely devised to enhance inference efficiency. Compared to the recent open-source BNN inference framework, BMXNet, our daBNN is 7x~23x faster on a single binary convolution, and about 6x faster on Bi-Real Net 18 (a BNN variant of ResNet-18). The daBNN is a BSD-licensed inference framework, and its source code, sample projects and pre-trained models are available on-line: https://github.com/JDAI-CV/dabnn.

References

[1]
2018. Caffe2 deep learning framework. https://github.com/caffe2/caffe2
[2]
2018. TensorFlow Lite. https://www.tensorflow.org/mobile/tflite/
[3]
2019. Open Neural Network Exchange. https://github.com/onnx/onnx.
[4]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. http://tensorflow.org/ Software available from tensorflow.org.
[5]
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).
[6]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Jun 2016). https://doi.org/10.1109/cvpr.2016.90
[7]
Yuwei Hu, Jidong Zhai, Dinghua Li, Yifan Gong, Yuhao Zhu, Wei Liu, Lei Su, and Jiangming Jin. 2018. BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 244--253.
[8]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). ACM, New York, NY, USA, 675--678. https://doi.org/10.1145/2647868.2654889
[9]
Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, and Kwang-Ting Cheng. 2018. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In Proceedings of the European Conference on Computer Vision (ECCV). 722--737.
[10]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic Differentiation in PyTorch. In NIPS Autodiff Workshop.
[11]
Frank Seide and Amit Agarwal. 2016. CNTK: Microsoft's open-source deeplearning toolkit. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2135--2135.
[12]
Robert J Wang, Xiang Li, and Charles X Ling. 2018. Pelee: A real-time object detection system on mobile devices. In Advances in Neural Information Processing Systems. 1963--1972.
[13]
Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, and Xuanzhe Liu. 2019. A First Look at Deep Learning Apps on Smartphones. In The World Wide Web Conference. ACM, 2125--2136.
[14]
Haojin Yang, Martin Fritzsche, Christian Bartz, and Christoph Meinel. 2017. BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet. In Proceedings of the 2017 ACM on Multimedia Conference (MM '17). ACM, New York, NY, USA, 1209--1212. https://doi.org/10.1145/3123266.3129393
[15]
Tianli Zhao, Xiangyu He, Jian Cheng, and Jing Hu. 2018. BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs. In Proceedings of the 26th ACM International Conference on Multimedia (MM '18). ACM, New York, NY, USA, 1545--1552. https://doi.org/10. 1145/3240508.3240673

Cited By

View all
  • (2024)Optimizing Data Flow in Binary Neural NetworksSensors10.3390/s2415478024:15(4780)Online publication date: 23-Jul-2024
  • (2024)4.6-Bit Quantization for Fast and Accurate Neural Network Inference on CPUsMathematics10.3390/math1205065112:5(651)Online publication date: 23-Feb-2024
  • (2024)BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network PerformanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.324325935:8(10674-10686)Online publication date: Aug-2024
  • Show More Cited By

Index Terms

  1. daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '19: Proceedings of the 27th ACM International Conference on Multimedia
    October 2019
    2794 pages
    ISBN:9781450368896
    DOI:10.1145/3343031
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 October 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. binary neural networks
    2. machine learning
    3. open source

    Qualifiers

    • Short-paper

    Conference

    MM '19
    Sponsor:

    Acceptance Rates

    MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 31 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Optimizing Data Flow in Binary Neural NetworksSensors10.3390/s2415478024:15(4780)Online publication date: 23-Jul-2024
    • (2024)4.6-Bit Quantization for Fast and Accurate Neural Network Inference on CPUsMathematics10.3390/math1205065112:5(651)Online publication date: 23-Feb-2024
    • (2024)BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network PerformanceIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.324325935:8(10674-10686)Online publication date: Aug-2024
    • (2024)Ulit-BiDet: An Ultralightweight Object Detector for SAR Images Based on Binary Neural NetworksIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.337348862(1-21)Online publication date: 2024
    • (2024)ADC-Free ReRAM-Based In-Situ Accelerator for Energy-Efficient Binary Neural NetworksIEEE Transactions on Computers10.1109/TC.2022.322480073:2(353-365)Online publication date: Feb-2024
    • (2024)Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473822(115-120)Online publication date: 22-Jan-2024
    • (2024)BNN-SAM: Improving generalization of binary object detector by Seeking Flat MinimaApplied Intelligence10.1007/s10489-024-05512-z54:8(6682-6700)Online publication date: 22-May-2024
    • (2023)BiBenchProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619585(28351-28388)Online publication date: 23-Jul-2023
    • (2023)Exploiting Kernel Compression on BNNs2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137052(1-6)Online publication date: Apr-2023
    • (2023)SmartLite: A DBMS-Based Serving System for DNN Inference in Resource-Constrained EnvironmentsProceedings of the VLDB Endowment10.14778/3632093.363209517:3(278-291)Online publication date: 1-Nov-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media