Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3489517.3530496acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Open access

Memory-efficient training of binarized neural networks on the edge

Published: 23 August 2022 Publication History
  • Get Citation Alerts
  • Abstract

    A visionary computing paradigm is to train resource efficient neural networks on the edge using dedicated low-power accelerators instead of cloud infrastructures, eliminating communication overheads and privacy concerns. One promising resource-efficient approach for inference is binarized neural networks (BNNs), which binarize parameters and activations. However, training BNNs remains resource demanding. State-of-the-art BNN training methods, such as the binary optimizer (Bop), require to store and update a large number of momentum values in the floating point (FP) format.
    In this work, we focus on memory-efficient FP encodings for the momentum values in Bop. To achieve this, we first investigate the impact of arbitrary FP encodings. When the FP format is not properly chosen, we prove that the updates of the momentum values can be lost and the quality of training is therefore dropped. With the insights, we formulate a metric to determine the number of unchanged momentum values in a training iteration due to the FP encoding. Based on the metric, we develop an algorithm to find FP encodings that are more memory-efficient than the standard FP encodings. In our experiments, the memory usage in BNN training is decreased by factors 2.47x, 2.43x, 2.04x, depending on the BNN model, with minimal accuracy cost (smaller than 1%) compared to using 32-bit FP encoding.

    References

    [1]
    Hasan, R., Taha, T. M., and Yakopcic, C. On-chip training of memristor based deep neural networks. In 2017 International Joint Conference on Neural Networks (IJCNN) (2017).
    [2]
    Helwegen, K., Widdicombe, J., Geiger, L., Liu, Z., Cheng, K.-T., and Nusselder, R. Latent weights do not exist: Rethinking binarized neural network optimization. In Advances in Neural Information Processing Systems (NIPS) (2019).
    [3]
    Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., and Bengio, Y. Binarized neural networks. In Advances in Neural Information Processing Systems (NIPS) (2016).
    [4]
    Kalamkar, D., Mudigere, D., Mellempudi, N., Das, D., Banerjee, K., Avancha, S., Vooturi, D. T., Jammalamadaka, N., Huang, J., Yuen, H., Yang, J., Park, J., Heinecke, A., Georganas, E., Srinivasan, S., Kundu, A., Smelyanskiy, M., Kaul, B., and Dubey, P. A study of bfloat16 for deep learning training. arXiv:1905.12322 (2019).
    [5]
    Kukreja, N., Shilova, A., Beaumont, O., Huckelheim, J., Ferrier, N., Hovland, P., and Gorman, G. Training on the edge: The why and the how. In International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (2019).
    [6]
    Luo, C., Sit, M.-K., Fan, H., Liu, S., Luk, W., and Guo, C. Towards efficient deep neural network training by fpga-based batch-level parallelism. In Field-Programmable Custom Computing Machines (FCCM) (2019).
    [7]
    Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, (ICLR) (2015).
    [8]
    Sohoni, N. S., Aberger, C. R., Leszczynski, M., Zhang, J., and Ré, C. Low-memory neural network training: A technical report. arXiv: 1904.10631 (2019).
    [9]
    Strubell, E., Ganesh, A., and McCallum, A. Energy and policy considerations for modern deep learning research. AAAI Conference (2020).
    [10]
    Wang, E., Davis, J. J., Moro, D., Zielinski, P., Coelho, C., Chatterjee, S., Cheung, P. Y. K., and Constantinides, G. A. Enabling binary neural network training on the edge. arXiv: 2102.04270 (2021).
    [11]
    Wang, E., Davis, J. J., Moro, D., Zielinski, P., Lim, J. J., Coelho, C., Chatterjee, S., Cheung, P. Y. K., and Constantinides, G. A. Enabling binary neural network training on the edge. In International Workshop on Embedded and Mobile Deep Learning (EDML) (2021).
    [12]
    Yang, T.-J., Chen, Y.-H., Emer, J., and Sze, V. A method to estimate the energy consumption of deep neural networks. In Asilomar Conference on Signals, Systems, and Computers (2017).
    [13]
    Zhao, W., Fu, H., Luk, W., Yu, T., Wang, S., Feng, B., Ma, Y., and Yang, G. F-cnn: An fpga-based framework for training convolutional neural networks. In International Conference on Application-specific Systems, Architectures and Processors (ASAP) (2016).

    Cited By

    View all
    • (2023)Enabling Binary Neural Network Training on the EdgeACM Transactions on Embedded Computing Systems10.1145/362610022:6(1-19)Online publication date: 9-Nov-2023
    • (2023)FlexBNN: Fast Private Binary Neural Network Inference With Flexible Bit-WidthIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.326534218(2382-2397)Online publication date: 1-Jan-2023
    • (2023)FASS-pruner: customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shufflingCCF Transactions on High Performance Computing10.1007/s42514-023-00156-w5:3(292-303)Online publication date: 26-May-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference
    July 2022
    1462 pages
    ISBN:9781450391429
    DOI:10.1145/3489517
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 August 2022

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    • DFG OneMemory
    • SFB876A1

    Conference

    DAC '22
    Sponsor:
    DAC '22: 59th ACM/IEEE Design Automation Conference
    July 10 - 14, 2022
    California, San Francisco

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Upcoming Conference

    DAC '25
    62nd ACM/IEEE Design Automation Conference
    June 22 - 26, 2025
    San Francisco , CA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)172
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Enabling Binary Neural Network Training on the EdgeACM Transactions on Embedded Computing Systems10.1145/362610022:6(1-19)Online publication date: 9-Nov-2023
    • (2023)FlexBNN: Fast Private Binary Neural Network Inference With Flexible Bit-WidthIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.326534218(2382-2397)Online publication date: 1-Jan-2023
    • (2023)FASS-pruner: customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shufflingCCF Transactions on High Performance Computing10.1007/s42514-023-00156-w5:3(292-303)Online publication date: 26-May-2023
    • (2023)DAEBI: A Tool for Data Flow and Architecture Explorations of Binary Neural Network AcceleratorsEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-46077-7_8(107-122)Online publication date: 2-Jul-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media