Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3437984.3458833acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity

Published: 26 April 2021 Publication History

Abstract

Recent advances in deep learning allow on-demand reduction of model complexity, without a need for re-training, thus enabling a dynamic trade-off between the inference accuracy and the energy savings. Approximate mobile computing, on the other hand, adapts the computation approximation level as the context of usage, and consequently the computation needs or result accuracy needs, vary. In this work, we propose a synergy between the two directions and develop a context-aware method for dynamically adjusting the width of an on-device neural network based on the input and context-dependent classification confidence. We implement our method on a human activity recognition neural network and through measurements on a real-world embedded device demonstrate that such a network would save up to 37.8% energy and induce only 1% loss of accuracy, if used for continuous activity monitoring in the field of elderly care.

References

[1]
Monsoon Solutions high voltage power monitor. http://msoon.github.io/powermonitor/HVPM.html
[2]
D. Anguita, A. Ghio, L. Oneto, X. Parra, and Jorge Luis Reyes-Ortiz. 2013. A Public Domain Dataset for Human Activity Recognition using Smartphones. In ESANN.
[3]
Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep?. In Advances in neural information processing systems. 2654--2662.
[4]
Konstantin Berestizshevsky and Guy Even. 2019. Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence. In Int. Conf. on Artificial Neural Networks. Springer, 306--320.
[5]
Tolga Bolukbasi, Joseph Wang, Ofer Dekel, and Venkatesh Saligrama. 2017. Adaptive neural networks for efficient inference. In Int. Conf. on Machine Learning (ICML). Sydney, Australia.
[6]
Hadi Esmaeilzadeh, Emily Blem, Renee St Amant, Karthikeyan Sankaralingam, and Doug Burger. 2011. Dark silicon and the end of multicore scaling. In IEEE/ACM Int. Symp. on Computer Architecture (ISCA). San Jose, CA, USA.
[7]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Int. Conf. on Machine Learning (ICML). New York City, NY, USA.
[8]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International Conference on Machine Learning. PMLR, 1321--1330.
[9]
Song Han, Huizi Mao, and William J Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In Int. Conf. on Learning Representations (ICLR). San Juan, Pureto Rico.
[10]
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In ACM MobiSys. Singapore, Singapore.
[11]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop. Montreal, Canada.
[12]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[13]
Roua Jabla, Félix Buendía, Maha Khemaja, and Sami Faiz. 2019. Balancing Timing and Accuracy Requirements in Human Activity Recognition Mobile Applications. In Multidisciplinary Digital Publishing Institute Proceedings, Vol. 31. 15.
[14]
Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In ACM/IEEE Int. Conf. on Information Processing in Sensor Networks (IPSN). Vienna, Austria.
[15]
Stefanos Laskaridis, Stylianos I. Venieris, Mario Almeida, Ilias Leontiadis, and Nicholas D. Lane. 2020. SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud. ACM, New York, NY, USA.
[16]
Seulki Lee and Shahriar Nirjon. 2020. Fast and scalable in-memory deep multitask learning via neural weight virtualization. In ACM MobiSys. Cyberspace.
[17]
Tailin Liang, John Glossner, Lei Wang, and Shaobo Shi. 2021. Pruning and Quantization for Deep Neural Network Acceleration: A Survey. arXiv:2101.09671 [cs.CV]
[18]
Hiroya Maeda, Yoshihide Sekimoto, Toshikazu Seto, Takehiro Kashiyama, and Hiroshi Omata. 2018. Road damage detection and classification using deep neural networks with smartphone images. Computer-Aided Civil and Infrastructure Eng. 33, 12 (2018), 1127--1141.
[19]
Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, and Bin Ren. 2020. Patdnn: Achieving real-time DNN execution on mobile devices with pattern-based weight pruning. In ACM ASPLOS. Cyberspace.
[20]
M. Nutter, C. H. Crawford, and J. Ortiz. 2018. Design of Novel Deep Learning Models for Real-time Human Activity Recognition with Mobile Phones. In 2018 Int. Joint Conf. on Neural Networks (IJCNN). 1--8.
[21]
Stylianos Paraschiakos, Ricardo Cachucho, Matthijs Moed, Diana van Heemst, Simon Mooijaart, Eline P Slagboom, Arno Knobbe, and Marian Beekman. 2020. Activity recognition using wearable sensors for tracking the elderly. User Modeling and User-Adapted Interaction 30, 3 (2020), 567--605.
[22]
Veljko Pejović. 2019. Towards approximate mobile computing. GetMobile: Mobile Computing and Communications 22, 4 (2019), 9--12.
[23]
Jorge-L Reyes-Ortiz, Luca Oneto, Albert Samà, Xavier Parra, and Davide Anguita. 2016. Transition-aware human activity recognition using smartphones. Neurocomputing 171 (2016), 754--767.
[24]
Andreas Schuler and Gabriele Anderst-Kotsis. 2019. Examining the energy impact of sorting algorithms on Android: an empirical study. In Proceedings of the 16th EAI Int. Conf. on Mobile and Ubiquitous Systems: Computing, Networking and Services (Houston, Texas) (MobiQuitous '19). ACM, NY, USA, 404--413.
[25]
Meiqi Wang, Jianqiao Mo, Jun Lin, Zhongfeng Wang, and Li Du. 2019. DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks. In 2019 IEEE Int. Workshop on Sign. Proc. Sys. (SiPS). IEEE, 178--183.
[26]
Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In IEEE CVPR. Las Vegas, NV, USA.
[27]
Jian Xue, Jinyu Li, and Yifan Gong. 2013. Restructuring of deep neural network acoustic models with singular value decomposition. In Interspeech. Lyon, France.
[28]
Kang Yang, Tianzhang Xing, Yang Liu, Zhenjiang Li, Xiaoqing Gong, Xiaojiang Chen, and Dingyi Fang. 2019. cDeepArch: A compact deep neural network architecture for mobile sensing. IEEE/ACM Transactions on Networking 27, 5 (2019), 2043--2055.
[29]
Shuochao Yao, Yiran Zhao, Huajie Shao, ShengZhong Liu, Dongxin Liu, Lu Su, and Tarek Abdelzaher. 2018. Fastdeepiot: Towards understanding and optimizing neural network execution time on mobile and embedded devices. In ACM SenSys. Shenzhen, China.
[30]
Haichao Yu, Haoxiang Li, Honghui Shi, Thomas S Huang, Gang Hua, et al. 2020. Any-Precision Deep Neural Networks. European Journal of Artificial Intelligence 1, 1 (2020) (2020), 10--37686.
[31]
Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas Huang. 2019. Slimmable neural networks. In Int. Conf. on Learning Representations (ICLR). New Orleans, LA, USA.
[32]
Zhihang Yuan, Xin Liu, Bingzhe Wu, and Guangyu Sun. 2020. ENAS4D: Efficient Multi-stage CNN Architecture Search for Dynamic Inference. arXiv preprint arXiv:2009.09182 (2020).

Cited By

View all
  • (2024)Mobiprox: Supporting Dynamic Approximate Computing on MobilesIEEE Internet of Things Journal10.1109/JIOT.2024.336595711:9(16873-16886)Online publication date: 1-May-2024
  • (2024)Context-Based Adaptation of Neural Network Compression for Unmanned Aerial Vehicle (UAV) Weed Detection2024 IEEE 20th International Conference on Intelligent Computer Communication and Processing (ICCP)10.1109/ICCP63557.2024.10792992(1-7)Online publication date: 17-Oct-2024
  • (2024)Energy-aware human activity recognition for wearable devices: A comprehensive reviewPervasive and Mobile Computing10.1016/j.pmcj.2024.101976104(101976)Online publication date: Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroMLSys '21: Proceedings of the 1st Workshop on Machine Learning and Systems
April 2021
130 pages
ISBN:9781450382984
DOI:10.1145/3437984
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

EuroSys '21
Sponsor:

Acceptance Rates

EuroMLSys '21 Paper Acceptance Rate 18 of 26 submissions, 69%;
Overall Acceptance Rate 18 of 26 submissions, 69%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mobiprox: Supporting Dynamic Approximate Computing on MobilesIEEE Internet of Things Journal10.1109/JIOT.2024.336595711:9(16873-16886)Online publication date: 1-May-2024
  • (2024)Context-Based Adaptation of Neural Network Compression for Unmanned Aerial Vehicle (UAV) Weed Detection2024 IEEE 20th International Conference on Intelligent Computer Communication and Processing (ICCP)10.1109/ICCP63557.2024.10792992(1-7)Online publication date: 17-Oct-2024
  • (2024)Energy-aware human activity recognition for wearable devices: A comprehensive reviewPervasive and Mobile Computing10.1016/j.pmcj.2024.101976104(101976)Online publication date: Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media