Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Context-aware Adaptive Surgery: A Fast and Effective Framework for Adaptative Model Partition

Published: 14 September 2021 Publication History

Abstract

Deep Neural Networks (DNNs) have made massive progress in many fields and deploying DNNs on end devices has become an emerging trend to make intelligence closer to users. However, it is challenging to deploy large-scale and computation-intensive DNNs on resource-constrained end devices due to their small size and lightweight. To this end, model partition, which aims to partition DNNs into multiple parts to realize the collaborative computing of multiple devices, has received extensive research attention. To find the optimal partition, most existing approaches need to run from scratch under given resource constraints. However, they ignore that resources of devices (e.g., storage, battery power), and performance requirements (e.g., inference latency), are often continuously changing, making the optimal partition solution change constantly during processing. Therefore, it is very important to reduce the tuning latency of model partition to realize the real-time adaption under the changing processing context. To address these problems, we propose the Context-aware Adaptive Surgery (CAS) framework to actively perceive the changing processing context, and adaptively find the appropriate partition solution in real-time. Specifically, we construct the partition state graph to comprehensively model different partition solutions of DNNs by import context resources. Then "the neighbor effect" is proposed, which provides the heuristic rule for the search process. When the processing context changes, CAS adopts the runtime search algorithm, Graph-based Adaptive DNN Surgery (GADS), to quickly find the appropriate partition that satisfies resource constraints under the guidance of the neighbor effect. The experimental results show that CAS realizes adaptively rapid tuning of the model partition solutions in 10ms scale even for large DNNs (2.25x to 221.7x search time improvement than the state-of-the-art researches), and the total inference latency still keeps the same level with baselines.

References

[1]
Bandar Almaslukh, Jalal AlMuhtadi, and Abdelmonim Artoli. 2017. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur 17, 4 (2017), 160--165.
[2]
Vincent Becker, Linus Fessler, and Gábor Sörös. 2019. GestEar: combining audio and motion sensing for gesture recognition on smartwatches. In Proceedings of the 23rd International Symposium on Wearable Computers. 10--19.
[3]
Alfredo Canziani, Adam Paszke, and Eugenio Culurciello. 2016. An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016).
[4]
Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. ACM SIGARCH Computer Architecture News 44, 3 (2016), 367--379.
[5]
Ryan Rey M Daga and Proceso L Fernandez. 2016. kd Tree-Segmented Block Truncation Coding for Image Compression. In MATEC Web of Conferences, Vol. 56. EDP Sciences, 02007.
[6]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[7]
Amir Erfan Eshratifar, Mohammad Saeed Abrishami, and Massoud Pedram. 2019. JointDNN: an efficient training and inference engine for intelligent mobile cloud computing services. IEEE Transactions on Mobile Computing (2019).
[8]
Tyler Giallanza, Travis Siems, Elena Smith, Erik Gabrielsen, Ian Johnson, Mitchell A Thornton, and Eric C Larson. 2019. Keyboard Snooping from Mobile Phone Arrays with Mixed Convolutional and Recurrent Neural Networks. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 2 (2019), 1--22.
[9]
Gregory Griffin, Alex Holub, and Pietro Perona. 2007. Caltech-256 object category dataset. (2007).
[10]
Gongde Guo, Hui Wang, David Bell, Yaxin Bi, and Kieran Greer. 2003. KNN model-based approach in classification. In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems". Springer, 986--996.
[11]
Kiana Hajebi, Yasin Abbasi-Yadkori, Hossein Shahbazi, and Hong Zhang. 2011. Fast approximate nearest-neighbor search with k-nearest neighbor graph. In Twenty-Second International Joint Conference on Artificial Intelligence.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[13]
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. 2019. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1314--1324.
[14]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[15]
Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. 2019. Dynamic adaptive DNN surgery for inference acceleration on the edge. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 1423--1431.
[16]
Junxian Huang, Feng Qian, Alexandre Gerber, Z Morley Mao, Subhabrata Sen, and Oliver Spatscheck. 2012. A close examination of performance and power characteristics of 4G LTE networks. In Proceedings of the 10th international conference on Mobile systems, applications, and services. 225--238.
[17]
Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Computer Architecture News 45, 1 (2017), 615--629.
[18]
Shailandra Kaushik. 2012. An overview of technical aspect for WiFi networks technology. International Journal of Electronics and Computer Science Engineering (IJECSE, ISSN: 2277-1956) 1, 01 (2012), 28--34.
[19]
Jong Hwan Ko, Taesik Na, Mohammad Faisal Amir, and Saibal Mukhopadhyay. 2018. Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms. In 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1--6.
[20]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[21]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012), 1097--1105.
[22]
Boning Li and Akane Sano. 2020. Extraction and interpretation of deep autoencoder-based temporal features from wearables for forecasting personalized mood, health, and stress. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 2 (2020), 1--26.
[23]
Bailin Li, Bowen Wu, Jiang Su, and Guangrun Wang. 2020. Eagleeye: Fast sub-net evaluation for efficient neural network pruning. In European Conference on Computer Vision. Springer, 639--654.
[24]
En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2019. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 19, 1 (2019), 447--457.
[25]
Hongshan Li, Chenghao Hu, Jingyan Jiang, Zhi Wang, Yonggang Wen, and Wenwu Zhu. 2018. Jalad: Joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution. In 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 671--678.
[26]
Xinyu Li, Yanyi Zhang, Ivan Marsic, Aleksandra Sarcevic, and Randall S Burd. 2016. Deep learning for rfid-based activity recognition. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. 164--175.
[27]
Sicong Liu, Yingyan Lin, Zimu Zhou, Kaiming Nan, Hui Liu, and Junzhao Du. 2018. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. 389--400.
[28]
Sicong Liu, Zimu Zhou, Junzhao Du, Longfei Shangguan, Jun Han, and Xin Wang. 2017. UbiEar: Bringing location-independent sound awareness to the hard-of-hearing people with smartphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 2 (2017), 17--1.
[29]
Jiachen Mao, Xiang Chen, Kent W Nixon, Christopher Krieger, and Yiran Chen. 2017. Modnn: Local distributed mobile computing system for deep neural network. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. IEEE, 1396--1401.
[30]
Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. 2020. Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 854--863.
[31]
Dhaval Parmar and Timothy Bickmore. 2020. Making It Personal: Addressing Individual Audience Members in Oral Presentations Using Augmented Reality. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 2 (2020), 1--22.
[32]
Zhe Peng, Shang Gao, Zecheng Li, Bin Xiao, and Yi Qian. 2018. Vehicle safety improvement through deep learning and mobile sensing. IEEE network 32, 4 (2018), 28--33.
[33]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779--788.
[34]
Maxim Shevtsov, Alexei Soupikov, and Alexander Kapustin. 2007. Highly parallel fast KD-tree construction for interactive ray tracing of dynamic scenes. In Computer Graphics Forum, Vol. 26. Wiley Online Library, 395--404.
[35]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[36]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.
[37]
Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 328--339.
[38]
Stylianos I Venieris and Christos-Savvas Bouganis. 2017. Latency-driven design for FPGA-based convolutional neural networks. In 2017 27th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--8.
[39]
Peisong Wang, Xiangyu He, Gang Li, Tianli Zhao, and Jian Cheng. 2020. Sparsity-inducing binarized neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12192--12199.
[40]
Xiaofei Wang, Yiwen Han, Victor CM Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen. 2020. Convergence of edge computing and deep learning: A comprehensive survey. IEEE Communications Surveys & Tutorials 22, 2 (2020), 869--904.
[41]
Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin. 2020. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2246--2251.
[42]
Yukang Yan, Chun Yu, Xin Yi, and Yuanchun Shi. 2018. Headgesture: hands-free input approach leveraging head movements for hmd devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1--23.
[43]
Fisher Yu, Wenqi Xian, Yingying Chen, Fangchen Liu, Mike Liao, Vashisht Madhavan, and Trevor Darrell. 2018. Bdd100k: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 2, 5 (2018), 6.
[44]
Wei Zeng and Theo Gevers. 2018. 3DContextNet: Kd Tree Guided Hierarchical Learning of Point Clouds Using Local and Global Contextual Cues. In European Conference on Computer Vision. Springer, 314--330.
[45]
Chaoyun Zhang, Paul Patras, and Hamed Haddadi. 2019. Deep learning in mobile and wireless networking: A survey. IEEE Communications Surveys & Tutorials 21, 3 (2019), 2224--2287.
[46]
Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. 2018. DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2348--2359.
[47]
Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. 2019. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107, 8 (2019), 1738--1762.

Cited By

View all
  • (2024)RFSpy: Eavesdropping on Online Conversations with Out-of-Vocabulary Words by Sensing Metal Coil Vibration of Headsets Leveraging RFIDProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661887(169-182)Online publication date: 3-Jun-2024
  • (2024)F2Key: Dynamically Converting Your Face into a Private Key Based on COTS Headphones for Reliable Voice InteractionProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661860(127-140)Online publication date: 3-Jun-2024
  • (2024)UbiPhysioProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435528:1(1-27)Online publication date: 6-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 5, Issue 3
Sept 2021
1443 pages
EISSN:2474-9567
DOI:10.1145/3486621
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 September 2021
Published in IMWUT Volume 5, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Adaptive model partition
  2. Collaborative model computing
  3. Context perception
  4. Edge intelligence

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)132
  • Downloads (Last 6 weeks)16
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)RFSpy: Eavesdropping on Online Conversations with Out-of-Vocabulary Words by Sensing Metal Coil Vibration of Headsets Leveraging RFIDProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661887(169-182)Online publication date: 3-Jun-2024
  • (2024)F2Key: Dynamically Converting Your Face into a Private Key Based on COTS Headphones for Reliable Voice InteractionProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661860(127-140)Online publication date: 3-Jun-2024
  • (2024)UbiPhysioProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435528:1(1-27)Online publication date: 6-Mar-2024
  • (2024)EarSlideProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435158:1(1-29)Online publication date: 6-Mar-2024
  • (2024)HyperHARProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435118:1(1-29)Online publication date: 6-Mar-2024
  • (2024)AFaceProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435108:1(1-33)Online publication date: 6-Mar-2024
  • (2024)VesNet: A Vessel Network for Jointly Learning Route Pattern and Future TrajectoryACM Transactions on Intelligent Systems and Technology10.1145/363937015:2(1-25)Online publication date: 28-Mar-2024
  • (2024)EarSEProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314477:4(1-33)Online publication date: 12-Jan-2024
  • (2024)TS2ACTProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314457:4(1-22)Online publication date: 12-Jan-2024
  • (2024)CAvatarProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314247:4(1-24)Online publication date: 12-Jan-2024
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media