research-article

AdaMEC: Towards a Context-adaptive and Dynamically Combinable DNN Deployment Framework for Mobile Edge Computing

Authors:

Zhiwen YuAuthors Info & Claims

ACM Transactions on Sensor Networks, Volume 20, Issue 1

Article No.: 21, Pages 1 - 28

https://doi.org/10.1145/3630098

Published: 07 December 2023 Publication History

Abstract

With the rapid development of deep learning, recent research on intelligent and interactive mobile applications (e.g., health monitoring, speech recognition) has attracted extensive attention. And these applications necessitate the mobile edge computing scheme, i.e., offloading partial computation from mobile devices to edge devices for inference acceleration and transmission load reduction. The current practices have relied on collaborative DNN partition and offloading to satisfy the predefined latency requirements, which is intractable to adapt to the dynamic deployment context at runtime. AdaMEC, a context-adaptive and dynamically combinable DNN deployment framework, is proposed to meet these requirements for mobile edge computing, which consists of three novel techniques. First, once-for-all DNN pre-partition divides DNN at the primitive operator level and stores partitioned modules into executable files, defined as pre-partitioned DNN atoms. Second, context-adaptive DNN atom combination and offloading introduces a graph-based decision algorithm to quickly search the suitable combination of atoms and adaptively make the offloading plan under dynamic deployment contexts. Third, runtime latency predictor provides timely latency feedback for DNN deployment considering both DNN configurations and dynamic contexts. Extensive experiments demonstrate that AdaMEC outperforms state-of-the-art baselines in terms of latency reduction by up to 62.14% and average memory saving by 55.21%.

References

[1]

Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, StevenJohnson, Kayvon Fatahalian, Frédo Durand, and Jonathan Ragan-Kelley. 2019. Learning to optimize halide with tree search and random programs. ACM Trans. Graph. 38, 4 (2019), 1–12.

[2]

Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5–32.

Digital Library

[3]

Sheng Chen, Yang Liu, Xiang Gao, and Zhen Han. 2018. MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices. In Chinese Conference on Biometric Recognition. Springer, 428–438.

[4]

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. Learning to optimize tensor programs. Adv. Neural Inf. Process. Syst. 31 (2018).

[5]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.

[6]

Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. 2020. Model compression and hardware acceleration for neural networks: A comprehensive survey. Proc. IEEE 108, 4 (2020), 485–532.

[7]

Charalampos Doukas and Ilias Maglogiannis. 2010. A fast mobile face recognition system for Android OS based on Eigenfaces decomposition. In IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, 295–302.

[8]

Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas Lane. 2020. BRP-NAS: Prediction-based NAS using GCNs. Adv. Neural Inf. Process. Syst. 33 (2020), 10480–10490.

[9]

Amir Erfan Eshratifar, Mohammad Saeed Abrishami, and Massoud Pedram. 2019. JointDNN: An efficient training and inference engine for intelligent mobile cloud computing services. IEEE Trans. Mob. Comput. 20, 2 (2019), 565–576.

Digital Library

[10]

Dawei Gao, Xiaoxi He, Zimu Zhou, Yongxin Tong, and Lothar Thiele. 2021. Pruning meta-trained networks for on-device adaptation. In ACM International Conference on Information & Knowledge Management. 514–523.

[11]

Xiaoxi He, Zimu Zhou, and Lothar Thiele. 2018. Multi-task zipping via layer-wise neuron sharing. In Adv. Neural Inf. Process. Syst.. 6019–6029.

[12]

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. AMC: AutoML for model compression and acceleration on mobile devices. In European Conference on Computer Vision (ECCV’18). 784–800.

[13]

Yanzhang He, Tara N Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, AnjuliKannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-yiin Chang, Kanishka Rao, and Alexander Gruenstein. 2019. Streaming end-to-end speech recognition for mobile devices. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 6381–6385.

[14]

M. Shamim Hossain and Ghulam Muhammad. 2019. Emotion recognition using secure edge and cloud computing. Inf. Sci. 504 (2019), 589–601.

Digital Library

[15]

Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

[16]

Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. 2019. Dynamic adaptive DNN surgery for inference acceleration on the edge. In IEEE Conference on Computer Communications. IEEE, 1423–1431.

[17]

Jin Huang, Colin Samplawski, Deepak Ganesan, Benjamin Marlin, and Heesung Kwon. 2020. CLIO: Enabling automatic compilation of deep learning pipelines across IoT and cloud. In 26th Annual International Conference on Mobile Computing and Networking. 1–12.

[18]

Qianyi Huang, Zhice Yang, and Qian Zhang. 2018. Smart-U: Smart utensils know what you eat. In IEEE Conference on Computer Communications. IEEE, 1439–1447.

[19]

Chien-Chun Hung, Ganesh Ananthanarayanan, Peter Bodik, Leana Golubchik, Minlan Yu, Paramvir Bahl, and Matthai Philipose. 2018. VideoEdge: Processing camera streams using hierarchical clusters. In IEEE/ACM Symposium on Edge Computing (SEC’18). IEEE, 115–131.

[20]

Hyuk-Jin Jeong, Hyeon-Jae Lee, Chang Hyun Shin, and Soo-Mook Moon. 2018. IONN: Incremental offloading of neural network computations from mobile devices to edge servers. In ACM Symposium on Cloud Computing. 401–411.

[21]

Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 45, 1 (2017), 615–629.

Digital Library

[22]

D. Kavitha and S. Ravikumar. 2021. IOT and context-aware learning-based optimal neural network model for real-time health monitoring. Trans. Emerg. Telecommun. Technol. 32, 1 (2021), e4132.

Digital Library

[23]

Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186.

[24]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. 32–33. https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf

[25]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012), 1097–1105.

Digital Library

[26]

Stefanos Laskaridis, Stylianos I. Venieris, Mario Almeida, Ilias Leontiadis, and Nicholas D. Lane. 2020. SPINN: Synergistic progressive inference of neural networks over device and cloud. In 26th Annual International Conference on Mobile Computing and Networking. 1–15.

[27]

Jangwon Lee, Jingya Wang, David Crandall, Selma Šabanović, and Geoffrey Fox. 2017. Real-time, cloud-based object detection for unmanned aerial vehicles. In 1st IEEE International Conference on Robotic Computing (IRC’17). IEEE, 36–43.

[28]

Youngmin Lee, Hongjin Yeh, Ki-Hyung Kim, and Okkyung Choi. 2018. A real-time fall detection system based on the acceleration sensor of smartphone. Int. J. Eng. Bus. Manag. 10 (2018), 1847979017750669.

[29]

En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2019. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wirel. Commun. 19, 1 (2019), 447–457.

[30]

Xinyu Li, Yanyi Zhang, Ivan Marsic, Aleksandra Sarcevic, and Randall S. Burd. 2016. Deep learning for RFID-based activity recognition. In 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. 164–175.

Digital Library

[31]

Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. DARTS: Differentiable architecture search. In International Conference on Learning Representations.

[32]

Sicong Liu, Bin Guo, Ke Ma, Zhiwen Yu, and Junzhao Du. 2021. AdaSpring: Context-adaptive and runtime-evolutionary deep model compression for mobile applications. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 5, 1 (2021), 1–22.

Digital Library

[33]

Pavel Mach and Zdenek Becvar. 2017. Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun. Surv. Tutor. 19, 3 (2017), 1628–1656.

Digital Library

[34]

Jiachen Mao, Xiang Chen, Kent W. Nixon, Christopher Krieger, and Yiran Chen. 2017. MoDNN: Local distributed mobile computing system for deep neural network. In Design, Automation & Test in Europe Conference & Exhibition (DATE’17). IEEE, 1396–1401.

[35]

Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, and Khaled B. Letaief. 2017. A survey on mobile edge computing: The communication perspective. IEEE Commun. Surv. Tutor. 19, 4 (2017), 2322–2358.

[36]

Yoanna Martinez-Diaz, Miguel Nicolas-Diaz, Heydi Mendez-Vazquez, Luis S. Luevano, Leonardo Chang, Miguel Gonzalez-Mendoza, and Luis Enrique Sucar. 2021. Benchmarking lightweight face architectures on specific face recognition scenarios. Artif. Intell. Rev. 54, 8 (2021), 6201–6244.

Digital Library

[37]

Akhil Mathur, Nicholas D. Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource efficient local execution of multiple deep vision models using wearable commodity hardware. In 15th Annual International Conference on Mobile Systems, Applications, and Services. 68–81.

Digital Library

[38]

Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. 2019. Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks. In International Conference on Machine Learning. PMLR, 4505–4515.

[39]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin,Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019).

[40]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition. 779–788.

[41]

Joe Saunders, Dag Sverre Syrdal, Kheng Lee Koay, Nathan Burke, and Kerstin Dautenhahn. 2015. “Teach me–show me”–End-user personalization of a smart home and companion robot. IEEE Trans. Hum.-mach. Sys. 46, 1 (2015), 27–40.

[42]

Liu Sicong, Zhou Zimu, Du Junzhao, Shangguan Longfei, Jun Han, and Xin Wang. 2017. UbiEar: Bringing location-independent sound awareness to the hard-of-hearing people with smartphones. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 1, 2 (2017), 1–21.

Digital Library

[43]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[44]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition. 1–9.

[45]

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. 2019. MnasNet: Platform-aware neural architecture search for mobile. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2820–2828.

[46]

Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In IEEE 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 328–339.

[47]

Hongli Wang, Bin Guo, Jiaqi Liu, Sicong Liu, Yungang Wu, and Zhiwen Yu. 2021. Context-aware adaptive surgery: A fast and effective framework for adaptative model partition. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 5, 3 (2021), 1–22.

Digital Library

[48]

Xiaofei Wang, Yiwen Han, Victor C. M. Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen. 2020. Convergence of edge computing and deep learning: A comprehensive survey. IEEE Commun. Surv. Tutor. 22, 2 (2020), 869–904.

[49]

Fan Wu, Taiyang Wu, and Mehmet Rasit Yuce. 2018. An internet-of-things (IoT) network system for connected safety and health monitoring applications. Sensors 19, 1 (2018), 21.

[50]

Junru Wu, Yue Wang, Zhenyu Wu, Zhangyang Wang, Ashok Veeraraghavan, and Yingyan Lin. 2018. Deep k-means: Re-training and parameter sharing with harder cluster assignments for compressing deep convolutions. In International Conference on Machine Learning. PMLR, 5363–5372.

[51]

Min Xue, Huaming Wu, Ruidong Li, Minxian Xu, and Pengfei Jiao. 2021. EosDNN: An efficient offloading scheme for DNN inference acceleration in local-edge-cloud collaborative environments. IEEE Trans. Green Commun. Netw. 6, 1 (2021), 248–264.

[52]

Min Xue, Huaming Wu, Guang Peng, and Katinka Wolter. 2021. DDPQN: An efficient DNN offloading strategy in local-edge-cloud collaborative environments. IEEE Trans. Serv. Comput. 15, 2 (2021), 640–655.

[53]

Santosh Kumar Yadav, Achleshwar Luthra, Kamlesh Tiwari, Hari Mohan Pandey, and Shaik Ali Akbar. 2022. ARFDNet: An efficient activity recognition & fall detection system using latent feature pooling. Knowl.-based Syst. 239 (2022), 107948.

Digital Library

[54]

Dixi Yao, Liyao Xiang, Zifan Wang, Jiayu Xu, Chao Li, and Xinbing Wang. 2021. Context-aware compilation of DNN training pipelines across edge and cloud. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 5, 4 (2021), 1–27.

Digital Library

[55]

Fisher Yu, Wenqi Xian, Yingying Chen, Fangchen Liu, Mike Liao, Vashisht Madhavan, and Trevor Darrell. 2018. BDD100K: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687

[56]

Sixing Yu, Arya Mazaheri, and Ali Jannesari. 2021. Auto graph encoder-decoder for neural network pruning. In IEEE/CVF International Conference on Computer Vision. 6362–6372.

[57]

Xiao Zeng, Biyi Fang, Haichen Shen, and Mi Zhang. 2020. Distream: Scaling live video analytics with workload-adaptive distributed edge intelligence. In 18th Conference on Embedded Networked Sensor Systems. 409–421.

Digital Library

[58]

Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua. 2018. LQ-Nets: Learned quantization for highly accurate and compact deep neural networks. In European Conference on Computer Vision (ECCV’18). 365–382.

[59]

Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. 2021. nn-Meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices. In 19th Annual International Conference on Mobile Systems, Applications, and Services. 81–93.

Digital Library

[60]

Shigeng Zhang, Yinggang Li, Xuan Liu, Song Guo, Weiping Wang, Jianxin Wang, Bo Ding, and Di Wu. 2020. Towards real-time cooperative deep inference over the cloud and edge end devices. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 4, 2 (2020), 1–24.

Digital Library

[61]

Pengpeng Zhao, Anjing Luo, Yanchi Liu, Fuzhen Zhuang, Jiajie Xu, Zhixu Li, Victor S. Sheng, and Xiaofang Zhou. 2020. Where to go next: A spatio-temporal gated network for next PoI recommendation. IEEE Trans. Knowl. Data Eng. 34, 5 (2020), 2512–2524.

[62]

Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. 2018. DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Trans. Comput.-aid. Des. Integ. Circ. Syst. 37, 11 (2018), 2348–2359.

[63]

Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. 2019. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107, 8 (2019), 1738–1762.

[64]

Yinhao Zhu and Nicholas Zabaras. 2018. Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. J. Comput. Phys. 366 (2018), 415–447.

Digital Library

Cited By

Yuan QLi Z(2025)Distributed Inference Models and Algorithms for Heterogeneous Edge Systems Using Deep LearningApplied Sciences10.3390/app1503109715:3(1097)Online publication date: 22-Jan-2025
https://doi.org/10.3390/app15031097
Chen TBu YZeng YXie LLu S(2024)RegionFilter: Region-aware video filtering mechanism on resource-constrained edge nodesComputer Networks10.1016/j.comnet.2024.110624251(110624)Online publication date: Sep-2024
https://doi.org/10.1016/j.comnet.2024.110624

Index Terms

AdaMEC: Towards a Context-adaptive and Dynamically Combinable DNN Deployment Framework for Mobile Edge Computing
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Cooperation and coordination
2. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing systems and tools
    2. Ubiquitous and mobile computing theory, concepts and paradigms
      1. Ubiquitous computing

Recommendations

A privacy protection approach in edge-computing based on maximized dnn partition strategy with energy saving
Abstract
With the development of deep neural network (DNN) techniques, applications of DNNs show state-of-art performance. In the cloud edge collaborative mode, edge devices upload the raw data, such as texts, images, and videos, to the cloud for ...
Joint Optimization of DNN Partition and Scheduling for Mobile Cloud Computing
ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

Reducing the inference time of Deep Neural Networks (DNNs) is critical when running time sensitive applications on mobile devices. Existing research has shown that partitioning a DNN and offloading a part of its computation to cloud servers can reduce ...
Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy
MECOMM'18: Proceedings of the 2018 Workshop on Mobile Edge Communications

As the backbone technology of machine learning, deep neural networks (DNNs) have have quickly ascended to the spotlight. Running DNNs on resource-constrained mobile devices is, however, by no means trivial, since it incurs high performance and energy ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Sensor Networks

ACM Transactions on Sensor Networks Volume 20, Issue 1

January 2024

717 pages

EISSN:1550-4867

DOI:10.1145/3618078

Editor:
Yunhao Liu
Tsinghua University, China

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 07 December 2023

Online AM: 30 October 2023

Accepted: 19 October 2023

Revised: 12 October 2023

Received: 29 November 2022

Published in TOSN Volume 20, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China
National Science Fund for Distinguished Young Scholars
National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
476
Total Downloads

Downloads (Last 12 months)346
Downloads (Last 6 weeks)28

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yuan QLi Z(2025)Distributed Inference Models and Algorithms for Heterogeneous Edge Systems Using Deep LearningApplied Sciences10.3390/app1503109715:3(1097)Online publication date: 22-Jan-2025
https://doi.org/10.3390/app15031097
Chen TBu YZeng YXie LLu S(2024)RegionFilter: Region-aware video filtering mechanism on resource-constrained edge nodesComputer Networks10.1016/j.comnet.2024.110624251(110624)Online publication date: Sep-2024
https://doi.org/10.1016/j.comnet.2024.110624

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents