Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

AdaMEC: Towards a Context-adaptive and Dynamically Combinable DNN Deployment Framework for Mobile Edge Computing

Published: 07 December 2023 Publication History
  • Get Citation Alerts
  • Abstract

    With the rapid development of deep learning, recent research on intelligent and interactive mobile applications (e.g., health monitoring, speech recognition) has attracted extensive attention. And these applications necessitate the mobile edge computing scheme, i.e., offloading partial computation from mobile devices to edge devices for inference acceleration and transmission load reduction. The current practices have relied on collaborative DNN partition and offloading to satisfy the predefined latency requirements, which is intractable to adapt to the dynamic deployment context at runtime. AdaMEC, a context-adaptive and dynamically combinable DNN deployment framework, is proposed to meet these requirements for mobile edge computing, which consists of three novel techniques. First, once-for-all DNN pre-partition divides DNN at the primitive operator level and stores partitioned modules into executable files, defined as pre-partitioned DNN atoms. Second, context-adaptive DNN atom combination and offloading introduces a graph-based decision algorithm to quickly search the suitable combination of atoms and adaptively make the offloading plan under dynamic deployment contexts. Third, runtime latency predictor provides timely latency feedback for DNN deployment considering both DNN configurations and dynamic contexts. Extensive experiments demonstrate that AdaMEC outperforms state-of-the-art baselines in terms of latency reduction by up to 62.14% and average memory saving by 55.21%.

    References

    [1]
    Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, StevenJohnson, Kayvon Fatahalian, Frédo Durand, and Jonathan Ragan-Kelley. 2019. Learning to optimize halide with tree search and random programs. ACM Trans. Graph. 38, 4 (2019), 1–12.
    [2]
    Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5–32.
    [3]
    Sheng Chen, Yang Liu, Xiang Gao, and Zhen Han. 2018. MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices. In Chinese Conference on Biometric Recognition. Springer, 428–438.
    [4]
    Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. Learning to optimize tensor programs. Adv. Neural Inf. Process. Syst. 31 (2018).
    [5]
    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.
    [6]
    Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. 2020. Model compression and hardware acceleration for neural networks: A comprehensive survey. Proc. IEEE 108, 4 (2020), 485–532.
    [7]
    Charalampos Doukas and Ilias Maglogiannis. 2010. A fast mobile face recognition system for Android OS based on Eigenfaces decomposition. In IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, 295–302.
    [8]
    Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas Lane. 2020. BRP-NAS: Prediction-based NAS using GCNs. Adv. Neural Inf. Process. Syst. 33 (2020), 10480–10490.
    [9]
    Amir Erfan Eshratifar, Mohammad Saeed Abrishami, and Massoud Pedram. 2019. JointDNN: An efficient training and inference engine for intelligent mobile cloud computing services. IEEE Trans. Mob. Comput. 20, 2 (2019), 565–576.
    [10]
    Dawei Gao, Xiaoxi He, Zimu Zhou, Yongxin Tong, and Lothar Thiele. 2021. Pruning meta-trained networks for on-device adaptation. In ACM International Conference on Information & Knowledge Management. 514–523.
    [11]
    Xiaoxi He, Zimu Zhou, and Lothar Thiele. 2018. Multi-task zipping via layer-wise neuron sharing. In Adv. Neural Inf. Process. Syst.. 6019–6029.
    [12]
    Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. AMC: AutoML for model compression and acceleration on mobile devices. In European Conference on Computer Vision (ECCV’18). 784–800.
    [13]
    Yanzhang He, Tara N Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, AnjuliKannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-yiin Chang, Kanishka Rao, and Alexander Gruenstein. 2019. Streaming end-to-end speech recognition for mobile devices. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 6381–6385.
    [14]
    M. Shamim Hossain and Ghulam Muhammad. 2019. Emotion recognition using secure edge and cloud computing. Inf. Sci. 504 (2019), 589–601.
    [15]
    Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
    [16]
    Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. 2019. Dynamic adaptive DNN surgery for inference acceleration on the edge. In IEEE Conference on Computer Communications. IEEE, 1423–1431.
    [17]
    Jin Huang, Colin Samplawski, Deepak Ganesan, Benjamin Marlin, and Heesung Kwon. 2020. CLIO: Enabling automatic compilation of deep learning pipelines across IoT and cloud. In 26th Annual International Conference on Mobile Computing and Networking. 1–12.
    [18]
    Qianyi Huang, Zhice Yang, and Qian Zhang. 2018. Smart-U: Smart utensils know what you eat. In IEEE Conference on Computer Communications. IEEE, 1439–1447.
    [19]
    Chien-Chun Hung, Ganesh Ananthanarayanan, Peter Bodik, Leana Golubchik, Minlan Yu, Paramvir Bahl, and Matthai Philipose. 2018. VideoEdge: Processing camera streams using hierarchical clusters. In IEEE/ACM Symposium on Edge Computing (SEC’18). IEEE, 115–131.
    [20]
    Hyuk-Jin Jeong, Hyeon-Jae Lee, Chang Hyun Shin, and Soo-Mook Moon. 2018. IONN: Incremental offloading of neural network computations from mobile devices to edge servers. In ACM Symposium on Cloud Computing. 401–411.
    [21]
    Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 45, 1 (2017), 615–629.
    [22]
    D. Kavitha and S. Ravikumar. 2021. IOT and context-aware learning-based optimal neural network model for real-time health monitoring. Trans. Emerg. Telecommun. Technol. 32, 1 (2021), e4132.
    [23]
    Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186.
    [24]
    Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. 32–33. https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf
    [25]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012), 1097–1105.
    [26]
    Stefanos Laskaridis, Stylianos I. Venieris, Mario Almeida, Ilias Leontiadis, and Nicholas D. Lane. 2020. SPINN: Synergistic progressive inference of neural networks over device and cloud. In 26th Annual International Conference on Mobile Computing and Networking. 1–15.
    [27]
    Jangwon Lee, Jingya Wang, David Crandall, Selma Šabanović, and Geoffrey Fox. 2017. Real-time, cloud-based object detection for unmanned aerial vehicles. In 1st IEEE International Conference on Robotic Computing (IRC’17). IEEE, 36–43.
    [28]
    Youngmin Lee, Hongjin Yeh, Ki-Hyung Kim, and Okkyung Choi. 2018. A real-time fall detection system based on the acceleration sensor of smartphone. Int. J. Eng. Bus. Manag. 10 (2018), 1847979017750669.
    [29]
    En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2019. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wirel. Commun. 19, 1 (2019), 447–457.
    [30]
    Xinyu Li, Yanyi Zhang, Ivan Marsic, Aleksandra Sarcevic, and Randall S. Burd. 2016. Deep learning for RFID-based activity recognition. In 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. 164–175.
    [31]
    Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. DARTS: Differentiable architecture search. In International Conference on Learning Representations.
    [32]
    Sicong Liu, Bin Guo, Ke Ma, Zhiwen Yu, and Junzhao Du. 2021. AdaSpring: Context-adaptive and runtime-evolutionary deep model compression for mobile applications. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 5, 1 (2021), 1–22.
    [33]
    Pavel Mach and Zdenek Becvar. 2017. Mobile edge computing: A survey on architecture and computation offloading. IEEE Commun. Surv. Tutor. 19, 3 (2017), 1628–1656.
    [34]
    Jiachen Mao, Xiang Chen, Kent W. Nixon, Christopher Krieger, and Yiran Chen. 2017. MoDNN: Local distributed mobile computing system for deep neural network. In Design, Automation & Test in Europe Conference & Exhibition (DATE’17). IEEE, 1396–1401.
    [35]
    Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, and Khaled B. Letaief. 2017. A survey on mobile edge computing: The communication perspective. IEEE Commun. Surv. Tutor. 19, 4 (2017), 2322–2358.
    [36]
    Yoanna Martinez-Diaz, Miguel Nicolas-Diaz, Heydi Mendez-Vazquez, Luis S. Luevano, Leonardo Chang, Miguel Gonzalez-Mendoza, and Luis Enrique Sucar. 2021. Benchmarking lightweight face architectures on specific face recognition scenarios. Artif. Intell. Rev. 54, 8 (2021), 6201–6244.
    [37]
    Akhil Mathur, Nicholas D. Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource efficient local execution of multiple deep vision models using wearable commodity hardware. In 15th Annual International Conference on Mobile Systems, Applications, and Services. 68–81.
    [38]
    Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. 2019. Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks. In International Conference on Machine Learning. PMLR, 4505–4515.
    [39]
    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin,Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019).
    [40]
    Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition. 779–788.
    [41]
    Joe Saunders, Dag Sverre Syrdal, Kheng Lee Koay, Nathan Burke, and Kerstin Dautenhahn. 2015. “Teach me–show me”–End-user personalization of a smart home and companion robot. IEEE Trans. Hum.-mach. Sys. 46, 1 (2015), 27–40.
    [42]
    Liu Sicong, Zhou Zimu, Du Junzhao, Shangguan Longfei, Jun Han, and Xin Wang. 2017. UbiEar: Bringing location-independent sound awareness to the hard-of-hearing people with smartphones. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 1, 2 (2017), 1–21.
    [43]
    Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
    [44]
    Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition. 1–9.
    [45]
    Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. 2019. MnasNet: Platform-aware neural architecture search for mobile. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2820–2828.
    [46]
    Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In IEEE 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 328–339.
    [47]
    Hongli Wang, Bin Guo, Jiaqi Liu, Sicong Liu, Yungang Wu, and Zhiwen Yu. 2021. Context-aware adaptive surgery: A fast and effective framework for adaptative model partition. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 5, 3 (2021), 1–22.
    [48]
    Xiaofei Wang, Yiwen Han, Victor C. M. Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen. 2020. Convergence of edge computing and deep learning: A comprehensive survey. IEEE Commun. Surv. Tutor. 22, 2 (2020), 869–904.
    [49]
    Fan Wu, Taiyang Wu, and Mehmet Rasit Yuce. 2018. An internet-of-things (IoT) network system for connected safety and health monitoring applications. Sensors 19, 1 (2018), 21.
    [50]
    Junru Wu, Yue Wang, Zhenyu Wu, Zhangyang Wang, Ashok Veeraraghavan, and Yingyan Lin. 2018. Deep k-means: Re-training and parameter sharing with harder cluster assignments for compressing deep convolutions. In International Conference on Machine Learning. PMLR, 5363–5372.
    [51]
    Min Xue, Huaming Wu, Ruidong Li, Minxian Xu, and Pengfei Jiao. 2021. EosDNN: An efficient offloading scheme for DNN inference acceleration in local-edge-cloud collaborative environments. IEEE Trans. Green Commun. Netw. 6, 1 (2021), 248–264.
    [52]
    Min Xue, Huaming Wu, Guang Peng, and Katinka Wolter. 2021. DDPQN: An efficient DNN offloading strategy in local-edge-cloud collaborative environments. IEEE Trans. Serv. Comput. 15, 2 (2021), 640–655.
    [53]
    Santosh Kumar Yadav, Achleshwar Luthra, Kamlesh Tiwari, Hari Mohan Pandey, and Shaik Ali Akbar. 2022. ARFDNet: An efficient activity recognition & fall detection system using latent feature pooling. Knowl.-based Syst. 239 (2022), 107948.
    [54]
    Dixi Yao, Liyao Xiang, Zifan Wang, Jiayu Xu, Chao Li, and Xinbing Wang. 2021. Context-aware compilation of DNN training pipelines across edge and cloud. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 5, 4 (2021), 1–27.
    [55]
    Fisher Yu, Wenqi Xian, Yingying Chen, Fangchen Liu, Mike Liao, Vashisht Madhavan, and Trevor Darrell. 2018. BDD100K: A diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687
    [56]
    Sixing Yu, Arya Mazaheri, and Ali Jannesari. 2021. Auto graph encoder-decoder for neural network pruning. In IEEE/CVF International Conference on Computer Vision. 6362–6372.
    [57]
    Xiao Zeng, Biyi Fang, Haichen Shen, and Mi Zhang. 2020. Distream: Scaling live video analytics with workload-adaptive distributed edge intelligence. In 18th Conference on Embedded Networked Sensor Systems. 409–421.
    [58]
    Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua. 2018. LQ-Nets: Learned quantization for highly accurate and compact deep neural networks. In European Conference on Computer Vision (ECCV’18). 365–382.
    [59]
    Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. 2021. nn-Meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices. In 19th Annual International Conference on Mobile Systems, Applications, and Services. 81–93.
    [60]
    Shigeng Zhang, Yinggang Li, Xuan Liu, Song Guo, Weiping Wang, Jianxin Wang, Bo Ding, and Di Wu. 2020. Towards real-time cooperative deep inference over the cloud and edge end devices. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 4, 2 (2020), 1–24.
    [61]
    Pengpeng Zhao, Anjing Luo, Yanchi Liu, Fuzhen Zhuang, Jiajie Xu, Zhixu Li, Victor S. Sheng, and Xiaofang Zhou. 2020. Where to go next: A spatio-temporal gated network for next PoI recommendation. IEEE Trans. Knowl. Data Eng. 34, 5 (2020), 2512–2524.
    [62]
    Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. 2018. DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Trans. Comput.-aid. Des. Integ. Circ. Syst. 37, 11 (2018), 2348–2359.
    [63]
    Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. 2019. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107, 8 (2019), 1738–1762.
    [64]
    Yinhao Zhu and Nicholas Zabaras. 2018. Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. J. Comput. Phys. 366 (2018), 415–447.

    Cited By

    View all
    • (2024)RegionFilter: Region-aware video filtering mechanism on resource-constrained edge nodesComputer Networks10.1016/j.comnet.2024.110624251(110624)Online publication date: Oct-2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Sensor Networks
    ACM Transactions on Sensor Networks  Volume 20, Issue 1
    January 2024
    717 pages
    ISSN:1550-4859
    EISSN:1550-4867
    DOI:10.1145/3618078
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 07 December 2023
    Online AM: 30 October 2023
    Accepted: 19 October 2023
    Revised: 12 October 2023
    Received: 29 November 2022
    Published in TOSN Volume 20, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Context-adaptive
    2. DNN combination and offloading
    3. DNN partition
    4. edge intelligence

    Qualifiers

    • Research-article

    Funding Sources

    • National Key R&D Program of China
    • National Science Fund for Distinguished Young Scholars
    • National Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)307
    • Downloads (Last 6 weeks)21
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RegionFilter: Region-aware video filtering mechanism on resource-constrained edge nodesComputer Networks10.1016/j.comnet.2024.110624251(110624)Online publication date: Oct-2024

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media