research-article

Context-aware Adaptive Surgery: A Fast and Effective Framework for Adaptative Model Partition

Authors:

Bin Guo,

Zhiwen YuAuthors Info & Claims

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 5, Issue 3

Article No.: 131, Pages 1 - 22

https://doi.org/10.1145/3478073

Published: 14 September 2021 Publication History

Get Access

Abstract

Deep Neural Networks (DNNs) have made massive progress in many fields and deploying DNNs on end devices has become an emerging trend to make intelligence closer to users. However, it is challenging to deploy large-scale and computation-intensive DNNs on resource-constrained end devices due to their small size and lightweight. To this end, model partition, which aims to partition DNNs into multiple parts to realize the collaborative computing of multiple devices, has received extensive research attention. To find the optimal partition, most existing approaches need to run from scratch under given resource constraints. However, they ignore that resources of devices (e.g., storage, battery power), and performance requirements (e.g., inference latency), are often continuously changing, making the optimal partition solution change constantly during processing. Therefore, it is very important to reduce the tuning latency of model partition to realize the real-time adaption under the changing processing context. To address these problems, we propose the Context-aware Adaptive Surgery (CAS) framework to actively perceive the changing processing context, and adaptively find the appropriate partition solution in real-time. Specifically, we construct the partition state graph to comprehensively model different partition solutions of DNNs by import context resources. Then "the neighbor effect" is proposed, which provides the heuristic rule for the search process. When the processing context changes, CAS adopts the runtime search algorithm, Graph-based Adaptive DNN Surgery (GADS), to quickly find the appropriate partition that satisfies resource constraints under the guidance of the neighbor effect. The experimental results show that CAS realizes adaptively rapid tuning of the model partition solutions in 10ms scale even for large DNNs (2.25x to 221.7x search time improvement than the state-of-the-art researches), and the total inference latency still keeps the same level with baselines.

References

[1]

Bandar Almaslukh, Jalal AlMuhtadi, and Abdelmonim Artoli. 2017. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur 17, 4 (2017), 160--165.

Abstract

References

Cited By

Index Terms

Recommendations

AdaMEC: Towards a Context-adaptive and Dynamically Combinable DNN Deployment Framework for Mobile Edge Computing

Context-aware aspects

Context modeling and inference system for heterogeneous context aware service

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations