Pruning-Aware Merging for Efficient Multitask Inference

He, Xiaoxi; Gao, Dawei; Zhou, Zimu; Tong, Yongxin; Thiele, Lothar

Computer Science > Machine Learning

arXiv:1905.09676 (cs)

[Submitted on 23 May 2019 (v1), last revised 28 May 2021 (this version, v2)]

Title:Pruning-Aware Merging for Efficient Multitask Inference

Authors:Xiaoxi He, Dawei Gao, Zimu Zhou, Yongxin Tong, Lothar Thiele

View PDF

Abstract:Many mobile applications demand selective execution of multiple correlated deep learning inference tasks on resource-constrained platforms. Given a set of deep neural networks, each pre-trained for a single task, it is desired that executing arbitrary combinations of tasks yields minimal computation cost. Pruning each network separately yields suboptimal computation cost due to task relatedness. A promising remedy is to merge the networks into a multitask network to eliminate redundancy across tasks before network pruning. However, pruning a multitask network combined by existing network merging schemes cannot minimise the computation cost of every task combination because they do not consider such a future pruning. To this end, we theoretically identify the conditions such that pruning a multitask network minimises the computation of all task combinations. On this basis, we propose Pruning-Aware Merging (PAM), a heuristic network merging scheme to construct a multitask network that approximates these conditions. The merged network is then ready to be further pruned by existing network pruning methods. Evaluations with different pruning schemes, datasets, and network architectures show that PAM achieves up to 4.87x less computation against the baseline without network merging, and up to 2.01x less computation against the baseline with a state-of-the-art network merging scheme.

Comments:	Accepted to KDD'21 as research track paper
Subjects:	Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1905.09676 [cs.LG]
	(or arXiv:1905.09676v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1905.09676

Submission history

From: Xiaoxi He [view email]
[v1] Thu, 23 May 2019 14:23:46 UTC (313 KB)
[v2] Fri, 28 May 2021 20:35:47 UTC (19,071 KB)

Computer Science > Machine Learning

Title:Pruning-Aware Merging for Efficient Multitask Inference

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Pruning-Aware Merging for Efficient Multitask Inference

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators