Modularizing while Training: A New Paradigm for Modularizing DNN Models

Qi, Binhang; Sun, Hailong; Zhang, Hongyu; Zhao, Ruobing; Gao, Xiang

Computer Science > Machine Learning

arXiv:2306.09376 (cs)

[Submitted on 15 Jun 2023 (v1), last revised 5 Oct 2023 (this version, v3)]

Title:Modularizing while Training: A New Paradigm for Modularizing DNN Models

Authors:Binhang Qi, Hailong Sun, Hongyu Zhang, Ruobing Zhao, Xiang Gao

View PDF

Abstract:Deep neural network (DNN) models have become increasingly crucial components in intelligent software systems. However, training a DNN model is typically expensive in terms of both time and money. To address this issue, researchers have recently focused on reusing existing DNN models - borrowing the idea of code reuse in software engineering. However, reusing an entire model could cause extra overhead or inherits the weakness from the undesired functionalities. Hence, existing work proposes to decompose an already trained model into modules, i.e., modularizing-after-training, and enable module reuse. Since trained models are not built for modularization, modularizing-after-training incurs huge overhead and model accuracy loss. In this paper, we propose a novel approach that incorporates modularization into the model training process, i.e., modularizing-while-training (MwT). We train a model to be structurally modular through two loss functions that optimize intra-module cohesion and inter-module coupling. We have implemented the proposed approach for modularizing Convolutional Neural Network (CNN) models in this work. The evaluation results on representative models demonstrate that MwT outperforms the state-of-the-art approach. Specifically, the accuracy loss caused by MwT is only 1.13 percentage points, which is 1.76 percentage points less than that of the baseline. The kernel retention rate of the modules generated by MwT is only 14.58%, with a reduction of 74.31% over the state-of-the-art approach. Furthermore, the total time cost required for training and modularizing is only 108 minutes, half of the baseline.

Comments:	Accepted at ICSE'24
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
Cite as:	arXiv:2306.09376 [cs.LG]
	(or arXiv:2306.09376v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.09376

Submission history

From: Binhang Qi [view email]
[v1] Thu, 15 Jun 2023 07:45:43 UTC (1,476 KB)
[v2] Sat, 23 Sep 2023 09:08:12 UTC (2,165 KB)
[v3] Thu, 5 Oct 2023 10:44:36 UTC (2,165 KB)

Computer Science > Machine Learning

Title:Modularizing while Training: A New Paradigm for Modularizing DNN Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Modularizing while Training: A New Paradigm for Modularizing DNN Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators