Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Li, Jiaqi; Lai, Yuanhao; Wang, Rui; Shui, Changjian; Sahoo, Sabyasachi; Ling, Charles X.; Yang, Shichun; Wang, Boyu; Gagné, Christian; Zhou, Fan

doi:10.1109/TKDE.2024.3419449

Computer Science > Machine Learning

arXiv:2311.15161 (cs)

[Submitted on 26 Nov 2023 (v1), last revised 21 Sep 2024 (this version, v5)]

Title:Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Authors:Jiaqi Li, Yuanhao Lai, Rui Wang, Changjian Shui, Sabyasachi Sahoo, Charles X. Ling, Shichun Yang, Boyu Wang, Christian Gagné, Fan Zhou

View PDF HTML (experimental)

Abstract:Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this work, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling the parameter transitions along the sequential tasks with the weight matrix transformation, we propose to apply the low-rank approximation on the task-adaptive parameters in each layer of the neural networks. Specifically, we theoretically demonstrate the quantitative relationship between the Hessian and the proposed low-rank approximation. The approximation ranks are then globally determined according to the marginal increment of the empirical loss estimated by the layer-specific gradient and low-rank approximation error. Furthermore, we control the model capacity by pruning less important parameters to diminish the parameter growth. We conduct extensive experiments on various benchmarks, including a dataset with large-scale tasks, and compare our method against some recent state-of-the-art methods to demonstrate the effectiveness and scalability of our proposed method. Empirical results show that our method performs better on different benchmarks, especially in achieving task order robustness and handling the forgetting issue. The source code is at this https URL.

Comments:	Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2311.15161 [cs.LG]
	(or arXiv:2311.15161v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2311.15161
Related DOI:	https://doi.org/10.1109/TKDE.2024.3419449

Submission history

From: Jiaqi Li [view email]
[v1] Sun, 26 Nov 2023 01:44:01 UTC (961 KB)
[v2] Thu, 4 Apr 2024 14:12:11 UTC (592 KB)
[v3] Mon, 24 Jun 2024 22:07:55 UTC (594 KB)
[v4] Sun, 7 Jul 2024 21:11:23 UTC (592 KB)
[v5] Sat, 21 Sep 2024 01:23:30 UTC (592 KB)

Computer Science > Machine Learning

Title:Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators