Hyperparameters in Continual Learning: A Reality Check

Cha, Sungmin; Cho, Kyunghyun

Computer Science > Machine Learning

arXiv:2403.09066 (cs)

[Submitted on 14 Mar 2024 (v1), last revised 11 Oct 2024 (this version, v3)]

Title:Hyperparameters in Continual Learning: A Reality Check

Authors:Sungmin Cha, Kyunghyun Cho

View PDF HTML (experimental)

Abstract:Continual learning (CL) aims to train a model on a sequence of tasks (i.e., a CL scenario) while balancing the trade-off between plasticity (learning new tasks) and stability (retaining prior knowledge). The dominantly adopted conventional evaluation protocol for CL algorithms selects the best hyperparameters within a given scenario and then evaluates the algorithms using these hyperparameters in the same scenario. However, this protocol has significant shortcomings: it overestimates the CL capacity of algorithms and relies on unrealistic hyperparameter tuning, which is not feasible for real-world applications. From the fundamental principles of evaluation in machine learning, we argue that the evaluation of CL algorithms should focus on assessing the generalizability of their CL capacity to unseen scenarios. Based on this, we propose a revised two-phase evaluation protocol consisting of a hyperparameter tuning phase and an evaluation phase. Both phases share the same scenario configuration (e.g., number of tasks) but are generated from different datasets. Hyperparameters of CL algorithms are tuned in the first phase and applied in the second phase to evaluate the algorithms. We apply this protocol to class-incremental learning, both with and without pretrained models. Across more than 8,000 experiments, our results show that most state-of-the-art algorithms fail to replicate their reported performance, highlighting that their CL capacity has been significantly overestimated in the conventional evaluation protocol.

Comments:	Preprint
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.09066 [cs.LG]
	(or arXiv:2403.09066v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.09066

Submission history

From: Sungmin Cha [view email]
[v1] Thu, 14 Mar 2024 03:13:01 UTC (551 KB)
[v2] Thu, 15 Aug 2024 21:07:45 UTC (3,467 KB)
[v3] Fri, 11 Oct 2024 22:44:23 UTC (3,326 KB)

Computer Science > Machine Learning

Title:Hyperparameters in Continual Learning: A Reality Check

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Hyperparameters in Continual Learning: A Reality Check

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators