Abstract
Continual learning enables learning systems to adapt to evolving data distributions by sequentially acquiring knowledge from a series of tasks. Unsupervised lifelong learning refers to the ability to learn over time while memorizing previous patterns without supervision. However, the prior methods in this field heavily rely on supervised or reinforcement learning, which necessitates annotated data, thereby limiting their scalability in real-world applications where data is often biased and lacks annotations. To overcome these challenges, this work introduces a novel approach called Lifelong Learning gets better with MixUp and Unsupervised Continual Representation (LL-UCR). LL-UCR aims to learn feature representations from unlabeled tasks, eliminating the need for annotated data. Within the LL-UCR framework, two innovative techniques are introduced: LL-MixUp, which mitigates catastrophic forgetting by interpolating samples between current and previous tasks, and Dark Experience Replay (DER) Buzzega et al. (Adv Neural Inf Process Syst, 33, 15920–15930 2020) adapted for UCR, aligning network logits across tasks. To overcome buffer size limitations in replay-based methods, the Retrospective Adversarial Replay (RAR) framework is incorporated, facilitating diverse replay sample generation. Through systematic analysis, we demonstrate that unsupervised visual representations exhibit remarkable resilience to catastrophic forgetting, consistently outperforming supervised methods in terms of performance and generalization on out-of-distribution tasks. Furthermore, our qualitative analysis reveals that LL-UCR fosters a smoother loss landscape and acquires meaningful feature representations. Extensive experimental evaluations conducted on diverse datasets validate the superior performance of LL-UCR compared to state-of-the-art supervised continual learning methods and the unsupervised LUMP Madaan et al. (International conference on learning representations, 2020) method, effectively mitigating catastrophic forgetting.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Buzzega P, Boschini M, Porrello A, Abati D, Calderara S (2020) Dark experience for general continual learning: a strong, simple baseline. Adv Neural Inf Process Syst 33:15920–15930
Madaan D, Yoon J, Li Y, Liu Y, Hwang SJ (2022) Representational continuity for unsupervised continual learning. In: International conference on learning representations
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890
Delange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Mach Intell
Thrun S (1995) A lifelong learning perspective for mobile robot control. In: Intelligent robots and systems, Elsevier, pp 201–214
McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: The sequential learning problem. Elsevier 24:109–165
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. In: Proceedings of the national academy of sciences 114(13):3521–3526
Zenke F, Poole B, Ganguli S (2017) Continual learning through synaptic intelligence. In: International conference on machine learning, PMLR, pp 3987–3995
Yoon J, Yang E, Lee J, Hwang SJ (2018) Lifelong learning with dynamically expandable networks. In: International conference on learning representations
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607
Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15750–15758
Zbontar J, Jing L, Misra I, LeCun Y, Deny S (2021) Barlow twins: Self-supervised learning via redundancy reduction. In: International conference on machine learning, PMLR, pp 12310–12320
Kumari L, Wang S, Zhou T, Bilmes J (2022) Retrospective adversarial replay for continual learning. In: Advances in neural information processing systems
Rolnick D, Ahuja A, Schwarz J, Lillicrap T, Wayne G (2019) Experience replay for continual learning. In: Advances in neural information processing systems 32
Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Machine Intell 40(12):2935–2947
Schwarz J, Czarnecki W, Luketina J, Grabska-Barwinska A, Teh YW, Pascanu R, Hadsell R (2018) Progress & compress: A scalable framework for continual learning. In: International Conference on Machine Learning, PMLR, pp 4528–4537
Ahn H, Cha S, Lee D, Moon T (2019) Uncertainty-based continual learning with adaptive regularization. In: Advances in neural information processing systems 32
Huszár F (2018) Note on the quadratic penalties in elastic weight consolidation. Proceed National Academy Sci 115(11):2496–2497
Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2001–2010
Riemer M, Cases I, Ajemian R, Liu M, Rish I, Tu Y, Tesauro G (2019) Learning to learn without forgetting by maximizing transfer and minimizing interference. In: International conference on learning representations
Wang L, Zhang X, Yang K, Yu L, Li C, HONG L, Zhang S, Li Z, Zhong Y, Zhu J (2022) Memory replay with data compression for continual learning. In: International conference on learning representations
Aljundi R, Lin M, Goujaud B, Bengio Y (2019) Gradient based sample selection for online continual learning. In: Advances in neural information processing systems 32
Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2019) Efficient lifelong learning with a-gem. In: International conference on learning representations
Chaudhry A, Gordo A, Dokania P, Torr P, Lopez-Paz D (2021) Using hindsight to anchor past knowledge in continual learning. Proceedings of the AAAI conference on artificial intelligence 35:6993–7001
Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks
Liu Y, Wu X, Bo Y, Zheng Z, Yin M (2023) Incremental learning without looking back: a neural connection relocation approach. Neural Comput Appl 35(19):14093–14107
Xu J, Zhu Z (2018) Reinforced continual learning. In: Advances in neural information processing systems 31
Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems 33:21271–21284
Lin Z, Wang Y, Lin H (2022) Continual contrastive learning for image classification. In: 2022 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1–6
Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1993) Signature verification using a “siamese" time delay neural network. In: Advances in neural information processing systems 6
Pfeifer B, Holzinger A, Schimek MG (2022) Robust random forest-based all-relevant feature ranks for trustworthy ai. Stud Health Technol Inform 294:137–138
Huo J, Zyl TL (2023) Incremental class learning using variational autoencoders with similarity learning. Neural Comput Appl 1–16
Rao D, Visin F, Rusu A, Pascanu R, Teh YW, Hadsell R (2019) Continual unsupervised representation learning. In: Advances in neural information processing systems 32
Fini E, Da Costa VGT, Alameda-Pineda X, Ricci E, Alahari K, Mairal J (2022) Self-supervised models are continual learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9621–9630
Yu X, Rosing T, Guo Y (2024) Evolve: Enhancing unsupervised continual learning with multiple experts. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2366–2377
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) mixup: Beyond empirical risk minimization. In: International conference on learning representations
Zhang L, Deng Z, Kawaguchi K, Ghorbani A, Zou J (2021) How does mixup help with robustness and generalization? In: International conference on learning representations
Hinton G, Vinyals O, Dean J (2014) Dark knowledge. Presented as the keynote in BayLearn 2(2)
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, IEEE, pp 248–255
Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742
De Lange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Machine Intell 44(7):3366–3385
Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. In: Advances in neural information processing systems 30
Yin H, Molchanov P, Alvarez JM, Li Z, Mallya A, Hoiem D, Jha NK, Kautz J (2020) Dreaming to distill: Data-free knowledge transfer via deepinversion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8715–8724
Kornblith S, Norouzi M, Lee H, Hinton G (2019) Similarity of neural network representations revisited. In: International conference on machine learning, PMLR, pp 3519–3529
Author information
Authors and Affiliations
Contributions
Prashant Kumar: Conceptualization, Methodology, Validation, Writing - original draft. Durga Toshniwal: Writing - review & editing.
Corresponding author
Ethics declarations
Competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical and informed consent for data used
In this paper, the dataset names are mentioned clearly, and it is stated that these datasets are publicly available. Additionally, it is stated that no ethical approval or informed consent was required for the usage of these datasets.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
kumar, P., Toshniwal, D. Lifelong learning gets better with MixUp and unsupervised continual representation. Appl Intell 54, 5235–5252 (2024). https://doi.org/10.1007/s10489-024-05434-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05434-w