Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Lifelong learning gets better with MixUp and unsupervised continual representation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Continual learning enables learning systems to adapt to evolving data distributions by sequentially acquiring knowledge from a series of tasks. Unsupervised lifelong learning refers to the ability to learn over time while memorizing previous patterns without supervision. However, the prior methods in this field heavily rely on supervised or reinforcement learning, which necessitates annotated data, thereby limiting their scalability in real-world applications where data is often biased and lacks annotations. To overcome these challenges, this work introduces a novel approach called Lifelong Learning gets better with MixUp and Unsupervised Continual Representation (LL-UCR). LL-UCR aims to learn feature representations from unlabeled tasks, eliminating the need for annotated data. Within the LL-UCR framework, two innovative techniques are introduced: LL-MixUp, which mitigates catastrophic forgetting by interpolating samples between current and previous tasks, and Dark Experience Replay (DER) Buzzega et al. (Adv Neural Inf Process Syst, 33, 15920–15930 2020) adapted for UCR, aligning network logits across tasks. To overcome buffer size limitations in replay-based methods, the Retrospective Adversarial Replay (RAR) framework is incorporated, facilitating diverse replay sample generation. Through systematic analysis, we demonstrate that unsupervised visual representations exhibit remarkable resilience to catastrophic forgetting, consistently outperforming supervised methods in terms of performance and generalization on out-of-distribution tasks. Furthermore, our qualitative analysis reveals that LL-UCR fosters a smoother loss landscape and acquires meaningful feature representations. Extensive experimental evaluations conducted on diverse datasets validate the superior performance of LL-UCR compared to state-of-the-art supervised continual learning methods and the unsupervised LUMP Madaan et al. (International conference on learning representations, 2020) method, effectively mitigating catastrophic forgetting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability and access

We used publicly available datasets: split CIFAR-10 [42], split CIFAR-100 [42], and Split Tiny-ImageNet [3] datasets. As these datasets are publicly accessible, no ethical approval or informed consent was required.

References

  1. Buzzega P, Boschini M, Porrello A, Abati D, Calderara S (2020) Dark experience for general continual learning: a strong, simple baseline. Adv Neural Inf Process Syst 33:15920–15930

    Google Scholar 

  2. Madaan D, Yoon J, Li Y, Liu Y, Hwang SJ (2022) Representational continuity for unsupervised continual learning. In: International conference on learning representations

  3. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  4. Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463

  5. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890

  6. Delange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Mach Intell

  7. Thrun S (1995) A lifelong learning perspective for mobile robot control. In: Intelligent robots and systems, Elsevier, pp 201–214

  8. McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: The sequential learning problem. Elsevier 24:109–165

    Google Scholar 

  9. Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. In: Proceedings of the national academy of sciences 114(13):3521–3526

  10. Zenke F, Poole B, Ganguli S (2017) Continual learning through synaptic intelligence. In: International conference on machine learning, PMLR, pp 3987–3995

  11. Yoon J, Yang E, Lee J, Hwang SJ (2018) Lifelong learning with dynamically expandable networks. In: International conference on learning representations

  12. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738

  13. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607

  14. Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15750–15758

  15. Zbontar J, Jing L, Misra I, LeCun Y, Deny S (2021) Barlow twins: Self-supervised learning via redundancy reduction. In: International conference on machine learning, PMLR, pp 12310–12320

  16. Kumari L, Wang S, Zhou T, Bilmes J (2022) Retrospective adversarial replay for continual learning. In: Advances in neural information processing systems

  17. Rolnick D, Ahuja A, Schwarz J, Lillicrap T, Wayne G (2019) Experience replay for continual learning. In: Advances in neural information processing systems 32

  18. Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Machine Intell 40(12):2935–2947

    Article  Google Scholar 

  19. Schwarz J, Czarnecki W, Luketina J, Grabska-Barwinska A, Teh YW, Pascanu R, Hadsell R (2018) Progress & compress: A scalable framework for continual learning. In: International Conference on Machine Learning, PMLR, pp 4528–4537

  20. Ahn H, Cha S, Lee D, Moon T (2019) Uncertainty-based continual learning with adaptive regularization. In: Advances in neural information processing systems 32

  21. Huszár F (2018) Note on the quadratic penalties in elastic weight consolidation. Proceed National Academy Sci 115(11):2496–2497

    Article  Google Scholar 

  22. Rebuffi S-A, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2001–2010

  23. Riemer M, Cases I, Ajemian R, Liu M, Rish I, Tu Y, Tesauro G (2019) Learning to learn without forgetting by maximizing transfer and minimizing interference. In: International conference on learning representations

  24. Wang L, Zhang X, Yang K, Yu L, Li C, HONG L, Zhang S, Li Z, Zhong Y, Zhu J (2022) Memory replay with data compression for continual learning. In: International conference on learning representations

  25. Aljundi R, Lin M, Goujaud B, Bengio Y (2019) Gradient based sample selection for online continual learning. In: Advances in neural information processing systems 32

  26. Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2019) Efficient lifelong learning with a-gem. In: International conference on learning representations

  27. Chaudhry A, Gordo A, Dokania P, Torr P, Lopez-Paz D (2021) Using hindsight to anchor past knowledge in continual learning. Proceedings of the AAAI conference on artificial intelligence 35:6993–7001

    Article  Google Scholar 

  28. Rusu AA, Rabinowitz NC, Desjardins G, Soyer H, Kirkpatrick J, Kavukcuoglu K, Pascanu R, Hadsell R (2016) Progressive neural networks

  29. Liu Y, Wu X, Bo Y, Zheng Z, Yin M (2023) Incremental learning without looking back: a neural connection relocation approach. Neural Comput Appl 35(19):14093–14107

    Article  Google Scholar 

  30. Xu J, Zhu Z (2018) Reinforced continual learning. In: Advances in neural information processing systems 31

  31. Grill J-B, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, Doersch C, Avila Pires B, Guo Z, Gheshlaghi Azar M et al (2020) Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems 33:21271–21284

    Google Scholar 

  32. Lin Z, Wang Y, Lin H (2022) Continual contrastive learning for image classification. In: 2022 IEEE International conference on multimedia and expo (ICME), IEEE, pp 1–6

  33. Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1993) Signature verification using a “siamese" time delay neural network. In: Advances in neural information processing systems 6

  34. Pfeifer B, Holzinger A, Schimek MG (2022) Robust random forest-based all-relevant feature ranks for trustworthy ai. Stud Health Technol Inform 294:137–138

    Google Scholar 

  35. Huo J, Zyl TL (2023) Incremental class learning using variational autoencoders with similarity learning. Neural Comput Appl 1–16

  36. Rao D, Visin F, Rusu A, Pascanu R, Teh YW, Hadsell R (2019) Continual unsupervised representation learning. In: Advances in neural information processing systems 32

  37. Fini E, Da Costa VGT, Alameda-Pineda X, Ricci E, Alahari K, Mairal J (2022) Self-supervised models are continual learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9621–9630

  38. Yu X, Rosing T, Guo Y (2024) Evolve: Enhancing unsupervised continual learning with multiple experts. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2366–2377

  39. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) mixup: Beyond empirical risk minimization. In: International conference on learning representations

  40. Zhang L, Deng Z, Kawaguchi K, Ghorbani A, Zou J (2021) How does mixup help with robustness and generalization? In: International conference on learning representations

  41. Hinton G, Vinyals O, Dean J (2014) Dark knowledge. Presented as the keynote in BayLearn 2(2)

  42. Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images

  43. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, IEEE, pp 248–255

  44. Wu Z, Xiong Y, Yu SX, Lin D (2018) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742

  45. De Lange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Machine Intell 44(7):3366–3385

    Google Scholar 

  46. Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. In: Advances in neural information processing systems 30

  47. Yin H, Molchanov P, Alvarez JM, Li Z, Mallya A, Hoiem D, Jha NK, Kautz J (2020) Dreaming to distill: Data-free knowledge transfer via deepinversion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8715–8724

  48. Kornblith S, Norouzi M, Lee H, Hinton G (2019) Similarity of neural network representations revisited. In: International conference on machine learning, PMLR, pp 3519–3529

Download references

Author information

Authors and Affiliations

Authors

Contributions

Prashant Kumar: Conceptualization, Methodology, Validation, Writing - original draft. Durga Toshniwal: Writing - review & editing.

Corresponding author

Correspondence to Durga Toshniwal.

Ethics declarations

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and informed consent for data used

In this paper, the dataset names are mentioned clearly, and it is stated that these datasets are publicly available. Additionally, it is stated that no ethical approval or informed consent was required for the usage of these datasets.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

kumar, P., Toshniwal, D. Lifelong learning gets better with MixUp and unsupervised continual representation. Appl Intell 54, 5235–5252 (2024). https://doi.org/10.1007/s10489-024-05434-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05434-w

Keywords