Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Distilled Meta-learning for Multi-Class Incremental Learning

Published: 15 March 2023 Publication History

Abstract

Meta-learning approaches have recently achieved promising performance in multi-class incremental learning. However, meta-learners still suffer from catastrophic forgetting, i.e., they tend to forget the learned knowledge from the old tasks when they focus on rapidly adapting to the new classes of the current task. To solve this problem, we propose a novel distilled meta-learning (DML) framework for multi-class incremental learning that integrates seamlessly meta-learning with knowledge distillation in each incremental stage. Specifically, during inner-loop training, knowledge distillation is incorporated into the DML to overcome catastrophic forgetting. During outer-loop training, a meta-update rule is designed for the meta-learner to learn across tasks and quickly adapt to new tasks. By virtue of the bilevel optimization, our model is encouraged to reach a balance between the retention of old knowledge and the learning of new knowledge. Experimental results on four benchmark datasets demonstrate the effectiveness of our proposal and show that our method significantly outperforms other state-of-the-art incremental learning methods.

References

[1]
Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV’18), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 144–161.
[2]
Yoshua Bengio, Samy Bengio, and Jocelyn Cloutier. 1990. Learning a Synaptic Learning Rule. Citeseer.
[3]
F. M. Castro, M. J. Marín-Jiménez, N. Guil, C. Schmid, and K. Alahari. 2018. End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision. 241–257.
[4]
Arslan Chaudhry, Puneet K. Dokania, Thalaiyasingam Ajanthan, and Philip H. S. Torr. 2018. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision (ECCV’18). 532–547.
[5]
Meng Cheng, Hanli Wang, and Yu Long. 2021. Meta-learning-based incremental few-shot object detection. IEEE Trans Circ. Syst. Vid. Technol. 32, 4 (2021), 2158–2169.
[6]
Zhixiang Chi, Li Gu, Huan Liu, Yang Wang, Yuanhao Yu, and Jin Tang. 2022. MetaFSCIL: A meta-learning approach for few-shot class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14166–14175.
[7]
J. Cichon and Wen Biao Gan. 2015. Branch-specific dendritic Ca2+ spikes cause persistent synaptic plasticity. Nature 520, 7546 (2015), 180–185.
[8]
M.-A. Ranzato and D. Lopez-Paz. 2017. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems. 6470–6479.
[9]
P. Dhar, R. V. Singh, K. Peng, Z. Wu, and R. Chellappa. 2019. Learning without memorizing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 5133–5141. DOI:
[10]
Gregory Ditzler, Manuel Roveri, Cesare Alippi, and Robi Polikar. 2015. Learning in nonstationary environments: A survey. IEEE Comput. Intell. Mag. 10, 4 (2015), 12–25.
[11]
Arthur Douillard, Yifu Chen, Arnaud Dapogny, and Matthieu Cord. 2021. PLOP: Learning without forgetting for continual semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 4040–4050.
[12]
G. Hinton, O. Vinyals, and J. Dean. 2015. Distilling the knowledge in a neural network. Comput. Sci. 14, 7 (2015), 38–39.
[13]
Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2019. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 831–839. DOI:
[14]
Khurram Javed and Martha White. 2019. Meta-learning representations for continual learning. In Advances in Neural Information Processing Systems, Vol. 32. 1–15.
[15]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. 2017. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. U.S.A. 114, 13 (March2017), 3521–3526. DOI:
[16]
Joseph Kj, Jathushan Rajasegaran, Salman Khan, Fahad Shahbaz Khan, and Vineeth N. Balasubramanian. 2021. Incremental object detection via meta-learning. IEEE Trans. Pattern Anal. Mach. Intell. (2021), 1–11.
[17]
A. Krizhevsky and G. Hinton. 2009. Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical Report 1 (012009).
[18]
S. Lange and G. Grieser. 2002. On the power of incremental learning. Theor. Comput. Sci. 2, 288 (2002), 277–307.
[19]
Y. Lecun and C. Cortes. 2010. The mnist Database of Handwritten Digits. Retrieved from http://yann.lecun.com/exdb/mnist/.
[20]
Z. Li and D. Hoiem. 2017. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40, 12 (2017), 2935–2947.
[21]
Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2019. On the variance of the adaptive learning rate and beyond. arXiv:1908.03265. Retrieved from https://arxiv.org/abs/1908.03265.
[22]
X. Liu, M. Masana, L. Herranz, J. Van de Weijer, A. M. López, and A. D. Bagdanov. 2018. Rotate your Networks: Better weight consolidation and less catastrophic forgetting. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). 2262–2268. DOI:
[23]
Yu Liu, Xiaopeng Hong, Xiaoyu Tao, Songlin Dong, Jingang Shi, and Yihong Gong. 2021. Structural knowledge organization and transfer for class-incremental learning. In Proceedings of the ACM Multimedia Asia (MMAsia’21). Association for Computing Machinery, New York, NY, Article 18, 7 pages. DOI:
[24]
Yaoyao Liu, Bernt Schiele, and Qianru Sun. 2021. Adaptive aggregation networks for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2544–2553.
[25]
Yaoyao Liu, Yuting Su, An-An Liu, Bernt Schiele, and Qianru Sun. 2020. Mnemonics Training: Multi-class incremental learning without forgetting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 12245–12254. DOI:
[26]
Michael McCloskey and Neal J. Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation. Elsevier, 109–165. DOI:
[27]
Martial Mermillod, Aurélia Bugaiska, and Patrick BONIN. 2013. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects. Front. Psychol. 4, 504 (2013), 1–3. DOI:
[28]
M. D. Muhlbaier, A. Topalis, and R. Polikar. 2009. Learn \(^{++}\) .NC: Combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. IEEE Trans. Neural Netw. 20, 1 (2009), 152–168.
[29]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. 1–9.
[30]
Alex Nichol, Joshua Achiam, and John Schulman. 2018. On first-order meta-learning algorithms. arXiv:1803.02999. Retrieved from https://arxiv.org/abs/1803.02999.
[31]
Alex Nichol and John Schulman. 2018. Reptile: A scalable metalearning algorithm. arXiv:1803.02999. Retrieved from https://arxiv.org/abs/1803.02999.
[32]
Jathushan Rajasegaran, Munawar Hayat, Salman Khan, Fahad Shahbaz Khan, and Ling Shao. 2019. Random path selection for incremental learning. Advances in Neural Information Processing Systems (2019), 1–11.
[33]
Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Mubarak Shah. 2020. iTAML: An incremental task-agnostic meta-learning approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 13588–13597. DOI:
[34]
S. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert. 2017. iCaRL: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5533–5542. DOI:
[35]
S. Ruping. 2001. Incremental learning with support vector machines. In Proceedings of the IEEE International Conference on Data Mining. 641–642. DOI:
[36]
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, and M. Bernstein. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211–252.
[37]
Jürgen Schmidhuber. 1992. Learning to control fast-weight memories: An alternative to dynamic recurrent networks. Neural Comput. 4, 1 (1992), 131–139.
[38]
Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. 2290–2999.
[39]
Zhen Tan, Kaize Ding, Ruocheng Guo, and Huan Liu. 2022. Graph few-shot class-incremental learning. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 987–996.
[40]
Kai Wang, Xialei Liu, Andrew D. Bagdanov, Luis Herranz, Shangling Jui, and Joost van de Weijer. 2022. Incremental meta-learning via episodic replay distillation for few-shot image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3729–3739.
[41]
Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, and Y. Fu. 2019. Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 374–382. DOI:
[42]
Dongbao Yang, Yu Zhou, Wei Shi, Dayan Wu, and Weiping Wang. 2022. RD-IOD: Two-level residual-distillation-based triple-network for incremental object detection. ACM Trans. Multimedia Comput. Commun. Appl. 18, 1 (2022), 1–23.
[43]
F. Zenke, B. Poole, and S. Ganguli. 2017. Continual learning through synaptic intelligence. Int. Conf. Mach. Learn. 70 (2017), 3987–3995.
[44]
Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry Heck, Heming Zhang, and C.-C. Jay Kuo. 2020. Class-incremental learning via deep model consolidation. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’20). IEEE, 1131–1140. DOI:
[45]
Da-Wei Zhou, Han-Jia Ye, and De-Chuan Zhan. 2021. Co-transport for class-incremental learning. In Proceedings of the 29th ACM International Conference on Multimedia. 1645–1654.

Cited By

View all
  • (2024)A Low-Density Parity-Check Coding Scheme for LoRa NetworkingACM Transactions on Sensor Networks10.1145/366592820:4(1-29)Online publication date: 8-Jul-2024
  • (2024)Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664654Online publication date: 13-May-2024
  • (2024)Blind Quality Assessment of Dense 3D Point Clouds with Structure Guided ResamplingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366419920:8(1-21)Online publication date: 13-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 4
July 2023
263 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3582888
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2023
Online AM: 17 January 2023
Accepted: 07 December 2022
Revised: 17 October 2022
Received: 08 June 2022
Published in TOMM Volume 19, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Incremental learning
  2. meta-learning
  3. knowledge distillation
  4. catastrophic forgetting
  5. stability–plasticity dilemma

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Natural Science Foundation of Jiangsu Province

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)272
  • Downloads (Last 6 weeks)17
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Low-Density Parity-Check Coding Scheme for LoRa NetworkingACM Transactions on Sensor Networks10.1145/366592820:4(1-29)Online publication date: 8-Jul-2024
  • (2024)Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664654Online publication date: 13-May-2024
  • (2024)Blind Quality Assessment of Dense 3D Point Clouds with Structure Guided ResamplingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366419920:8(1-21)Online publication date: 13-Jun-2024
  • (2024)Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366357020:8(1-19)Online publication date: 13-Jun-2024
  • (2024)Progressive Adapting and Pruning: Domain-Incremental Learning for Saliency PredictionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366131220:8(1-243)Online publication date: 13-Jun-2024
  • (2024)Dual Dynamic Threshold Adjustment StrategyACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604720:7(1-18)Online publication date: 15-May-2024
  • (2024)Backdoor Two-Stream Video Models on Federated LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3651307Online publication date: 7-Mar-2024
  • (2024)Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video GenerationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364836820:6(1-18)Online publication date: 26-Mar-2024
  • (2024)Head3D: Complete 3D Head Generation via Tri-plane Feature DistillationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363571720:6(1-20)Online publication date: 8-Mar-2024
  • (2024)Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.333221819(1168-1182)Online publication date: 1-Jan-2024
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media