research-article

Distilled Meta-learning for Multi-Class Incremental Learning

Authors:

Abdulmotaleb El SaddikAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 19, Issue 4

Article No.: 149, Pages 1 - 16

https://doi.org/10.1145/3576045

Published: 15 March 2023 Publication History

Abstract

Meta-learning approaches have recently achieved promising performance in multi-class incremental learning. However, meta-learners still suffer from catastrophic forgetting, i.e., they tend to forget the learned knowledge from the old tasks when they focus on rapidly adapting to the new classes of the current task. To solve this problem, we propose a novel distilled meta-learning (DML) framework for multi-class incremental learning that integrates seamlessly meta-learning with knowledge distillation in each incremental stage. Specifically, during inner-loop training, knowledge distillation is incorporated into the DML to overcome catastrophic forgetting. During outer-loop training, a meta-update rule is designed for the meta-learner to learn across tasks and quickly adapt to new tasks. By virtue of the bilevel optimization, our model is encouraged to reach a balance between the retention of old knowledge and the learning of new knowledge. Experimental results on four benchmark datasets demonstrate the effectiveness of our proposal and show that our method significantly outperforms other state-of-the-art incremental learning methods.

References

[1]

Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. 2018. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV’18), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 144–161.

Digital Library

[2]

Yoshua Bengio, Samy Bengio, and Jocelyn Cloutier. 1990. Learning a Synaptic Learning Rule. Citeseer.

[3]

F. M. Castro, M. J. Marín-Jiménez, N. Guil, C. Schmid, and K. Alahari. 2018. End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision. 241–257.

Digital Library

[4]

Arslan Chaudhry, Puneet K. Dokania, Thalaiyasingam Ajanthan, and Philip H. S. Torr. 2018. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision (ECCV’18). 532–547.

Digital Library

[5]

Meng Cheng, Hanli Wang, and Yu Long. 2021. Meta-learning-based incremental few-shot object detection. IEEE Trans Circ. Syst. Vid. Technol. 32, 4 (2021), 2158–2169.

[6]

Zhixiang Chi, Li Gu, Huan Liu, Yang Wang, Yuanhao Yu, and Jin Tang. 2022. MetaFSCIL: A meta-learning approach for few-shot class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14166–14175.

[7]

J. Cichon and Wen Biao Gan. 2015. Branch-specific dendritic Ca2+ spikes cause persistent synaptic plasticity. Nature 520, 7546 (2015), 180–185.

[8]

M.-A. Ranzato and D. Lopez-Paz. 2017. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems. 6470–6479.

[9]

P. Dhar, R. V. Singh, K. Peng, Z. Wu, and R. Chellappa. 2019. Learning without memorizing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 5133–5141. DOI:

[10]

Gregory Ditzler, Manuel Roveri, Cesare Alippi, and Robi Polikar. 2015. Learning in nonstationary environments: A survey. IEEE Comput. Intell. Mag. 10, 4 (2015), 12–25.

Digital Library

[11]

Arthur Douillard, Yifu Chen, Arnaud Dapogny, and Matthieu Cord. 2021. PLOP: Learning without forgetting for continual semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 4040–4050.

[12]

G. Hinton, O. Vinyals, and J. Dean. 2015. Distilling the knowledge in a neural network. Comput. Sci. 14, 7 (2015), 38–39.

[13]

Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2019. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 831–839. DOI:

[14]

Khurram Javed and Martha White. 2019. Meta-learning representations for continual learning. In Advances in Neural Information Processing Systems, Vol. 32. 1–15.

[15]

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. 2017. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. U.S.A. 114, 13 (March2017), 3521–3526. DOI:

[16]

Joseph Kj, Jathushan Rajasegaran, Salman Khan, Fahad Shahbaz Khan, and Vineeth N. Balasubramanian. 2021. Incremental object detection via meta-learning. IEEE Trans. Pattern Anal. Mach. Intell. (2021), 1–11.

[17]

A. Krizhevsky and G. Hinton. 2009. Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical Report 1 (012009).

[18]

S. Lange and G. Grieser. 2002. On the power of incremental learning. Theor. Comput. Sci. 2, 288 (2002), 277–307.

Digital Library

[19]

Y. Lecun and C. Cortes. 2010. The mnist Database of Handwritten Digits. Retrieved from http://yann.lecun.com/exdb/mnist/.

[20]

Z. Li and D. Hoiem. 2017. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40, 12 (2017), 2935–2947.

Digital Library

[21]

Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2019. On the variance of the adaptive learning rate and beyond. arXiv:1908.03265. Retrieved from https://arxiv.org/abs/1908.03265.

[22]

X. Liu, M. Masana, L. Herranz, J. Van de Weijer, A. M. López, and A. D. Bagdanov. 2018. Rotate your Networks: Better weight consolidation and less catastrophic forgetting. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). 2262–2268. DOI:

[23]

Yu Liu, Xiaopeng Hong, Xiaoyu Tao, Songlin Dong, Jingang Shi, and Yihong Gong. 2021. Structural knowledge organization and transfer for class-incremental learning. In Proceedings of the ACM Multimedia Asia (MMAsia’21). Association for Computing Machinery, New York, NY, Article 18, 7 pages. DOI:

Digital Library

[24]

Yaoyao Liu, Bernt Schiele, and Qianru Sun. 2021. Adaptive aggregation networks for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2544–2553.

[25]

Yaoyao Liu, Yuting Su, An-An Liu, Bernt Schiele, and Qianru Sun. 2020. Mnemonics Training: Multi-class incremental learning without forgetting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 12245–12254. DOI:

[26]

Michael McCloskey and Neal J. Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation. Elsevier, 109–165. DOI:

[27]

Martial Mermillod, Aurélia Bugaiska, and Patrick BONIN. 2013. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects. Front. Psychol. 4, 504 (2013), 1–3. DOI:

[28]

M. D. Muhlbaier, A. Topalis, and R. Polikar. 2009. Learn \(^{++}\) .NC: Combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. IEEE Trans. Neural Netw. 20, 1 (2009), 152–168.

Digital Library

[29]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. 1–9.

[30]

Alex Nichol, Joshua Achiam, and John Schulman. 2018. On first-order meta-learning algorithms. arXiv:1803.02999. Retrieved from https://arxiv.org/abs/1803.02999.

[31]

Alex Nichol and John Schulman. 2018. Reptile: A scalable metalearning algorithm. arXiv:1803.02999. Retrieved from https://arxiv.org/abs/1803.02999.

[32]

Jathushan Rajasegaran, Munawar Hayat, Salman Khan, Fahad Shahbaz Khan, and Ling Shao. 2019. Random path selection for incremental learning. Advances in Neural Information Processing Systems (2019), 1–11.

[33]

Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Mubarak Shah. 2020. iTAML: An incremental task-agnostic meta-learning approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 13588–13597. DOI:

[34]

S. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert. 2017. iCaRL: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5533–5542. DOI:

[35]

S. Ruping. 2001. Incremental learning with support vector machines. In Proceedings of the IEEE International Conference on Data Mining. 641–642. DOI:

[36]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, and M. Bernstein. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211–252.

Digital Library

[37]

Jürgen Schmidhuber. 1992. Learning to control fast-weight memories: An alternative to dynamic recurrent networks. Neural Comput. 4, 1 (1992), 131–139.

Digital Library

[38]

Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. 2290–2999.

[39]

Zhen Tan, Kaize Ding, Ruocheng Guo, and Huan Liu. 2022. Graph few-shot class-incremental learning. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 987–996.

Digital Library

[40]

Kai Wang, Xialei Liu, Andrew D. Bagdanov, Luis Herranz, Shangling Jui, and Joost van de Weijer. 2022. Incremental meta-learning via episodic replay distillation for few-shot image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3729–3739.

[41]

Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, and Y. Fu. 2019. Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 374–382. DOI:

[42]

Dongbao Yang, Yu Zhou, Wei Shi, Dayan Wu, and Weiping Wang. 2022. RD-IOD: Two-level residual-distillation-based triple-network for incremental object detection. ACM Trans. Multimedia Comput. Commun. Appl. 18, 1 (2022), 1–23.

Digital Library

[43]

F. Zenke, B. Poole, and S. Ganguli. 2017. Continual learning through synaptic intelligence. Int. Conf. Mach. Learn. 70 (2017), 3987–3995.

[44]

Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry Heck, Heming Zhang, and C.-C. Jay Kuo. 2020. Class-incremental learning via deep model consolidation. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’20). IEEE, 1131–1140. DOI:

[45]

Da-Wei Zhou, Han-Jia Ye, and De-Chuan Zhan. 2021. Co-transport for class-incremental learning. In Proceedings of the 29th ACM International Conference on Multimedia. 1645–1654.

Digital Library

Cited By

Yang KDu W(2024)A Low-Density Parity-Check Coding Scheme for LoRa NetworkingACM Transactions on Sensor Networks10.1145/366592820:4(1-29)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3665928
Zhang DZhu WLiao XQi FYang GDing X(2024)Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664654Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3664654
Zhou WYang QChen WJiang QZhai GLin W(2024)Blind Quality Assessment of Dense 3D Point Clouds with Structure Guided ResamplingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366419920:8(1-21)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3664199
Show More Cited By

Index Terms

Distilled Meta-learning for Multi-Class Incremental Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Lifelong machine learning
2. Networks
  1. Network algorithms
  2. Network performance evaluation
    1. Network experimentation
    2. Network performance analysis

Recommendations

Incremental Learning Based on Dual-Branch Network
Pattern Recognition and Computer Vision
Abstract
Incremental learning aims to overcome catastrophic forgetting. When the model learns multiple tasks sequentially, due to the imbalance of new and old classes numbers, the knowledge of old classes stored in the model is destroyed by large number of ...
Incremental learning with neural networks for computer vision: a survey
Abstract
Incremental learning is one of the most important abilities of human beings. In the age of artificial intelligence, it is the key task to make neural network models as powerful as human beings, to achieve the ability to continuously acquire, fine-...
Incremental learning by heterogeneous bagging ensemble
ADMA'10: Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II

Classifier ensemble is a main direction of incremental learning researches, and many ensemble-based incremental learning methods have been presented. Among them, Learn++, which is derived from the famous ensemble algorithm, AdaBoost, is special. Learn++ ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 19, Issue 4

July 2023

263 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3582888

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2023

Online AM: 17 January 2023

Accepted: 07 December 2022

Revised: 17 October 2022

Received: 08 June 2022

Published in TOMM Volume 19, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Natural Science Foundation of Jiangsu Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
623
Total Downloads

Downloads (Last 12 months)272
Downloads (Last 6 weeks)17

Reflects downloads up to 21 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang KDu W(2024)A Low-Density Parity-Check Coding Scheme for LoRa NetworkingACM Transactions on Sensor Networks10.1145/366592820:4(1-29)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3665928
Zhang DZhu WLiao XQi FYang GDing X(2024)Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664654Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3664654
Zhou WYang QChen WJiang QZhai GLin W(2024)Blind Quality Assessment of Dense 3D Point Clouds with Structure Guided ResamplingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366419920:8(1-21)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3664199
Peng BSun LLei JLiu BShen HLi WHuang Q(2024)Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366357020:8(1-19)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3663570
Yang KHan JGuo GFang CFan YCheng LZhang D(2024)Progressive Adapting and Pruning: Domain-Incremental Learning for Saliency PredictionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366131220:8(1-243)Online publication date: 13-Jun-2024
https://doi.org/10.1145/3661312
Jiang XYao YLiu SShen FNie LHua X(2024)Dual Dynamic Threshold Adjustment StrategyACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604720:7(1-18)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3656047
Zhao JYang HHe HPeng JZhang WNi JSangaiah ACastiglione A(2024)Backdoor Two-Stream Video Models on Federated LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3651307Online publication date: 7-Mar-2024
https://dl.acm.org/doi/10.1145/3651307
Suo YZheng ZWang XZhang BYang Y(2024)Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video GenerationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364836820:6(1-18)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3648368
Cheng YYan YZhu WPan YPan BYang X(2024)Head3D: Complete 3D Head Generation via Tri-plane Feature DistillationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363571720:6(1-20)Online publication date: 8-Mar-2024
https://dl.acm.org/doi/10.1145/3635717
Luo AKong CHuang JHu YKang XKot A(2024)Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.333221819(1168-1182)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2023.3332218
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents