research-article

Gradient-Semantic Compensation for Incremental Semantic Segmentation

Authors:

Henghui DingAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 26

Pages 5561 - 5574

https://doi.org/10.1109/TMM.2023.3336243

Published: 24 November 2023 Publication History

Abstract

Incremental semantic segmentation focuses on continually learning the segmentation of new coming classes without obtaining the training data from previously seen classes. However, most current methods fail to tackle catastrophic forgetting and background shift since they 1) treat all previous classes equally without considering different forgetting paces caused by imbalanced gradient back-propagation; 2) lack strong semantic guidance between classes. In this paper, to solve the aforementioned challenges, we propose a <bold><underline>G</underline></bold>radient-<bold><underline>S</underline></bold>emantic <bold><underline>C</underline></bold>ompensation (<bold>GSC</bold>) model, which surmounts incremental semantic segmentation from both gradient and semantic perspectives. Specifically, to handle catastrophic forgetting from the gradient aspect, we develop a step-aware gradient compensation that can balance forgetting paces of previously seen classes by re-weighting gradient back-propagation. Meanwhile, we propose a soft-sharp semantic relation distillation to distill consistent inter-class semantic relations via soft labels for alleviating catastrophic forgetting from the semantic aspect. In addition, we design a prototypical pseudo re-labeling which provides strong semantic guidance to mitigate background shift. It produces high-quality pseudo labels for background pixels belonging to previous classes by assessing distances of pixels relative to class-wise prototypes. Experiments on three public segmentation datasets provide strong evidence for the effectiveness of our proposed GSC model.

References

[1]

S. Minaee et al., “Image segmentation using deep learning: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 7, pp. 3523–3542, Jul. 2022.

[2]

D. Feng et al., “Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 3, pp. 1341–1360, Mar. 2021.

Digital Library

[3]

J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3431–3440.

[4]

H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 6230–6239.

[5]

L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 801–818.

[6]

J. Kirkpatrick et al., “Overcoming catastrophic forgetting in neural networks,” in Proc. Nat. Acad. Sci. USA, vol. 114, no. 13, pp. 3521–3526, 2017.

[7]

C.-B. Zhang, J.-W. Xiao, X. Liu, Y.-C. Chen, and M.-M. Cheng, “Representation compensation networks for continual semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 7043–7054.

[8]

L. Yu, X. Liu, and J. Van de Weijer, “Self-training for class-incremental semantic segmentation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 11, pp. 9116–9127, Nov. 2023.

[9]

F. Cermelli, M. Mancini, S. R. Bulo, E. Ricci, and B. Caputo, “Modeling the background for incremental learning in semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9230–9239.

[10]

A. Douillard, Y. Chen, A. Dapogny, and M. Cord, “PLOP: Learning without forgetting for continual semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 4039–4049.

[11]

U. Michieli and P. Zanuttigh, “Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 1114–1124.

[12]

G. Yang et al., “Uncertainty-aware contrastive distillation for incremental semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 2567–2581, Feb. 2023.

[13]

G. Yang et al., “Continual attentive fusion for incremental learning in semantic segmentation,” IEEE Trans. Multimedia, vol. 25, pp. 3841–3854, 2022.

[14]

M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” Psychol. Learn. Motiv., vol. 24, pp. 109–165, 1989.

[15]

Z. Li and D. Hoiem, “Learning without forgetting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 2935–2947, Dec. 2018.

Digital Library

[16]

U. Michieli and P. Zanuttigh, “Incremental learning techniques for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop, 2019, pp. 3205–3212.

[17]

H. Ding, X. Jiang, A. Q. Liu, N. M. Thalmann, and G. Wang, “Boundary-aware feature propagation for scene segmentation,” in Proc. Int. Conf. Comput. Vis., 2019, pp. 6819–6829.

[18]

M. Everingham et al., “The Pascal visual object classes challenge: A retrospective,” Int. J. Comput. Vis., vol. 111, no. 1, pp. 98–136, 2015.

Digital Library

[19]

M. Cordts et al., “The cityscapes dataset for semantic urban scene understanding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3213–3223.

[20]

B. Zhou et al., “Scene parsing through ADE20K dataset,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5122–5130.

[21]

H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1520–1528.

[22]

V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Dec. 2017.

[23]

G. Gao et al., “FBSNet: A fast bilateral symmetrical network for real-time semantic segmentation,” IEEE Trans. Multimedia, vol. 25, pp. 3273–3283, 2022.

[24]

L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, “Attention to scale: Scale-aware semantic image segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3640–3649.

[25]

L. Ma, H. Xie, C. Liu, and Y. Zhang, “Learning cross-channel representations for semantic segmentation,” IEEE Trans. Multimedia, vol. 25, pp. 2774–2787, 2022.

[26]

H. Ding, X. Jiang, B. Shuai, A. Q. Liu, and G. Wang, “Context contrasted feature and gated multi-scale aggregation for scene segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2393–2402.

[27]

D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, and H. Huang, “Multi-scale context intertwining for semantic segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 622–638.

[28]

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, Apr. 2018.

[29]

J. He, Z. Deng, L. Zhou, Y. Wang, and Y. Qiao, “Adaptive pyramid context network for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7511–7520.

[30]

S. Zheng et al., “Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6877–6886.

[31]

Z. Chen and B. Liu, “Lifelong machine learning,” Synth. Lect. Artif. Intell. Mach. Learn., vol. 12, no. 3, pp. 1–207, 2018.

[32]

G. Sun, Y. Cong, Q. Wang, B. Zhong, and Y. Fu, “Representative task self-selection for flexible clustered lifelong learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 4, pp. 1467–1481, Apr. 2022.

[33]

G. Sun et al., “What and how: Generalized lifelong spectral clustering via dual memory,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 7, pp. 3895–3908, Jul. 2022.

[34]

S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5533–5542.

[35]

D. Lopez-Paz and M. A. Ranzato, “Gradient episodic memory for continual learning,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 6470–6479.

[36]

L. Zhu, T. Chen, J. Yin, S. See, and J. Liu, “Continual semantic segmentation with automatic memory sample selection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 3082–3092.

[37]

S. Cha et al., “Rebalancing batch normalization for exemplar-based class-incremental learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 20127–20136.

[38]

J. Dong et al., “Federated class-incremental learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 10154–10163.

[39]

Y. Cui et al., “Uncertainty-guided semi-supervised few-shot class-incremental learning with knowledge distillation,” IEEE Trans. Multimedia, vol. 25, pp. 6422–6435, 2022.

[40]

S. Thuseethan, S. Rajasegarar, and J. Yearwood, “Deep continual learning for emerging emotion recognition,” IEEE Trans. Multimedia, vol. 24, pp. 4367–4380, 2022.

[41]

A. Mallya and S. Lazebnik, “PackNet: Adding multiple tasks to a single network by iterative pruning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7765–7773.

[42]

J. Yoon, E. Yang, J. Lee, and S. J. Hwang, “Lifelong learning with dynamically expandable networks,” in Proc. Int. Conf. Learn. Representations, 2018.

[43]

A. Rosenfeld and J. K. Tsotsos, “Incremental learning through deep adaptation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 3, pp. 651–663, Mar. 2020.

[44]

Y. Li, W. Cao, W. Xie, J. Li, and E. Benetos, “Few-shot class-incremental audio classification using dynamically expanded classifier with self-attention modified prototypes,” IEEE Trans. Multimedia, early access, May 25, 2023.

Digital Library

[45]

F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” in Proc. Int. Conf. Mach. Learn., 2017, pp. 3987–3995.

[46]

R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory aware synapses: Learning what (not) to forget,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 144–161.

[47]

J. Schwarz et al., “Progress & compress: A scalable framework for continual learning,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 4528–4537.

[48]

J. Dong et al., “Federated incremental semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 3934–3943.

[49]

S. Hou, X. Pan, C. C. Loy, Z. Wang, and D. Lin, “Learning a unified classifier incrementally via rebalancing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 831–839.

[50]

D. Baek, Y. Oh, S. Lee, J. Lee, and B. Ham, “Decomposed knowledge distillation for class-incremental semantic segmentation,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 10380–10392.

[51]

Y. Oh, D. Baek, and B. Ham, “ALIFE: Adaptive logit regularizer and feature replay for incremental semantic segmentation,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 14516–14528.

[52]

S. Cha, B. Kim, Y. Yoo, and T. Moon, “SSUL: Semantic segmentation with unknown label for exemplar-based class-incremental learning,” in Proc. Adv. Neural Inf. Process. Syst., 2021, pp. 10919–10930.

[53]

Z. Zhang, G. Gao, Z. Fang, J. Jiao, and Y. Wei, “Mining unseen classes via regional objectness: A simple baseline for incremental segmentation,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 24340–24353.

[54]

Q. Hou et al., “Deeply supervised salient object detection with short connections,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5300–5309.

[55]

B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar, “Masked-attention mask transformer for universal image segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 1280–1289.

[56]

J.-W. Xiao et al., “Endpoints weight fusion for class incremental semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 7204–7213.

[57]

A. Douillard, M. Cord, C. Ollion, T. Robert, and E. Valle, “PODNet: Pooled outputs distillation for small-tasks incremental learning,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 86–102.

[58]

Y. Liu, Y. Su, A.-A. Liu, B. Schiele, and Q. Sun, “Mnemonics training: Multi-class incremental learning without forgetting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 12242–12251.

[59]

L. Wang, S. Xu, X. Wang, and Q. Zhu, “Eavesdrop the composition proportion of training labels in federated learning,” 2019, arXiv:1910.06044.

[60]

L. Wang, S. Xu, X. Wang, and Q. Zhu, “Addressing class imbalance in federated learning,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 10165–10173.

[61]

R. Garland-Thomson, Staring: How We Look. Oxford, U.K.: Oxford Univ. Press, 2009.

[62]

E. Xie et al., “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Proc. Adv. Neural Inform. Process. Syst., 2021, pp. 12077–12090.

[63]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.

[64]

J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 248–255.

[65]

J. N. Kundu, R. M. Venkatesh, N. Venkat, A. Revanur, and R. V. Babu, “Class-incremental domain adaptation,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 53–69.

[66]

J. Xie, S. Yan, and X. He, “General incremental learning with domain-aware categorical representations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 14331–14340.

[67]

R. Xu et al., “RSS Former: Foreground saliency enhancement for remote sensing land-cover segmentation,” IEEE Trans. Image Process., vol. 32, pp. 1052–1064, 2023.

[68]

R. Xu, C. Wang, S. Xu, W. Meng, and X. Zhang, “Wave-like class activation map with representation fusion for weakly-supervised semantic segmentation,” IEEE Trans. Multimedia, early access, Apr. 18, 2023.

Digital Library

[69]

D. Zhang et al., “Continual named entity recognition without catastrophic forgetting,” in Proc. Conf. Empir. Methods Natural Lang. Process., 2023.

[70]

D. Zhang et al., “Task relation distillation and prototypical pseudo label for incremental named entity recognition,” in Proc. 32nd ACM Int. Conf. Inf. Knowl. Manage., 2023, pp. 3319–3329.

Digital Library

[71]

S. A. Taghanaki, K. Abhishek, J. P. Cohen, and J. Cohen-Adad, “Deep semantic segmentation of natural and medical images: A review,” Artif. Intell. Rev., vol. 54, no. 1, pp. 137–178, 2021.

Digital Library

Recommendations

Semantic-aware superpixel for weakly supervised semantic segmentation
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

Weakly-supervised semantic segmentation aims to train a semantic segmentation network using weak labels. Among weak labels, image-level label has been the most popular choice due to its simplicity. However, since image-level labels lack accurate object ...
Advancing Incremental Few-Shot Semantic Segmentation via Semantic-Guided Relation Alignment and Adaptation
MultiMedia Modeling
Abstract
Incremental few-shot semantic segmentation aims to extend a semantic segmentation model to novel classes according to only a few labeled data, while preserving its segmentation capability on learned base classes. However, semantic aliasing between ...
Dual semantic-guided model for weakly-supervised zero-shot semantic segmentation
Abstract
The major obstacle in semantic segmentation is that it requires a large number of pixel-level labeled data to train an effective model. In order to reduce the cost of annotation, weakly-supervised methods use weaker labels to overcome the need for ... $_{}$

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 26, Issue

2024

10405 pages

ISSN:1520-9210

Issue’s Table of Contents

1520-9210 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 24 November 2023

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents