Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Gradient-Semantic Compensation for Incremental Semantic Segmentation

Published: 24 November 2023 Publication History

Abstract

Incremental semantic segmentation focuses on continually learning the segmentation of new coming classes without obtaining the training data from previously seen classes. However, most current methods fail to tackle catastrophic forgetting and background shift since they 1) treat all previous classes equally without considering different forgetting paces caused by imbalanced gradient back-propagation; 2) lack strong semantic guidance between classes. In this paper, to solve the aforementioned challenges, we propose a <bold><underline>G</underline></bold>radient-<bold><underline>S</underline></bold>emantic <bold><underline>C</underline></bold>ompensation (<bold>GSC</bold>) model, which surmounts incremental semantic segmentation from both gradient and semantic perspectives. Specifically, to handle catastrophic forgetting from the gradient aspect, we develop a step-aware gradient compensation that can balance forgetting paces of previously seen classes by re-weighting gradient back-propagation. Meanwhile, we propose a soft-sharp semantic relation distillation to distill consistent inter-class semantic relations via soft labels for alleviating catastrophic forgetting from the semantic aspect. In addition, we design a prototypical pseudo re-labeling which provides strong semantic guidance to mitigate background shift. It produces high-quality pseudo labels for background pixels belonging to previous classes by assessing distances of pixels relative to class-wise prototypes. Experiments on three public segmentation datasets provide strong evidence for the effectiveness of our proposed GSC model.

References

[1]
S. Minaee et al., “Image segmentation using deep learning: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 7, pp. 3523–3542, Jul. 2022.
[2]
D. Feng et al., “Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 3, pp. 1341–1360, Mar. 2021.
[3]
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3431–3440.
[4]
H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 6230–6239.
[5]
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 801–818.
[6]
J. Kirkpatrick et al., “Overcoming catastrophic forgetting in neural networks,” in Proc. Nat. Acad. Sci. USA, vol. 114, no. 13, pp. 3521–3526, 2017.
[7]
C.-B. Zhang, J.-W. Xiao, X. Liu, Y.-C. Chen, and M.-M. Cheng, “Representation compensation networks for continual semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 7043–7054.
[8]
L. Yu, X. Liu, and J. Van de Weijer, “Self-training for class-incremental semantic segmentation,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 11, pp. 9116–9127, Nov. 2023.
[9]
F. Cermelli, M. Mancini, S. R. Bulo, E. Ricci, and B. Caputo, “Modeling the background for incremental learning in semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 9230–9239.
[10]
A. Douillard, Y. Chen, A. Dapogny, and M. Cord, “PLOP: Learning without forgetting for continual semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 4039–4049.
[11]
U. Michieli and P. Zanuttigh, “Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 1114–1124.
[12]
G. Yang et al., “Uncertainty-aware contrastive distillation for incremental semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 2567–2581, Feb. 2023.
[13]
G. Yang et al., “Continual attentive fusion for incremental learning in semantic segmentation,” IEEE Trans. Multimedia, vol. 25, pp. 3841–3854, 2022.
[14]
M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” Psychol. Learn. Motiv., vol. 24, pp. 109–165, 1989.
[15]
Z. Li and D. Hoiem, “Learning without forgetting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 2935–2947, Dec. 2018.
[16]
U. Michieli and P. Zanuttigh, “Incremental learning techniques for semantic segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop, 2019, pp. 3205–3212.
[17]
H. Ding, X. Jiang, A. Q. Liu, N. M. Thalmann, and G. Wang, “Boundary-aware feature propagation for scene segmentation,” in Proc. Int. Conf. Comput. Vis., 2019, pp. 6819–6829.
[18]
M. Everingham et al., “The Pascal visual object classes challenge: A retrospective,” Int. J. Comput. Vis., vol. 111, no. 1, pp. 98–136, 2015.
[19]
M. Cordts et al., “The cityscapes dataset for semantic urban scene understanding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3213–3223.
[20]
B. Zhou et al., “Scene parsing through ADE20K dataset,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5122–5130.
[21]
H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1520–1528.
[22]
V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Dec. 2017.
[23]
G. Gao et al., “FBSNet: A fast bilateral symmetrical network for real-time semantic segmentation,” IEEE Trans. Multimedia, vol. 25, pp. 3273–3283, 2022.
[24]
L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille, “Attention to scale: Scale-aware semantic image segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3640–3649.
[25]
L. Ma, H. Xie, C. Liu, and Y. Zhang, “Learning cross-channel representations for semantic segmentation,” IEEE Trans. Multimedia, vol. 25, pp. 2774–2787, 2022.
[26]
H. Ding, X. Jiang, B. Shuai, A. Q. Liu, and G. Wang, “Context contrasted feature and gated multi-scale aggregation for scene segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2393–2402.
[27]
D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, and H. Huang, “Multi-scale context intertwining for semantic segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 622–638.
[28]
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, Apr. 2018.
[29]
J. He, Z. Deng, L. Zhou, Y. Wang, and Y. Qiao, “Adaptive pyramid context network for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7511–7520.
[30]
S. Zheng et al., “Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6877–6886.
[31]
Z. Chen and B. Liu, “Lifelong machine learning,” Synth. Lect. Artif. Intell. Mach. Learn., vol. 12, no. 3, pp. 1–207, 2018.
[32]
G. Sun, Y. Cong, Q. Wang, B. Zhong, and Y. Fu, “Representative task self-selection for flexible clustered lifelong learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 4, pp. 1467–1481, Apr. 2022.
[33]
G. Sun et al., “What and how: Generalized lifelong spectral clustering via dual memory,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 7, pp. 3895–3908, Jul. 2022.
[34]
S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5533–5542.
[35]
D. Lopez-Paz and M. A. Ranzato, “Gradient episodic memory for continual learning,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 6470–6479.
[36]
L. Zhu, T. Chen, J. Yin, S. See, and J. Liu, “Continual semantic segmentation with automatic memory sample selection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 3082–3092.
[37]
S. Cha et al., “Rebalancing batch normalization for exemplar-based class-incremental learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 20127–20136.
[38]
J. Dong et al., “Federated class-incremental learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 10154–10163.
[39]
Y. Cui et al., “Uncertainty-guided semi-supervised few-shot class-incremental learning with knowledge distillation,” IEEE Trans. Multimedia, vol. 25, pp. 6422–6435, 2022.
[40]
S. Thuseethan, S. Rajasegarar, and J. Yearwood, “Deep continual learning for emerging emotion recognition,” IEEE Trans. Multimedia, vol. 24, pp. 4367–4380, 2022.
[41]
A. Mallya and S. Lazebnik, “PackNet: Adding multiple tasks to a single network by iterative pruning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7765–7773.
[42]
J. Yoon, E. Yang, J. Lee, and S. J. Hwang, “Lifelong learning with dynamically expandable networks,” in Proc. Int. Conf. Learn. Representations, 2018.
[43]
A. Rosenfeld and J. K. Tsotsos, “Incremental learning through deep adaptation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 3, pp. 651–663, Mar. 2020.
[44]
Y. Li, W. Cao, W. Xie, J. Li, and E. Benetos, “Few-shot class-incremental audio classification using dynamically expanded classifier with self-attention modified prototypes,” IEEE Trans. Multimedia, early access, May 25, 2023.
[45]
F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” in Proc. Int. Conf. Mach. Learn., 2017, pp. 3987–3995.
[46]
R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory aware synapses: Learning what (not) to forget,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 144–161.
[47]
J. Schwarz et al., “Progress & compress: A scalable framework for continual learning,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 4528–4537.
[48]
J. Dong et al., “Federated incremental semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 3934–3943.
[49]
S. Hou, X. Pan, C. C. Loy, Z. Wang, and D. Lin, “Learning a unified classifier incrementally via rebalancing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 831–839.
[50]
D. Baek, Y. Oh, S. Lee, J. Lee, and B. Ham, “Decomposed knowledge distillation for class-incremental semantic segmentation,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 10380–10392.
[51]
Y. Oh, D. Baek, and B. Ham, “ALIFE: Adaptive logit regularizer and feature replay for incremental semantic segmentation,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 14516–14528.
[52]
S. Cha, B. Kim, Y. Yoo, and T. Moon, “SSUL: Semantic segmentation with unknown label for exemplar-based class-incremental learning,” in Proc. Adv. Neural Inf. Process. Syst., 2021, pp. 10919–10930.
[53]
Z. Zhang, G. Gao, Z. Fang, J. Jiao, and Y. Wei, “Mining unseen classes via regional objectness: A simple baseline for incremental segmentation,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 24340–24353.
[54]
Q. Hou et al., “Deeply supervised salient object detection with short connections,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5300–5309.
[55]
B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar, “Masked-attention mask transformer for universal image segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 1280–1289.
[56]
J.-W. Xiao et al., “Endpoints weight fusion for class incremental semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 7204–7213.
[57]
A. Douillard, M. Cord, C. Ollion, T. Robert, and E. Valle, “PODNet: Pooled outputs distillation for small-tasks incremental learning,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 86–102.
[58]
Y. Liu, Y. Su, A.-A. Liu, B. Schiele, and Q. Sun, “Mnemonics training: Multi-class incremental learning without forgetting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 12242–12251.
[59]
L. Wang, S. Xu, X. Wang, and Q. Zhu, “Eavesdrop the composition proportion of training labels in federated learning,” 2019, arXiv:1910.06044.
[60]
L. Wang, S. Xu, X. Wang, and Q. Zhu, “Addressing class imbalance in federated learning,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 10165–10173.
[61]
R. Garland-Thomson, Staring: How We Look. Oxford, U.K.: Oxford Univ. Press, 2009.
[62]
E. Xie et al., “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Proc. Adv. Neural Inform. Process. Syst., 2021, pp. 12077–12090.
[63]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
[64]
J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 248–255.
[65]
J. N. Kundu, R. M. Venkatesh, N. Venkat, A. Revanur, and R. V. Babu, “Class-incremental domain adaptation,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 53–69.
[66]
J. Xie, S. Yan, and X. He, “General incremental learning with domain-aware categorical representations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 14331–14340.
[67]
R. Xu et al., “RSS Former: Foreground saliency enhancement for remote sensing land-cover segmentation,” IEEE Trans. Image Process., vol. 32, pp. 1052–1064, 2023.
[68]
R. Xu, C. Wang, S. Xu, W. Meng, and X. Zhang, “Wave-like class activation map with representation fusion for weakly-supervised semantic segmentation,” IEEE Trans. Multimedia, early access, Apr. 18, 2023.
[69]
D. Zhang et al., “Continual named entity recognition without catastrophic forgetting,” in Proc. Conf. Empir. Methods Natural Lang. Process., 2023.
[70]
D. Zhang et al., “Task relation distillation and prototypical pseudo label for incremental named entity recognition,” in Proc. 32nd ACM Int. Conf. Inf. Knowl. Manage., 2023, pp. 3319–3329.
[71]
S. A. Taghanaki, K. Abhishek, J. P. Cohen, and J. Cohen-Adad, “Deep semantic segmentation of natural and medical images: A review,” Artif. Intell. Rev., vol. 54, no. 1, pp. 137–178, 2021.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 26, Issue
2024
10405 pages

Publisher

IEEE Press

Publication History

Published: 24 November 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media