Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Where and How to Transfer: Knowledge Aggregation-Induced Transferability Perception for Unsupervised Domain Adaptation

Published: 01 March 2024 Publication History

Abstract

Unsupervised domain adaptation without accessing expensive annotation processes of target data has achieved remarkable successes in semantic segmentation. However, most existing state-of-the-art methods cannot explore whether semantic representations across domains are transferable or not, which may result in the negative transfer brought by irrelevant knowledge. To tackle this challenge, in this paper, we develop a novel <underline>K</underline>nowledge <underline>A</underline>ggregation-induced <underline>T</underline>ransferability <underline>P</underline>erception (KATP) module for unsupervised domain adaptation, which is a pioneering attempt to distinguish transferable or untransferable knowledge across domains. Specifically, the KATP module is designed to quantify which semantic knowledge across domains is transferable, by incorporating the transferability information propagation from constructed global category-wise prototypes. Based on KATP, we design a novel KATP Adaptation Network (KATPAN) to determine where and how to transfer. The KATPAN contains a transferable appearance translation module <inline-formula><tex-math notation="LaTeX">$\mathcal {T}_A(\cdot)$</tex-math><alternatives><mml:math><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x00B7;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math><inline-graphic xlink:href="dong-ieq1-3128560.gif"/></alternatives></inline-formula> and a transferable representation augmentation module <inline-formula><tex-math notation="LaTeX">$\mathcal {T}_R(\cdot)$</tex-math><alternatives><mml:math><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>R</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x00B7;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math><inline-graphic xlink:href="dong-ieq2-3128560.gif"/></alternatives></inline-formula>, where both modules construct a virtuous circle of performance promotion. <inline-formula><tex-math notation="LaTeX">$\mathcal {T}_A(\cdot)$</tex-math><alternatives><mml:math><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x00B7;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math><inline-graphic xlink:href="dong-ieq3-3128560.gif"/></alternatives></inline-formula> develops a transferability-aware information bottleneck to highlight where to adapt transferable visual characterizations and modality information; <inline-formula><tex-math notation="LaTeX">$\mathcal {T}_R(\cdot)$</tex-math><alternatives><mml:math><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>R</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x00B7;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math><inline-graphic xlink:href="dong-ieq4-3128560.gif"/></alternatives></inline-formula> explores how to augment transferable representations while abandoning untransferable information, and promotes the translation performance of <inline-formula><tex-math notation="LaTeX">$\mathcal {T}_A(\cdot)$</tex-math><alternatives><mml:math><mml:mrow><mml:msub><mml:mi mathvariant="script">T</mml:mi><mml:mi>A</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mo>&#x00B7;</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math><inline-graphic xlink:href="dong-ieq5-3128560.gif"/></alternatives></inline-formula> in return. Comprehensive experiments on several representative benchmark datasets and a medical dataset support the state-of-the-art performance of our model.

References

[1]
A. Alemi, I. Fischer, J. Dillon, and K. Murphy, “Deep variational information bottleneck,” in Proc. Int. Conf. Learn. Representations, 2017, pp. 1024–1035.
[2]
M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: A geometric framework for learning from labeled and unlabeled examples,” J. Mach. Learn. Res., vol. 7, no. 85, pp. 2399–2434, 2006.
[3]
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 833–851.
[4]
M. Chen, H. Xue, and D. Cai, “Domain adaptation for semantic segmentation with maximum squares loss,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 2090–2099.
[5]
Y.-H. Chen et al., “No more discrimination: Cross city adaptation of road scene segmenters,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2011–2020.
[6]
H. Chi et al., “TOHAN: A one-step approach towards few-shot hypothesis adaptation,” 2021, arXiv:2106.06326.
[7]
J. Choi, T. Kim, and C. Kim, “Self-ensembling with GAN-based data augmentation for domain adaptation in semantic segmentation,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 6829–6839.
[8]
M. Cordts et al., “The cityscapes dataset for semantic urban scene understanding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3213–3223.
[9]
J. Dong, Y. Cong, G. Sun, and D. Hou, “Semantic-transferable weakly-supervised endoscopic lesions segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 10711–10720.
[10]
J. Dong, Y. Cong, G. Sun, Y. Liu, and X. Xu, “CSCL: Critical semantic-consistent learning for unsupervised domain adaptation,” in Proc. Eur. Conf. Comput. Visi., A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. 2020, pp. 745–762.
[11]
J. Dong, Y. Cong, G. Sun, Y. Yang, X. Xu, and Z. Ding, “Weakly-supervised cross-domain adaptation for endoscopic lesions segmentation,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 5, pp. 2020–2033, 2021.
[12]
J. Dong, Y. Cong, G. Sun, B. Zhong, and X. Xu, “What can be transferred: Unsupervised domain adaptation for endoscopic lesions segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 4022–4031.
[13]
L. Du et al., “SSF-DAN: Separated semantic feature based domain adaptation network for semantic segmentation,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 982–991.
[14]
Z. Fang, J. Lu, A. Liu, F. Liu, and G. Zhang, “Learning bounds for open-set learning,” in Proc. 38th Int. Conf. Mach. Learn., Proc. Mach. Learni. Res., M. Meila and T. Zhang, Eds. 2021, pp. 3122–3132.
[15]
Z. Fang, J. Lu, F. Liu, J. Xuan, and G. Zhang, “Open set domain adaptation: Theoretical bound and algorithm,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 10, pp. 4309–4322, Oct. 2021.
[16]
C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proc. 34th Int. Conf. Mach. Learn., Proc. Mach. Learn. Res., D. Precup and Y. W. Teh, Eds. 2017, pp. 1126–1135.
[17]
L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2414–2423.
[18]
G. Ghiasi and C. Fowlkes, “Laplacian pyramid reconstruction and refinement for semantic segmentation,” in Proc. Eur. Conf. Comput. Vis., 2016, vol. 9907, pp. 519–534.
[19]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
[20]
J. Hoffman et al., “CyCADA: Cycle-consistent adversarial domain adaptation,” in Proc. 35th Int. Conf. Machi. Learn., Proc. Mach. Learn. Res., Jennifer Dy and Andreas Krause, Eds. 2018, pp. 1989–1998.
[21]
J. Hoffman, D. Wang, F. Yu, and T. Darrell, “FCNs in the wild: Pixel-level adversarial and constraint-based adaptation,” 2016,.
[22]
J. Huang, S. Lu, D. Guan, and X. Zhang, “Contextual-relation consistent domain adaptation for semantic segmentation,” in Proc. Eur. Conf. Comput. Vis., A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds, 2020, pp. 705–722.
[23]
M. Kim and H. Byun, “Learning texture invariant representation for domain adaptation of semantic segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 12972–12981.
[24]
C.-Y. Lee, T. Batra, M. H. Baig, and D. Ulbricht, “Sliced wasserstein discrepancy for unsupervised domain adaptation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 10277–10287.
[25]
J. Li, E. Chen, Z. Ding, L. Zhu, K. Lu, and H. T. Shen, “Maximum density divergence for domain adaptation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 11, pp. 3918–3930, Nov. 2021.
[26]
Yunsheng Li, Lu Yuan, and Nuno Vasconcelos, “Bidirectional learning for domain adaptation of semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 6929–6938.
[27]
Q. Lian, F. Lv, L. Duan, and B. Gong, “Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: A non-adversarial approach,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 6757–6766.
[28]
F. Liu, W. Xu, J. Lu, G. Zhang, A. Gretton, and D. J. Sutherland, “Learning deep kernels for non-parametric two-sample tests,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 6316–6326.
[29]
S. Liu, S. D. Mello, J. Gu, G. Zhong, M.-H. Yang, and J. Kautz, “Learning affinity via spatial propagation networks,” in Proc. Adv. Neural Inf. Process. Syste., 2017, pp. 1520–1530.
[30]
Z. Liu, X. Li, P. Luo, C.-C. Loy, and X. Tang, “Semantic image segmentation via deep parsing network,” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 1377–1385.
[31]
M. Long, J. Wang, G. Ding, S. J. Pan, and P. S. Yu, “Adaptation regularization: A general framework for transfer learning,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 5, pp. 1076–1089, May 2014.
[32]
F. Lv, T. Liang, X. Chen, and G. Lin, “Cross-domain semantic segmentation via domain-invariant interactive relation transfer,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 4333–4342.
[33]
F. Pan, I. Shin, F. Rameau, S. Lee, and I. S. Kweon, “Unsupervised intra-domain adaptation for semantic segmentation through self-supervision,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 3763–3772.
[34]
X. B. Peng, A. Kanazawa, S. Toyer, P. Abbeel, and S. Levine, “Variational discriminator bottleneck: Improving imitation learning, inverse RL, and GANs by constraining information flow,” in Proc. Int. Conf. Learn. Representations, 2019, pp. 937–949.
[35]
S. R. Richter, V. Vineet, S. Roth, and V. Koltun, “Playing for data: Ground truth from computer games,” in Eur. Conf. Comput. Vis., B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. 2016, pp. 102–118.
[36]
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” 2015,.
[37]
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3234–3243.
[38]
E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, 4, pp. 640–651, Apr. 2017.
[39]
I. Shin, S. Woo, F. Pan, and I. S. Kweon, “Two-phase pseudo label densification for self-training based domain adaptation,” in Eur. Conf. Comput. Vis., A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. 2020, pp. 532–548.
[40]
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Representations, 2015, pp. 730–734.
[41]
M. N. Subhani and M. Ali, “Learning from scale-invariant examples for domain adaptation in semantic segmentation,” in Proc. Comput. Vis., A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. 2020, pp. 290–306.
[42]
Y.-H. Tsai, W.-C. Hung, S. Schulter, K. Sohn, M.-H. Yang, and M. Chandraker, “Learning to adapt structured output space for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7472–7481.
[43]
T.-H. Vu, H. Jain, M. Bucher, M. Cord, and P. Perez, “ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 2512–2521.
[44]
H. Wang, T. Shen, W. Zhang, L.-Y. Duan, and T. Mei, “Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation,” in Eur. Conf. Comput. Vis., A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds, 2020, pp. 642–659.
[45]
Z. Wang et al., “Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 12632–12641.
[46]
Z. Wu et al., “DCAN: Dual channel-wise alignment networks for unsupervised scene adaptation,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 535–552.
[47]
H. Yu et al., “FOAL: Fast online adaptive learning for cardiac motion estimation,” In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 4312–4322.
[48]
Y. Zhang, P. David, and B. Gong, “Curriculum domain adaptation for semantic segmentation of urban scenes,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2039–2049.
[49]
Y. Zhang, F. Liu, Z. Fang, B. Yuan, G. Zhang, and J. Lu, “Clarinet: A one-step approach towards budget-friendly unsupervised domain adaptation,” in Proc. Int. Joint Conf. Artif. Intell., C. Bessiere, Eds. 2020, pp. 2526–2532.
[50]
H. Zhao et al., “PSANet: Point-wise spatial attention network for scene parsing,” in Proc. Eur. Conf. Comput. Vis., 2018, pp 270–286.
[51]
Sixiao Zheng et al., “Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6881–6890.
[52]
L. Zhong, Z. Fang, F. Liu, J. Lu, B. Yuan, and G. Zhang, “How does the combined risk affect the performance of unsupervised domain adaptation approaches?,” in Proc. Assoc. Adv. Artif. Intell., 2021, pp. 11079–11087.
[53]
J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2242–2251.
[54]
Y. Zou, Z. Yu, B. V. K. Vijaya Kumar, and J. Wang, “Unsupervised domain adaptation for semantic segmentation via class-balanced self-training,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 289–305.

Cited By

View all
  • (2024)Spectral prompt tuningProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i6.28456(6369-6377)Online publication date: 20-Feb-2024
  • (2024)Uni-YOLO: Vision-Language Model-Guided YOLO for Robust and Fast Universal Detection in the Open WorldProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681212(1991-2000)Online publication date: 28-Oct-2024
  • (2024)A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multi-ModalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.341745146:12(9456-9478)Online publication date: 1-Dec-2024
  • Show More Cited By

Index Terms

  1. Where and How to Transfer: Knowledge Aggregation-Induced Transferability Perception for Unsupervised Domain Adaptation
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
          IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 46, Issue 3
          March 2024
          579 pages

          Publisher

          IEEE Computer Society

          United States

          Publication History

          Published: 01 March 2024

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 01 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Spectral prompt tuningProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i6.28456(6369-6377)Online publication date: 20-Feb-2024
          • (2024)Uni-YOLO: Vision-Language Model-Guided YOLO for Robust and Fast Universal Detection in the Open WorldProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681212(1991-2000)Online publication date: 28-Oct-2024
          • (2024)A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multi-ModalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.341745146:12(9456-9478)Online publication date: 1-Dec-2024
          • (2024)No One Left Behind: Real-World Federated Class-Incremental LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333421346:4(2054-2070)Online publication date: 1-Apr-2024
          • (2024)Category-Contextual Relation Encoding Network for Few-Shot Object DetectionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.337897834:9(8355-8367)Online publication date: 1-Sep-2024
          • (2023)Efficient adversarial contrastive learning via robustness-aware coreset selectionProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669434(75798-75825)Online publication date: 10-Dec-2023
          • (2023)Enhancing adversarial contrastive learning via adversarial invariant regularizationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666856(16783-16803)Online publication date: 10-Dec-2023
          • (2023)Synthesizing Videos from Images for Image-to-Video AdaptationProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611897(8294-8303)Online publication date: 26-Oct-2023
          • (2023)Adaptive Feature Swapping for Unsupervised Domain AdaptationProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611896(7017-7028)Online publication date: 26-Oct-2023
          • (2023)Cal-SFDA: Source-Free Domain-adaptive Semantic Segmentation with Differentiable Expected Calibration ErrorProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611808(1167-1178)Online publication date: 26-Oct-2023
          • Show More Cited By

          View Options

          View options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media