Triple Feature Disentanglement for One-Stage Adaptive Object Detection

Authors

  • Haoan Wang East China Normal University
  • Shilong Jia East China Normal University
  • Tieyong Zeng The Chinese University of Hong Kong
  • Guixu Zhang East China Normal University
  • Zhi Li East China Normal University

DOI:

https://doi.org/10.1609/aaai.v38i6.28348

Keywords:

CV: Object Detection & Categorization, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

In recent advancements concerning Domain Adaptive Object Detection (DAOD), unsupervised domain adaptation techniques have proven instrumental. These methods enable enhanced detection capabilities within unlabeled target domains by mitigating distribution differences between source and target domains. A subset of DAOD methods employs disentangled learning to segregate Domain-Specific Representations (DSR) and Domain-Invariant Representations (DIR), with ultimate predictions relying on the latter. Current practices in disentanglement, however, often lead to DIR containing residual domain-specific information. To address this, we introduce the Multi-level Disentanglement Module (MDM) that progressively disentangles DIR, enhancing comprehensive disentanglement. Additionally, our proposed Cyclic Disentanglement Module (CDM) facilitates DSR separation. To refine the process further, we employ the Categorical Features Disentanglement Module (CFDM) to isolate DIR and DSR, coupled with category alignment across scales for improved source-target domain alignment. Given its practical suitability, our model is constructed upon the foundational framework of the Single Shot MultiBox Detector (SSD), which is a one-stage object detection approach. Experimental validation highlights the effectiveness of our method, demonstrating its state-of-the-art performance across three benchmark datasets.

Published

2024-03-24

How to Cite

Wang, H., Jia, S., Zeng, T., Zhang, G., & Li, Z. (2024). Triple Feature Disentanglement for One-Stage Adaptive Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 5401-5409. https://doi.org/10.1609/aaai.v38i6.28348

Issue

Section

AAAI Technical Track on Computer Vision V