Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Hierarchical contrastive adaptation for cross-domain object detection

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Object detection based on deep learning has been enormously developed in recent years. However, applying the detectors trained on a label-rich domain to an unseen domain results in performance drop due to the domain-shift. To deal with this problem, we propose a novel unsupervised domain adaptation method to adapt from a labeled source domain to an unlabeled target domain. Recent approaches based on adversarial learning show some effect for aligning the feature distributions of different domains, but the decision boundary would be strongly source-biased for the complex detection task when merely training with source labels and aligning in the entire feature distribution. In this paper, we suggest utilizing image translation to generate translated images of source and target domains to fill in the large domain gap and facilitate a paired adaptation. We propose a hierarchical contrastive adaptation method between the original and translated domains to encourage the detectors to learn domain-invariant but discriminative features. To attach importance to foreground instances and tackle the noises of translated images, we further propose foreground attention reweighting for instance-aware adaptation . Experiments are carried out on 3 cross-domain detection scenarios, and we achieve the state-of-the-art results against other approaches, showing the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Arun, A., Jawahar, C., Kumar, M.: Dissimilarity coefficient based weakly supervised object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 9424–9433 (2019)

  2. Bachman, P., Hjelm, R.D., Buchwalter, W.: Learning representations by maximizing mutual information across views. (2019) arXiv:1906.00910

  3. Benenson, R., Popov, S., Ferrari, V.: Large-scale interactive object segmentation with human annotators. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11692–11701 (2019)

  4. Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 95–104 (2017)

  5. Cai, Q., Pan, Y., Ngo, C., Tian, X., Duan, L., Yao, T.: Exploring object relation in mean teacher for cross-domain detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11449–11458 (2019)

  6. Chen, C., Xie, W., Xu, T., Huang, W., Rong, Y., Ding, X., Huang, Y., Huang, J.: Progressive feature alignment for unsupervised domain adaptation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 627–636 (2019)

  7. Chen, C., Zheng, Z., Ding, X., Huang, Y., Dou, Q.: Harmonizing transferability and discriminability for adapting object detectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

  8. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. (2020) arXiv:2002.05709

  9. Chen, Y., Li, W., Sakaridis, C., Dai, D., Gool, L.V.: Domain adaptive faster r-cnn for object detection in the wild. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 3339–3348 (2018)

  10. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3213–3223 (2016)

  11. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR 2009 (2009)

  12. Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2009)

    Article  Google Scholar 

  13. Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: ICML (2015)

  14. Guan, D., Huang, J., Xiao, A., Lu, S., Cao, Y.: Uncertainty-aware unsupervised domain adaptation in object detection. (2021) arXiv:2103.00236

  15. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: CVPR (2006)

  16. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.B.: Momentum contrast for unsupervised visual representation learning. (2019) arXiv:1911.05722

  17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 770–778 (2016)

  18. He, Z., Zhang, L.: Multi-adversarial faster-rcnn for unrestricted object detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) pp. 6667–6676 (2019)

  19. Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Trischler, A., Bengio, Y.: Learning deep representations by mutual information estimation and maximization. (2019) arXiv:1808.06670

  20. Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A.A., Darrell, T.: Cycada: Cycle-consistent adversarial domain adaptation. In: ICML (2018)

  21. Hsu, H.K., Hung, W.C., Tseng, H.Y., Yao, C.H., Tsai, Y.H., Singh, M.K., Yang, M.H.: Progressive domain adaptation for object detection. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) pp. 738–746 (2020)

  22. Huang, S.W., Lin, C.T., Chen, S., Wu, Y.Y., Hsu, P.H., Lai, S.: Auggan: Cross domain adaptation with gan-based data augmentation. In: ECCV (2018)

  23. Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 5001–5009 (2018)

  24. Jang, W.D., Kim, C.S.: Interactive image segmentation via backpropagating refinement scheme. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 5292–5301 (2019)

  25. Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? 2017 IEEE International Conference on Robotics and Automation (ICRA) pp. 746–753 (2017)

  26. Kang, G., Jiang, L., Yang, Y., Hauptmann, A.: Contrastive adaptation network for unsupervised domain adaptation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4888–4897 (2019)

  27. Kim, T., Jeong, M., Kim, S., Choi, S., Kim, C.: Diversify and match: A domain adaptive representation learning paradigm for object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 12448–12457 (2019)

  28. Kim, Y., Yoo, B., Kwak, Y., Choi, C., Kim, J.: Deep generative-contrastive networks for facial expression recognition. (2017) arXiv:1703.07140

  29. Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. (2017) arXiv:1703.00848

  30. Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: ICML (2015)

  31. Maaten, L.V.D., Hinton, G.E.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  32. Majumder, S., Yao, A.: Content-aware multi-level guidance for interactive instance segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11594–11603 (2019)

  33. Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. (2018) arXiv:1807.03748

  34. Park, C., Lee, J., Yoo, J., Hur, M., Yoon, S.: Joint contrastive learning for unsupervised domain adaptation. (2020) arXiv:2006.10297

  35. Racah, E., Beckham, C., Maharaj, T., Kahou, S., Prabhat, Pal, C.: Extremeweather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. In: NIPS (2017)

  36. Radosavovic, I., Dollár, P., Girshick, R.B., Gkioxari, G., He, K.: Data distillation: Towards omni-supervised learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 4119–4128 (2018)

  37. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Patt. Anal. Mach. Intell. 39, 1137–1149 (2015)

    Article  Google Scholar 

  38. Rezaeianaran, F., Shetty, R., Aljundi, R., Reino, D.O., Zhang, S., Schiele, B.: Seeking similarities over differences: Similarity-based domain alignment for adaptive object detection. (2021) arXiv:2110.01428

  39. Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 6949–6958 (2019)

  40. Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 3723–3732 (2018)

  41. Sakaridis, C., Dai, D., Gool, L.: Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 126, 973–992 (2018)

    Article  Google Scholar 

  42. Sermanet, P., Lynch, C., Hsu, J., Levine, S.: Time-contrastive networks: Self-supervised learning from multi-view observation. In: CVPRW (2017)

  43. Shen, J., Qu, Y., Zhang, W., Yu, Y.: Wasserstein distance guided representation learning for domain adaptation. In: AAAI (2018)

  44. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR (2015) arXiv:1409.1556

  45. Srivastava, N.: Unsupervised learning of visual representations using videos (2015)

  46. Tang, Y., Wang, J., Gao, B., Dellandréa, E., Gaizauskas, R., Chen, L.: Large scale semi-supervised object detection using visual and semantic knowledge transfer. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2119–2128 (2016)

  47. Tang, Y., Zou, W., Jin, Z., Chen, Y., Hua, Y., Li, X.: Weakly supervised salient object detection with spatiotemporal cascade neural networks. IEEE Trans. Circuits and Syst. Video Technol. 29, 1973–1984 (2019)

    Article  Google Scholar 

  48. Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 7472–7481 (2018)

  49. Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2962–2971 (2017)

  50. Vu, T.H., Jain, H., Bucher, M., Cord, M., Pérez, P.: Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2512–2521 (2019)

  51. Wan, F., Liu, C., Ke, W., Ji, X., Jiao, J., Ye, Q.: C-mil: Continuation multiple instance learning for weakly supervised object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2194–2203 (2019)

  52. Wu, Z., Xiong, Y., Yu, S., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3733–3742 (2018)

  53. Xu, C., Zhao, X., Jin, X., Wei, X.S.: Exploring categorical regularization for domain adaptive object detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11721–11730 (2020)

  54. Xu, M., Wang, H., Ni, B., Tian, Q., Zhang, W.: Cross-domain detection via graph-induced prototype alignment. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 12352–12361 (2020)

  55. Yang, S., Wu, L., Wiliem, A., Lovell, B.C.: Unsupervised domain adaptive object detection using forward-backward cyclic adaptation. (2020) arXiv:2002.00575

  56. Yang, Y., Soatto, S.: Fda: Fourier domain adaptation for semantic segmentation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4084–4094 (2020)

  57. Yang, Z., Mahajan, D., Ghadiyaram, D., Nevatia, R., Ramanathan, V.: Activity driven weakly supervised object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2912–2921 (2019)

  58. Yu, F., Wang, D., Chen, Y., Karianakis, N., Yu, P., Lymberopoulos, D., Chen, X.: Unsupervised domain adaptation for object detection via cross-domain semi-supervised learning. (2019) arXiv:1911.07158

  59. Zellinger, W., Grubinger, T., Lughofer, E., Natschläger, T., Saminger-Platz, S.: Central moment discrepancy (cmd) for domain-invariant representation learning. (2017) arXiv:1702.08811

  60. Zhang, H., Tian, Y., Wang, K., He, H., yue Wang, F.: Synthetic-to-real domain adaptation for object instance segmentation. 2019 International Joint Conference on Neural Networks (IJCNN) pp. 1–7 (2019)

  61. Zhang, Q., Zhang, J., Liu, W., Tao, D.: Category anchor-guided unsupervised domain adaptation for semantic segmentation. (2019) arXiv:1910.13049

  62. Zheng, Y., Huang, D., Liu, S., Wang, Y.: Cross-domain object detection through coarse-to-fine feature adaptation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 13763–13772 (2020)

  63. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. 2017 IEEE International Conference on Computer Vision (ICCV) pp. 2242–2251 (2017)

  64. Zhu, X., Pang, J., Yang, C., Shi, J., Lin, D.: Adapting object detectors via selective cross-domain alignment. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 687–696 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziwei Deng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, Z., Kong, Q., Akira, N. et al. Hierarchical contrastive adaptation for cross-domain object detection. Machine Vision and Applications 33, 62 (2022). https://doi.org/10.1007/s00138-022-01317-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-022-01317-7

Keywords