Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

DoubleAUG: Single-domain Generalized Object Detector in Urban via Color Perturbation and Dual-style Memory

Published: 11 January 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Object detection in urban scenarios is crucial for autonomous driving in intelligent traffic systems. However, unlike conventional object detection tasks, urban-scene images vary greatly in style. For example, images taken on sunny days differ significantly from those taken on rainy days. Therefore, models trained on sunny-day images may not generalize well to rainy-day images. In this article, we aim to solve the single-domain generalizable object detection task in urban scenarios, meaning that a model trained on images from one weather condition should be able to perform well on images from any other weather conditions. To address this challenge, we propose a novel Double AUGmentation (DoubleAUG) method that includes image- and feature-level augmentation schemes. In the image-level augmentation, we consider the variation in color information across different weather conditions and propose a Color Perturbation (CP) method that randomly exchanges the RGB channels to generate various images. In the feature-level augmentation, we propose to utilize a Dual-Style Memory (DSM) to explore the diverse style information on the entire dataset, further enhancing the model’s generalization capability. Extensive experiments demonstrate that our proposed method outperforms state-of-the-art methods. Furthermore, ablation studies confirm the effectiveness of each module in our proposed method. Moreover, our method is plug-and-play and can be integrated into existing methods to further improve model performance.

    References

    [1]
    Isabela Albuquerque, João Monteiro, Mohammad Darvishi, Tiago H. Falk, and Ioannis Mitliagkas. 2019. Generalizing to unseen domains via distribution matching. arXiv preprint arXiv:1911.00804 (2019).
    [2]
    Rossella Aversa, Piero Coronica, Cristiano De Nobili, and Stefano Cozzini. 2020. Deep learning, feature learning, and clustering analysis for SEM image classification. Data Intell. 2, 4 (2020), 513–528.
    [3]
    Chaoqi Chen, Jiongcheng Li, Xiaoguang Han, Xiaoqing Liu, and Yizhou Yu. 2022. Compound domain generalization via meta-knowledge encoding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 7109–7119.
    [4]
    Chaoqi Chen, Zebiao Zheng, Xinghao Ding, Yue Huang, and Qi Dou. 2020. Harmonizing transferability and discriminability for adapting object detectors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 8866–8875.
    [5]
    Yuhua Chen, Wen Li, Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Domain adaptive faster R-CNN for object detection in the wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 3339–3348.
    [6]
    Sungha Choi, Sanghun Jung, Huiwon Yun, Joanne Taery Kim, Seungryong Kim, and Jaegul Choo. 2021. RobustNet: Improving domain generalization in urban-scene segmentation via instance selective whitening. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’21). 11580–11590.
    [7]
    Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’16). 3213–3223.
    [8]
    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’09). 248–255.
    [9]
    Xinjie Fan, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, and Mingyuan Zhou. 2021. Adversarially adaptive normalization for single domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 8208–8217.
    [10]
    Ben Fei, Zhaoyang Lyu, Liang Pan, Junzhe Zhang, Weidong Yang, Tianyue Luo, Bo Zhang, and Bo Dai. 2023. Generative diffusion prior for unified image restoration and enhancement. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’23). 9935–9946.
    [11]
    Ross B. Girshick. 2015. Fast R-CNN. In IEEE/CVF International Conference on Computer Vision (ICCV’15). 1440–1448.
    [12]
    Ross B. Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’14). 580–587.
    [13]
    Sivan Harary, Eli Schwartz, Assaf Arbelle, Peter W. J. Staar, Shady Abu Hussein, Elad Amrani, Roei Herzig, Amit Alfassy, Raja Giryes, Hilde Kuehne, Dina Katabi, Kate Saenko, Rogério Feris, and Leonid Karlinsky. 2022. Unsupervised domain generalization by learning a bridge across domains. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 5270–5280.
    [14]
    Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B. Girshick. 2017. Mask R-CNN. In IEEE/CVF International Conference on Computer Vision (ICCV’17). 2980–2988.
    [15]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 770–778.
    [16]
    Cheng-Chun Hsu, Yi-Hsuan Tsai, Yen-Yu Lin, and Ming-Hsuan Yang. 2020. Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In European Conference on Computer Vision (ECCV), Vol. 12354. 733–748.
    [17]
    Han-Kai Hsu, Wei-Chih Hung, Hung-Yu Tseng, Chun-Han Yao, Yi-Hsuan Tsai, Maneesh Singh, and Ming-Hsuan Yang. 2019. Progressive domain adaptation for object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 1–5.
    [18]
    Qichang Hu, Sakrapee Paisitkriangkrai, Chunhua Shen, Anton van den Hengel, and Fatih Porikli. 2016. Fast detection of multiple objects in traffic scenes with a common detection framework. IEEE Trans. Intell. Transport. Syst. 17, 4 (2016), 1002–1014.
    [19]
    Xiaowei Hu, Chi-Wing Fu, Lei Zhu, and Pheng-Ann Heng. 2019. Depth-attentional features for single-image rain removal. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 8022–8031.
    [20]
    Lei Huang, Yi Zhou, Fan Zhu, Li Liu, and Ling Shao. 2019. Iterative normalization: Beyond standardization towards efficient whitening. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 4874–4883.
    [21]
    Xun Huang and Serge J. Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In International Conference on Computer Vision (ICCV’17). 1510–1519.
    [22]
    Glenn Jocher. [n.d.]. YOLOv5. Retrieved from https://github.com/ultralytics/yolov5
    [23]
    M. Johnson-Roberson, Charles Barto, Rounak Mehta, Sharath Nittur Sridhar, Karl Rosaen, and Ram Vasudevan. 2017. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In IEEE International Conference on Robotics and Automation (ICRA’17). 1–8.
    [24]
    Juwon Kang, Sohyun Lee, Namyup Kim, and Suha Kwak. 2022. Style neophile: Constantly seeking novel styles for domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 7120–7130.
    [25]
    Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2018. Learning to generalize: Meta-learning for domain generalization. In AAAI Conference on Artificial Intelligence (AAAI’18). 3490–3497.
    [26]
    Wuyang Li, Xinyu Liu, and Yixuan Yuan. 2022. SIGMA: Semantic-complete graph matching for domain adaptive object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 5281–5290.
    [27]
    Tsung-Yi Lin, Piotr Dollár, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie. 2017. Feature pyramid networks for object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 936–944.
    [28]
    Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (ECCV’14), Vol. 8693. 740–755.
    [29]
    Yajing Liu, Zhiwei Xiong, Ya Li, Yuning Lu, Xinmei Tian, and Zheng-Jun Zha. 2023. Category-stitch learning for union domain generalization. ACM Trans. Multim. Comput., Commun. Applic. 19, 1 (2023).
    [30]
    Rang Meng, Xianfeng Li, Weijie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, and Shiliang Pu. 2022. Attention diversification for domain generalization. In European Conference on Computer Vision (ECCV’22), Shai Avidan, Gabriel J. Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). 322–340.
    [31]
    Seonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, and Jinkyu Kim. 2022. Grounding visual representations with texts for domain generalization. In European Conference on Computer Vision (ECCV’22), Shai Avidan, Gabriel J. Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). 37–53.
    [32]
    Xingang Pan, Ping Luo, Jianping Shi, and Xiaoou Tang. 2018. Two at once: Enhancing learning and generalization capacities via IBN-Net. In European Conference on Computer Vision (ECCV’18), Vol. 11208. 484–500.
    [33]
    Xingang Pan, Xiaohang Zhan, Jianping Shi, Xiaoou Tang, and Ping Luo. 2019. Switchable whitening for deep representation learning. In International Conference on Computer Vision (ICCV’19). 1863–1871.
    [34]
    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An imperative style, high-performance deep learning library. (2019), 8024–8035.
    [35]
    Fengchun Qiao, Long Zhao, and Xi Peng. 2020. Learning to learn single domain generalization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’20). 12553–12562.
    [36]
    Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. arXiv/1804.02767 (2018).
    [37]
    Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Conference on Advances in Neural Information Processing Systems (NeurIPS’15). 91–99.
    [38]
    Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 6 (2017), 1137–1149.
    [39]
    Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, and Kate Saenko. 2019. Strong-weak distribution alignment for adaptive object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 6956–6965.
    [40]
    Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 126, 9 (2018), 973–992.
    [41]
    Damien Teney, Ehsan Abbasnejad, Simon Lucey, and Anton van den Hengel. 2022. Evading the simplicity bias: Training a diverse set of models discovers solutions with superior OOD generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 16740–16751.
    [42]
    Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 11 (2008), 2579–2605.
    [43]
    Riccardo Volpi, Hongseok Namkoong, Ozan Sener, John C. Duchi, Vittorio Murino, and Silvio Savarese. 2018. Generalizing to unseen domains via adversarial data augmentation. In Conference on Advances in Neural Information Processing Systems (NeurIPS’18). 5339–5349.
    [44]
    Chaoqun Wan, Xu Shen, Yonggang Zhang, Zhiheng Yin, Xinmei Tian, Feng Gao, Jianqiang Huang, and Xian-Sheng Hua. 2022. Meta convolutional neural networks for single domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 4672–4681.
    [45]
    Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, and Tao Qin. 2021. Generalizing to unseen domains: A survey on domain generalization. In International Joint Conference on Artificial Intelligence, (IJCAI’21). 4627–4635.
    [46]
    Shujun Wang, Lequan Yu, Caizi Li, Chi-Wing Fu, and Pheng-Ann Heng. 2020. Learning from extrinsic and intrinsic supervisions for domain generalization. In European Conference on Computer Vision (ECCV’20), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). 159–176.
    [47]
    Xin Wang, Thomas E. Huang, Benlin Liu, Fisher Yu, Xiaolong Wang, Joseph E. Gonzalez, and Trevor Darrell. 2021. Robust object detection via instance-level temporal cycle confusion. In IEEE/CVF International Conference on Computer Vision (ICCV’21). 9123–9132.
    [48]
    Zijian Wang, Yadan Luo, Ruihong Qiu, Zi Huang, and Mahsa Baktashmotlagh. 2021. Learning to diversify for single domain generalization. In International Conference on Computer Vision (ICCV’21). 814–823.
    [49]
    Aming Wu and Cheng Deng. 2022. Single-domain generalized object detection in urban scene via cyclic-disentangled self-distillation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 837–846.
    [50]
    Aming Wu, Yahong Han, Linchao Zhu, and Yi Yang. 2022. Instance-invariant domain adaptive object detection via progressive disentanglement. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8 (2022), 4178–4193.
    [51]
    Jiaxi Wu, Jiaxin Chen, Mengzhe He, Yiru Wang, Bo Li, Bingqi Ma, Weihao Gan, Wei Wu, Yali Wang, and Di Huang. 2022. Target-relevant knowledge preservation for multi-source domain adaptive object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 5291–5300.
    [52]
    Lei Wu, Hefei Ling, Yuxuan Shi, and Baiyan Zhang. 2022. Instance correlation graph for unsupervised domain adaptation. ACM Trans. Multim. Comput., Commun. Applic. 18, 1s (2022), 33:1–33:23.
    [53]
    Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. Retrieved from https://github.com/facebookresearch/detectron2
    [54]
    Chang-Dong Xu, Xing-Ran Zhao, Xin Jin, and Xiu-Shen Wei. 2020. Exploring categorical regularization for domain adaptive object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 11721–11730.
    [55]
    Minghao Xu, Hang Wang, Bingbing Ni, Qi Tian, and Wenjun Zhang. 2020. Cross-domain detection via graph-induced prototype alignment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 12352–12361.
    [56]
    Yifan Xu, Kekai Sheng, Weiming Dong, Baoyuan Wu, Changsheng Xu, and Bao-Gang Hu. 2022. Towards corruption-agnostic robust domain adaptation. ACM Trans. Multim. Comput., Commun. Applic. 18, 4 (2022), 99:1–99:16.
    [57]
    Yuzhe Yang, Hao Wang, and Dina Katabi. 2022. On multi-domain long-tailed recognition, imbalanced domain generalization and beyond. In European Conference on Computer Vision (ECCV’22), Vol. 13680. 57–75.
    [58]
    Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, and Bei Yu. 2022. PCL: Proxy-based contrastive learning for domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 7087–7097.
    [59]
    Hanlin Zhang, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, and Eric P. Xing. 2022. Towards principled disentanglement for domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 8014–8024.
    [60]
    Jian Zhang, Lei Qi, Yinghuan Shi, and Yang Gao. 2022. MVDG: A unified multi-view framework for domain generalization. In European Conference on Computer Vision (ECCV’22), Shai Avidan, Gabriel J. Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). 161–177.
    [61]
    Xingxuan Zhang, Linjun Zhou, Renzhe Xu, Peng Cui, Zheyan Shen, and Haoxin Liu. 2022. Towards unsupervised domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 4900–4910.
    [62]
    Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, and Lei Zhang. 2022. Exact feature distribution matching for arbitrary style transfer and domain generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 8025–8035.
    [63]
    Yuhong Zhang, Jianqing Wu, Qi Zhang, and Xuegang Hu. 2023. Multi-view feature learning for the over-penalty in adversarial domain adaptation. Data Intell. (2023), 1–16.
    [64]
    Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, and Tao Xiang. 2020. Deep domain-adversarial image generation for domain generalisation. In AAAI Conference on Artificial Intelligence (AAAI’20). 13025–13032.
    [65]
    Kaiyang Zhou, Yongxin Yang, Timothy M. Hospedales, and Tao Xiang. 2020. Learning to generate novel domains for domain generalization. In European Conference on Computer Vision (ECCV’20), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.), Vol. 12361. 561–578.
    [66]
    Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xiang. 2021. Domain generalization with MixStyle. In International Conference on Learning Representations (ICLR’21).

    Index Terms

    1. DoubleAUG: Single-domain Generalized Object Detector in Urban via Color Perturbation and Dual-style Memory

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 5
      May 2024
      650 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3613634
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 January 2024
      Online AM: 18 December 2023
      Accepted: 21 November 2023
      Revised: 14 October 2023
      Received: 16 March 2023
      Published in TOMM Volume 20, Issue 5

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Single-domain generalization
      2. Object detection
      3. DoubleAUG

      Qualifiers

      • Research-article

      Funding Sources

      • NSFC Program
      • Jiangsu Natural Science Foundation Project

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 278
        Total Downloads
      • Downloads (Last 12 months)278
      • Downloads (Last 6 weeks)22
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media