Scene Semantic Guidance for Object Detection

Liu, Zhuo; Xie, Xuemei; Li, Xuyang

doi:10.1007/978-3-030-88004-0_29

Zhuo Liu¹⁶,
Xuemei Xie¹⁶ &
Xuyang Li¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13019))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

2633 Accesses

Abstract

In the real world, objects often have ambiguous or indistinctive appearances but they tend to co-vary with other objects and particular scene. In this paper, we exploit scene context to provide a global semantic guidance for object detection. We present a simple but effective Scene Semantic Guidance (SSG) framework, which can be applied as a plug-and-play component, to facilitate the classification ability of detectors. Specifically, to explicitly model scene semantic context, we propose a scene semantic embedding module which leverages an auxiliary task of multi-label classification to learn object-level scene concept. Further, to adaptively incorporate the scene semantic context into the object feature, we propose a semantic consistency guidance module which can strengthen the discrimination of the object feature. Comprehensive experiments on MS-COCO benchmark demonstrate that the proposed SSG framework is effective and generalizable, leading to consistent improvements upon typical detectors, including Faster R-CNN, RetinaNet, and FCOS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Few-shot object detection with dense-global feature interaction and dual-contrastive learning

Article 27 October 2022

Hierarchical Context Embedding for Region-Based Object Detection

DOCK: Detecting Objects by Transferring Common-Sense Knowledge

References

Cao, J., Chen, Q., Guo, J., et al.: Attention-guided context feature pyramid network for object detection. arXiv preprint arXiv:2005.11475 (2020)
Chen, K., Wang, J., Pang, J., et al.: MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Chen, X., Gupta, A.: Spatial memory for context reasoning in object detection. In: ICCV (2017)
Google Scholar
Chen, Z.M., Jin, X., Zhao, B., et al.: Hierarchical context embedding for region-based object detection. In: ECCV (2020)
Google Scholar
Chen, Z., Huang, S., Tao, D.: Context refinement for object detection. In: ECCV (2018)
Google Scholar
Deng, J., Dong, W., Socher, R., et al.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Divvala, S.K., Hoiem, D., Hays, J.H., et al.: An empirical study of context in object detection. In: CVPR (2009)
Google Scholar
Fu, J., Liu, J., Wang, Y., et al.: Adaptive context network for scene parsing. In: ICCV (2019)
Google Scholar
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: ECCV (2018)
Google Scholar
Li, J., Wei, Y., Liang, X., et al.: Attentive contexts for object detection. IEEE TMM 19(5), 944–954 (2016)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, et al.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Lin, T.Y., Maire, M., Belongie, S., et al.: Microsoft coco: Common objects in context. In: ECCV (2014)
Google Scholar
Liu, Y., Wang, R., Shan, S., et al.: Structure inference net: object detection using scene-level context and instance-level relationships. In: CVPR (2018)
Google Scholar
Ouyang, W., Wang, X., Zeng, X., et al.: Deepid-net: deformable deep convolutional neural networks for object detection. In: CVPR (2015)
Google Scholar
Qiao, X., Zheng, Q., Cao, Y., et al.: Tell me where i am: object-level scene context prediction. In: CVPR (2019)
Google Scholar
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE TPAMI 39(6), 1137 (2017)
Article Google Scholar
Tian, Z., Shen, C., Chen, H., et al.: Fcos: fully convolutional one-stage object detection. in: ICCV (2019)
Google Scholar
Yang, Z., Liu, S., Hu, H., et al.: Reppoints: point set representation for object detection. In: ICCV (2019)
Google Scholar
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: European Conference on Computer Vision (2020)
Google Scholar
Zhang, S., Chi, C., Yao, Y., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR (2020)
Google Scholar
Zhang, X., Wan, F., Liu, C., et al.: Freeanchor: Learning to match anchors for visual object detection. In: NeurIPS (2019)
Google Scholar
Zhu, C., Chen, F., Shen, Z., et al.: Soft anchor-point object detection. In: ECCV (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence, Xidian University, Xi’an, 710071, China
Zhuo Liu, Xuemei Xie & Xuyang Li

Authors

Zhuo Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xuemei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Xuyang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuemei Xie .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Z., Xie, X., Li, X. (2021). Scene Semantic Guidance for Object Detection. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13019. Springer, Cham. https://doi.org/10.1007/978-3-030-88004-0_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-88004-0_29
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88003-3
Online ISBN: 978-3-030-88004-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Scene Semantic Guidance for Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot object detection with dense-global feature interaction and dual-contrastive learning

Hierarchical Context Embedding for Region-Based Object Detection

DOCK: Detecting Objects by Transferring Common-Sense Knowledge

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Scene Semantic Guidance for Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot object detection with dense-global feature interaction and dual-contrastive learning

Hierarchical Context Embedding for Region-Based Object Detection

DOCK: Detecting Objects by Transferring Common-Sense Knowledge

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation