Abstract
In the real world, objects often have ambiguous or indistinctive appearances but they tend to co-vary with other objects and particular scene. In this paper, we exploit scene context to provide a global semantic guidance for object detection. We present a simple but effective Scene Semantic Guidance (SSG) framework, which can be applied as a plug-and-play component, to facilitate the classification ability of detectors. Specifically, to explicitly model scene semantic context, we propose a scene semantic embedding module which leverages an auxiliary task of multi-label classification to learn object-level scene concept. Further, to adaptively incorporate the scene semantic context into the object feature, we propose a semantic consistency guidance module which can strengthen the discrimination of the object feature. Comprehensive experiments on MS-COCO benchmark demonstrate that the proposed SSG framework is effective and generalizable, leading to consistent improvements upon typical detectors, including Faster R-CNN, RetinaNet, and FCOS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, J., Chen, Q., Guo, J., et al.: Attention-guided context feature pyramid network for object detection. arXiv preprint arXiv:2005.11475 (2020)
Chen, K., Wang, J., Pang, J., et al.: MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Chen, X., Gupta, A.: Spatial memory for context reasoning in object detection. In: ICCV (2017)
Chen, Z.M., Jin, X., Zhao, B., et al.: Hierarchical context embedding for region-based object detection. In: ECCV (2020)
Chen, Z., Huang, S., Tao, D.: Context refinement for object detection. In: ECCV (2018)
Deng, J., Dong, W., Socher, R., et al.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
Divvala, S.K., Hoiem, D., Hays, J.H., et al.: An empirical study of context in object detection. In: CVPR (2009)
Fu, J., Liu, J., Wang, Y., et al.: Adaptive context network for scene parsing. In: ICCV (2019)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: CVPR (2016)
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: ECCV (2018)
Li, J., Wei, Y., Liang, X., et al.: Attentive contexts for object detection. IEEE TMM 19(5), 944–954 (2016)
Lin, T.Y., Dollár, P., Girshick, et al.: Feature pyramid networks for object detection. In: CVPR (2017)
Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: ICCV (2017)
Lin, T.Y., Maire, M., Belongie, S., et al.: Microsoft coco: Common objects in context. In: ECCV (2014)
Liu, Y., Wang, R., Shan, S., et al.: Structure inference net: object detection using scene-level context and instance-level relationships. In: CVPR (2018)
Ouyang, W., Wang, X., Zeng, X., et al.: Deepid-net: deformable deep convolutional neural networks for object detection. In: CVPR (2015)
Qiao, X., Zheng, Q., Cao, Y., et al.: Tell me where i am: object-level scene context prediction. In: CVPR (2019)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE TPAMI 39(6), 1137 (2017)
Tian, Z., Shen, C., Chen, H., et al.: Fcos: fully convolutional one-stage object detection. in: ICCV (2019)
Yang, Z., Liu, S., Hu, H., et al.: Reppoints: point set representation for object detection. In: ICCV (2019)
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: European Conference on Computer Vision (2020)
Zhang, S., Chi, C., Yao, Y., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR (2020)
Zhang, X., Wan, F., Liu, C., et al.: Freeanchor: Learning to match anchors for visual object detection. In: NeurIPS (2019)
Zhu, C., Chen, F., Shen, Z., et al.: Soft anchor-point object detection. In: ECCV (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, Z., Xie, X., Li, X. (2021). Scene Semantic Guidance for Object Detection. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13019. Springer, Cham. https://doi.org/10.1007/978-3-030-88004-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-88004-0_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88003-3
Online ISBN: 978-3-030-88004-0
eBook Packages: Computer ScienceComputer Science (R0)