Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Scene Semantic Guidance for Object Detection

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13019))

Included in the following conference series:

  • 2633 Accesses

Abstract

In the real world, objects often have ambiguous or indistinctive appearances but they tend to co-vary with other objects and particular scene. In this paper, we exploit scene context to provide a global semantic guidance for object detection. We present a simple but effective Scene Semantic Guidance (SSG) framework, which can be applied as a plug-and-play component, to facilitate the classification ability of detectors. Specifically, to explicitly model scene semantic context, we propose a scene semantic embedding module which leverages an auxiliary task of multi-label classification to learn object-level scene concept. Further, to adaptively incorporate the scene semantic context into the object feature, we propose a semantic consistency guidance module which can strengthen the discrimination of the object feature. Comprehensive experiments on MS-COCO benchmark demonstrate that the proposed SSG framework is effective and generalizable, leading to consistent improvements upon typical detectors, including Faster R-CNN, RetinaNet, and FCOS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cao, J., Chen, Q., Guo, J., et al.: Attention-guided context feature pyramid network for object detection. arXiv preprint arXiv:2005.11475 (2020)

  2. Chen, K., Wang, J., Pang, J., et al.: MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)

  3. Chen, X., Gupta, A.: Spatial memory for context reasoning in object detection. In: ICCV (2017)

    Google Scholar 

  4. Chen, Z.M., Jin, X., Zhao, B., et al.: Hierarchical context embedding for region-based object detection. In: ECCV (2020)

    Google Scholar 

  5. Chen, Z., Huang, S., Tao, D.: Context refinement for object detection. In: ECCV (2018)

    Google Scholar 

  6. Deng, J., Dong, W., Socher, R., et al.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  7. Divvala, S.K., Hoiem, D., Hays, J.H., et al.: An empirical study of context in object detection. In: CVPR (2009)

    Google Scholar 

  8. Fu, J., Liu, J., Wang, Y., et al.: Adaptive context network for scene parsing. In: ICCV (2019)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  10. Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: ECCV (2018)

    Google Scholar 

  11. Li, J., Wei, Y., Liang, X., et al.: Attentive contexts for object detection. IEEE TMM 19(5), 944–954 (2016)

    Google Scholar 

  12. Lin, T.Y., Dollár, P., Girshick, et al.: Feature pyramid networks for object detection. In: CVPR (2017)

    Google Scholar 

  13. Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: ICCV (2017)

    Google Scholar 

  14. Lin, T.Y., Maire, M., Belongie, S., et al.: Microsoft coco: Common objects in context. In: ECCV (2014)

    Google Scholar 

  15. Liu, Y., Wang, R., Shan, S., et al.: Structure inference net: object detection using scene-level context and instance-level relationships. In: CVPR (2018)

    Google Scholar 

  16. Ouyang, W., Wang, X., Zeng, X., et al.: Deepid-net: deformable deep convolutional neural networks for object detection. In: CVPR (2015)

    Google Scholar 

  17. Qiao, X., Zheng, Q., Cao, Y., et al.: Tell me where i am: object-level scene context prediction. In: CVPR (2019)

    Google Scholar 

  18. Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE TPAMI 39(6), 1137 (2017)

    Article  Google Scholar 

  19. Tian, Z., Shen, C., Chen, H., et al.: Fcos: fully convolutional one-stage object detection. in: ICCV (2019)

    Google Scholar 

  20. Yang, Z., Liu, S., Hu, H., et al.: Reppoints: point set representation for object detection. In: ICCV (2019)

    Google Scholar 

  21. Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: European Conference on Computer Vision (2020)

    Google Scholar 

  22. Zhang, S., Chi, C., Yao, Y., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR (2020)

    Google Scholar 

  23. Zhang, X., Wan, F., Liu, C., et al.: Freeanchor: Learning to match anchors for visual object detection. In: NeurIPS (2019)

    Google Scholar 

  24. Zhu, C., Chen, F., Shen, Z., et al.: Soft anchor-point object detection. In: ECCV (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuemei Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Z., Xie, X., Li, X. (2021). Scene Semantic Guidance for Object Detection. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13019. Springer, Cham. https://doi.org/10.1007/978-3-030-88004-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88004-0_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88003-3

  • Online ISBN: 978-3-030-88004-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics