Self-attention recurrent network for saliency detection

Sun, Fengdong; Li, Wenhui; Guan, Yuanyuan

doi:10.1007/s11042-018-6591-3

Self-attention recurrent network for saliency detection

Published: 17 September 2018

Volume 78, pages 30793–30807, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Fengdong Sun¹,
Wenhui Li¹ &
Yuanyuan Guan¹

600 Accesses
15 Citations
Explore all metrics

Abstract

Feature maps in deep neural networks generally contain different semantics. Existing methods often omit their characteristics that may lead to sub-optimal results. In this paper, we propose a novel end-to-end deep saliency network which could effectively utilize multi-scale feature maps according to their characteristics. Shallow layers generally contain more local information, and deep layers have advantages in global semantics. Therefore, our network could generate elaborate saliency maps by exploiting the different semantics of feature maps in different layers. On one hand, local information of shallow layers is enhanced by a recurrent structure which shared convolution kernels at different time steps. On the other hand, global information of deep layers is utilized by a self-attention module, which generates attention weights for salient objects and backgrounds thus achieve better performance. Experimental results on four widely used datasets demonstrate that our method has advantages in performance over existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent Double Features: Recurrent Multi-scale Deep Features and Saliency Features for Salient Object Detection

Enhancing Feature Representation for Saliency Detection

Multi-attention embedded network for salient object detection

Article 14 September 2021

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: A system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI’16. USENIX Association, pp265–283. http://dl.acm.org/citation.cfm?id=3026877.3026899
Achantay R, Hemamiz S, Estraday F, Su̇sstrunky S (2009) Frequency-tuned salient region detection. In: 2009 IEEE Computer society conference on computer vision and pattern recognition workshops, CVPR workshops 2009, pp 1597–1604. https://doi.org/10.1109/CVPRW.2009.5206596
Bi S, Li G, Yu Y (2014) Person re-identification using multiple experts with random subspaces. Int J Image Graph 2(2):151–157
Google Scholar
Borji A, Frintrop S, Sihite DN, Itti L (2012) Adaptive object tracking by learning background context. In: IEEE Computer society conference on computer vision and pattern recognition workshops, pp 23–30. https://doi.org/10.1109/CVPRW.2012.6239191
Cheng M, Zhang F, Mitra N, Huang X, Hu S (2010) RepFinder: Finding Approximately Repeated Scene Elements for Image Editing. ACM Trans Graph TOG 29(4):1. https://doi.org/10.1145/1778765.1778820. http://discovery.ucl.ac.uk/1327991/
Article Google Scholar
Cheng MM, Zhang GX, Mitra NJ, Huang X, Hu SM (2011) Global Contrast based Salient Region Detection, pp 409–416. https://doi.org/10.1109/CVPR.2011.5995344
Cheng MM, Hou QB, Zhang SH, Rosin PL (2017) Intelligent visual media processing:when graphics meets vision. J Comput Sci Technol 32(1):110–121
Article Google Scholar
Guo C, Zhang L (2010) A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression. IEEE Trans Image Process 19(1):185–198. https://doi.org/10.1109/TIP.2009.2030969. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5223506
Article MathSciNet Google Scholar
Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Teh YW, Titterington M (eds) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol 9. PMLR, Chia Laguna Resort, Sardinia, pp 249–256. http://proceedings.mlr.press/v9/glorot10a.html
Hou Q, Cheng MM, Hu X, Borji A, Tu Z, Torr PH (2018) Deeply Supervised Salient Object Detection with Short Connections. https://doi.org/10.1109/TPAMI.2018.2815688
Article Google Scholar
Hua Y, Zhao Z, Tian H, Guo X, Cai A (2013) A probabilistic saliency model with memory-guided top-down cues for free-viewing. In: IEEE International conference on multimedia and expo, pp 1–6
Itti L, Koch C, Niebur E (1998) A Model of Saliency Based Visual Attention for Rapid Scene Analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259. https://doi.org/10.1016/S1053-5357(00)00088-3
Article Google Scholar
Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiol 4(4):219–27. https://doi.org/10.1016/j.imavis.2008.02.004. http://www.ncbi.nlm.nih.gov/pubmed/3836989
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances In Neural Information Processing Systems 1:1097–1105. https://doi.org/10.1016/j.protcy.2014.09.007
Google Scholar
Kuen J, Wang Z, Wang G (2016) Recurrent attentional networks for saliency detection. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 3668–3677. https://doi.org/10.1109/CVPR.2016.399
Li Y, Hou X, Koch C, Rehg J, Yuille A (2014) The secrets of salient object segmentation, pp 4321–4328. https://doi.org/10.1109/CVPR.2014.43. http://www.stat.ucla.edu/yuille/Pubs10_12/LiHouKochRehgYuille.pdf
Li G, Yu Y (2016) Deep contrast learning for salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, vol 2016, pp 478–487
Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. IEEE Computer Society, Washington, pp 3367–3375. https://doi.org/10.1109/CVPR.2015.7298958. arXiv:https://arxiv.org/abs/1704.07709
Google Scholar
Liu N, Han J (2016) DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 678–686. https://doi.org/10.1109/CVPR.2016.80. http://ieeexplore.ieee.org/document/7780449/
Luo Z, Mishra A, Achkar A, Eichel J, Li S, Jodoin P (2017) Non-local deep features for salient object detection. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 6593–6601. https://doi.org/10.1109/CVPR.2017.698
Ma YF, Lu L, Zhang HJ, Li M (2002) A user attention model for video summarization. In: Proceedings of the Tenth ACM International Conference on Multimedia, MULTIMEDIA ’02. ACM, New York, pp 533–542. https://doi.org/10.1145/641007.641116
Simonyan K, Zisserman A (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition. pp. 1–14. https://doi.org/10.1016/j.infsof.2008.09.005. arXiv:1409.1556
Article Google Scholar
Wang Y, Zhao Q (2015) Superpixel tracking via graph-based semi-supervised svm and supervised saliency detection. In: IEEE International conference on multimedia and expo, pp 1–6
Wang Y, Lin X, Wu L, Zhang W, Zhang Q, Huang X (2015) Robust subspace clustering for multi-view data by exploiting correlation consensus. IEEE Trans Image Process 24(11):3939–3949. https://doi.org/10.1109/TIP.2015.2457339
Article MathSciNet Google Scholar
Wang T, Zhang L, Lu H, Sun C, Qi J (2016) Kernelized subspace ranking for saliency detection. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision – ECCV 2016. Springer International Publishing, Cham, pp 450–466
Chapter Google Scholar
Wang Y, Zhang W, Wu L, Lin X, Fang M, Pan S (2016) Iterative views agreement: An iterative low-rank based structured optimization method to multi-view spectral clustering. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16. AAAI Press, pp 2153–2159. http://dl.acm.org/citation.cfm?id=3060832.3060922
Wang L, Wang L, Lu H, Zhang P, Xiang R (2016) Saliency detection with recurrent fully convolutional networks. In: European conference on computer vision, pp 825–841
Chapter Google Scholar
Wang T, Borji A, Zhang L, Zhang P, Lu H (2017) A stagewise refinement model for detecting salient objects in images. 2017 IEEE International conference on computer vision (ICCV), pp 4039–4048. https://doi.org/10.1109/ICCV.2017.433
Wang Y, Lin X, Wu L, Zhang W (2017) Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval. IEEE Trans Image Process 26(3):1393–1404. https://doi.org/10.1109/TIP.2017.2655449
Article MathSciNet Google Scholar
Wang Y, Wu L (2018) Beyond low-rank representations: Orthogonal clustering basis reconstruction with optimized graph structure for multi-view spectral clustering, vol 103. https://doi.org/10.1016/j.neunet.2018.03.006. http://www.sciencedirect.com/science/article/pii/S0893608018300911
Article Google Scholar
Wang Y, Wu L, Lin X, Gao J (2018) Multiview spectral clustering via structured low-rank matrix factorization. In: IEEE Transactions on Neural Networks and Learning Systems, pp 1–11. https://doi.org/10.1109/TNNLS.2017.2777489
Article Google Scholar
Wang Y, Zhang W, Wu L, Lin X, Zhao X (2017) Unsupervised metric fusion over multiview data by graph random walk-based cross-view diffusion. IEEE Trans Neural Netw Learn Syst 28(1):57–70. https://doi.org/10.1109/TNNLS.2015.2498149
Article Google Scholar
Wu L, Wang Y, Gao J, Li X (2018) Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recogn 73:275–288
Article Google Scholar
Wu L, Wang Y, Li X, Gao J (2018) Deep attention-based spatially recursive networks for fine-grained visual recognition. IEEE Transactions on Cybernetics. https://doi.org/10.1109/TCYB.2018.2813971
Article Google Scholar
Wu L, Wang Y, Li X, Gao J (2018) What-and-where to match: Deep spatially multiplicative integration networks for person re-identification. Pattern Recogn 76:727–738
Article Google Scholar
Yang J (2012) Top-down visual saliency via joint crf and dictionary learning. In: Computer vision and pattern recognition, pp 2296–2303
Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3166–3173. https://doi.org/10.1109/CVPR.2013.407
Zhang GX, Cheng MM, Hu SM, Martin RR (2009) A shape-preserving approach to image resizing. Comput Graph Forum 28(7):1897–1906. https://doi.org/10.1111/j.1467-8659.2009.01568.x
Article Google Scholar
Zhang J, Sclaroff S, Lin Z, Shen X, Price B, Mech R (2016) Minimum barrier salient object detection at 80 FPS. pp 1404–1412. https://doi.org/10.1109/ICCV.2015.165
Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. In: 2017 IEEE International conference on computer vision (ICCV), pp 212–221. https://doi.org/10.1109/ICCV.2017.32
Zhang H, Goodfellow I, Metaxas D, Odena A (2018) Self-Attention Generative Adversarial Networks. arXiv:1805.08318
Zhang P, Wang L, Wang D, Lu H, Shen C (2018) Agile Amulet: Real-Time Salient Object Detection with Contextual Attention. arXiv:1802.06960
Zhang X, Wang T, Qi J, Lu H, Wang G (2018) Progressive Attention Guided Recurrent Network for Salient Object Detection. In: Cvpr, pp. 714–722. https://doi.org/10.1109/CVPR.2018.00081. https://github.com/zhangxiaoning666/PAGR
Zhu L, Klein DA, Frintrop S, Cao Z, Cremers AB (2014) A multisize superpixel approach for salient object detection based on multivariate normal distribution estimation. IEEE Trans Image Process 23(12):5094–5107. https://doi.org/10.1109/TIP.2014.2361024
Article MathSciNet Google Scholar
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: 2014 IEEE Conference on computer vision and pattern recognition. Columbus, OH, pp 2814–2821. https://doi.org/10.1109/CVPR.2014.360

Download references

Acknowledgments

This work was supported by the Science and Technology Development Plan of Jilin Province under Grant 20170204020GX, the National Science Foundation of China under Grant U1564211.

Author information

Authors and Affiliations

College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Fengdong Sun, Wenhui Li & Yuanyuan Guan

Authors

Fengdong Sun
View author publications
You can also search for this author in PubMed Google Scholar
Wenhui Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenhui Li.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, F., Li, W. & Guan, Y. Self-attention recurrent network for saliency detection. Multimed Tools Appl 78, 30793–30807 (2019). https://doi.org/10.1007/s11042-018-6591-3

Download citation

Received: 05 August 2018
Revised: 18 August 2018
Accepted: 22 August 2018
Published: 17 September 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s11042-018-6591-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Self-attention recurrent network for saliency detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recurrent Double Features: Recurrent Multi-scale Deep Features and Saliency Features for Salient Object Detection

Enhancing Feature Representation for Saliency Detection

Multi-attention embedded network for salient object detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Self-attention recurrent network for saliency detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recurrent Double Features: Recurrent Multi-scale Deep Features and Saliency Features for Salient Object Detection

Enhancing Feature Representation for Saliency Detection

Multi-attention embedded network for salient object detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation