Abstract
Although two-stage methods have favorably improved the accuracy and robustness of saliency detection, obtaining a saliency map with clear foreground boundaries and fine structure is still challenging. In this article, we proposed a novel and effective two-stage method to intelligently detect salient objects, which constitutes the coarse saliency map construction and the fine saliency map generation. Firstly, we develop the prior distribution learning algorithm (PDL) to explore the mapping relationship between input superpixel and its corresponding superpixels in various prior maps. The PDL can calculate the corresponding weights according to the contribution of various priors to each region in the image. Therefore, it can provide more reliable pseudo-labels for training subsequent learning models. Secondly, through learning the implicit representation between reliable samples and multiple priors, the learning model can accurately predict the salient values of those regions that are difficult to judge the saliency, so as to obtain an instructive coarse saliency map. Thirdly, in order to optimize the details of the coarse saliency map, we propose a framework called saliency consistency optimization, which can get clear foreground boundaries and effectively suppress the background noise. We compare the proposed algorithm with other state-of-the-art methods on four datasets. Experimental results adequately demonstrate the effectiveness of our approach over other comparison methods, especially two-stage-based methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability Statement
The datasets analyzed during the current study are available from the corresponding author on reasonable request.
Change history
13 November 2022
A Correction to this paper has been published: https://doi.org/10.1007/s00371-022-02724-7
References
Dou, P., Shen, H., Li, Z., et al.: Time series remote sensing image classification framework using combination of deep learning and multiple classifiers system. Int. J. Appl. Earth Obs. Geoinf. 103(8), 102477 (2021)
Luo, L., Wang, X., Hu, S., Hu, X., Zhang, H., Liu, Y., Zhang, J.: A unified framework for interactive image segmentation via Fisher rules. Vis. Comput. 35(12), 1869–1882 (2019)
Lv, G., Dong, L., Zhang, W., Xu, W.: Region-based adaptive association learning for robust image scene recognition. Vis. Comput. 1–21 (2022)
Chen, X., Wang, T., Zhu, Y., Jin, L., Luo, C.: Adaptive embedding gate for attention-based scene text recognition. Neurocomputing 381, 261–271 (2020)
Wang, Q., Huang, Y., Jia, W., He, X., Blumenstein, M., Lyu, S., Lu, Y.: FACLSTM: ConvLSTM with focused attention for scene text recognition. Sci. China Inf. Sci. 63(2), 1–14 (2020)
Xie, J., Ge, Y., Zhang, J., Huang, S., Chen, F., Wang, H.: Low-resolution assisted three-stream network for person re-identification. Vis. Comput. 38, 1–11 (2021)
Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision, pp. 202–211 (2017)
He, S., Lau, R.W., Liu, W., Huang, Z., Yang, Q.: Supercnn: a superpixelwise convolutional neural network for salient object detection. Int. J. Comput. Vis. 115(3), 330–344 (2015)
Liu, N., Han, J., Yang, M.H.: Picanet: pixel-wise contextual attention learning for accurate saliency detection. IEEE Trans. Image Process. 29, 6438–6451 (2020)
Qin, X., Fan, D.P., Huang, C., Diagne, C., Zhang, Z.: Boundary-aware segmentation network for mobile and web applications. arXiv:2101.04704 (2021)
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)
Zhang, M., Pang, Y., Wu, Y., Du, Y., Sun, H., Zhang, K.: Saliency detection via local structure propagation. J. Vis. Commun. Image Represent. 52, 131–142 (2018)
Wu, Y., Jia, T., Pang, Y., Sun, J., Xue, D.: Salient object detection via a boundary-guided graph structure. J. Vis. Commun. Image Represent. 75, 103048 (2021)
Jian, M., Wang, J., Yu, H., Wang, G.G.: Integrating object proposal with attention networks for video saliency detection. Inf. Sci. 576, 819–830 (2021)
Zhuge, M., Lu, X., Guo, Y., Cai, Z., Chen, S.: CubeNet: X-shape connection for camouflaged object detection. Pattern Recogn. 127, 108644 (2022)
Zhuge, M., Fan, D.P., Liu, N., Zhang, D., Xu, D.: Salient object detection via integrity learning. arXiv:2101.07663 (2021)
Islam, M.A., Kalash, M., Bruce, N.D.B.: Revisiting salient object detection: Simultaneous detection, ranking, and subitizing of multiple salient objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7142–7150 (2018)
Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.M.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2014)
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1597–1604 (2009)
Li, H., Lu, H., Lin, Z., Shen, X., Price, B.: Inner and inter label propagation: salient object detection in the wild. IEEE Trans. Image Process. 24(10), 3176–3186 (2015)
Zhang, M., Wu, Y., Du, Y., Fang, L., Pang, Y.: Saliency detection integrating global and local information. J. Vis. Commun. Image Represent. 53, 215–223 (2018)
Jian, M., Wang, R., Xu, H., Yu, H., Dong, J., Li, G., Yin, Y., Lam, K.M.: Robust seed selection of foreground and background priors based on directional blocks for saliency-detection system. Multimed. Tools Appl. 1–25 (2022)
Jian, M., Wang, J., Yu, H., Wang, G., Meng, X., Yang, L., Dong, J., Yin, Y.: Visual saliency detection by integrating spatial position prior of object with background cues. Expert Syst. Appl. 168, 114219 (2021)
Qin, Y., Feng, M., Lu, H., Cottrell, G.W.: Hierarchical cellular automata for visual saliency. Int. J. Comput. Vis. 126(7), 751–770 (2018)
Chen, S., Zheng, L., Hu, X., Zhou, P.: Discriminative saliency propagation with sink points. Pattern Recogn. 60, 2–12 (2016)
Pang, Y., Yu, X., Wu, Y., Wu, C.: FSP: a feedback-based saliency propagation method for saliency detection. J. Electr. Imaging 29(1), 013011 (2020)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. 1409–1556 (2014)
Geng, X.: Label distribution learning. IEEE Trans. Knowl. Data Eng. 28(7), 1734–1748 (2016)
Yang, C., Zhang, L., Lu, H.: Graph-regularized saliency detection with convex-hull-based center prior. IEEE Signal Process. Lett. 20(7), 637–640 (2013)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Hwang, I., Lee, S.H., Park, J.S., Cho, N.I.: Saliency detection based on seed propagation in a multilayer graph. Multimed. Tools Appl. 76(2), 2111–2129 (2017)
Gong, C., Tao, D., Liu, W., Maybank, S.J., Fang, M., Fu, K., Yang, J.: Saliency propagation from simple to difficult. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2531–2539 (2015)
Qin, Y., Lu, H., Xu, Y., Wang, H.: Saliency detection via cellular automata. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 110–119 (2015)
Fang, S., Li, J., Tian, Y., Huang, T., Chen, X.: Learning discriminative subspaces on random contrasts for image saliency analysis. IEEE Trans. Neural Netw. Learn. Syst. 28(5), 1095–1108 (2016)
Tu, W.C., He, S., Yang, Q., Chien, S.Y.: Real-time salient object detection with a minimum spanning tree. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2334–2342 (2016)
Sun, J., Lu, H., Liu, X.: Saliency region detection based on Markov absorption probabilities. IEEE Trans. Image Process. 24(5), 1639–1649 (2015)
Peng, H., Li, B., Ling, H., Hu, W., Xiong, W., Maybank, S.J.: Salient object detection via structured matrix decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 818–832 (2016)
Wang, L., Lu, H., Ruan, X., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3183–3192 (2015)
Liu, G.H., Yang, J.Y.: Exploiting color volume and color difference for salient region detection. IEEE Trans. Image Process. 28(1), 6–16 (2018)
Zhang, L., Ai, J., Jiang, B., Lu, H., Li, X.: Saliency detection via absorbing Markov chain with learnt transition probability. IEEE Trans. Image Process. 27(2), 987–998 (2017)
Wu, Y., Jia, T., Li, W., Chen, D.: RSF: a novel saliency fusion framework for image saliency detection. In: 2020 International Conference on Culture-oriented Science and Technology (ICCST), pp. 45–49 (2020)
Zhang, L., Sun, J., Wang, T., Min, Y., Lu, H.: Visual saliency detection via kernelized subspace ranking with active learning. IEEE Trans. Image Process. 29, 2258–2270 (2019)
Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1155–1162 (2013)
Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014)
Movahedi, V., Elder, J.H.: Design and perceptual validation of performance measures for salient object segmentation. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 49–56 (2010)
Tong, N., Lu, H., Ruan, X., Yang, M.H.: Salient object detection via bootstrap learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1884–1892 (2015)
Tong, N., Lu, H., Zhang, Y., Ruan, X.: Salient object detection via global and local cues. Pattern Recogn. 48(10), 3258–3267 (2015)
Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)
Acknowledgements
Thanks are due to Ph.D.Yu Pang for inspiring our work. This work was supported in part by the National Natural Science Foundation of China under Grant Nos. U1613214 and 62173083, the Major Program of National Natural Science Foundation of China under Grant No.71790614 and the 111 Project B16009, the National Key Research and Development Project Grant No.2018YFB1404101 and the Fundamental Research Fund for the Central Universities of China under Grant N170402008 and N2026004.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial or non-financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: there were errors in affiliations and in equation 10 and 11.
Appendix
Appendix
In the multi-prior distribution learning model, five prior knowledge are utilized to extract saliency cues, i.e., background prior, objectness prior, color distribution prior, global contrast prior and center prior. Here, we will give more details about these five prior knowledge.
Background prior is the universal prior knowledge in saliency detection. It usually assumes image boundary superpixels as initial background seeds. Then, each superpixel’s saliency value is determined by its feature contrast with background seeds. Thus, we construct a background-based map \({\text{BG}} = \left[ {{\text{bg}}_{1} ,{\text{bg}}_{2} , \ldots ,{\text{bg}}_{N} } \right]^{T}\) as follows:
where \({d}_{i}\) and \({d}_{j}^{b}\) represent the deep features of \(i\)-th superpixel and \(j\)-th boundary superpixel, respectively. \(n \) is the total number of boundary superpixels, and the parameter \(\sigma \) is set to 0.1. Higher \({\text{bg}}_{i} \) indicates that superpixel \( m_{i }\) has higher contrast to background seeds and is more likely to be the salient object. Otherwise, superpixel \({ } m_{i }\) tends to be background regions.
Objectness prior is proposed in [50]. It firstly constructs a large number of windows, each of which is a part of the image. Then several rules are utilized to compute the likelihood of each window containing salient objects. Objectness prior is widely applied to numerous saliency detection methods due to its effectiveness. We define the objectness map to be \({\text{OB}} = \left[ {{\text{ob}}_{1} ,{\text{ob}}_{2} , \ldots ,{\text{ob}}_{N} } \right]^{T}\).
Color distribution prior is proposed in our previous works [12]. It firstly divides image pixels into eight regions according to color cues. Secondly, those regions that jointly have compact spatial structures and are near the image weight center are more likely to be salient objects. Finally, according to [12], we obtain a color distribution map \({\text{CD}} = \left[ {{\text{cd}}_{1} ,{\text{cd}}_{2} , \ldots ,{\text{cd}}_{N} } \right]^{T}\) without any modifications.
Global contrast prior is based on the observation that the human eye tends to detect regions which have higher contrast to other regions. Thus, we construct the global contrast-based map \({\text{GC}} = \left[ {{\text{gc}}_{1} ,{\text{gc}}_{2} , \ldots ,{\text{gc}}_{N} } \right]^{T}\) as follows:
where \(d_{i} \) and \(d_{j} \) are the deep features of superpixel \( m_{i }\) and \( m_{j} , \) respectively,\(\,{ }N \) is the total number of superpixels and parameter \(\sigma \) is set to 0.1. The global contrast value of each superpixel is determined by the mean contrast between it and other superpixels.
Center prior is also effective prior knowledge, which assumes that image center regions are more likely to be salient objects. To this end, we construct the center-based map \({\text{CB}} = \left[ {{\text{cb}}_{1} ,{\text{cb}}_{2} , \ldots ,{\text{cb}}_{N} } \right]^{T}\) as follows:
where \({\mathbf{p}}_{i} \) is the position coordinate of superpixel \( m_{i }\) and \({\mathbf{p}}^{{{\mathbf{c}} }}\) is the image center position, parameter \(\sigma \) is set to 0.1.
In multi-prior distribution learning algorithm, five prior maps are represented as \( {\text{pm}}_{1}\),\({\text{pm}}_{2}\),…, \({\text{pm}}_{5}\), i.e., \({\text{pm}}_{1} \) for \({\text{ BG}}\),\( {\text{pm}}_{2} \) for \({\text{ OB}}\),\( {\text{pm}}_{3} \) for \({\text{ SD}}\),\( {\text{pm}}_{4} \) for \({\text{ GC}}\) and \({\text{pm}}_{5} \) for \({\text{ CB}}\).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, Y., Chang, X., Chen, D. et al. Two-stage salient object detection based on prior distribution learning and saliency consistency optimization. Vis Comput 39, 5729–5745 (2023). https://doi.org/10.1007/s00371-022-02692-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02692-y