Abstract
Co-saliency detection within a single image is a common vision problem that has not yet been well addressed. Existing methods often used a bottom-up strategy to infer co-saliency in an image in which salient regions are firstly detected using visual primitives such as color and shape and then grouped and merged into a co-saliency map. However, co-saliency is intrinsically perceived complexly with bottom-up and top-down strategies combined in human vision. To address this problem, this study proposes a novel end-to-end trainable network comprising a backbone net and two branch nets. The backbone net uses ground-truth masks as top-down guidance for saliency prediction, whereas the two branch nets construct triplet proposals for regional feature mapping and clustering, which drives the network to be bottom-up sensitive to co-salient regions. We construct a new dataset of 2019 natural images with co-saliency in each image to evaluate the proposed method. Experimental results show that the proposed method achieves state-of-the-art accuracy with a running speed of 28 fps.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hou X, Zhang L. Saliency detection: a spectral residual approach. In: Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007. 1–8
Huang T J, Tian Y H, Li J, et al. Salient region detection and segmentation for general object recognition and image understanding. Sci China Inf Sci, 2011, 54: 2461–2470
Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell, 2011, 34: 1915–1926
Cheng M M, Mitra N J, Huang X, et al. Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell, 2014, 37: 569–582
Li Z Q, Fang T, Huo H. A saliency model based on wavelet transform and visual attention. Sci China Inf Sci, 2010, 53: 738–751
Huang Z Y, He F Z, Cai X T, et al. Efficient random saliency map detection. Sci China Inf Sci, 2011, 54: 1207–1217
Li Q N, Li Y D, Lang C Y. Salient object detection with side information. Sci China Inf Sci, 2020, 63: 189202
Wang W, Shen J, Shao L. Video salient object detection via fully convolutional networks. IEEE Trans Image Process, 2018, 27: 38–49
Zhu W, Liang S, Wei Y, et al. Saliency optimization from robust background detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014. 2814–2821
Liu Y, Li X Q, Wang L, et al. Interpolation-tuned salient region detection. Sci China Inf Sci, 2014, 57: 012104
Piao Y, Jiang Y, Zhang M, et al. PANet: patch-aware network for light field salient object detection. IEEE Trans Cybern, 2023, 53: 379–391
Fu H, Cao X, Tu Z. Cluster-based co-saliency detection. IEEE Trans Image Process, 2013, 22: 3766–3778
Cao X, Tao Z, Zhang B, et al. Self-adaptively weighted co-saliency detection via rank constraint. IEEE Trans Image Processing, 2014, 23: 4175–4186
Huang R, Feng W, Sun J. Color feature reinforcement for cosaliency detection without single saliency residuals. IEEE Signal Process Lett, 2017, 24: 569–573
Cong R, Lei J, Fu H, et al. An iterative co-saliency framework for RGBD images. IEEE Trans Cybern, 2017, 49: 233–246
Wei L, Zhao S, Bourahla O E F, et al. Group-wise deep co-saliency detection. 2017. ArXiv:1707.07381
Guo F, Wang W, Shen J, et al. Video saliency detection using object proposals. IEEE Trans Cybern, 2017, 48: 3159–3170
Zou Q, Ni L, Wang Q, et al. Local pattern collocations using regional co-occurrence factorization. IEEE Trans Multimedia, 2017, 19: 492–505
Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell, 2016, 39: 2298–2304
Liao M, Shi B, Bai X. TextBoxes++: a single-shot oriented scene text detector. IEEE Trans Image Process, 2018, 27: 3676–3690
Huang T T, Xu Y C, Bai S, et al. Feature context learning for human parsing. Sci China Inf Sci, 2019, 62: 220101
Yu H, Zheng K, Fang J, et al. Co-saliency detection within a single image. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2018
Zitnick C L, Dollár P. Edge boxes: locating object proposals from edges. In: Proceedings of the 13th European Conference on Computer Vision, Zurich, 2014. 391–405
Wang L, Wang L, Lu H, et al. Saliency detection with recurrent fully convolutional networks. In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016. 825–841
Li G, Yu Y. Deep contrast learning for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 478–487
Zhao T, Wu X. Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 3085–3094
Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, 2015. 28
Schroff F, Kalenichenko D, Philbin J. FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 815–823
Bell S, Bala K. Learning visual similarity for product design with convolutional neural networks. ACM Trans Graph, 2015, 34: 1–10
Cong R, Zhang Y, Fang L, et al. RRNet: relational reasoning network with parallel multiscale attention for salient object detection in optical remote sensing images. IEEE Trans Geosci Remote Sens, 2022, 60: 1–11
Cong R, Yang N, Li C, et al. Global-and-local collaborative learning for co-salient object detection. IEEE Trans Cybern, 2023, 53: 1920–1931
Han J, Zhang D, Wen S, et al. Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans Cybern, 2015, 46: 487–498
Bylinskii Z, Recasens A, Borji A, et al. Where should saliency models look next? In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016. 809–824
Borji A, Cheng M M, Jiang H, et al. Salient object detection: a benchmark. IEEE Trans Image Process, 2015, 24: 5706–5722
Zhou Y, Huo S, Xiang W, et al. Semi-supervised salient object detection using a linear feedback control system model. IEEE Trans Cybern, 2018, 49: 1173–1185
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Machine Intell, 1998, 20: 1254–1259
Liu T, Yuan Z J, Sun J, et al. Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell, 2010, 33: 353–367
Wei Y, Wen F, Zhu W, et al. Geodesic saliency using background priors. In: Proceedings of the 12th European Conference on Computer Vision, Florence, 2012. 29–42
Borji A. Boosting bottom-up and top-down visual features for saliency estimation. In: Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012. 438–445
Li J, Pan Z, Liu Q, et al. Complementarity-aware attention network for salient object detection. IEEE Trans Cybern, 2020, 52: 873–886
Fang Y, Lin W, Lee B S, et al. Bottom-up saliency detection model based on human visual sensitivity and amplitude spectrum. IEEE Trans Multimedia, 2011, 14: 187–198
Zhao R, Ouyang W, Li H, et al. Saliency detection by multi-context deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 1265–1274
Zhang J, Sclaroff S, Lin Z, et al. Unconstrained salient object detection via proposal subset optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 5733–5742
Li G, Yu Y. Visual saliency based on multiscale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 5455–5463
Kim J, Pavlovic V. A shape-based approach for salient object detection using deep learning. In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016. 455–470
Hou Q, Cheng M M, Hu X, et al. Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 3203–3212
Liu N, Han J, Yang M H. PiCANet: learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 3089–3098
Chen S, Tan X, Wang B, et al. Reverse attention for salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 234–250
Luo Z, Mishra A, Achkar A, et al. Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 6609–6617
Xie S, Tu Z. Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, 2015. 1395–1403
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, 2015. 234–241
Ye L, Liu Z, Li L, et al. Salient object segmentation via effective integration of saliency and objectness. IEEE Trans Multimedia, 2017, 19: 1742–1756
Cong R, Lei J, Fu H, et al. Review of visual saliency detection with comprehensive information. IEEE Trans Circuits Syst Video Technol, 2018, 29: 2941–2959
Zhang Q, Cong R, Li C, et al. Dense attention fluid network for salient object detection in optical remote sensing images. IEEE Trans Image Process, 2020, 30: 1305–1317
Li C, Cong R, Hou J, et al. Nested network with two-stream pyramid for salient object detection in optical remote sensing images. IEEE Trans Geosci Remote Sens, 2019, 57: 9156–9166
Chen Z, Cong R, Xu Q, et al. DPANet: depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Trans Image Process, 2020, 30: 7012–7024
Cong R, Lei J, Fu H, et al. Going from RGB to RGBD saliency: a depth-guided transformation model. IEEE Trans Cybern, 2019, 50: 3627–3639
Fang Y, Wang Z, Lin W, et al. Video saliency incorporating spatiotemporal cues and uncertainty weighting. IEEE Trans Image Process, 2014, 23: 3910–3921
Li Y, Sheng B, Ma L, et al. Temporally coherent video saliency using regional dynamic contrast. IEEE Trans Circuits Syst Video Technol, 2013, 23: 2067–2076
Chen H T. Preattentive co-saliency detection. In: Proceedings of 2010 IEEE International Conference on Image Processing, 2010. 1117–1120
Li H, Ngan K N. A co-saliency model of image pairs. IEEE Trans Image Process, 2011, 20: 3365–3375
Zhang D, Meng D, Han J. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell, 2016, 39: 865–878
Zhang K, Dong M, Liu B, et al. DeepACG: co-saliency detection via semantic-aware contrast Gromov-Wasserstein distance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021. 13703–13712
Zhang N, Han J, Liu N, et al. Summarize and search: learning consensus-aware dynamic convolution for co-saliency detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 4167–4176
Tang L, Li B, Kuang S, et al. Re-thinking the relations in co-saliency detection. IEEE Trans Circuits Syst Video Technol, 2022, 32: 5453–5466
Ren G, Dai T, Stathaki T. Adaptive intra-group aggregation for co-saliency detection. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022. 2520–2524
Yu H, Zheng K, Fang J, et al. A new method and benchmark for detecting co-saliency within a single image. IEEE Trans Multimedia, 2020, 22: 3051–3063
Song S, Yu H, Miao Z, et al. An easy-to-hard learning strategy for within-image co-saliency detection. Neurocomputing, 2019, 358: 166–176
Guo Y, Liu Y, Georgiou T, et al. A review of semantic segmentation using deep neural networks. Int J Multimed Info Retr, 2018, 7: 87–93
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. 248–255
He K, Gkioxari G, Dollár P, et al. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 2961–2969
Shrivastava A, Gupta A, Girshick R. Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 761–769
Zou Q, Zhang Z, Li Q, et al. DeepCrack: learning hierarchical convolutional features for crack detection. IEEE Trans Image Process, 2018, 28: 1498–1512
Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2015. 1440–1448
Batra D, Kowdle A, Parikh D, et al. iCoseg: interactive co-segmentation with intelligent scribble guidance. In: Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. 3169–3176
Li Q Q, Zou Q, Ma D, et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes. Sci China Inf Sci, 2018, 61: 092105
Achanta R, Hemami S, Estrada F, et al. Frequency-tuned salient region detection. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. 1597–1604
Fan D P, Gong C, Cao Y, et al. Enhanced-alignment measure for binary foreground map evaluation. 2018. ArXiv:1805.10421
Qin X, Zhang Z, Huang C, et al. BASNet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 7479–7489
Chen Z, Xu Q, Cong R, et al. Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 10599–10606
Pang Y, Zhao X, Zhang L, et al. Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 9413–9422
Fan Q, Fan D P, Fu H, et al. Group collaborative learning for co-salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021. 12288–12298
Yu S, Xiao J, Zhang B, et al. Democracy does matter: comprehensive feature mining for co-salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 979–988
Acknowledgements
This work was supported by Key Research and Development Program of Hubei Province (Grant No. 2020BAB018), National Natural Science Foundation of China (Grant No. 62171324), and National Key R&D Program of China (Grant No. 2022YFF0901902).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yue, Y., Zou, Q., Yu, H. et al. An end-to-end network for co-saliency detection in one single image. Sci. China Inf. Sci. 66, 210101 (2023). https://doi.org/10.1007/s11432-022-3686-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-022-3686-1