Abstract
Camouflaged object detection (COD) is a challenging task which aims to detect objects similar to the surrounding environment. In this paper, we propose a transformer-induced progressive refinement network (TPRNet) to solve challenging COD tasks. Specifically, our network includes a Transformer-induced Progressive Refinement Module (TPRM) and a Semantic-Spatial Interaction Enhancement Module (SIEM). In TPRM, high-level features with rich semantic information are integrated through transformers as prior guidance, and then, it is sent to the refinement concurrency unit (RCU), and the accurately positioned feature area is obtained through a progressive refinement strategy. In SIEM, we perform feature interaction to localized-accurate semantic features and low-level features to obtain rich fine-grained clues and increase the symbolic power of boundary features. Extensive experiments on four widely used benchmark datasets (i.e., CAMO, CHAMELEON, COD10K, and NC4K) demonstrate that our TPRNet is an effective COD model and outperforms state-of-the-art models. The code is available https://github.com/zhangqiao970914/TPRNet.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amit, S.N.K.B., Shiraishi, S., Inoshita, T., Aoki, Y.: Analysis of satellite images for disaster detection. In: 2016 IEEE International geoscience and remote sensing symposium (IGARSS), pp. 5189–5192. IEEE (2016)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Bi, H., Wang, K., Lu, D., Wu, C., Wang, W., Yang, L.: C 2 net: a complementary co-saliency detection network. Vis. Comput. 37(5), 911–923 (2021)
Bi, H., Zhang, C., Wang, K., Tong, J., Zheng, F.: Rethinking camouflaged object detection: models and datasets. IEEE Trans. Circuits Syst. Video Technol. (2021). https://doi.org/10.1109/TCSVT.2021.3124952
Cui, Y., Cao, Z., Xie, Y., Jiang, X., Tao, F., Chen, Y.V., Li, L., Liu, D.: Dg-labeler and dgl-mots dataset: Boost the autonomous driving perception. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 58–67 (2022)
Cui, Y., Yan, L., Cao, Z., Liu, D.: Tf-blender: Temporal feature blender for video object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8138–8147 (2021)
Dong, B., Zhuge, M., Wang, Y., Bi, H., Chen, G.: Towards accurate camouflaged object detection with mixture convolution and interactive fusion. arXiv preprint arXiv:2101.056871(2) (2021)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp. 4548–4557 (2017)
Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421 (2018)
Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3085766
Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2777–2787 (2020)
Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: Parallel reverse attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 263–273. Springer (2020)
Fan, D.P., Zhou, T., Ji, G.P., Zhou, Y., Chen, G., Fu, H., Shen, J., Shao, L.: Inf-net: automatic covid-19 lung infection segmentation from ct images. IEEE Trans. Med. Imaging 39(8), 2626–2637 (2020)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3146–3154 (2019)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Hou, J.Y.Y.H.W., Li, J.: Detection of the mobile object with camouflage color under dynamic background based on optical flow. Procedia Eng. 15, 2201–2205 (2011)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)
Ji, G.P., Zhu, L., Zhuge, M., Fu, K.: Fast camouflaged object detection via edge-based reversible re-calibration network. Pattern Recogn. 123, 108414 (2022)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Le, T.N., Nguyen, T.V., Nie, Z., Tran, M.T., Sugimoto, A.: Anabranch network for camouflaged object segmentation. Comput. Vis. Image Underst. 184, 45–56 (2019)
Le, X., Mei, J., Zhang, H., Zhou, B., Xi, J.: A learning-based approach for surface defect detection using small image datasets. Neurocomputing 408, 112–120 (2020)
Li, A., Zhang, J., Lv, Y., Liu, B., Zhang, T., Dai, Y.: Uncertainty-aware joint salient object and camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10071–10081 (2021)
Liu, D., Cui, Y., Chen, Y., Zhang, J., Fan, B.: Video object detection for autonomous driving: Motion-aid feature calibration. Neurocomputing 409, 1–11 (2020)
Liu, D., Cui, Y., Tan, W., Chen, Y.: Sg-net: Spatial granularity network for one-stage video instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9816–9825 (2021)
Liu, Z., Huang, K., Tan, T.: Foreground object detection using top-down information based on em framework. IEEE Trans. Image Process. 21(9), 4204–4217 (2012)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Lv, Y., Zhang, J., Dai, Y., Li, A., Liu, B., Barnes, N., Fan, D.P.: Simultaneously localize, segment and rank the camouflaged objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11591–11601 (2021)
Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 248–255 (2014)
Mei, H., Ji, G.P., Wei, Z., Yang, X., Wei, X., Fan, D.P.: Camouflaged object segmentation with distraction mining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8772–8781 (2021)
Pan, Y., Chen, Y., Fu, Q., Zhang, P., Xu, X.: Study on the camouflaged target detection method based on 3d convexity. Mod. Appl. Sci. 5(4), 152 (2011)
Pang, Y., Zhao, X., Zhang, L., Lu, H.: Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9413–9422 (2020)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: an imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019)
Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 733–740. IEEE (2012)
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)
Sengottuvelan, P., Wahi, A., Shanmugam, A.: Performance of decamouflaging through exploratory image analysis. In: 2008 First International Conference on Emerging Trends in Engineering and Technology, pp. 6–10. IEEE (2008)
Skurowski, P., Abdulameer, H., Błaszczyk, J., Depta, T., Kornacki, A., Kozieł, P.: Animal camouflage analysis: Chameleon database. Unpublished manuscript 2(6), 7 (2018)
Sun, Y., Chen, G., Zhou, T., Zhang, Y., Liu, N.: Context-aware cross-level fusion network for camouflaged object detection. arXiv preprint arXiv:2105.12555 (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
Wang, D., Hu, G., Lyu, C.: Frnet: an end-to-end feature refinement neural network for medical image segmentation. Vis. Comput. 37(5), 1101–1112 (2021)
Wang, K., Bi, H., Zhang, Y., Zhang, C., Liu, Z., Zheng, S.: D 2 c-net: a dual-branch, dual-guidance and cross-refine network for camouflaged object detection. IEEE Trans. Ind. Electron. 69, 5364 (2021)
Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., Shao, L.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578 (2021)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803 (2018)
Wang, X., Wang, W., Bi, H., Wang, K.: Reverse collaborative fusion model for co-saliency detection. The Visual Computer pp. 1–11 (2021)
Wei, J., Wang, S., Huang, Q.: F\(^3\)net: fusion, feedback and focus for salient object detection. Proc. AAAI Conf. Artif. Intell. 34, . 12321-12328 (2020)
Wu, Y.H., Gao, S.H., Mei, J., Xu, J., Fan, D.P., Zhang, R.G., Cheng, M.M.: Jcs: an explainable covid-19 diagnosis system by joint classification and segmentation. IEEE Trans. Image Process. 30, 3113–3126 (2021)
Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3907–3916 (2019)
Wu, Z., Su, L., Huang, Q.: Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7264–7273 (2019)
Xiao, H., Ran, Z., Mabu, S., Li, Y., Li, L.: Saunet++: an automatic segmentation model of covid-19 lesion from ct slices. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02414-4
Yan, J., Le, T.N., Nguyen, K.D., Tran, M.T., Do, T.T., Nguyen, T.V.: Mirrornet: bio-inspired camouflaged object segmentation. IEEE Access 9, 43290–43300 (2021)
Yang, F., Zhai, Q., Li, X., Huang, R., Luo, A., Cheng, H., Fan, D.P.: Uncertainty-guided transformer reasoning for camouflaged object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4146–4155 (2021)
Youwei, P., Xiaoqi, Z., Tian-Zhu, X., Lihe, Z., Huchuan, L.: Zoom in and out: A mixed-scale triplet network for camouflaged object detection. arXiv preprint arXiv:2203.02688 (2022)
Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Jiang, Z.H., Tay, F.E., Feng, J., Yan, S.: Tokens-to-token vit: Training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 558–567 (2021)
Zhai, Q., Li, X., Yang, F., Chen, C., Cheng, H., Fan, D.P.: Mutual graph learning for camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12997–13007 (2021)
Zhang, X., Wang, X., Gu, C.: Online multi-object tracking with pedestrian re-identification and occlusion processing. Vis. Comput. 37(5), 1089–1099 (2021)
Zhang, Y., Han, S., Zhang, Z., Wang, J., Bi, H.: Cf-gan: cross-domain feature fusion generative adversarial network for text-to-image synthesis. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02404-6
Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8779–8788 (2019)
Zhuge, M., Lu, X., Guo, Y., Cai, Z., Chen, S.: Cubenet: X-shape connection for camouflaged object detection. Pattern Recogn. 127, 108644 (2022)
Acknowledgements
The paper is supported by AnHui Province Key Laboratory of Infrared and Low-Temperature Plasma under No. IRKL2022KF07.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Q., Ge, Y., Zhang, C. et al. TPRNet: camouflaged object detection via transformer-induced progressive refinement network. Vis Comput 39, 4593–4607 (2023). https://doi.org/10.1007/s00371-022-02611-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02611-1