Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Enhanced salient object detection in remote sensing images via dual-stream semantic interactive network

  • Research
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Salient object detection in remote sensing images (RSI-SOD) aims to identify the most prominent regions within complex RSI scenes. Current convolutional neural network (CNN)-based approaches struggle to capture long-distance dependencies, limiting their performance. To address this, we propose a novel dual-stream semantic interactive network (DSINet). Specifically, the model combines the advantages of Transformer and CNN to simultaneously model both global relationships and local details via the dual-stream architecture. It comprises three key modules: a multi-scale feature enhancement module to enhance feature representations across scales, a cross-attention complementary mining module to explore complementary cues between Transformer and CNN features, and a cross-layer feature interaction module to mitigate inconsistencies between adjacent layers. Extensive experiments on benchmark datasets demonstrate that DSINet achieves superior performance compared to state-of-the-art methods, effectively identifying salient objects in challenging RSI scenes. The code and results of our method are available at https://github.com/dqxfj99/DSINet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The datasets can be found in the https://github.com/rmcong/EORSSD-dataset, https://li-chongyi.github.io/proj_optical_saliency.html, and https://github.com/wchao1213/ORSI-SOD.

References

  1. Cong, R., Lei, J., Fu, H., Cheng, M.M., Lin, W., Huang, Q.: Review of visual saliency detection with comprehensive information. IEEE Trans. Circ. Syst. Video Technol. 29(10), 2941–2959 (2018)

    Article  Google Scholar 

  2. Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)

    Article  MathSciNet  Google Scholar 

  3. Li, G., Liu, Z., Ling, H.: Icnet: Information conversion network for rgb-d based salient object detection. IEEE Trans. Image Proc. 29, 4873–4884 (2020)

    Article  Google Scholar 

  4. Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H., Yang, R.: Salient object detection in the deep learning era: an in-depth survey. IEEE Trans. Patt. Anal. Mach. Intell. 44(6), 3239–3259 (2021)

    Article  Google Scholar 

  5. Xie, Z., Zhang, W., Sheng, B., Li, P., Chen, C.P.: Bagfn: broad attentive graph fusion network for high-order feature interactions. IEEE Trans. Neural Netw. Learn. Syst. 34(8), 4499–4513 (2021)

    Article  Google Scholar 

  6. Zhou, Y., Chen, Z., Li, P., Song, H., Chen, C.P., Sheng, B.: Fsad-net: feedback spatial attention dehazing network. IEEE Trans. Neural Netw. Learn. Syst. 34(10), 7719–7733 (2022)

    Article  Google Scholar 

  7. Zhang, Q., Ge, Y., Zhang, C., Bi, H.: Tprnet: camouflaged object detection via transformer-induced progressive refinement network. Visual Comput. 39(10), 4593–4607 (2023)

    Article  Google Scholar 

  8. Karambakhsh, A., Sheng, B., Li, P., Li, H., Kim, J., Jung, Y., Chen, C.P.: Sparsevoxnet: 3-d object recognition with sparsely aggregation of 3-d dense blocks. IEEE Trans. Neural Netw. Learn. Syst. 35(1), 532–546 (2022)

    Article  Google Scholar 

  9. Ge, Y., Ren, J., Zhang, C., He, M., Bi, H., Zhang, Q.: Feature-aware and iterative refinement network for camouflaged object detection. Visual Comput. 102, 1–18 (2024)

    Google Scholar 

  10. Ali, S.G., Wang, X., Li, P., Li, H., Yang, P., Jung, Y., Qin, J., Kim, J., Sheng, B.: Egdnet: an efficient glomerular detection network for multiple anomalous pathological feature in glomerulonephritis. Visual Comput. 26, 1–18 (2024)

    Google Scholar 

  11. Wei, W., Xu, M., Wang, J., Luo, X.: Bidirectional attentional interaction networks for rgb-d salient object detection. Image Vis. Comput. 138, 104792 (2023)

    Article  Google Scholar 

  12. Lan, X., Gu, X., Gu, X.: Mmnet: Multi-modal multi-stage network for rgb-t image semantic segmentation. Appl. Intell. 52(5), 5817–5829 (2022)

    Article  Google Scholar 

  13. Lian, Y., Shi, X., Shen, S., Hua, J.: Multitask learning for image translation and salient object detection from multimodal remote sensing images. Visual Comput. 40(3), 1395–1414 (2024)

    Article  Google Scholar 

  14. Cong, R., Lei, J., Fu, H., Porikli, F., Huang, Q., Hou, C.: Video saliency detection via sparsity-based reconstruction and propagation. IEEE Trans. Image Proc. 28(10), 4819–4831 (2019)

    Article  MathSciNet  Google Scholar 

  15. Wang, P., Liu, Y., Cao, Y., Yang, X., Luo, Y., Lu, H., Liang, Z., Lau, R.W.: Salient object detection with image-level binary supervision. Patt. Recogn. 129, 108782 (2022)

    Article  Google Scholar 

  16. Feng, W., Han, R., Guo, Q., Zhu, J., Wang, S.: Dynamic saliency-aware regularization for correlation filter-based object tracking. IEEE Trans. Image Process. 28(7), 3232–3245 (2019)

    Article  MathSciNet  Google Scholar 

  17. Hadizadeh, H., Bajić, I.V.: Saliency-aware video compression. IEEE Trans. Image Process. 23(1), 19–33 (2013)

    Article  MathSciNet  Google Scholar 

  18. Li, G., Liu, Z., Shi, R., Wei, W.: Constrained fixation point based segmentation via deep neural network. Neurocomputing 368, 180–187 (2019)

    Article  Google Scholar 

  19. Li, G., Liu, Z., Shi, R., Hu, Z., Wei, W., Wu, Y., Huang, M., Ling, H.: Personal fixations-based object segmentation with object localization and boundary preservation. IEEE Trans. Image Process. 30, 1461–1475 (2020)

    Article  Google Scholar 

  20. Liu, N., Zhao, W., Shao, L., Han, J.: Scg: Saliency and contour guided salient instance segmentation. IEEE Trans. Image Process. 30, 5862–5874 (2021)

    Article  Google Scholar 

  21. En, Q., Duan, L., Zhang, Z.: Joint multisource saliency and exemplar mechanism for weakly supervised video object segmentation. IEEE Trans. Image Process. 30, 8155–8169 (2021)

    Article  Google Scholar 

  22. Li, G., Wang, Y., Liu, Z., Zhang, X., Zeng, D.: Rgb-t semantic segmentation with location, activation, and sharpening. IEEE Trans. Circ. Syst. or Video Technol. 33(3), 1223–1235 (2022)

    Article  Google Scholar 

  23. Wellmann, T., Lausch, A., Andersson, E., Knapp, S., Cortinovis, C., Jache, J., Scheuer, S., Kremer, P., Mascarenhas, A., Kraemer, R., et al.: Remote sensing in urban planning: contributions towards ecologically sound policies? Landsc. Urban Plann. 204, 103921 (2020)

    Article  Google Scholar 

  24. Duraklı, E., Aptoula, E.: Domain generalized object detection for remote sensing images. In: 2023 31st signal processing and communications applications conference (SIU), pp. 1–4. IEEE (2023)

  25. Li, C., Cong, R., Hou, J., Zhang, S., Qian, Y., Kwong, S.: Nested network with two-stream pyramid for salient object detection in optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 57(11), 9156–9166 (2019)

    Article  Google Scholar 

  26. Zeng, X., Xu, M., Hu, Y., Tang, H., Hu, Y., Nie, L.: Adaptive edge-aware semantic interaction network for salient object detection in optical remote sensing images. IEEE Trans. Geosci. Remote Sens. (2023)

  27. Li, G., Liu, Z., Zeng, D., Lin, W., Ling, H.: Adjacent context coordination network for salient object detection in optical remote sensing images. IEEE Trans. Cybern. 53(1), 526–538 (2023)

    Article  Google Scholar 

  28. Li, G., Liu, Z., Lin, W., Ling, H.: Multi-content complementation network for salient object detection in optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2021)

    Google Scholar 

  29. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inform. Process. Syst. 30 (2017)

  30. Dong, P., Wang, B., Cong, R., Sun, H.H., Li, C.: Transformer with large convolution kernel decoder network for salient object detection in optical remote sensing images. Comput. Vis. Image Understand. 240, 103917 (2024)

    Article  Google Scholar 

  31. Li, H., Chen, X., Yang, W., Huang, J., Sun, K., Wang, Y., Huang, A., Mei, L.: Global semantic-sense aggregation network for salient object detection in remote sensing images. Entropy 26(6), 445 (2024)

    Article  Google Scholar 

  32. Zhang, M., Tian, X.: Transformer architecture based on mutual attention for image-anomaly detection. Virt. Real. Intell. Hardw. 5(1), 57–67 (2023)

    MathSciNet  Google Scholar 

  33. Lin, X., Sun, S., Huang, W., Sheng, B., Li, P., Feng, D.D.: Eapt: efficient attention pyramid transformer for image processing. IEEE Trans. Multim. 25, 50–61 (2021)

    Article  Google Scholar 

  34. Huang, S., Liu, X., Tan, T., Hu, M., Wei, X., Chen, T., Sheng, B.: Transmrsr: transformer-based self-distilled generative prior for brain mri super-resolution. Visual Comput. 39(8), 3647–3659 (2023)

    Article  Google Scholar 

  35. Wang, W., Zhao, S., Shen, J., Hoi, S.C., Borji, A.: Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1448–1457 (2019)

  36. Liu, Y., Gu, Y.C., Zhang, X.Y., Wang, W., Cheng, M.M.: Lightweight salient object detection via hierarchical visual perception learning. IEEE Trans. Cybern. 51(9), 4439–4449 (2020)

    Article  Google Scholar 

  37. Qin, X., Zhang, Z., Huang, C., Dehghan, M., Zaiane, O.R., Jagersand, M.: U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn. 106, 107404 (2020)

    Article  Google Scholar 

  38. Wang, W., Shen, J., Dong, X., Borji, A., Yang, R.: Inferring salient objects from human fixations. IEEE transactions on pattern analysis and machine intelligence 42(8), 1913–1927 (2019)

    Article  Google Scholar 

  39. Liu, Y., Zhang, X.Y., Bian, J.W., Zhang, L., Cheng, M.M.: Samnet: Stereoscopically attentive multi-scale network for lightweight salient object detection. IEEE Trans. Image Process. 30, 3804–3814 (2021)

    Article  Google Scholar 

  40. Wang, W., Shen, J., Cheng, M.M., Shao, L.: An iterative and cooperative top-down and bottom-up inference network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5968–5977 (2019)

  41. Zhao, X., Pang, Y., Zhang, L., Lu, H., Zhang, L.: Suppress and balance: A simple gated network for salient object detection. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pp. 35–51. Springer (2020)

  42. Zhang, Q., Cong, R., Li, C., Cheng, M.M., Fang, Y., Cao, X., Zhao, Y., Kwong, S.: Dense attention fluid network for salient object detection in optical remote sensing images. IEEE Trans. Image Process. 30, 1305–1317 (2020)

    Article  Google Scholar 

  43. Tu, Z., Wang, C., Li, C., Fan, M., Zhao, H., Luo, B.: Orsi salient object detection via multiscale joint region and boundary model. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2021)

    Google Scholar 

  44. Liang, B., Luo, H.: Meanet: An effective and lightweight solution for salient object detection in optical remote sensing images. Expert Systems with Applications p. 121778 (2023)

  45. Cai, X., Lai, Q., Wang, Y., Wang, W., Sun, Z., Yao, Y.: Poly kernel inception network for remote sensing detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 27706–27716 (2024)

  46. Li, G., Liu, Z., Bai, Z., Lin, W., Ling, H.: Lightweight salient object detection in optical remote sensing images via feature correlation. IEEE Trans. Geosci. Remote Sens. 60, 1–12 (2022)

    Google Scholar 

  47. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European conference on computer vision, pp. 213–229. Springer (2020)

  48. Li, G., Liu, Z., Ye, L., Wang, Y., Ling, H.: Cross-modal weighting network for rgb-d salient object detection. In: European conference on computer vision, pp. 665–681. Springer (2020)

  49. Zhang, Q.L., Yang, Y.B.: Sa-net: Shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2235–2239. IEEE (2021)

  50. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  51. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022 (2021)

  52. Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., Shao, L.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 568–578 (2021)

  53. Wang, Q., Liu, Y., Xiong, Z., Yuan, Y.: Hybrid feature aligned network for salient object detection in optical remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)

    Google Scholar 

  54. Bao, L., Zhou, X., Zheng, B., Yin, H., Zhu, Z., Zhang, J., Yan, C.: Aggregating transformers and cnns for salient object detection in optical remote sensing images. Neurocomputing 553, 126560 (2023)

    Article  Google Scholar 

  55. Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., Shao, L.: Pvt v2: Improved baselines with pyramid vision transformer. Computat. Visual Media 8(3), 415–424 (2022)

    Article  Google Scholar 

  56. Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp. 6105–6114. PMLR (2019)

  57. Fan, D.P., Ji, G.P., Cheng, M.M., Shao, L.: Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6024–6042 (2021)

    Article  Google Scholar 

  58. Senhua, X., Liqing, G., Liang, W., Wei, F.: Multi-scale context-aware network for continuous sign language recognition. Virt. Real. Intell. Hardw. 6(4), 323–337 (2024)

    Google Scholar 

  59. Al-Jebrni, A.H., Ali, S.G., Li, H., Lin, X., Li, P., Jung, Y., Kim, J., Feng, D.D., Sheng, B., Jiang, L., et al.: Sthy-net: a feature fusion-enhanced dense-branched modules network for small thyroid nodule classification from ultrasound images. Vis. Comput. 39(8), 3675–3689 (2023)

    Article  Google Scholar 

  60. Liu, R., Wang, T., Li, H., Zhang, P., Li, J., Yang, X., Shen, D., Sheng, B.: Tmm-nets: transferred multi-to mono-modal generation for lupus retinopathy diagnosis. IEEE Trans. Med. Imaging 42(4), 1083–1094 (2022)

    Article  Google Scholar 

  61. Li, G., Liu, Z., Zhang, X., Lin, W.: Lightweight salient object detection in optical remote-sensing images via semantic matching and edge alignment. IEEE Trans. Geosci. Remote Sens. 61, 1–11 (2023)

    Google Scholar 

  62. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848–6856 (2018)

  63. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)

  64. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)

  65. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7479–7489 (2019)

  66. Li, G., Liu, Z., Chen, M., Bai, Z., Lin, W., Ling, H.: Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans. Image Process. 30, 3528–3542 (2021)

    Article  Google Scholar 

  67. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019)

  68. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  69. Shen, K., Zhou, X., Wan, B., Shi, R., Zhang, J.: Fully squeezed multiscale inference network for fast and accurate saliency detection in optical remote-sensing images. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022)

    Google Scholar 

  70. Feng, D., Chen, H., Liu, S., Liao, Z., Shen, X., Xie, Y., Zhu, J.: Boundary-semantic collaborative guidance network with dual-stream feedback mechanism for salient object detection in optical remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 61, 1–17 (2023)

    Google Scholar 

  71. Liu, Y., Xiong, Z., Yuan, Y., Wang, Q.: Transcending pixels: boosting saliency detection via scene understanding from aerial imagery. IEEE Transactions on Geoscience and Remote Sensing (2023)

  72. Liu, Y., Xiong, Z., Yuan, Y., Wang, Q.: Distilling knowledge from super-resolution for efficient remote sensing salient object detection. IEEE Trans. Geosci. Remote Sens. 61, 1–16 (2023)

    Article  Google Scholar 

  73. Liu, Y., Yuan, Y., Wang, Q.: Uncertainty-aware graph reasoning with global collaborative learning for remote sensing salient object detection. IEEE Geoscience and Remote Sensing Letters (2023)

  74. Li, G., Bai, Z., Liu, Z.: Texture-semantic collaboration network for orsi salient object detection. IEEE Trans. Circuits Syst. II Express Briefs 71(4), 2464–2468 (2024)

    Google Scholar 

  75. Zhao, J., Jia, Y., Ma, L., Yu, L.: Adaptive dual-stream sparse transformer network for salient object detection in optical remote sensing images. IEEE J. Select. Topics Appl. Earth Observ. Remote Sens. 17, 5173–5192 (2024)

    Article  Google Scholar 

  76. Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp. 4548–4557 (2017)

  77. Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: Contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 733–740. IEEE (2012)

  78. Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421 (2018)

  79. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 1597–1604. IEEE (2009)

  80. Liu, S., Huang, D., et al.: Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp. 385–400 (2018)

  81. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)

    Article  Google Scholar 

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (No. 62471124), Heilongjiang Province Natural Science Foundation (No. LH2022F005), and Young Top Talents Fund in the School of Electrical Information Engineering of Northeast Petroleum University (No. DYDQQB202204).

Author information

Authors and Affiliations

Authors

Contributions

Yanliang Ge provided software and contributed to validation and writing—original draft. Taichuan Liang was involved in methodology, validation, and writing—original draft, and provided software. Junchao Ren contributed to visualization, writing—review, and validation. Jiaxue Chen was involved in data curation, investigation, and validation. Hongbo Bi contributed to methodology and writing—review.

Corresponding author

Correspondence to Hongbo Bi.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ge, Y., Liang, T., Ren, J. et al. Enhanced salient object detection in remote sensing images via dual-stream semantic interactive network. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03713-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03713-8

Keywords