Weamba: Weather-Degraded Remote Sensing Image Restoration with Multi-Router State Space Model
Abstract
:1. Introduction
- We propose an effective local-enhanced visual state space model Weamba for RS image restoration under adverse weather conditions, which allows for exploring complementary components of local and global dependencies.
- We develop a simple yet effective multi-router scanning strategy for spatially varying feature extraction, allowing for comprehensive modeling of information flow through different pathways in high-resolution RS image data.
- We quantitatively and qualitatively evaluate the proposed method on haze removal and raindrop removal tasks. The results show that our method obtains a favorable trade-off between performance and model complexity.
2. Related Work
2.1. RS Image Restoration
2.2. Visual State Space Models
3. Proposed Method
3.1. Preliminaries
3.2. Overall Architecture
3.3. Local-Enhanced State Space Module (LSSM)
- Global State Space Block (GSSB). Inspired by the use of similar structures in the basic blocks of the Transformer and Mamba, we develop the GSSB to help image restoration. Specifically, the input features are processed sequentially through a linear layer, a depth-wise convolution, and a SiLU activation function, along with the multi-router scanning block (MRSB). Compared to the previous scanning mechanisms, our proposed MRSB can obtain strong feature representations, which will be detailed in the following. Next, layer normalization is applied to manage data across different batches. In addition, the nonlinear features are connected with a branch that bypasses the depth-wise convolution, enhancing the model’s capability to extract global features through differential learning. Finally, the processed features pass through another linear layer to produce the final GSSB output. The entire flow can be formulated as follows:
- Multi-Router Scanning Block (MRSB). Like ViT [12], which divides images into patches and flattens them for input into the model, the SSM also handles flattened image patches as sequences. However, in contrast to ViT, which applies multi-head self-attention to these image patches, the SSM processes these patches sequentially. Thus, exploring effective methods for the sequential scanning of image patches is crucial. Recently, Vim [36] and VMamba [37] demonstrated that utilizing various scanning orders, including both row-wise and column-wise scans in different directions, can effectively enhance model performance.
- Local Feature Modulation Block (LFMB). The GSSB effectively captures the global information of images, but enhancing the extraction of local features is equally important. The LFMB enhances the model’s ability to focus on and extract local features by using dynamic feature weighting, thereby complementing the global information captured by the GSSB. Specifically, we begin by applying global average pooling to the features after layer normalization. These processed features are then subjected to a series of convolutional operations to extract high-level features. Finally, the sigmoid function is used on these deep features to generate channel weights. The LFMB dynamically adjusts the importance of each channel by performing element-wise multiplication with the normalized input features using these weights. The LFMB can be represented as follows:
3.4. Datasets and Implementation
- Datasets. To comprehensively evaluate the effectiveness of our approach under hazy and rainy conditions in RS images, we carry out extensive experiments using existing synthetic datasets, including SateHaze1k [23], RS-Haze [38], RICE [39], UAV-Rain1k [3], and a real-world dataset RRSD300 [40]. Specifically, the SateHaze1k dataset contains three different haze densities of RS images. Each subset of density includes 320 images for training and 45 images for testing. The RS-Haze dataset consists of 51,300 images for training, along with 2700 images for testing. The RICE dataset consists of 425 images used for training and 75 images used for testing. The UAV-Rain1k dataset consists of 800 synthetic raindrop images for training and 220 ones for testing. The RRSD300 dataset comprises 303 real RS hazy images. To maintain fairness in our comparisons, we follow the protocols of these benchmarks to evaluate our method.
- Implementation details. The initial feature layer comprises 32 channels, while the encoder/decoder utilizes vision Mamba modules with configurations of [2, 3, 3, 4] from level 1 to level 4, respectively. The Adam optimizer [41] is employed during the training phase using its default parameters. To enhance the training dataset, data augmentation techniques, including flipping and rotation, are applied. The patch size is set to be pixels and the batch size is set to be 4. The initial learning rate is established at and is halved at designated milestones. The final learning rate is dynamically adjusted following the cosine annealing schedule [42]. All experiments are executed using the PyTorch framework on NVIDIA RTX 4090 GPUs. The source code will be available to the public.
3.5. Comparisons with the State of the Art
- Evaluations on datasets for image dehazing. We first conduct a comprehensive evaluation of our method against state-of-the-art ones on several benchmark datasets, including SateHaze1k [23], RS-Haze [38], and RICE [39]. Here, we compare our Weamba with one prior-based algorithm (DCP [6]), CNN-based methods (AOD-Net [43], LD-Net [44], GCANet [45], GridDehazeNet [46], FFA-Net [47], FCTF-Net [20], and M2SCN [22]), and recent Transformer-based approaches (Restormer [9], Dehamer [48], DehazeFormer [38], AIDTransformer [25], and RSDformer [24]). We retrain the deep learning-based methods to ensure fair comparisons if they have not been previously trained on these benchmarks. We employ PSNR and SSIM [49] as evaluation metrics to measure the quality of the restored images. The quantitative results on different benchmark datasets are summarized in Table 1 and Table 2. These quantitative results clearly indicate that our proposed approach consistently outperforms all other dehazing techniques. Notably, our proposed Weamba achieves a significant improvement, surpassing RSDformer by 1.25 dB on average across the SateHaze1k dataset.
- Evaluations on datasets for image deraining. We further assess our proposed method, Weamba using the recent UAV-Rain1k dataset [3]. Following the framework established by [3], we compare Weamba against several notable algorithms, including DSC [8], RCDNet [50], SPDNet [51], Restormer [9], IDT [52], and DRSformer [11]. The quantitative results for the various methods evaluated are comprehensively presented in Table 3. It is particularly noteworthy that our approach achieves the highest PSNR value among the compared algorithms, which serves to underscore its superior effectiveness in tackling the image deraining task. In addition to these numerical results, Figure 6 provides a detailed visual comparison of the restoration outcomes generated by each algorithm. It becomes increasingly clear that raindrops create varying levels of obstruction in remote sensing images, complicating the restoration process. Unlike other models that tend to leave behind differing degrees of rain artifacts, our method excels in not only effectively eliminating unexpected raindrops but also in successfully restoring the intricate texture details of the background, thereby bringing them into closer alignment with the ground truth. This dual capability enhances the overall quality of the restored images, demonstrating the robustness of our approach.
- Evaluations on real-world datasets. To further validate the generalization performance of various approaches in real-world remote sensing scenarios, we conduct a thorough evaluation of our proposed method using the real-world RRSD300 dataset [40], which is specifically designed for RS image dehazing tasks. It includes image data from multiple scenes, covering a variety of geographic regions and environmental conditions. In this context, Figure 7 presents a detailed comparative analysis of the visual results generated by several different algorithms, allowing for an insightful assessment of their effectiveness. It becomes increasingly evident that earlier methods [6,45,46] often lead to noticeable color distortions when applied to real-world scenes, thereby compromising the overall visual fidelity and authenticity of the restored images.
Benchmark Datasets | RS-Haze | SateHaze1k | SateHaze1k Average | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Thin Haze | Moderate Haze | Thick Haze | |||||||||
Metrics | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | |
Prior | DCP [6] | 8.48 | 0.4801 | 13.45 | 0.6977 | 9.78 | 0.5739 | 10.90 | 0.5715 | 11.38 | 0.6144 |
CNN | AOD-Net [43] | 24.90 | 0.8300 | 18.74 | 0.8584 | 17.69 | 0.7969 | 13.42 | 0.6523 | 16.60 | 0.7692 |
LD-Net [44] | 25.84 | 0.8230 | 17.83 | 0.8568 | 19.80 | 0.8980 | 16.60 | 0.7649 | 18.08 | 0.8399 | |
GCANet [45] | 34.41 | 0.9490 | 22.27 | 0.9030 | 24.89 | 0.9327 | 20.51 | 0.8307 | 22.56 | 0.8888 | |
GridDehazeNet [46] | 34.19 | 0.9446 | 20.04 | 0.8651 | 20.96 | 0.8988 | 18.67 | 0.7944 | 19.89 | 0.8528 | |
FFA-Net [47] | 35.68 | 0.9446 | 22.30 | 0.9072 | 25.46 | 0.9372 | 20.84 | 0.8451 | 22.87 | 0.8965 | |
MSBDN [53] | 35.25 | 0.9357 | 18.02 | 0.7029 | 20.76 | 0.7613 | 16.78 | 0.5389 | 18.52 | 0.6677 | |
FCTF-Net [20] | 34.29 | 0.9321 | 20.06 | 0.8808 | 23.43 | 0.9265 | 18.68 | 0.8020 | 20.72 | 0.8698 | |
M2SCN [22] | 37.75 | 0.9497 | 25.21 | 0.9175 | 26.11 | 0.9416 | 21.33 | 0.8289 | 24.22 | 0.8960 | |
Transformer | Restormer [9] | 36.72 | 0.9514 | 24.97 | 0.9248 | 26.77 | 0.9452 | 21.28 | 0.8362 | 24.34 | 0.9021 |
Dehamer [48] | 36.74 | 0.9459 | 20.94 | 0.8717 | 22.89 | 0.8708 | 19.80 | 0.8086 | 21.21 | 0.8504 | |
DehazeFormer [38] | 39.62 | 0.9595 | 23.92 | 0.9121 | 25.94 | 0.9445 | 22.03 | 0.8373 | 23.96 | 0.8980 | |
AIDTransformer [25] | - | - | 23.12 | 0.9052 | 25.08 | 0.9124 | 20.56 | 0.8325 | 22.92 | 0.8834 | |
RSDformer [24] | 37.07 | 0.9575 | 24.06 | 0.9177 | 25.97 | 0.9390 | 22.87 | 0.8646 | 24.30 | 0.9071 | |
Mamba | MambaIR [18] | 38.65 | 0.9569 | 24.67 | 0.9267 | 26.36 | 0.9443 | 22.68 | 0.8570 | 24.57 | 0.9093 |
Weamba (Ours) | 39.86 | 0.9610 | 25.75 | 0.9284 | 27.50 | 0.9468 | 23.39 | 0.8702 | 25.55 | 0.9151 |
Methods | DCP [6] | AODNet [43] | FFA-Net [47] | MSBDN [53] | LD-Net [44] | DehazeFormer [38] | RSDformer [24] | Ours |
---|---|---|---|---|---|---|---|---|
Category | Prior | CNN | CNN | CNN | CNN | Transformer | Transformer | Mamba |
PSNR | 17.48 | 23.77 | 28.54 | 30.37 | 28.88 | 30.91 | 33.01 | 33.84 |
SSIM | 0.7841 | 0.8731 | 0.0.9396 | 0.8584 | 0.9336 | 0.9350 | 0.9525 | 0.9582 |
Methods | Input | DSC [8] | RCDNet [50] | SPDNet [51] | Restormer [9] | IDT [52] | DRSformer [11] | Ours |
---|---|---|---|---|---|---|---|---|
Category | - | Prior | CNN | CNN | Transformer | Transformer | Transformer | Mamba |
PSNR | 16.80 | 16.68 | 22.48 | 24.78 | 24.78 | 22.47 | 24.93 | 25.25 |
SSIM | 0.7196 | 0.7142 | 0.8753 | 0.9054 | 0.8594 | 0.9054 | 0.9155 | 0.9080 |
4. Analysis and Discussion
4.1. Ablation Study
- Effectiveness of the MRSB. The essence of our methodology lies in the principal design of the MRSB, which adeptly models global information. To elucidate the efficacy of the MRSB, we commence by contrasting it with the prevalent multi-directional scanning technique, as detailed in reference [18] (refer to Figure 8). Table 4 presents the performance outcomes of models (a–c), which employ uni-directional, bi-directional, and four-directional scanning methods along a defined Z-shaped trajectory, respectively. Furthermore, we elucidate the results of employing diverse routing configurations within the MRSB, as depicted in models (d–f) of Table 4. It is noteworthy that, in the context of remote sensing (RS) image restoration, the performance enhancement achieved by model (c) with an increased number of scanning directions, as compared to model (b), is relatively modest. This may be attributed to the inherent issue of information redundancy associated with the multi-directional scanning paradigm. In stark contrast, our MRSB markedly augments performance by leveraging a variety of routing shapes, thereby demonstrating a superior capability to effectively exploit salient information for global feature modeling throughout the scanning procedure.
- Effectiveness of the LFMB. We present the LFMB with the objective of empowering the model to adeptly capture complex global and local dependencies inherent in the data. To assess the impact of the LFMB, we conduct a performance comparison by omitting this critical component. In comparison to model (f) in Table 4, our proposed method (g) yields a notable Peak Signal-to-Noise Ratio (PSNR) enhancement of 0.12 dB on the UAV-Rain1k dataset. Furthermore, as illustrated in Figure 9, our approach demonstrates a superior ability to preserve fine textures in remote sensing (RS) images upon the integration of the LFMB. This improvement underscores the significance of the LFMB in enriching the model’s capability to maintain intricate details, thereby enhancing the overall fidelity and quality of the reconstructed images.
- Effectiveness of the number of LSSMs. The above experiments sufficiently demonstrate the effectiveness of the LSSM. To further validate the impact of the number of LSSMs on the final experimental results, we conduct additional ablation studies on the UAV-Rain1k dataset. As shown in the Table 5, we set up four groups of control experiments. As the model downsamples to gradually extract deeper image features, a larger number of LSSMs are needed for effective modeling. In the shallow layers of the image, the amount of useful information is relatively small, so an excessive number of LSSMs leads to information redundancy, which in turn limits the model’s performance. Therefore, we gradually adjust the number of LSSMs to achieve a balance between effectiveness and efficiency.
- Discussions on a closely related method. The recent method referred to as MambaIR [18] presents a straightforward baseline for image restoration using a state space model. Our method, however, differs from MambaIR in several significant ways. Firstly, we have devised an advanced multi-router scanning strategy, which stands in contrast to the multi-directional scanning technique employed by MambaIR. This scanning approach enables a more comprehensive and nuanced capture of spatial degradation present in high-resolution remote sensing (RS) images. By traversing the scene along multiple, strategically chosen routes, it significantly augments the hierarchical feature modeling within the state space framework, facilitating a deeper comprehension of spatial variations and intricacies inherent in the imagery. Secondly, we integrate local feature modulation, which serves to enhance the state space model by incorporating critical local information. This enhancement is crucial, as it allows us to synergistically leverage both global and local data. By adopting this dual focus, we achieve substantial improvements in the reconstruction quality of high-resolution RS images. The amalgamation of global and local features not only amplifies the model’s overall performance but also ensures the preservation of intricate details and contextual information, thereby achieving superior image fidelity and clarity. The results of our experiments substantiate the efficacy of our proposed strategy, demonstrating its potential to advance the field of image restoration within remote sensing applications.
4.2. Model Complexity
4.3. Application
4.4. Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rasti, B.; Chang, Y.; Dalsasso, E.; Denis, L.; Ghamisi, P. Image restoration for remote sensing: Overview and toolbox. IEEE Geosci. Remote Sens. Mag. 2021, 10, 201–230. [Google Scholar] [CrossRef]
- Mehta, A.; Sinha, H.; Mandal, M.; Narang, P. Domain-aware unsupervised hyperspectral reconstruction for aerial image dehazing. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 413–422. [Google Scholar]
- Chang, W.; Chen, H.; He, X.; Chen, X.; Shen, L. UAV-Rain1k: A Benchmark for Raindrop Removal from UAV Aerial Imagery. arXiv 2024, arXiv:2402.05773. [Google Scholar]
- Chen, X.; Pan, J.; Dong, J.; Tang, J. Towards unified deep image deraining: A survey and a new benchmark. arXiv 2023, arXiv:2310.03535. [Google Scholar]
- Gui, J.; Cong, X.; Cao, Y.; Ren, W.; Zhang, J.; Zhang, J.; Cao, J.; Tao, D. A comprehensive survey and taxonomy on single image dehazing based on deep learning. ACM Comput. Surv. 2023, 55, 1–37. [Google Scholar] [CrossRef]
- He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar] [PubMed]
- Long, J.; Shi, Z.; Tang, W.; Zhang, C. Single remote sensing image dehazing. IEEE Geosci. Remote Sens. Lett. 2013, 11, 59–63. [Google Scholar] [CrossRef]
- Luo, Y.; Xu, Y.; Ji, H. Removing rain from a single image via discriminative sparse coding. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3397–3405. [Google Scholar]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
- Wang, Z.; Cun, X.; Bao, J.; Zhou, W.; Liu, J.; Li, H. Uformer: A general u-shaped transformer for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17683–17693. [Google Scholar]
- Chen, X.; Li, H.; Li, M.; Pan, J. Learning a sparse transformer network for effective image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 5896–5905. [Google Scholar]
- Vaswani, A. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Chen, H.; Chen, X.; Lu, J.; Li, Y. Rethinking Multi-Scale Representations in Deep Deraining Transformer. Proc. AAAI Conf. Artif. Intell. 2024, 38, 1046–1053. [Google Scholar] [CrossRef]
- Wu, X.; Chen, H.; Chen, X.; Xu, G. Multi-scale transformer with conditioned prompt for image deraining. Digit. Signal Process. 2025, 156, 104847. [Google Scholar] [CrossRef]
- Gu, A.; Goel, K.; Ré, C. Efficiently modeling long sequences with structured state spaces. arXiv 2021, arXiv:2111.00396. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
- Zhou, H.; Wu, X.; Chen, H.; Chen, X.; He, X. RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing. arXiv 2024, arXiv:2405.10030. [Google Scholar]
- Guo, H.; Li, J.; Dai, T.; Ouyang, Z.; Ren, X.; Xia, S.T. Mambair: A simple baseline for image restoration with state-space model. In Proceedings of the Computer Vision—ECCV 2024, Milan, Italy, 29 September–4 October 2024; European Conference on Computer Vision. Springer: Cham, Switzerland, 2025; pp. 222–241. [Google Scholar]
- Liu, M.; Tang, L.; Fan, L.; Zhong, S.; Luo, H.; Peng, J. Towards Blind-Adaptive Remote Sensing Image Restoration. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4705212. [Google Scholar] [CrossRef]
- Li, Y.; Chen, X. A coarse-to-fine two-stage attentive network for haze removal of remote sensing images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1751–1755. [Google Scholar] [CrossRef]
- Chen, X.; Li, Y.; Dai, L.; Kong, C. Hybrid high-resolution learning for single remote sensing satellite image dehazing. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Li, S.; Zhou, Y.; Xiang, W. M2scn: Multi-model self-correcting network for satellite remote sensing single-image dehazing. IEEE Geosci. Remote Sens. Lett. 2022, 20, 1–5. [Google Scholar] [CrossRef]
- Huang, B.; Zhi, L.; Yang, C.; Sun, F.; Song, Y. Single satellite optical imagery dehazing using SAR image prior based on conditional generative adversarial networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass, CO, USA, 1–5 March 2020; pp. 1806–1813. [Google Scholar]
- Song, T.; Fan, S.; Li, P.; Jin, J.; Jin, G.; Fan, L. Learning an effective transformer for remote sensing satellite image dehazing. IEEE Geosci. Remote Sens. Lett. 2023, 20, 8002305. [Google Scholar] [CrossRef]
- Kulkarni, A.; Murala, S. Aerial image dehazing with attentive deformable transformers. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 6305–6314. [Google Scholar]
- Smith, J.T.; Warrington, A.; Linderman, S.W. Simplified state space layers for sequence modeling. arXiv 2022, arXiv:2208.04933. [Google Scholar]
- Ju, M.; Xie, S.; Li, F. Improving skip connection in u-net through fusion perspective with mamba for image dehazing. IEEE Trans. Consum. Electron. 2024, 70, 7505–7514. [Google Scholar] [CrossRef]
- Zheng, Z.; Wu, C. U-shaped vision mamba for single image dehazing. arXiv 2024, arXiv:2402.04139. [Google Scholar]
- Lei, X.; ZHang, W.; Cao, W. DVMSR: Distillated Vision Mamba for Efficient Super-Resolution. arXiv 2024, arXiv:2405.03008. [Google Scholar]
- Xiao, Y.; Yuan, Q.; Jiang, K.; Chen, Y.; Zhang, Q.; Lin, C.W. Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution. arXiv 2024, arXiv:2405.04964. [Google Scholar] [CrossRef]
- Bai, J.; Yin, Y.; He, Q. Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement. arXiv 2024, arXiv:2405.03349. [Google Scholar]
- Li, G.; Zhang, K.; Wang, T.; Li, M.; Zhao, B.; Li, X. Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement. arXiv 2024, arXiv:2409.16604. [Google Scholar]
- Weng, J.; Yan, Z.; Tai, Y.; Qian, J.; Yang, J.; Li, J. MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space. arXiv 2024, arXiv:2405.16105. [Google Scholar]
- Zou, W.; Gao, H.; Yang, W.; Liu, T. Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement. arXiv 2024, arXiv:2408.01276. [Google Scholar]
- Wu, X.; Lu, J.; Wu, J.; Li, Y. Multi-Scale Dilated Convolution Transformer for Single Image Deraining. In Proceedings of the 2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP), Poitiers, France, 27–29 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
- Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv 2024, arXiv:2401.09417. [Google Scholar]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Liu, Y. Vmamba: Visual state space model. arXiv 2024, arXiv:2401.10166. [Google Scholar]
- Song, Y.; He, Z.; Qian, H.; Du, X. Vision transformers for single image dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef] [PubMed]
- Lin, D.; Xu, G.; Wang, X.; Wang, Y.; Sun, X.; Fu, K. A remote sensing image dataset for cloud removal. arXiv 2019, arXiv:1901.00600. [Google Scholar]
- Wen, Y.; Gao, T.; Zhang, J.; Li, Z.; Chen, T. Encoder-free multi-axis physics-aware fusion network for remote sensing image dehazing. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4705915. [Google Scholar] [CrossRef]
- Kinga, D.; Adam, J.B. A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015; Volume 5, p. 6. [Google Scholar]
- Loshchilov, I.; Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. arXiv 2016, arXiv:1608.03983. [Google Scholar]
- Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. Aod-net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
- Ullah, H.; Muhammad, K.; Irfan, M.; Anwar, S.; Sajjad, M.; Imran, A.S.; de Albuquerque, V.H.C. Light-DehazeNet: A novel lightweight CNN architecture for single image dehazing. IEEE Trans. Image Process. 2021, 30, 8968–8982. [Google Scholar] [CrossRef]
- Chen, D.; He, M.; Fan, Q.; Liao, J.; Zhang, L.; Hou, D.; Yuan, L.; Hua, G. Gated context aggregation network for image dehazing and deraining. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 7–11 January 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1375–1383. [Google Scholar]
- Liu, X.; Ma, Y.; Shi, Z.; Chen, J. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7314–7323. [Google Scholar]
- Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. Proc. AAAI Conf. Artif. Intell. 2020, 34, 11908–11915. [Google Scholar] [CrossRef]
- Guo, C.L.; Yan, Q.; Anwar, S.; Cong, R.; Ren, W.; Li, C. Image dehazing transformer with transmission-aware 3d position embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5812–5820. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Xie, Q.; Zhao, Q.; Meng, D. A model-driven deep neural network for single image rain removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3103–3112. [Google Scholar]
- Yi, Q.; Li, J.; Dai, Q.; Fang, F.; Zhang, G.; Zeng, T. Structure-preserving deraining with residue channel prior guidance. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 4238–4247. [Google Scholar]
- Xiao, J.; Fu, X.; Liu, A.; Wu, F.; Zha, Z.J. Image de-raining transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 12978–12995. [Google Scholar] [CrossRef] [PubMed]
- Dong, H.; Pan, J.; Xiang, L.; Hu, Z.; Zhang, X.; Wang, F.; Yang, M.H. Multi-scale boosted dehazing network with dense feature fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2157–2167. [Google Scholar]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1833–1844. [Google Scholar]
Methods | GSSB | LFMB | PSNR | SSIM | |||||
---|---|---|---|---|---|---|---|---|---|
Multi-Directional | Multi-Router | ||||||||
Uni-Directional | Bi-Directional | Four-Directional | Z-Shape | S-Shape | W-Shape | ||||
(a) | ✔ | 24.81 | 0.8984 | ||||||
(b) | ✔ | 24.90 | 0.8988 | ||||||
(c) | ✔ | 24.94 | 0.8990 | ||||||
(d) | ✔ | 24.89 | 0.8985 | ||||||
(e) | ✔ | ✔ | 25.02 | 0.8991 | |||||
(f) | ✔ | ✔ | ✔ | 25.13 | 0.9026 | ||||
(g) | ✔ | ✔ | ✔ | ✔ | 25.25 | 0.9080 |
Configurations | PSNR | SSIM |
---|---|---|
[2, 2, 2, 2] | 24.98 | 0.9051 |
[3, 3, 3, 3] | 25.12 | 0.9054 |
[4, 4, 4, 4] | 25.28 | 0.9077 |
[2, 3, 3, 4] | 25.25 | 0.9080 |
Methods | MSBDN [53] | Restormer [9] | Dehamer [48] | Uformer [10] | IDT [52] | DRSformer [11] | Ours |
---|---|---|---|---|---|---|---|
Parameters (M) | 31.35 | 26.13 | 132.45 | 20.6 | 16.41 | 33.65 | 15.4 |
FLOPs (G) | 41.5 | 174.7 | 120.6 | 41.0 | 61.9 | 242.9 | 40.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, S.; He, X.; Chen, X. Weamba: Weather-Degraded Remote Sensing Image Restoration with Multi-Router State Space Model. Remote Sens. 2025, 17, 458. https://doi.org/10.3390/rs17030458
Wu S, He X, Chen X. Weamba: Weather-Degraded Remote Sensing Image Restoration with Multi-Router State Space Model. Remote Sensing. 2025; 17(3):458. https://doi.org/10.3390/rs17030458
Chicago/Turabian StyleWu, Shuang, Xin He, and Xiang Chen. 2025. "Weamba: Weather-Degraded Remote Sensing Image Restoration with Multi-Router State Space Model" Remote Sensing 17, no. 3: 458. https://doi.org/10.3390/rs17030458
APA StyleWu, S., He, X., & Chen, X. (2025). Weamba: Weather-Degraded Remote Sensing Image Restoration with Multi-Router State Space Model. Remote Sensing, 17(3), 458. https://doi.org/10.3390/rs17030458