Liu A Data-Centric Solution To NonHomogeneous Dehazing Via Vision Transformer CVPRW 2023 Paper
Liu A Data-Centric Solution To NonHomogeneous Dehazing Via Vision Transformer CVPRW 2023 Paper
Liu A Data-Centric Solution To NonHomogeneous Dehazing Via Vision Transformer CVPRW 2023 Paper
Yangyi Liu1 , Huan Liu1 , Liangyan Li1 , Zijun Wu2 and Jun Chen1
1
McMaster University, Hamilton, Canada
2
China Telecom Research Institute, Shanghai, China
{liu5, lil61, chenjun}@mcmaster.ca, liuh127@outlook.com, wuzj12@chinatelecom.cn
Figure 1. Our results on NTIRE 2023 dehazing challenge, achieving the best performance in terms of PNSR, SSIM and LPIPS.
1406
handling image dehazing problem. Specifically, benefiting to build the second branch with a RCAN [40]. Since there is
from powerful network modules and vast training data, the no down-sampling and up-sampling operations in the sec-
end-to-end approaches deliver promising results. However, ond branch, we expect it to extract features distinct from
as the distribution of haze becomes more complicated and the ones obtained by the first branch. Finally, a fusion tail
non-homogeneous, many of them fail to achieve satisfying aggregates the results from both branches and produces de-
results. The reason for such failures is because the thickness hazed output images.
of the non-homogeneous haze is not determined entirely by Overall, our contributions are summarized as fol-
the depth of the background scene. lows. Firstly we put forward a simple but effective data-
Although researchers have made tremendous efforts col- preprocessing approach inspired by data-centric AI, lever-
lecting data with non-homogeneous haze, e.g., the NH- aging extra data to significantly enhance our model. Sec-
HAZE datasets [6–8], the quantity is still limited. A com- ondly, we incorporate the state-of-the-art backbone in the
mon belief is that models are prone to encounter the overfit- two-branch framework. By carefully balancing the two
ting problems when training a deep neural network from branches, our model demonstrates promising results us-
scratch with such small datasets. A naive solution is to ing limit-sized datasets, and outperforms other current ap-
combine all the available non-homogeneous haze datasets proaches adopting this pipeline. Finally, we conduct ex-
together to form a relatively larger dataset. However, due to tensive experiments to demonstrate the competitive perfor-
the differences between datasets caused by a variety of fac- mance of our proposed method. With substantial ablation
tors, such as color distortion, objects complexity and cam- study on different combinations of models and data, we
era capability, it has been shown that a direct combination hope to convince the future competition participants to pay
actually compromises the dehazing performance on individ- equal attention to model design and data engineering.
ual datasets [22]. It remains a serious challenge to find a ro-
bust solution to cope with the practical situation where both 2. Related Works
the quality and quantity of the available data limited. In this section, we briefly review the literature of single
To address the above-mentioned problems, we adopt the image dehazing and learning with limited data.
two-branch framework consisting of state-of-the-art back- Single Image Dehazing. Approaches proposed for sin-
bone networks, with a novel data-preprocessing transfor- gle image dehazing are divided into two categories: prior-
mation applied on the NH-HAZE datasets from previous based methods and learning-based methods. To guarantee
years. Motivated by the idea of data-centric AI that ma- the performance, prior-based methods require reasonable
chine learning has matured to a point that high-performance assumptions and knowledge on hazy images to obtain ac-
model architectures are widely available, while approaches curate estimations about the transmission map and atmo-
to engineering datasets have lagged [1, 27], we put much spheric light intensity in ASM modeling [26]. Representa-
effort in engineering the data. Inspired by the promising tive works in this category include [9,14,18,34,44]. Specif-
performance of gamma correction [15, 37], we propose a ically, [34] observed that clear images have higher contrast
simple yet effective RGB-channel-wise data-preprocessing comparing to the hazy counterparts, and proposed a local
approach. We demonstrate its suitability for this competi- contrast maximization method. Based on the assumption
tion setting, and argue that it is prospective to be the prin- that image pixels in no-haze patches have intensity val-
ciple for augmenting similar dataset. Details of this data- ues close to zero in at least one color channel, [18] intro-
centric AI inspired preprocessing approach are discussed in duced Dark Channel Prior (DCP). [44] presented a linear
later sections. Regarding to the network architecture, we de- model adapting color attenuation prior (CAP) to estimate
sign our model under the two-branch framework [15,36,37]. the depth according to the knowledge about the difference
In the first branch, we adopt the Swin Transformer V2 between the brightness and the saturation of hazy images.
model [24] pre-trained on ImageNet dataset [12] as the en- Prior-based methods left a permanent mark in single im-
coder. The powerful Swin Transformer is accredited to be age dehazing but their vulnerability when adapted in vari-
able to supersede the previous methods in many contexts of able scenes pivoted the researchers to another direction, the
transfer learning, where the knowledge gained from large- learning-based methods. With the advances in neural net-
scale benchmark is adapted to task-specific datasets [20,24]. works, [10, 11, 21, 29, 39, 42] have proposed progressively
Such pertinent features are of vital importance when deal- more powerful models that are capable of directly recov-
ing with small real-world non-homogeneous datasets [37]. ering the clean image from hazy image without estimating
Alongside a refined decoder and skip connections, the first the transmission map and depth. The superiority of these
branch extracts multi-level features of the hazy images. The methods in removing homogeneous haze is attributed to the
second branch is introduced to complement the knowledge availability of large training datasets. When applied on non-
learned from the pre-trained model by exclusively working homogeneous haze, they fail to yield comparable results.
on the domain of target data. For simplicity, we follows [37] The limited quantity of existing non-homogeneous haze
1407
Figure 2. Comparison of RGB-wise distribution of datasets (GT) before and after being processed by our proposed method.
datasets prevents researchers from adopting simple end-to- available data. In the next section, we introduce our innova-
end training methods. tive solution standing out in the NTIRE challenge settings.
Learning with Limited Data. Data is indispensable for
all the AI models. Many of the models demand a huge 3. Proposed Method
dataset for training, but large dataset is not always avail- In this section, we introduce the details of our method-
able. Therefore, it urges the researchers to find solutions to ology following the order of the working pipeline. Firstly,
accomplish training with limited data. In terms of dehazing, we demonstrate the data-preprocessing method inspired by
a seemingly straightforward solution to address the issues the idea of data-centric AI. Secondly, details of our model
caused by small non-homogeneous training datasets is com- architecture are presented, as well as the functions of each
posing a relatively large dataset by combining several small component. Finally, we introduce the loss functions applied
datasets all together. In terms of NTIRE2023 challenge [8], to train our proposed networks.
it can be done by augmenting the NH-HAZE datasets (aug-
mented dataset) [6, 7] with this year’s data (target dataset). 3.1. Data-Centric Engineering
Surprisingly, against the common believe that larger dataset
is always better in deep learning, [22] observed that the Systematically engineering the data is a key requirement
models perform better when training and testing are con- for training deep neural networks. The idea of data-centric
ducted on a single dataset (as opposed to the union of all AI moreover emphasizes on assessing the data quality be-
datasets). This observation indicates that the augmented fore deployment [38]. By comparing the NH-HAZE20 and
dataset locates in a different domain comparing to the target 21 dataset [6, 7] to the data provided this year both numeri-
data. Direct aggregation introduces domain shift problem cally and empirically, we notice obvious color discrepancy.
within the dataset. Thereby, [22] proposed a testing time When evaluating on this year’s test data, training on a direct
training strategy to mitigate the problems, while [15,31,37] combination of all data does not boost the score comparing
chose to adjust the domains of training data before send- to training on this year’s data only (see results in Section
ing them into the dehazing modules. Interestingly, the idea 4.3.1). Therefore, our goal is to propose an approach that
of focusing on improving the dataset rather than the model reduces the color differences, and shifts the distribution of
was introduced by the Data-Centric AI competition [1]. augmented data towards that of target data. Inspired by the
Data-Centric AI is anticipated to deliver a set of approaches success application of gamma correction [15, 37] as a sim-
for dataset optimization, thereby enabling deep neural net- ple yet effective data-preprocessing technique, we propose
works to be effectively trained using smaller datasets [27]. a more systematic solution for data engineering. Instead of
The set of proposed techniques ranges widely from simple the practice in [15, 37] by applying gray-scale gamma cor-
ones to complex combinations [38]. Through our experi- rection, we here introduce to correct on each R,G,B channel
ments and qualitative analysis, we find that a too simple ap- separately:
proach, such as the gamma correction adopted by [15, 37]
fails to recover the color accurately. Nevertheless, a compli- \label {GC} O_{R,G,B} = (\frac {I_{R,G,B}}{255}) ^{\frac {1}{\gamma _{R,G,B}}} (1)
cated method, like [31] applying domain adaptation to learn
a separate neural network to translate the data, is infeasible where O and I are output and input pixel intensity (∈
due to the scarcity and lacking of depth information of the [0, 255]), respectively. γ is the gamma factor. The sub-
1408
Transfer Learning Branch
SwinTransformer
Block Concat
Linear Embedding
Patch Partition
Swin
Atten- Atten- Atten-
Up-sampling
Up-sampling
Up-sampling
Trans- Linear Embed Linear Embed Linear Embed
SwinT Block
+ tion tion tion Enhanc
former + +
SwinT Block*2 SwinT Block*18 SwinT Block*2 Block Block Block e-ing
Block
CA+PA CA+PA CA+PA Block
Pre-Process
Tail
ReflectionPad
X2
Conv 7X7
Fusion
Tanh
c
X 4 RCAG
Conv 3X3
… …
RCAB
RCAB
X 10
Residual Channel Attention Group (RCAG)
Data Fitting Branch
Figure 3. An overview of our network. The model consists of two branches. The transfer learning branch is composed by Swin Transformer
based model. The data fitting branch consists of residual channel attention groups.
scripts R, G, B indicate that the values for different chan- 3.2. Network Architecture
nels are unique.
As shown in Figure 3, the pre-processed data is fed
As for implementation, we first calculate the average into a two-branch model architecture. This two-branch
pixel intensity of each channel of the three datasets; then for framework has been successfully employed in various com-
each channel of the NH-HAZE20 or 21 dataset, we apply a puter vision tasks [19], and has facilitated several works
transformation with a unique gamma value to all the pixels, [15, 36, 37] winning the awards in the past NTIRE chal-
resulting in similar mean and variance values comparing lenges. In our implementation, the first Transfer Learning
with the corresponding channel of NH-HAZE23 dataset. In Branch aims to extract pertinent features of the inputs with
Figure 2, we present the histogram change with correspond- pre-trained weights initialization. The second Data Fitting
ing γ values. From observation, our method adjusts the Branch is responsible to complement the knowledge learned
color of NH-HAZE20 and 21 data to become much similar from the first branch and work on the domain of target data.
to the NH-HAZE23 data. Numerically, the average pixel The fusion tail aggregates the outputs from both branches
intensity of 2023 data is 107.46(R), 114.48(G), 101.92(B). and produces dehazed images.
After applying our method, the adjusted average pixel
Swin Transformer based Transfer Learning. To lever-
intensity of NH-HAZE20 data is 107.77(R), 114.33(G),
age the power of transfer learning [33], we use the Ima-
102.08(B); and the adjusted ones of NH-HAZE21 data is
geNet [12] pre-trained Swin Transformer [24] as the back-
107.43(R), 115.01(G), 102.13(B). Note that, we not only
bone of our encoder. Swin Transformer achieves the state-
apply such preprocessing method on the clean ground truth
of-the-art performance in many vision tasks. It is exception-
images but also on the hazy images (as opposed to [15, 37]
ally efficient and more accurate as comparing to its prede-
only manipulating the ground truth images).
cessor, Vision Transformer (ViT) [13], which struggles with
With this novel data-preprocessing method, the distribu- high resolution images because its complexity is quadratic
tions of all three color channels of NH-HAZE20 and 21 data to the input size. The working pipeline of the Swin Trans-
are shifted closer to those of NH-HAZE23 dataset. Benefit- former is summarized as follows. First, Swin Transformer
ing from more in-distribution data, the models gain substan- splits an input image into non-overlapping patches with a
tial improvements. Being able to work with small but good patch splitting module. Through a linear embedding layer,
dataset, rather than a larger but internally diverged one helps the patches and their features are set as a concatenation of
us stand out in the competition. This indeed aligns with the the raw pixel RGB values, also referred to as “token”, and
idea of data-centric AI [27, 38]. For future competition par- then be projected to an arbitrary dimension. These tokens
ticipants, we elect this approach to be a good starting point are processed by a cascade of stages. Each stage consists
for data engineering. of a linear embedding layer and several Swin Transformer
1409
Block (SwinT Block) modules. SwinT Block uses cyclic- lation for the i-th pixel follows:
shift with MSA modules to implement efficient batch com-
putation for shifted window partitioning. From the previ- \begin {aligned} \text {SSIM}(i)&=\frac {2\mu _O\mu _G + C_1}{\mu _O^2 + \mu _G^2 +C_1} \cdot \frac {2\sigma _{OG} + C_2}{\sigma _O^2 + \sigma _G^2 +C_2} , \label {con:1} \end {aligned} (4)
ous stage to the next, the spatial dimension of the feature
maps are effectively reduced, resulting in hierarchical fea-
where C1 and C2 help stabilize the division.
ture maps. These modules compose our encoder part of
Perceptual Loss. Besides pixel-scale supervision on
the Transfer Learning Branch. As for the decoder part, we
perceptual quality, we adopt ImageNet [12] pre-trained
adopt the ideas from [15,37]. With skip connections, the at-
VGG16 [32] to measure perceptual similarity, which helps
tention blocks and up-sampling layers gradually restore the
reconstruct finer details [43]. Denoting x and y as hazy
hierarchical feature maps and produce an output with the
inputs and ground truth images respectively, the loss is de-
same spatial dimension as the input.
fined as:
Rest of the Model. We adopt the Data Fitting Branch
from [40], which is based on residual channel attention L_{perc} = \frac {1}{N}\sum _{j}\frac {1}{C_jH_jW_j}||\phi _j(f_\theta (x)) - \phi _j(y)||_2^2 , (5)
block [40]. Trained from scratch, this second branch com-
plements the first one by exclusively working on the domain
of target data. With no down-sampling and up-sampling op- where fθ (x) is the dehazed image. ϕj (·) denotes the feature
erations, this branch operates in the full-resolution mode, map. We choose L2 loss to measure the distances between
thus extracts features distinct from the ones obtained by the them. N denotes the number of features.
first branch. A simple yet insightful fusion tail consisting Adversarial Loss. Compensating for the risks that
of a reflection padding layer, a 7 × 7 convolutional layer pixel-wised loss functions fail to provide sufficient super-
and the Tanh activation [37] combines the features from two vision when training on a small dataset, we employ the ad-
branches and produces dehazed images. versarial loss [43]:
1410
Table 1. Ablation study for architectures and data-preprocessing
techniques. The scores are evaluated using NTIRE2023 online
validation server.
Res2Net+RCAN Ours
Data
PSNR SSIM PSNR SSIM
NH-HAZE23 only 20.68 0.678 21.54 0.682
NH-HAZE20+21+23 20.86 0.688 21.54 0.689
NH-HAZE20+21+23 GC 21.08 0.690 21.58 0.693
NH-HAZE20+21+23 RGB 21.26 0.693 21.94 0.697
1411
Figure 5. Comparison of RGB-wise distribution of datasets (hazy) before and after being processed by our proposed method.
1412
Figure 7. Qualitative evaluation on the four representative datasets, i.e., DENSE-HAZE, NH-HAZE20, NH-HAZE21 and NH-HAZE23.
For DENSE-HAZE and NH-HAZE20, we follow the official train, val and test split. For NH-HAZE21 and NH-HAZE23, due to the
unavailable of test data, we split the released official training data to our training set and test set.
Table 2. Quantitative evaluation on DENSE-HAZE, NH-HAZE20, NH-HAZE21 and NH-HAZE23 datasets. The best results are marked
in bold, and the second bests are marked with underlines.
It is worth noticing that our model substantially outper- improving the model’s capacity.
forms the Res2Net+RCAN model only on O-HAZE and
NTIRE2023. We argue the reason behind is that due to 5. Conclusion
the huge increase of image resolution on O-HAZE and NH-
HAZE23 datasets. For example, the number of pixels in In this paper, we propose a method targeting on non-
NH-HAZE23 data is 6.25 times larger than that of the com- homogeneous dehazing. It consists of a data-preprocessing
bination of NH-HAZE20 and NH-HAZE21 datasets. Since strategy inspired by data-centric AI and a Transformer
our transformer-based model contains more learnable pa- based two-branch model structure. Combining them to-
rameters, a larger training dataset can essentially alleviate gether, we construct a solution that outperforms the SOTA
the overfitting problem. This phenomenon further indicates methods, which stimulates our advocation on treating the
that when it comes to a limited data setting, it is more criti- model and the data equally important. Additionally, exten-
cal to investigate in a data-centric manner other than simply sive experimental results provide strong support to the ef-
fectiveness of our method.
1413
References on Applications of Computer Vision (WACV), pages 1375–
1383, 2019. 1, 2, 7
[1] Data-centric ai competition submission guide, 2021. 2, 3 [12] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li,
[2] Cosmin Ancuti, Codruta O Ancuti, and Radu Timofte. Ntire and Li Fei-Fei. Imagenet: A large-scale hierarchical image
2018 challenge on image dehazing: Methods and results. In database. In 2009 IEEE conference on computer vision and
Proceedings of the IEEE Conference on Computer Vision pattern recognition, pages 248–255. Ieee, 2009. 2, 4, 5
and Pattern Recognition Workshops, pages 891–901, 2018. [13] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov,
5 Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner,
[3] Codruta O. Ancuti, Cosmin Ancuti, Mateu Sbert, and Radu Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl-
Timofte. Dense haze: A benchmark for image dehazing with vain Gelly, et al. An image is worth 16x16 words: Trans-
dense-haze and haze-free images. In IEEE International formers for image recognition at scale. arXiv preprint
Conference on Image Processing (ICIP), IEEE ICIP 2019, arXiv:2010.11929, 2020. 4
2019. 5 [14] Raanan Fattal. Dehazing using color-lines. ACM transac-
[4] C. O. Ancuti, C. Ancuti, R. Timofte, L. Van Gool, L. Zhang, tions on graphics (TOG), 34(1):1–14, 2014. 2
M. Yang, T. Guo, X. Li, V. Cherukuri, V. Monga, H. Jiang, [15] Minghan Fu, Huan Liu, Yankun Yu, Jun Chen, and Keyan
S. Yang, Y. Liu, X. Qu, P. Wan, D. Park, S. Y. Chun, M. Wang. Dw-gan: A discrete wavelet transform gan for nonho-
Hong, J. Huang, Y. Chen, S. Chen, B. Wang, P. N. Miche- mogeneous dehazing. In Proceedings of the IEEE/CVF Con-
lini, H. Liu, D. Zhu, J. Liu, S. Santra, R. Mondal, B. Chanda, ference on Computer Vision and Pattern Recognition (CVPR)
P. Morales, T. Klinghoffer, L. M. Quan, Y. Kim, X. Liang, R. Workshops, pages 203–212, June 2021. 1, 2, 3, 4, 5, 6
Li, J. Pan, J. Tang, K. Purohit, M. Suin, A. N. Rajagopalan, [16] Shanghua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu
R. Schettini, S. Bianco, F. Piccoli, C. Cusano, L. Celona, Zhang, Ming-Hsuan Yang, and Philip HS Torr. Res2net: A
S. Hwang, Y. S. Ma, H. Byun, S. Murala, A. Dudhane, H. new multi-scale backbone architecture. IEEE transactions
Aulakh, T. Zheng, T. Zhang, W. Qin, R. Zhou, S. Wang, J. on pattern analysis and machine intelligence, 2019. 6
Tarel, C. Wang, and J. Wu. Ntire 2019 image dehazing chal- [17] Ross Girshick. Fast r-cnn. In Proceedings of the IEEE Inter-
lenge report. In 2019 IEEE/CVF Conference on Computer national Conference on Computer Vision (ICCV), December
Vision and Pattern Recognition Workshops (CVPRW), pages 2015. 5
2241–2253, 2019. 5 [18] Kaiming He, Jian Sun, and Xiaoou Tang. Single image haze
[5] Codruta O. Ancuti, Cosmin Ancuti, Radu Timofte, and removal using dark channel prior. IEEE transactions on pat-
Christophe De Vleeschouwer. O-haze: a dehazing bench- tern analysis and machine intelligence, 33(12):2341–2353,
mark with real hazy and haze-free outdoor images. In IEEE 2010. 1, 2, 7
Conference on Computer Vision and Pattern Recognition, [19] Robert A Jacobs, Michael I Jordan, Steven J Nowlan, and
NTIRE Workshop, NTIRE CVPR’18, 2018. 5 Geoffrey E Hinton. Adaptive mixtures of local experts. Neu-
[6] Codruta O Ancuti, Cosmin Ancuti, Florin-Alexandru ral computation, 3(1):79–87, 1991. 4
Vasluianu, and Radu Timofte. Ntire 2020 challenge on non- [20] Simon Kornblith, Jonathon Shlens, and Quoc V. Le. Do
homogeneous dehazing. In Proceedings of the IEEE/CVF better imagenet models transfer better? In Proceedings of
Conference on Computer Vision and Pattern Recognition the IEEE/CVF Conference on Computer Vision and Pattern
Workshops, pages 490–491, 2020. 2, 3, 6 Recognition (CVPR), June 2019. 2
[21] Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, and
[7] Codruta O Ancuti, Cosmin Ancuti, Florin-Alexandru
Dan Feng. Aod-net: All-in-one dehazing network. In Pro-
Vasluianu, and Radu Timofte. Ntire 2021 nonhomogeneous
ceedings of the IEEE international conference on computer
dehazing challenge report. In Proceedings of the IEEE/CVF
vision, pages 4770–4778, 2017. 1, 2, 7
Conference on Computer Vision and Pattern Recognition,
[22] Huan Liu, Zijun Wu, Liangyan Li, Sadaf Salehkalaibar,
pages 627–646, 2021. 2, 3, 6
Jun Chen, and Keyan Wang. Towards multi-domain single
[8] Codruta O Ancuti, Cosmin Ancuti, Florin-Alexandru
image dehazing via test-time training. In Proceedings of
Vasluianu, and Radu Timofte. Ntire 2023 challenge on non-
the IEEE/CVF Conference on Computer Vision and Pattern
homogeneous dehazing. In Proceedings of the IEEE/CVF
Recognition, pages 5831–5840, 2022. 1, 2, 3
Conference on Computer Vision and Pattern Recognition
[23] Jing Liu, Haiyan Wu, Yuan Xie, Yanyun Qu, and Lizhuang
Workshops, 2023. 2, 3
Ma. Trident dehazing network. In Proceedings of the
[9] Dana Berman, Shai Avidan, et al. Non-local image dehazing. IEEE/CVF Conference on Computer Vision and Pattern
In Proceedings of the IEEE conference on computer vision Recognition (CVPR) Workshops, June 2020. 1
and pattern recognition, pages 1674–1682, 2016. 2 [24] Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie,
[10] Bolun Cai, Xiangmin Xu, Kui Jia, Chunmei Qing, and Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et al.
Dacheng Tao. Dehazenet: An end-to-end system for single Swin transformer v2: Scaling up capacity and resolution. In
image haze removal. IEEE Transactions on Image Process- Proceedings of the IEEE/CVF conference on computer vi-
ing, 25(11):5187–5198, 2016. 1, 2 sion and pattern recognition, pages 12009–12019, 2022. 2,
[11] D. Chen, M. He, Q. Fan, J. Liao, L. Zhang, D. Hou, L. Yuan, 4
and G. Hua. Gated context aggregation network for image [25] Ilya Loshchilov and Frank Hutter. Decoupled weight decay
dehazing and deraining. In 2019 IEEE Winter Conference regularization. arXiv preprint arXiv:1711.05101, 2017. 6
1414
[26] William Edgar Knowles Middleton. Vision through the at- the European conference on computer vision (ECCV), pages
mosphere. University of Toronto Press, 1952. 2 286–301, 2018. 2, 5
[27] Mohammad Motamedi, Nikolay Sakharnykh, and Tim [41] Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. Loss
Kaldewey. A data-centric approach for training deep neu- functions for image restoration with neural networks. IEEE
ral networks with less data, 2021. 2, 3, 4 Transactions on computational imaging, 3(1):47–57, 2016.
[28] Adam Paszke, Sam Gross, Soumith Chintala, Gregory 5
Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Al- [42] Zhaorun Zhou, Zhenghao Shi, Mingtao Guo, Yaning Feng,
ban Desmaison, Luca Antiga, and Adam Lerer. Automatic and Minghua Zhao. Cggan: A context guided generative
differentiation in pytorch. 2017. 6 adversarial network for single image dehazing, 2020. 1, 2
[29] Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and [43] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A
Huizhu Jia. Ffa-net: Feature fusion attention network for Efros. Unpaired image-to-image translation using cycle-
single image dehazing. Proceedings of the AAAI Conference consistent adversarial networks. In Proceedings of the IEEE
on Artificial Intelligence, 34(07):11908–11915, Apr. 2020. international conference on computer vision, pages 2223–
1, 2, 7 2232, 2017. 5
[30] Wenqi Ren, Lin Ma, Jiawei Zhang, Jinshan Pan, Xiaochun [44] Qingsong Zhu, Jiaming Mai, and Ling Shao. A fast single
Cao, Wei Liu, and Ming-Hsuan Yang. Gated fusion network image haze removal algorithm using color attenuation prior.
for single image dehazing. In Proceedings of the IEEE Con- IEEE transactions on image processing, 24(11):3522–3533,
ference on Computer Vision and Pattern Recognition, pages 2015. 2
3253–3261, 2018. 1
[31] Yuanjie Shao, Lerenhan Li, Wenqi Ren, Changxin Gao, and
Nong Sang. Domain adaptation for image dehazing. In Pro-
ceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, pages 2808–2817, 2020. 3
[32] Karen Simonyan and Andrew Zisserman. Very deep convo-
lutional networks for large-scale image recognition. arXiv
preprint arXiv:1409.1556, 2014. 5
[33] Chuanqi Tan, Fuchun Sun, Tao Kong, Wenchang Zhang,
Chao Yang, and Chunfang Liu. A survey on deep transfer
learning, 2018. 4
[34] Robby T Tan. Visibility in bad weather from a single im-
age. In 2008 IEEE conference on computer vision and pat-
tern recognition, pages 1–8. IEEE, 2008. 2
[35] Haiyan Wu, Jing Liu, Yuan Xie, Yanyun Qu, and Lizhuang
Ma. Knowledge transfer dehazing network for nonhomoge-
neous dehazing. In Proceedings of the IEEE/CVF Confer-
ence on Computer Vision and Pattern Recognition (CVPR)
Workshops, June 2020. 1
[36] Haiyan Wu, Jing Liu, Yuan Xie, Yanyun Qu, and Lizhuang
Ma. Knowledge transfer dehazing network for nonhomoge-
neous dehazing. In Proceedings of the IEEE/CVF confer-
ence on computer vision and pattern recognition workshops,
pages 478–479, 2020. 1, 2, 4
[37] Yankun Yu, Huan Liu, Minghan Fu, Jun Chen, Xiyao Wang,
and Keyan Wang. A two-branch neural network for non-
homogeneous dehazing via ensemble learning. In Proceed-
ings of the IEEE/CVF conference on computer vision and
pattern recognition, pages 193–202, 2021. 1, 2, 3, 4, 5, 6, 7
[38] Daochen Zha, Zaid Pervaiz Bhat, Kwei-Herng Lai, Fan
Yang, and Xia Hu. Data-centric ai: Perspectives and chal-
lenges, 2023. 3, 4
[39] He Zhang and Vishal M Patel. Densely connected pyramid
dehazing network. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 3194–3203,
2018. 1, 2
[40] Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng
Zhong, and Yun Fu. Image super-resolution using very
deep residual channel attention networks. In Proceedings of
1415