research-article

WaveCNet: Wavelet Integrated CNNs to Suppress Aliasing Effect for Noise-Robust Image Classification

Authors:

Zhihui LaiAuthors Info & Claims

IEEE Transactions on Image Processing, Volume 30

Pages 7074 - 7089

https://doi.org/10.1109/TIP.2021.3101395

Published: 01 January 2021 Publication History

Abstract

Though widely used in image classification, convolutional neural networks (CNNs) are prone to noise interruptions, i.e. the CNN output can be drastically changed by small image noise. To improve the noise robustness, we try to integrate CNNs with wavelet by replacing the common down-sampling (max-pooling, strided-convolution, and average pooling) with discrete wavelet transform (DWT). We firstly propose general DWT and inverse DWT (IDWT) layers applicable to various orthogonal and biorthogonal discrete wavelets like Haar, Daubechies, and Cohen, etc., and then design wavelet integrated CNNs (WaveCNets) by integrating DWT into the commonly used CNNs (VGG, ResNets, and DenseNet). During the down-sampling, WaveCNets apply DWT to decompose the feature maps into the low-frequency and high-frequency components. Containing the main information including the basic object structures, the low-frequency component is transmitted into the following layers to generate robust high-level features. The high-frequency components are dropped to remove most of the data noises. The experimental results show that WaveCNets achieve higher accuracy on ImageNet than various vanilla CNNs. We have also tested the performance of WaveCNets on the noisy version of ImageNet, ImageNet-C and six adversarial attacks, the results suggest that the proposed DWT/IDWT layers could provide better noise-robustness and adversarial robustness. When applying WaveCNets as backbones, the performance of object detectors (i.e., faster R-CNN and RetinaNet) on COCO detection dataset are consistently improved. We believe that suppression of aliasing effect, i.e. separation of low frequency and high frequency information, is the main advantages of our approach. The code of our DWT/IDWT layer and different WaveCNets are available at <uri>https://github.com/CVI-SZU/WaveCNet</uri>.

References

[1]

D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” 2019, arXiv:1903.12261. [Online]. Available: http://arxiv.org/abs/1903.12261

[2]

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” 2014, arXiv:1412.6572. [Online]. Available: http://arxiv.org/abs/1412.6572

[3]

A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” 2016, arXiv:1607.02533. [Online]. Available: http://arxiv.org/abs/1607.02533

[4]

F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” 2017, arXiv:1705.07204. [Online]. Available: http://arxiv.org/abs/1705.07204

[5]

N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” 2016, arXiv:1608.04644. [Online]. Available: http://arxiv.org/abs/1608.04644

[6]

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” 2017, arXiv:1706.06083. [Online]. Available: http://arxiv.org/abs/1706.06083

[7]

C. Xie, Y. Wu, L. V. D. Maaten, A. L. Yuille, and K. He, “Feature denoising for improving adversarial robustness,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 501–509.

[8]

R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel, “ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness,” 2018, arXiv:1811.12231. [Online]. Available: http://arxiv.org/abs/1811.12231

[9]

H. Nyquist, “Certain topics in telegraph transmission theory,” Trans. Amer. Inst. Elect. Eng., vol. 47, no. 2, pp. 617–644, Apr. 1928.

[10]

S. G. Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 7, pp. 674–693, Jul. 1989.

Digital Library

[11]

I. Daubechies, Ten Lectures Wavelets. Philadelphia, PA, USA: SIAM, 1992.

Digital Library

[12]

A. Paszkeet al., “Automatic differentiation in pytorch,” Tech. Rep., 2017.

[13]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2009, pp. 248–255.

[14]

T.-Y. Linet al., “Microsoft COCO: Common objects in context,” in Proc. Eur. Conf. Comput. Vis. (ECCV). Cham, Switzerland: Springer, 2014, pp. 740–755.

[15]

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.

Digital Library

[16]

T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2980–2988.

[17]

Q. Li, L. Shen, S. Guo, and Z. Lai, “Wavelet integrated CNNs for noise-robust image classification,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 7245–7254.

[18]

W. Brendel and M. Bethge, “Approximating CNNs with bag-of-local-features models works surprisingly well on ImageNet,” 2019, arXiv:1904.00760. [Online]. Available: http://arxiv.org/abs/1904.00760

[19]

F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense against adversarial attacks using high-level representation guided denoiser,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 1778–1787.

[20]

D. Yu, H. Wang, P. Chen, and Z. Wei, “Mixed pooling for convolutional neural networks,” in Proc. Int. Conf. Rough Sets Knowl. Technol. Cham, Switzerland: Springer, 2014, pp. 364–375.

[21]

M. D. Zeiler and R. Fergus, “Stochastic pooling for regularization of deep convolutional neural networks,” 2013, arXiv:1301.3557. [Online]. Available: http://arxiv.org/abs/1301.3557

[22]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.

[23]

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 4700–4708.

[24]

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 4510–4520.

[25]

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014, arXiv:1409.1556. [Online]. Available: http://arxiv.org/abs/1409.1556

[26]

A. Azulay and Y. Weiss, “Why do deep convolutional networks generalize so poorly to small image transformations?” 2018, arXiv:1805.12177. [Online]. Available: http://arxiv.org/abs/1805.12177

[27]

R. Zhang, “Making convolutional networks shift-invariant again,” 2019, arXiv:1904.11486. [Online]. Available: http://arxiv.org/abs/1904.11486

[28]

Q. Li and L. Shen, “WaveSNet: Wavelet integrated deep networks for image segmentation,” 2020, arXiv:2005.14461. [Online]. Available: http://arxiv.org/abs/2005.14461

[29]

V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Dec. 2017.

[30]

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, 2015, pp. 234–241.

[31]

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected CRFs,” 2014, arXiv:1412.7062. [Online]. Available: http://arxiv.org/abs/1412.7062

[32]

L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. ECCV, 2018, pp. 801–818.

[33]

J. Bruna and S. Mallat, “Invariant scattering convolution networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1872–1886, Aug. 2013.

Digital Library

[34]

T. Wiatowski and H. Bölcskei, “A mathematical theory of deep convolutional neural networks for feature extraction,” IEEE Trans. Inf. Theory, vol. 64, no. 3, pp. 1845–1866, Mar. 2018.

[35]

Q. Zhang and A. Benveniste, “Wavelet networks,” IEEE Trans. Neural Netw., vol. 3, no. 6, pp. 889–898, Nov. 1992.

Digital Library

[36]

H. H. Szu, B. A. Telfer, and S. L. Kadambe, “Neural network adaptive wavelets for signal representation and classification,” Opt. Eng., vol. 31, no. 9, pp. 1907–1917, 1992.

[37]

D. D. N. De Silva, H. W. M. K. Vithanage, K. S. D. Fernando, and I. T. S. Piyatilake, “Multi-path learnable wavelet neural network for image classification,” 2019, arXiv:1908.09775. [Online]. Available: http://arxiv.org/abs/1908.09775

[38]

H. Huang, R. He, Z. Sun, and T. Tan, “Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 1689–1697.

[39]

Y. Liu, Q. Li, and Z. Sun, “Attribute-aware face aging with wavelet-based generative adversarial networks,” 2018, arXiv:1809.06647. [Online]. Available: http://arxiv.org/abs/1809.06647

[40]

B. A. Savareh, H. Emami, M. Hajiabadi, S. M. Azimi, and M. Ghafoori, “Wavelet-enhanced convolutional neural network: A new idea in a deep learning paradigm,” Biomed. Eng./Biomedizinische Technik, vol. 64, no. 2, pp. 195–205, Apr. 2019.

[41]

B. Yuanet al., “WaveletAE: A wavelet-enhanced autoencoder for wind turbine blade icing detection,” 2019, arXiv:1902.05625. [Online]. Available: http://arxiv.org/abs/1902.05625

[42]

P. Liu, H. Zhang, K. Zhang, L. Lin, and W. Zuo, “Multi-level wavelet-CNN for image restoration,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jun. 2018, pp. 773–782.

[43]

T. Williams and R. Li, “Wavelet pooling for convolutional neural networks,” in Proc. Int. Conf. Learn. Represent., 2018, pp. 1–12.

[44]

Y. Duan, F. Liu, L. Jiao, P. Zhao, and L. Zhang, “SAR image segmentation based on convolutional-wavelet neural network and Markov random field,” Pattern Recognit., vol. 64, pp. 255–267, Apr. 2017.

Digital Library

[45]

J. Yoo, Y. Uh, S. Chun, B. Kang, and J.-W. Ha, “Photorealistic style transfer via wavelet transforms,” 2019, arXiv:1903.09760. [Online]. Available: http://arxiv.org/abs/1903.09760

[46]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.

[47]

A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Univ. Toronto, Toronto, ON, Canada, Tech. Rep., 2009.

[48]

Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” in Proc. Workshop Deep Learn. Unsupervised Feature Learn. (NIPS), Granada, Spain, 2011.

[49]

D. Lundqvist, A. Flykt, and A. Öhman, “The Karolinska directed emotional faces (KDEF),” CD ROM from Dept. Clin. Neurosci., Psychol. Sect., Karolinska Institutet, vol. 91, p. 630, Feb. 1998.

[50]

F. Keinert, Wavelets Multiwavelets. Boca Raton, FL, USA: CRC Press, 2004.

[51]

I. W. Selesnick, R. G. Baraniuk, and N. C. Kingsbury, “The dual-tree complex wavelet transform,” IEEE Signal Process. Mag., vol. 22, no. 6, pp. 123–151, Nov. 2005.

[52]

J. Ma and M. Fenn, “Combined complex ridgelet shrinkage and total variation minimization,” SIAM J. Sci. Comput., vol. 28, no. 3, pp. 984–1000, Jan. 2006.

Digital Library

[53]

E. Candès, L. Demanet, D. Donoho, and L. Ying, “Fast discrete curvelet transforms,” Multiscale Model. Simul., vol. 5, no. 3, pp. 861–899, Jan. 2006.

[54]

E. Le Pennec and S. Mallat, “Image compression with geometrical wavelets,” in Proc. Int. Conf. Image Process., 2000, pp. 661–664.

[55]

M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2091–2106, Dec. 2005.

Digital Library

[56]

D. L. Donoho, “De-noising by soft-thresholding,” IEEE Trans. Inf. Theory, vol. 41, no. 3, pp. 613–627, May 1995.

Digital Library

[57]

D. L. Donoho and J. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” Biometrika, vol. 81, no. 3, pp. 425–455, 1994.

[58]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.

[59]

H. Wang, X. Wu, Z. Huang, and E. P. Xing, “High-frequency component helps explain the generalization of convolutional neural networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 8684–8694.

[60]

K. Chenet al., “MMDetection: Open MMLab detection toolbox and benchmark,” 2019, arXiv:1906.07155. [Online]. Available: http://arxiv.org/abs/1906.07155

[61]

Z. Wuet al., “3D shapeNets: A deep representation for volumetric shapes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015, pp. 1912–1920.

Cited By

Zhang RYu JChen JLi GLin LWang D(2024)A Prior Guided Wavelet-Spatial Dual Attention Transformer Framework for Heavy Rain Image RestorationIEEE Transactions on Multimedia10.1109/TMM.2024.335948026(7043-7057)Online publication date: 29-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3359480
Atlan FPence I(2024)The effect of noise removal filters on classifying different types of medical imagesDigital Signal Processing10.1016/j.dsp.2024.104613153:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.dsp.2024.104613
Xu JYuan MYan DWu T(2023)Illumination Guided Attentive Wavelet Network for Low-Light Image EnhancementIEEE Transactions on Multimedia10.1109/TMM.2022.320733025(6258-6271)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3207330
Show More Cited By

Recommendations

Inter-scale Correlation Image Denoising Based on Non-aliasing Contourlet Transform
ICMTMA '10: Proceedings of the 2010 International Conference on Measuring Technology and Mechatronics Automation - Volume 02

A novel image denoising method based on non-aliasing Contourlet transform(NACT) is presented according to coefficient inter-scale correlation. A noisy image was decomposed into a low frequency approximation sub-image and a series of high frequency ...
Image Denoising Based on the Dyadic Wavelet Transform
ICCIMA '03: Proceedings of the 5th International Conference on Computational Intelligence and Multimedia Applications

Since subsampling does not take place in image dyadic wavelet transform at each level, image representation in dyadic wavelet domain compared with wavelet series reconstruction is very redundant and part of disturbance of image dyadic wavelet ...
Wavelet-Attention CNN for image classification
Abstract
The feature learning methods based on convolutional neural network (CNN) have successfully produced tremendous achievements in image classification tasks. However, the inherent noise and some other factors may weaken the effectiveness of the ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing

IEEE Transactions on Image Processing Volume 30, Issue

2021

5053 pages

ISSN:1057-7149

Issue’s Table of Contents

1941-0042 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 January 2021

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang RYu JChen JLi GLin LWang D(2024)A Prior Guided Wavelet-Spatial Dual Attention Transformer Framework for Heavy Rain Image RestorationIEEE Transactions on Multimedia10.1109/TMM.2024.335948026(7043-7057)Online publication date: 29-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3359480
Atlan FPence I(2024)The effect of noise removal filters on classifying different types of medical imagesDigital Signal Processing10.1016/j.dsp.2024.104613153:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.dsp.2024.104613
Xu JYuan MYan DWu T(2023)Illumination Guided Attentive Wavelet Network for Low-Light Image EnhancementIEEE Transactions on Multimedia10.1109/TMM.2022.320733025(6258-6271)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3207330
Wu ZSun CXuan HZhang KYan Y(2023)Divide-and-Conquer Completion Network for Video InpaintingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.322591133:6(2753-2766)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TCSVT.2022.3225911
Yang YJiao LLiu XLiu FYang SLi LChen PLi XHuang Z(2023)Dual Wavelet Attention Networks for Image ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321873533:4(1899-1910)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1109/TCSVT.2022.3218735
Wang YSun TLi SYuan XNi WHossain EVincent Poor H(2023)Adversarial Attacks and Defenses in Machine Learning-Empowered Communication Systems and Networks: A Contemporary SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2023.331949225:4(2245-2298)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1109/COMST.2023.3319492
Palasciano HNason G(2023)A test for the absence of aliasing or white noise in two-dimensional locally stationary wavelet processesStatistics and Computing10.1007/s11222-023-10269-533:5Online publication date: 24-Jul-2023
https://dl.acm.org/doi/10.1007/s11222-023-10269-5
Yang WChen BShen YYu L(2023)WaveCNNs-AT: Wavelet-based deep CNNs of adaptive threshold for signal recognitionApplied Intelligence10.1007/s10489-023-05047-953:23(28819-28831)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s10489-023-05047-9
Fan SChen XHe CYu LMao ZZheng Y(2023)Multiple frequency–spatial network for RGBT tracking in the presence of motion blurNeural Computing and Applications10.1007/s00521-023-09024-835:34(24389-24406)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00521-023-09024-8
Li GLiang D(2022)Adder Wavelet for Better Image Classification under Adder Neural NetworkProceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence10.1145/3577530.3577540(63-68)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3577530.3577540
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents