Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

WaveCNet: Wavelet Integrated CNNs to Suppress Aliasing Effect for Noise-Robust Image Classification

Published: 01 January 2021 Publication History

Abstract

Though widely used in image classification, convolutional neural networks (CNNs) are prone to noise interruptions, i.e. the CNN output can be drastically changed by small image noise. To improve the noise robustness, we try to integrate CNNs with wavelet by replacing the common down-sampling (max-pooling, strided-convolution, and average pooling) with discrete wavelet transform (DWT). We firstly propose general DWT and inverse DWT (IDWT) layers applicable to various orthogonal and biorthogonal discrete wavelets like Haar, Daubechies, and Cohen, etc., and then design wavelet integrated CNNs (WaveCNets) by integrating DWT into the commonly used CNNs (VGG, ResNets, and DenseNet). During the down-sampling, WaveCNets apply DWT to decompose the feature maps into the low-frequency and high-frequency components. Containing the main information including the basic object structures, the low-frequency component is transmitted into the following layers to generate robust high-level features. The high-frequency components are dropped to remove most of the data noises. The experimental results show that WaveCNets achieve higher accuracy on ImageNet than various vanilla CNNs. We have also tested the performance of WaveCNets on the noisy version of ImageNet, ImageNet-C and six adversarial attacks, the results suggest that the proposed DWT/IDWT layers could provide better noise-robustness and adversarial robustness. When applying WaveCNets as backbones, the performance of object detectors (i.e., faster R-CNN and RetinaNet) on COCO detection dataset are consistently improved. We believe that suppression of aliasing effect, i.e. separation of low frequency and high frequency information, is the main advantages of our approach. The code of our DWT/IDWT layer and different WaveCNets are available at <uri>https://github.com/CVI-SZU/WaveCNet</uri>.

References

[1]
D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” 2019, arXiv:1903.12261. [Online]. Available: http://arxiv.org/abs/1903.12261
[2]
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” 2014, arXiv:1412.6572. [Online]. Available: http://arxiv.org/abs/1412.6572
[3]
A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” 2016, arXiv:1607.02533. [Online]. Available: http://arxiv.org/abs/1607.02533
[4]
F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” 2017, arXiv:1705.07204. [Online]. Available: http://arxiv.org/abs/1705.07204
[5]
N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” 2016, arXiv:1608.04644. [Online]. Available: http://arxiv.org/abs/1608.04644
[6]
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” 2017, arXiv:1706.06083. [Online]. Available: http://arxiv.org/abs/1706.06083
[7]
C. Xie, Y. Wu, L. V. D. Maaten, A. L. Yuille, and K. He, “Feature denoising for improving adversarial robustness,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 501–509.
[8]
R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel, “ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness,” 2018, arXiv:1811.12231. [Online]. Available: http://arxiv.org/abs/1811.12231
[9]
H. Nyquist, “Certain topics in telegraph transmission theory,” Trans. Amer. Inst. Elect. Eng., vol. 47, no. 2, pp. 617–644, Apr. 1928.
[10]
S. G. Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 7, pp. 674–693, Jul. 1989.
[11]
I. Daubechies, Ten Lectures Wavelets. Philadelphia, PA, USA: SIAM, 1992.
[12]
A. Paszkeet al., “Automatic differentiation in pytorch,” Tech. Rep., 2017.
[13]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2009, pp. 248–255.
[14]
T.-Y. Linet al., “Microsoft COCO: Common objects in context,” in Proc. Eur. Conf. Comput. Vis. (ECCV). Cham, Switzerland: Springer, 2014, pp. 740–755.
[15]
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.
[16]
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2980–2988.
[17]
Q. Li, L. Shen, S. Guo, and Z. Lai, “Wavelet integrated CNNs for noise-robust image classification,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 7245–7254.
[18]
W. Brendel and M. Bethge, “Approximating CNNs with bag-of-local-features models works surprisingly well on ImageNet,” 2019, arXiv:1904.00760. [Online]. Available: http://arxiv.org/abs/1904.00760
[19]
F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense against adversarial attacks using high-level representation guided denoiser,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 1778–1787.
[20]
D. Yu, H. Wang, P. Chen, and Z. Wei, “Mixed pooling for convolutional neural networks,” in Proc. Int. Conf. Rough Sets Knowl. Technol. Cham, Switzerland: Springer, 2014, pp. 364–375.
[21]
M. D. Zeiler and R. Fergus, “Stochastic pooling for regularization of deep convolutional neural networks,” 2013, arXiv:1301.3557. [Online]. Available: http://arxiv.org/abs/1301.3557
[22]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.
[23]
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 4700–4708.
[24]
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 4510–4520.
[25]
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014, arXiv:1409.1556. [Online]. Available: http://arxiv.org/abs/1409.1556
[26]
A. Azulay and Y. Weiss, “Why do deep convolutional networks generalize so poorly to small image transformations?” 2018, arXiv:1805.12177. [Online]. Available: http://arxiv.org/abs/1805.12177
[27]
R. Zhang, “Making convolutional networks shift-invariant again,” 2019, arXiv:1904.11486. [Online]. Available: http://arxiv.org/abs/1904.11486
[28]
Q. Li and L. Shen, “WaveSNet: Wavelet integrated deep networks for image segmentation,” 2020, arXiv:2005.14461. [Online]. Available: http://arxiv.org/abs/2005.14461
[29]
V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Dec. 2017.
[30]
O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, 2015, pp. 234–241.
[31]
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected CRFs,” 2014, arXiv:1412.7062. [Online]. Available: http://arxiv.org/abs/1412.7062
[32]
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. ECCV, 2018, pp. 801–818.
[33]
J. Bruna and S. Mallat, “Invariant scattering convolution networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1872–1886, Aug. 2013.
[34]
T. Wiatowski and H. Bölcskei, “A mathematical theory of deep convolutional neural networks for feature extraction,” IEEE Trans. Inf. Theory, vol. 64, no. 3, pp. 1845–1866, Mar. 2018.
[35]
Q. Zhang and A. Benveniste, “Wavelet networks,” IEEE Trans. Neural Netw., vol. 3, no. 6, pp. 889–898, Nov. 1992.
[36]
H. H. Szu, B. A. Telfer, and S. L. Kadambe, “Neural network adaptive wavelets for signal representation and classification,” Opt. Eng., vol. 31, no. 9, pp. 1907–1917, 1992.
[37]
D. D. N. De Silva, H. W. M. K. Vithanage, K. S. D. Fernando, and I. T. S. Piyatilake, “Multi-path learnable wavelet neural network for image classification,” 2019, arXiv:1908.09775. [Online]. Available: http://arxiv.org/abs/1908.09775
[38]
H. Huang, R. He, Z. Sun, and T. Tan, “Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 1689–1697.
[39]
Y. Liu, Q. Li, and Z. Sun, “Attribute-aware face aging with wavelet-based generative adversarial networks,” 2018, arXiv:1809.06647. [Online]. Available: http://arxiv.org/abs/1809.06647
[40]
B. A. Savareh, H. Emami, M. Hajiabadi, S. M. Azimi, and M. Ghafoori, “Wavelet-enhanced convolutional neural network: A new idea in a deep learning paradigm,” Biomed. Eng./Biomedizinische Technik, vol. 64, no. 2, pp. 195–205, Apr. 2019.
[41]
B. Yuanet al., “WaveletAE: A wavelet-enhanced autoencoder for wind turbine blade icing detection,” 2019, arXiv:1902.05625. [Online]. Available: http://arxiv.org/abs/1902.05625
[42]
P. Liu, H. Zhang, K. Zhang, L. Lin, and W. Zuo, “Multi-level wavelet-CNN for image restoration,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jun. 2018, pp. 773–782.
[43]
T. Williams and R. Li, “Wavelet pooling for convolutional neural networks,” in Proc. Int. Conf. Learn. Represent., 2018, pp. 1–12.
[44]
Y. Duan, F. Liu, L. Jiao, P. Zhao, and L. Zhang, “SAR image segmentation based on convolutional-wavelet neural network and Markov random field,” Pattern Recognit., vol. 64, pp. 255–267, Apr. 2017.
[45]
J. Yoo, Y. Uh, S. Chun, B. Kang, and J.-W. Ha, “Photorealistic style transfer via wavelet transforms,” 2019, arXiv:1903.09760. [Online]. Available: http://arxiv.org/abs/1903.09760
[46]
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[47]
A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Univ. Toronto, Toronto, ON, Canada, Tech. Rep., 2009.
[48]
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” in Proc. Workshop Deep Learn. Unsupervised Feature Learn. (NIPS), Granada, Spain, 2011.
[49]
D. Lundqvist, A. Flykt, and A. Öhman, “The Karolinska directed emotional faces (KDEF),” CD ROM from Dept. Clin. Neurosci., Psychol. Sect., Karolinska Institutet, vol. 91, p. 630, Feb. 1998.
[50]
F. Keinert, Wavelets Multiwavelets. Boca Raton, FL, USA: CRC Press, 2004.
[51]
I. W. Selesnick, R. G. Baraniuk, and N. C. Kingsbury, “The dual-tree complex wavelet transform,” IEEE Signal Process. Mag., vol. 22, no. 6, pp. 123–151, Nov. 2005.
[52]
J. Ma and M. Fenn, “Combined complex ridgelet shrinkage and total variation minimization,” SIAM J. Sci. Comput., vol. 28, no. 3, pp. 984–1000, Jan. 2006.
[53]
E. Candès, L. Demanet, D. Donoho, and L. Ying, “Fast discrete curvelet transforms,” Multiscale Model. Simul., vol. 5, no. 3, pp. 861–899, Jan. 2006.
[54]
E. Le Pennec and S. Mallat, “Image compression with geometrical wavelets,” in Proc. Int. Conf. Image Process., 2000, pp. 661–664.
[55]
M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2091–2106, Dec. 2005.
[56]
D. L. Donoho, “De-noising by soft-thresholding,” IEEE Trans. Inf. Theory, vol. 41, no. 3, pp. 613–627, May 1995.
[57]
D. L. Donoho and J. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” Biometrika, vol. 81, no. 3, pp. 425–455, 1994.
[58]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
[59]
H. Wang, X. Wu, Z. Huang, and E. P. Xing, “High-frequency component helps explain the generalization of convolutional neural networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 8684–8694.
[60]
K. Chenet al., “MMDetection: Open MMLab detection toolbox and benchmark,” 2019, arXiv:1906.07155. [Online]. Available: http://arxiv.org/abs/1906.07155
[61]
Z. Wuet al., “3D shapeNets: A deep representation for volumetric shapes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015, pp. 1912–1920.

Cited By

View all
  • (2024)A Prior Guided Wavelet-Spatial Dual Attention Transformer Framework for Heavy Rain Image RestorationIEEE Transactions on Multimedia10.1109/TMM.2024.335948026(7043-7057)Online publication date: 29-Jan-2024
  • (2024)The effect of noise removal filters on classifying different types of medical imagesDigital Signal Processing10.1016/j.dsp.2024.104613153:COnline publication date: 1-Oct-2024
  • (2023)Illumination Guided Attentive Wavelet Network for Low-Light Image EnhancementIEEE Transactions on Multimedia10.1109/TMM.2022.320733025(6258-6271)Online publication date: 1-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing
IEEE Transactions on Image Processing  Volume 30, Issue
2021
5053 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2021

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Prior Guided Wavelet-Spatial Dual Attention Transformer Framework for Heavy Rain Image RestorationIEEE Transactions on Multimedia10.1109/TMM.2024.335948026(7043-7057)Online publication date: 29-Jan-2024
  • (2024)The effect of noise removal filters on classifying different types of medical imagesDigital Signal Processing10.1016/j.dsp.2024.104613153:COnline publication date: 1-Oct-2024
  • (2023)Illumination Guided Attentive Wavelet Network for Low-Light Image EnhancementIEEE Transactions on Multimedia10.1109/TMM.2022.320733025(6258-6271)Online publication date: 1-Jan-2023
  • (2023)Divide-and-Conquer Completion Network for Video InpaintingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.322591133:6(2753-2766)Online publication date: 1-Jun-2023
  • (2023)Dual Wavelet Attention Networks for Image ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321873533:4(1899-1910)Online publication date: 1-Apr-2023
  • (2023)Adversarial Attacks and Defenses in Machine Learning-Empowered Communication Systems and Networks: A Contemporary SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2023.331949225:4(2245-2298)Online publication date: 1-Oct-2023
  • (2023)A test for the absence of aliasing or white noise in two-dimensional locally stationary wavelet processesStatistics and Computing10.1007/s11222-023-10269-533:5Online publication date: 24-Jul-2023
  • (2023)WaveCNNs-AT: Wavelet-based deep CNNs of adaptive threshold for signal recognitionApplied Intelligence10.1007/s10489-023-05047-953:23(28819-28831)Online publication date: 1-Dec-2023
  • (2023)Multiple frequency–spatial network for RGBT tracking in the presence of motion blurNeural Computing and Applications10.1007/s00521-023-09024-835:34(24389-24406)Online publication date: 1-Dec-2023
  • (2022)Adder Wavelet for Better Image Classification under Adder Neural NetworkProceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence10.1145/3577530.3577540(63-68)Online publication date: 9-Dec-2022
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media