A Competition of Shape and Texture Bias by Multi-view Image Representation

Kong, Lingwei; Wang, Jianzong; Huang, Zhangcheng; Xiao, Jing

doi:10.1007/978-3-030-88013-2_12

Lingwei Kong¹⁶,
Jianzong Wang¹⁶,
Zhangcheng Huang¹⁶ &
…
Jing Xiao¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13022))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1810 Accesses
1 Citations

Abstract

There are mainly two views on the interpretation of high efficiency of Convolutional Neural Networks (CNNs) for the task of image classification: shape bias and texture bias. This is critical to the causality and reliability of CNN models in real applications. In this work, we try to explore the power of CNNs and reconcile the hypothesis contradiction of CNNs from a multi-view image representation. Firstly, we assume an image is generated from object shape representation, object texture representation, and background information. Secondly, we segment and recombine the object shape, texture and image background through two losses: image reconstructed loss and feature discrepancy loss. Finally, the classification loss is combined by shape, texture and background contributions weighted by multi-view features. Comprehensive experiments conducted on real-world datasets show that, first, CNNs generally do not have texture or shape bias, which change with the internal bias of data; second, CNNs are learning knowledge in a lazy way, i.e., high level knowledge is learned only if low level knowledge does not satisfy the task requirements. Our findings might benefit the interpretability of CNNs and provide insight of more robust design.

This paper is supported by National Key Research and Development Program of China under grant No. 2018YFB0204403, No. 2017YFB1401202 and No. 2018YFB1003500.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-view and multi-augmentation for self-supervised visual representation learning

Article 16 December 2023

A Multi-view Images Classification Based on Shallow Convolutional Neural Network

A multi-view convolutional neural network based on cross-connection and residual-wider

Article 25 October 2022

References

Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018)
Article Google Scholar
Ballester, P., Araujo, R.M.: On the performance of GoogLeNet and AlexNet applied to sketches. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Google Scholar
Bhattacharjee, D., et al.: DUNIT: detection-based unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Brendel, W., Bethge, M.: Approximating CNNs with bag-of-local-features models works surprisingly well on ImageNet. arXiv preprint arXiv:1904.00760 (2019)
Cho, D., Tai, Y.-W., Kweon, I.: Natural image matting using deep convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 626–643. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_39
Chapter Google Scholar
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. arXiv preprint arXiv:1610.07629 (2016)
Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Univ. Montreal 1341(3), 1 (2019)
Google Scholar
Funke, C.M., Gatys, L.A., Ecker, A.S., Bethge, M.: Synthesising dynamic textures using convolutional neural networks. arXiv preprint arXiv:1702.07006 (2017)
Gatys, L., Ecker, A.S., Bethge, M.: Texture sythesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 262–270 (2015)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Google Scholar
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wich-mann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
Hao, S., Zhou, Y., Guo, Y.: A brief survey on semantic segmentation with deep learning. Neurocomputing 406 (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Google Scholar
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision, pp. 172–189 (2018)
Google Scholar
Kubilius, J., Bracci, S., de Beeck, H.P.O.: Deep neural networks as a computational model for human shape sensitivity. PLoS Comput. Biol. 12(4), e1004896 (2016)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Milletari, F., Navab, N., Ahmadi, S.-A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Google Scholar
Narayanaswamy, S., et al.: Learning disentangled representations with semi-supervised deep generative models. In: Advances in Neural Information Processing Systems, pp. 5925–5935 (2017)
Google Scholar
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Google Scholar
Ritter, S., Barrett, D.G., Santoro, A., Botvinick, M.M.: Cognitive psychology for deep neural networks: a shape bias case study. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2940–2949 (2017). JMLR.org
Shen, X., Tao, X., Gao, H., Zhou, C., Jia, J.: Deep automatic portrait matting. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 92–107. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_6
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Tran, L., Yin, X., Liu, X.: Disentangled representation learning GAN for pose-invariant face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1415–1424 (2017)
Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932 (2017)
Google Scholar
Wang, N., Chen, M., Subbalakshmi, K.P.: Explainable CNN-attention networks (C-attention network) for automated detection of Alzheimer’s disease. arXiv preprint arXiv:2006.14135 (2020)
Wang, Q., et al.: IBRNet: learning multi-view image-based rendering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2021)
Google Scholar
Wang, R., et al.: Multi-view bearing fault diagnosis method based on deep learning. J. Phys. Conf. Ser. 1757(1) (2021)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Xu, N., Price, B., Cohen, S., Huang, T.: Deep image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2970–2979 (2017)
Google Scholar
Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., Zhu, J.: Explainable AI: a brief survey on history, research areas, approaches and challenges. In: Tang, J., Kan, M.-Y., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2019. LNCS (LNAI), vol. 11839, pp. 563–574. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32236-6_51
Chapter Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, C., Wang, D.-H.: Exploring the prediction consistency of multiple views for transductive visual recognition. IEEE Signal Process. Lett. 28, 668–672 (2021)
Article Google Scholar
Zhao, B., et al.: Multi-view image generation from a single-view. In: Proceedings of the 26th ACM International Conference on Multimedia (2018)
Google Scholar

Download references

Acknowledgments

This paper is supported by National Key Research and Development Program of China under grant No. 2018YFB0204403, No. 2017YFB1401202 and No. 2018YFB1003500.

Author information

Authors and Affiliations

Ping An Technology (Shenzhen) Co., Ltd., Shenzhen, China
Lingwei Kong, Jianzong Wang, Zhangcheng Huang & Jing Xiao

Authors

Lingwei Kong
View author publications
You can also search for this author in PubMed Google Scholar
Jianzong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhangcheng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianzong Wang .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kong, L., Wang, J., Huang, Z., Xiao, J. (2021). A Competition of Shape and Texture Bias by Multi-view Image Representation. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13022. Springer, Cham. https://doi.org/10.1007/978-3-030-88013-2_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-88013-2_12
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88012-5
Online ISBN: 978-3-030-88013-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Competition of Shape and Texture Bias by Multi-view Image Representation

Abstract

Access this chapter

Similar content being viewed by others

Multi-view and multi-augmentation for self-supervised visual representation learning

A Multi-view Images Classification Based on Shallow Convolutional Neural Network

A multi-view convolutional neural network based on cross-connection and residual-wider

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Competition of Shape and Texture Bias by Multi-view Image Representation

Abstract

Access this chapter

Similar content being viewed by others

Multi-view and multi-augmentation for self-supervised visual representation learning

A Multi-view Images Classification Based on Shallow Convolutional Neural Network

A multi-view convolutional neural network based on cross-connection and residual-wider

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation