Architectural style classification based on CNN and channel–spatial attention

Wang, Bo; Zhang, Sulan; Zhang, Jifu; Cai, Zhenjiao

doi:10.1007/s11760-022-02208-0

Architectural style classification based on CNN and channel–spatial attention

Original Paper
Published: 16 April 2022

Volume 17, pages 99–107, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Bo Wang¹,
Sulan Zhang¹,
Jifu Zhang¹ &
…
Zhenjiao Cai¹

1167 Accesses
1 Altmetric
Explore all metrics

Abstract

The accurate classification of architectural styles is of great significance to the study of architectural culture and human historical civilization. Models based on convolutional neural network (CNN) have achieved highly competitive results in the field of architectural style classification owing to its more powerful capability of feature expression. However, most of the CNN models to date only extract the global features of architecture facade or focus on some regions of architecture and fail to extract the spatial features of different components. To improve the accuracy of architectural style classification, we propose an architectural style classification method based on CNN and channel–spatial attention. Firstly, we add a preprocessing operation before CNN feature extraction to select main building candidate region in architectural image and then use CNN feature extractor for deep feature extraction. Secondly, channel–spatial attention module is introduced to generate an attention map, which can not only enhance the texture feature representation of architectural images but also focus on the spatial features of different architectural elements. Finally, the Softmax classifier is used to predict the score of the target class. The experimental results on the Architectural Style Dataset and AHE_Dataset have achieved satisfactory performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Architectural Style Classification Based on DNN Model

Intra-class Classification of Architectural Styles Using Visualization of CNN

A novel CNN structure for fine-grained classification of Chinese calligraphy styles

Article 19 April 2019

Notes

https://sites.google.com/site/zhexuutssjtu/projects/arch.

References

Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L.: Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018). https://doi.org/10.1109/CVPR.2018.00636
Cao, C., Liu, X., Yang, Y., Yu, Y., Wang, J., Wang, Z., Huang, Y., Wang, L., Huang, C., Xu, W., Ramanan, D., Huang, T.S.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Liu, W., Chua, T.S.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)
Article MATH Google Scholar
Gong, L., Thota, M., Yu, M., Duan, W., Swainson, M., Ye, X., Kollias, S.: A novel unified deep neural networks methodology for use by date recognition in retail food package image. SIViP 15(3), 449–457 (2021)
Article Google Scholar
Guo, H., Zheng, K., Fan, X., Yu, H., Wang, S.: Visual attention consistency under image transforms for multi-label image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size (2016). arXiv:1602.07360
Jiang, S., Shao, M., Jia, C., Fu, Y.: Learning consensus representation for weak style classification. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2906–2919 (2017)
Article Google Scholar
Lamas, A., Tabik, S., Cruz, P., Montes, R., Martínez-Sevilla, Á., Cruz, T., Herrera, F.: Monumai: dataset, deep learning pipeline and citizen science based app for monumental heritage taxonomy and classification. Neurocomputing 420, 266–280 (2021). https://doi.org/10.1016/j.neucom.2020.09.041
Article Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178 (2006)
Li, Lj., Su, H., Fei-fei, L., Xing, E.: Object bank: a high-level image representation for scene classification & semantic feature sparsification. Adv. Neural Inf. Process. Syst. 23, 1378–1386 (2010)
Google Scholar
Llamas, J., M Lerones, P., Medina, R., Zalama, E., Gómez-García-Bermejo, J.: Classification of architectural heritage images using deep learning techniques. Appl. Sci. 7(10), 992 (2017)
Article Google Scholar
Nam, H., Ha, J.W., Kim, J.: Dual attention networks for multimodal reasoning and matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.1109/CVPR.2017.232
Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: 2011 International Conference on Computer Vision, pp. 1307–1314 (2011)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Shalunts, G., Haxhimusa, Y., Sablatnig, R.: Architectural style classification of building facade windows. In: International Symposium on Visual Computing, pp. 280–289. Springer (2011)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014). arXiv:1409.1556
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence (2017). arXiv:1602.07261
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: Computer Vision, IEEE International Conference on, vol. 2, pp. 273–273. IEEE Computer Society (2003)
Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.1109/CVPR.2017.683
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018). arXiv:1807.06521
Xin, M., Wang, Y.: Research on image classification model based on deep convolution neural network. EURASIP J. Image Video Process. 2019(1), 1–11 (2019)
Article Google Scholar
Xu, Z., Tao, D., Zhang, Y., Wu, J., Tsoi, A.C.: Architectural style classification using multinomial latent logistic regression. In: European Conference on Computer Vision, pp. 600–615. Springer (2014). https://doi.org/10.1007/978-3-319-10590-1_39
Yi, Y.K., Zhang, Y., Myung, J.: House style recognition using deep convolutional neural network. Autom. Constr. 118, 103307 (2020). https://doi.org/10.1016/j.autcon.2020.103307
Article Google Scholar
Zhang, J., Wei, F., Feng, F., Wang, C.: Spatial-spectral feature refinement for hyperspectral image classification based on attention-dense 3D–2D-CNN. Sensors 20(18), 5191 (2020). https://doi.org/10.3390/s20185191
Zhang, L., Song, M., Liu, X., Sun, L., Chen, C., Bu, J.: Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf. Sci. 254, 141–154 (2014). https://doi.org/10.1016/j.ins.2013.08.020
Article Google Scholar
Zhu, Y., Zhao, C., Guo, H., Wang, J., Zhao, X., Lu, H.: Attention couplenet: fully convolutional attention coupling network for object detection. IEEE Trans. Image Process. 28(1), 113–126 (2018)
Article Google Scholar

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Shanxi Province, China (Grant No. 202103021224285) and the Key Scientific and Technological Innovation Team of Shanxi Province, China (20180-5D131007), for big data analysis and parallel computing.

Author information

Authors and Affiliations

Taiyuan University of Science and Technology, School of Computer Science and Technology, Taiyuan, China
Bo Wang, Sulan Zhang, Jifu Zhang & Zhenjiao Cai

Authors

Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sulan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jifu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenjiao Cai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sulan Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, B., Zhang, S., Zhang, J. et al. Architectural style classification based on CNN and channel–spatial attention. SIViP 17, 99–107 (2023). https://doi.org/10.1007/s11760-022-02208-0

Download citation

Received: 23 July 2021
Revised: 15 January 2022
Accepted: 20 March 2022
Published: 16 April 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11760-022-02208-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Architectural style classification based on CNN and channel–spatial attention

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Architectural Style Classification Based on DNN Model

Intra-class Classification of Architectural Styles Using Visualization of CNN

A novel CNN structure for fine-grained classification of Chinese calligraphy styles

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Architectural style classification based on CNN and channel–spatial attention

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Architectural Style Classification Based on DNN Model

Intra-class Classification of Architectural Styles Using Visualization of CNN

A novel CNN structure for fine-grained classification of Chinese calligraphy styles

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation