research-article

Connoisseur: classification of styles of Mexican architectural heritage with deep learning and visual attention prediction

Authors:

Abraham Montoya Obeso,

Mireya S. García Vázquez,

Alejandro A. Ramírez Acosta,

Jenny Benois-PineauAuthors Info & Claims

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

Article No.: 16, Pages 1 - 7

https://doi.org/10.1145/3095713.3095730

Published: 19 June 2017 Publication History

Abstract

The automatic description of multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance for application to this method. Our problem is classification of architectural styles of buildings in digital photographs of Mexican cultural heritage. The selection of relevant content in the scene for training classification models allows them to be more precise in the classification task. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Convolutional Neural Network to identify the architectural style of Mexican buildings. Also, we present an analysis of the behavior of the models trained under the traditional cropped image and the prominence maps. In this sense, we show that the performance of the saliency-based CNNs is better than the traditional training reaching a classification rate of 97% in validation dataset. It is considered that style identification with this technique can make a wide contribution in video description tasks, specifically in the automatic documentation of Mexican cultural heritage.

References

[1]

Guy Thomas Buswell. 1935. How people look at pictures: a study of the psychology and perception in art. (1935).

[2]

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.

Digital Library

[3]

Iván González-Díaz, Vincent Buso, and Jenny Benois-Pineau. 2016. Perceptual modeling in the problem of active object recognition in visual scenes. Pattern Recognition 56 (2016), 129--141.

Digital Library

[4]

NVIDIA DIGITS-Interactive Deep Learning GPU. 2015. Training System. (2015).

[5]

Jonathan Harel, Christof Koch, and Pietro Perona. 2007. Graph-based visual saliency. In Advances in neural information processing systems. 545--552.

[6]

Andrew G Howard. 2013. Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:1312.5402 (2013).

[7]

Laurent Itti, Christof Koch, and Ernst Niebur. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on pattern analysis and machine intelligence 20, 11 (1998), 1254--1259.

Digital Library

[8]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia. ACM, 675--678.

Digital Library

[9]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

[10]

Jose Llamas, Pedro M Lerones, Eduardo Zalama, and Jaime Gómez-García-Bermejo. 2016. Applying Deep Learning Techniques to Cultural Heritage Images Within the INCEPTION Project. In Euro-Mediterranean Conference. Springer, 25--32.

[11]

Abraham Montoya Obeso, Jenny Benois-Pineau, Alejandro Álvaro Ramírez Acosta, and Mireya Saraí García Vázquez. 2016. Architectural style classification of Mexican historical buildings using deep convolutional neural networks and sparse features. Journal of Electronic Imaging 26, 1 (2016), 011016. https://doi.org/10.1117/1.JEI.26.1.011016

[12]

Abraham Montoya Obeso, Laura Mariel Amaya Reyes, Mario Lopez Rodriguez, Mario Humberto Mijes Cruz, Mireya Saraí García Vázquez, Jenny Benois-Pineau, Luis Miguel Zamudio Fuentes, Elizabeth Cano Martinez, Jesús Abimelek Flores Secundino, Jose Luis Rivera Martinez, et al. 2016. Image annotation for Mexican buildings database. In SPIE Optical Engineering+ Applications. International Society for Optics and Photonics, 99700Y--99700Y.

[13]

Alex Papushoy and Adrian G Bors. 2015. Image retrieval based on query by saliency content. Digital Signal Processing 36 (2015), 156--173.

Digital Library

[14]

Pierre Sermanet, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann LeCun. 2013. Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229 (2013).

[15]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[16]

Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. 2013. On the importance of initialization and momentum in deep learning. In International conference on machine learning. 1139--1147.

Digital Library

[17]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.

[18]

Jasper RR Uijlings, Koen EA Van De Sande, Theo Gevers, and Arnold WM Smeulders. 2013. Selective search for object recognition. International journal of computer vision 104, 2 (2013), 154--171.

[19]

Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision. Springer, 818--833.

Cited By

Sasithradevi ASabarinathan Shoba SRoomi SPrakash P(2024)KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classificationHeritage Science10.1186/s40494-024-01167-812:1Online publication date: 19-Feb-2024
https://doi.org/10.1186/s40494-024-01167-8
Zu XGao CWang Y(2024)Interpreting regional characteristics of Tibetan-Qiang houses in Northwestern Sichuan by Deep Learning and Image LandscapeInternational Journal of Applied Earth Observation and Geoinformation10.1016/j.jag.2024.103865129(103865)Online publication date: May-2024
https://doi.org/10.1016/j.jag.2024.103865
Kumar PGupta VGrover M(2024)Dual attention and channel transformer based generative adversarial network for restoration of the damaged artworkEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107457128:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.107457
Show More Cited By

Index Terms

Connoisseur: classification of styles of Mexican architectural heritage with deep learning and visual attention prediction
1. Applied computing
  1. Arts and humanities
2. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning algorithms

Recommendations

Saliency-based selection of visual content for deep convolutional neural networks

The automatic description of digital multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance of application of these methods. We ...
Re-exploring the Kayseri Culture Route by Using Deep Learning for Cultural Heritage Image Classification
AICCONF '24: Proceedings of the Cognitive Models and Artificial Intelligence Conference

The categorization of images captured during the documentation of architectural structures is a crucial aspect of preserving cultural heritage in digital form. Dealing with a large volume of images makes this categorization process laborious and time-...
Multimodal metadata assignment for cultural heritage artifacts
Abstract
We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset. The three modalities are Image, Text, and Tabular data. We based the image classifier on a ResNet convolutional neural ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

June 2017

237 pages

ISBN:9781450353335

DOI:10.1145/3095713

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CBMI '17

CBMI '17: International Workshop on Content-Based Multimedia Indexing

June 19 - 21, 2017

Florence, Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
310
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)5

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sasithradevi ASabarinathan Shoba SRoomi SPrakash P(2024)KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classificationHeritage Science10.1186/s40494-024-01167-812:1Online publication date: 19-Feb-2024
https://doi.org/10.1186/s40494-024-01167-8
Zu XGao CWang Y(2024)Interpreting regional characteristics of Tibetan-Qiang houses in Northwestern Sichuan by Deep Learning and Image LandscapeInternational Journal of Applied Earth Observation and Geoinformation10.1016/j.jag.2024.103865129(103865)Online publication date: May-2024
https://doi.org/10.1016/j.jag.2024.103865
Kumar PGupta VGrover M(2024)Dual attention and channel transformer based generative adversarial network for restoration of the damaged artworkEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107457128:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.107457
Kumar PGupta V(2024)Preserving Artistic Heritage: A Comprehensive Review of Virtual Restoration Methods for Damaged ArtworksArchives of Computational Methods in Engineering10.1007/s11831-024-10175-7Online publication date: 5-Sep-2024
https://doi.org/10.1007/s11831-024-10175-7
Han QYin C(2023)Architectural style classification of the Chinese traditional settlements using deep learningInternational Conference on Geographic Information and Remote Sensing Technology (GIRST 2022)10.1117/12.2667749(139)Online publication date: 10-Feb-2023
https://doi.org/10.1117/12.2667749
Kukreja VSharma RBordoloi D(2023)Application of Deep Learning Strategy for Multi-classification of Indian Heritage Images2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT56998.2023.10307029(1-5)Online publication date: 6-Jul-2023
https://doi.org/10.1109/ICCCNT56998.2023.10307029
Nagpal YJindal VKukreja VVats SSharma R(2023)Deep Learning Multiclassification Model: Recognizing Monuments2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS58487.2023.10250711(315-319)Online publication date: 23-Aug-2023
https://doi.org/10.1109/ICAISS58487.2023.10250711
Li MYu YWei HChan T(2023)Classification of the qilou (arcade building) using a robust image processing framework based on the Faster R-CNN with ResNet50Journal of Asian Architecture and Building Engineering10.1080/13467581.2023.223803823:2(595-612)Online publication date: 29-Jul-2023
https://doi.org/10.1080/13467581.2023.2238038
Pandi GAggarwal K(2023)Deep Learning-Based 3-D Model for the Cultural Heritage Sites in the State of Gujarat, IndiaArtificial Intelligence and Sustainable Computing10.1007/978-981-99-1431-9_59(737-750)Online publication date: 24-Sep-2023
https://doi.org/10.1007/978-981-99-1431-9_59
Zhang MGu XXiao JZou PShi ZHe SLi HLi S(2022)Analysis of Urban Visual Memes Based on Dictionary Learning: An Example with Urban Image DataSymmetry10.3390/sym1401017514:1(175)Online publication date: 17-Jan-2022
https://doi.org/10.3390/sym14010175
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents