Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

How to Tell Ancient Signs Apart? Recognizing and Visualizing Maya Glyphs with CNNs

Published: 05 December 2018 Publication History

Abstract

Thanks to the digital preservation of cultural heritage materials, multimedia tools (e.g., based on automatic visual processing) considerably ease the work of scholars in the humanities and help them to perform quantitative analysis of their data. In this context, this article assesses three different Convolutional Neural Network (CNN) architectures along with three learning approaches to train them for hieroglyph classification, which is a very challenging task due to the limited availability of segmented ancient Maya glyphs. More precisely, the first approach, the baseline, relies on pretrained networks as feature extractor. The second one investigates a transfer learning method by fine-tuning a pretrained network for our glyph classification task. The third approach considers directly training networks from scratch with our glyph data. The merits of three different network architectures are compared: a generic sequential model (i.e., LeNet), a sketch-specific sequential network (i.e., Sketch-a-Net), and the recent Residual Networks. The sketch-specific model trained from scratch outperforms other models and training strategies. Even for a challenging 150-class classification task, this model achieves 70.3% average accuracy and proves itself promising in case of a small amount of cultural heritage shape data. Furthermore, we visualize the discriminative parts of glyphs with the recent Grad-CAM method, and demonstrate that the discriminative parts learned by the model agree, in general, with the expert annotation of the glyph specificity (diagnostic features). Finally, as a step toward systematic evaluation of these visualizations, we conduct a perceptual crowdsourcing study. Specifically, we analyze the interpretability of the representations from Sketch-a-Net and ResNet-50. Overall, our article takes two important steps toward providing tools to scholars in the digital humanities: increased performance for automation and improved interpretability of algorithms.

References

[1]
Abrar H. Abdulnabi, Gang Wang, Jiwen Lu, and Kui Jia. 2015. Multi-task CNN model for attributes prediction. IEEE Transactions on Multimedia 17, 11 (2015), 1949--1959.
[2]
Gulcan Can, Jean-Marc Odobez, and Daniel Gatica-Perez. 2016. Evaluating shape representations for maya glyph classification. ACM Journal on Computing and Cultural Heritage (JOCCH) 9, 3 (Sept. 2016).
[3]
Gulcan Can, Jean-Marc Odobez, and Daniel Gatica-Perez. 2017. Maya codical glyph segmentation: A crowdsourcing approach. Transactions on Multimedia (in press) (September 2017).
[4]
Gülcan Can, Jean-Marc Odobez, and Daniel Gatica-Perez. 2017. Shape representations for Maya codical glyphs: Knowledge-driven or deep? In Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing (CBMI’17). ACM, Article 32, 6 pages.
[5]
Alfredo Canziani, Adam Paszke, and Eugenio Culurciello. 2016. An analysis of deep neural network models for practical applications. arXiv Preprint arXiv:1605.07678 (2016).
[6]
Dan C. Cireşan, Ueli Meier, and Jürgen Schmidhuber. 2012. Transfer learning for Latin and Chinese characters with deep neural networks. In Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--6.
[7]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition. IEEE, 248--255.
[8]
V. Deufemia, L. Paolino, and H. d. Lumley. 2012. Petroglyph recognition using self-organizing maps and fuzzy visual language parsing. In IEEE 24th International Conference on Tools with Artificial Intelligence, Vol. 1. 852--859.
[9]
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. 2014. DeCAF: A deep convolutional activation feature for generic visual recognition. In ICML, Vol. 32. 647--655.
[10]
Mathias Eitz, James Hays, and Marc Alexa. 2012. How do humans sketch objects? ACM Transactions on Graphics 31, 4, Article 44 (Jul 2012), 10 pages.
[11]
Morris Franken and Jan C. van Gemert. 2013. Automatic Egyptian hieroglyph recognition by retrieving images as texts. In Proceedings of the ACM Multimedia Conference. ACM, 765--768.
[12]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 249--256.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[14]
Paulina Hensman and David Masko. 2015. The Impact of Imbalanced Training Data for Convolutional Neural Networks. Technical Report. KTH, Stockholm, Sweden. Degree Project, in Computer Science, First Level.
[15]
Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv Preprint arXiv:1207.0580 (2012).
[16]
Judy Hoffman, Eric Tzeng, Jeff Donahue, Yangqing Jia, Kate Saenko, and Trevor Darrell. 2013. One-shot adaptation of supervised deep convolutional models. arXiv Preprint arXiv:1312.6204 (2013).
[17]
Stephen Houston, John Robertson, and David Stuart. 2000. The language of classic Maya inscriptions. Current Anthropology 41, 3 (2000), 321--356.
[18]
Rui Hu, Gulcan Can, Carlos Pallan Gayol, Guido Krempel, Jakub Spotak, Gabrielle Vail, Stephane Marchand-Maillet, Jean-Marc Odobez, and Daniel Gatica-Perez. 2015. Multimedia analysis and access of ancient Maya epigraphy. Signal Processing Magazine 32, 4 (July 2015), 75--84.
[19]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning. 448--456.
[20]
Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv Preprint arXiv:1412.6980 (2014).
[21]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.
[22]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.
[23]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.
[24]
David G. Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the International Conference on Computer Vision, Vol. 2. IEEE, 1150--1157.
[25]
Martha J. Macri and Matthew George Looper. 2003. The New Catalog of Maya Hieroglyphs, Vol. 1: The Classic Period Inscriptions. University of Oklahoma Press.
[26]
Martha J. Macri and Gabrielle Vail. 2008. The New Catalog of Maya Hieroglyphs, Vol. 2: The Codical Texts. University of Oklahoma Press.
[27]
Frank J. Massey Jr. 1951. The Kolmogorov-Smirnov test for goodness of fit. Journal of the American Statistical Association 46, 253 (1951), 68--78.
[28]
Georg Poier, Markus Seidl, Matthias Zeppelzauer, Christian Reinbacher, Martin Schaich, Giovanna Bellandi, Alberto Marretta, and Horst Bischof. 2017. The 3D-Pitoti dataset: A dataset for high-resolution 3D surface segmentation. In Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing (CBMI’17). ACM, New York, NY, Article 5, 7 pages.
[29]
Edgar Roman-Rangel, Gulcan Can, Stephane Marchand-Maillet, Rui Hu, Carlos Pallan Gayol, Guido Krempel, Jakub Spotak, Jean-Marc Odobez, and Daniel Gatica-Perez. 2016. Transferring neural representations for low-dimensional indexing of Maya hieroglyphic art. In Proceedings of the ECCV Workshop on Computer Vision for Art Analysis.
[30]
Edgar Roman-Rangel, Carlos Pallan, Jean-Marc Odobez, and Daniel Gatica-Perez. 2011. Analyzing ancient Maya glyph collections with contextual shape descriptors. IJCV 94, 1 (2011), 101--117.
[31]
Markus Seidl and Christian Breiteneder. 2012. Automated petroglyph image segmentation with interactive classifier fusion. In Proceedings of the 8th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP’12). ACM, New York, NY, Article 66, 8 pages.
[32]
Markus Seidl, Ewald Wieser, and Craig Alexander. 2015. Automated classification of petroglyphs. Digital Applications in Archaeology and Cultural Heritage 2, 2 (2015), 196--212. Digital imaging techniques for the study of prehistoric rock art.
[33]
Markus Seidl, Ewald Wieser, Matthias Zeppelzauer, Axel Pinz, and Christian Breiteneder. 2015. Graph-Based Shape Similarity of Petroglyphs. Springer International Publishing, Cham, 133--148.
[34]
Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, and Dhruv Batra. 2016. Grad-CAM: Why did you say that?arXiv Preprint arXiv:1611.07450 (2016).
[35]
Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: An astounding baseline for recognition. In CVPR Workshops.
[36]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv Preprint arXiv:1312.6034 (2013).
[37]
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014).
[38]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alex Alemi. 2016. Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv Preprint arXiv:1602.07261 (2016).
[39]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.
[40]
N. Tajbakhsh, J. Y. Shin, S. R. Gurudu, R. T. Hurst, C. B. Kendall, M. B. Gotway, and J. Liang. 2016. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Transactions on Medical Imaging 35, 5 (May 2016), 1299--1312.
[41]
John Eric Sidney Thompson and George E. Stuart. 1962. A Catalog of Maya Hieroglyphs. University of Oklahoma Press.
[42]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems. 3320--3328.
[43]
Qian Yu, Yongxin Yang, Yi-Zhe Song, Tao Xiang, and Timothy Hospedales. 2015. Sketch-a-net that beats humans. arXiv Preprint arXiv:1501.07873 (2015).
[44]
Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818--833.
[45]
B. Zhou, A. Khosla, Lapedriza. A., A. Oliva, and A. Torralba. 2016. Learning deep features for discriminative localization. CVPR (2016).
[46]
Qiang Zhu, Xiaoyue Wang, Eamonn Keogh, and Sang-Hee Lee. 2011. An efficient and effective similarity measure to enable data mining of petroglyphs. Data Mining and Knowledge Discovery 23, 1 (July 1, 2011), 91--127.

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2024)LanT: finding experts for digital calligraphy character restorationMultimedia Tools and Applications10.1007/s11042-023-17844-y83:24(64963-64986)Online publication date: 18-Jan-2024
  • (2023)Digital Restoration of Cultural Heritage With Data-Driven Computing: A SurveyIEEE Access10.1109/ACCESS.2023.328063911(53939-53977)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal on Computing and Cultural Heritage
Journal on Computing and Cultural Heritage   Volume 11, Issue 4
December 2018
122 pages
ISSN:1556-4673
EISSN:1556-4711
DOI:10.1145/3293468
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2018
Accepted: 01 May 2018
Revised: 01 May 2018
Received: 01 September 2017
Published in JOCCH Volume 11, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Maya glyphs
  2. convolutional neural networks
  3. crowdsourcing
  4. shape recognition
  5. transfer learning

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Hasler Foundation through the DCrowdLens project
  • Swiss National Science Foundation through the MAAYA project

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)7
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2024)LanT: finding experts for digital calligraphy character restorationMultimedia Tools and Applications10.1007/s11042-023-17844-y83:24(64963-64986)Online publication date: 18-Jan-2024
  • (2023)Digital Restoration of Cultural Heritage With Data-Driven Computing: A SurveyIEEE Access10.1109/ACCESS.2023.328063911(53939-53977)Online publication date: 2023
  • (2022)Deep Segmentation of Corrupted GlyphsJournal on Computing and Cultural Heritage 10.1145/346562915:1(1-24)Online publication date: 22-Jan-2022
  • (2021)A Few-shot Learning Approach for Historical Ciphered Manuscript Recognition2020 25th International Conference on Pattern Recognition (ICPR)10.1109/ICPR48806.2021.9413255(5413-5420)Online publication date: 10-Jan-2021
  • (2021)A deep neural network based framework for restoring the damaged persian pottery via digital inpaintingJournal of Computational Science10.1016/j.jocs.2021.10148656(101486)Online publication date: Nov-2021
  • (2021)Decoding the Cauzin Softstrip: a case study in extracting information from old mediaArchival Science10.1007/s10502-021-09358-z21:3(281-294)Online publication date: 25-Feb-2021
  • (2020)Identification of Construction Era for Indian Subcontinent Ancient and Heritage Buildings by Using Deep LearningProceedings of Fifth International Congress on Information and Communication Technology10.1007/978-981-15-5856-6_64(631-640)Online publication date: 22-Oct-2020
  • (2019)Improved Hieroglyph Representation for Image RetrievalJournal on Computing and Cultural Heritage 10.1145/328438812:2(1-15)Online publication date: 30-Apr-2019
  • (2019)Convolutional neural networks for archaeological site detection – Finding “princely” tombsJournal of Archaeological Science10.1016/j.jas.2019.104998110(104998)Online publication date: Oct-2019

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media