research-article

How to Tell Ancient Signs Apart? Recognizing and Visualizing Maya Glyphs with CNNs

Authors:

Gülcan Can,

Jean-Marc Odobez,

Daniel Gatica-PerezAuthors Info & Claims

Journal on Computing and Cultural Heritage (JOCCH), Volume 11, Issue 4

Article No.: 20, Pages 1 - 25

https://doi.org/10.1145/3230670

Published: 05 December 2018 Publication History

Get Access

Abstract

Thanks to the digital preservation of cultural heritage materials, multimedia tools (e.g., based on automatic visual processing) considerably ease the work of scholars in the humanities and help them to perform quantitative analysis of their data. In this context, this article assesses three different Convolutional Neural Network (CNN) architectures along with three learning approaches to train them for hieroglyph classification, which is a very challenging task due to the limited availability of segmented ancient Maya glyphs. More precisely, the first approach, the baseline, relies on pretrained networks as feature extractor. The second one investigates a transfer learning method by fine-tuning a pretrained network for our glyph classification task. The third approach considers directly training networks from scratch with our glyph data. The merits of three different network architectures are compared: a generic sequential model (i.e., LeNet), a sketch-specific sequential network (i.e., Sketch-a-Net), and the recent Residual Networks. The sketch-specific model trained from scratch outperforms other models and training strategies. Even for a challenging 150-class classification task, this model achieves 70.3% average accuracy and proves itself promising in case of a small amount of cultural heritage shape data. Furthermore, we visualize the discriminative parts of glyphs with the recent Grad-CAM method, and demonstrate that the discriminative parts learned by the model agree, in general, with the expert annotation of the glyph specificity (diagnostic features). Finally, as a step toward systematic evaluation of these visualizations, we conduct a perceptual crowdsourcing study. Specifically, we analyze the interpretability of the representations from Sketch-a-Net and ResNet-50. Overall, our article takes two important steps toward providing tools to scholars in the digital humanities: increased performance for automation and improved interpretability of algorithms.

References

[1]

Abrar H. Abdulnabi, Gang Wang, Jiwen Lu, and Kui Jia. 2015. Multi-task CNN model for attributes prediction. IEEE Transactions on Multimedia 17, 11 (2015), 1949--1959.

Abstract

References

Cited By

Index Terms

Recommendations

Shape Representations for Maya Codical Glyphs: Knowledge-driven or Deep?

Visualizing web search results using glyphs: Design and evaluation of a flower metaphor

Visualizing weakly-Annotated Multi-label Mayan Inscriptions with Supervised t-SNE

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations