research-article

DeepFont: Identify Your Font from An Image

Authors:

Zhangyang Wang,

Aseem Agarwala,

Jonathan Brandt,

Thomas S. HuangAuthors Info & Claims

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Pages 451 - 459

https://doi.org/10.1145/2733373.2806219

Published: 13 October 2015 Publication History

Abstract

As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers. We study the Visual Font Recognition (VFR) problem [4] LFE, and advance the state-of-the-art remarkably by developing the DeepFont system. First of all, we build up the first available large-scale VFR dataset, named AdobeVFR, consisting of both labeled synthetic data and partially labeled real-world data. Next, to combat the domain mismatch between available training and testing data, we introduce a Convolutional Neural Network (CNN) decomposition approach, using a domain adaptation technique based on a Stacked Convolutional Auto-Encoder (SCAE) that exploits a large corpus of unlabeled real-world text images combined with synthetic data preprocessed in a specific way. Moreover, we study a novel learning-based model compression approach, in order to reduce the DeepFont model size without sacrificing its performance. The DeepFont system achieves an accuracy of higher than 80% (top-5) on our collected dataset, and also produces a good font similarity measure for font selection and suggestion. We also achieve around 6 times compression of the model without any visible loss of recognition accuracy.

References

[1]

C. Avilés-Cruz, R. Rangel-Kuoppa, M. Reyes-Ayala, A. Andrade-Gonzalez, and R. Escarela-Perez. High-order statistical texture analysis: font recognition applied. PRL, 26(2):135--145, 2005.

Digital Library

[2]

Y. Bengio. Learning deep architectures for ai. Foundations and trends® in Machine Learning, 2(1):1--127, 2009.

Digital Library

[3]

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. NIPS, 19:153, 2007.

Digital Library

[4]

G. Chen, J. Yang, H. Jin, J. Brandt, E. Shechtman, A. Agarwala, and T. X. Han. Large-scale visual font recognition. In CVPR, pages 3598--3605. IEEE, 2014.

Digital Library

[5]

M. Denil, B. Shakibi, L. Dinh, N. de Freitas, et al. Predicting parameters in deep learning. In NIPS, pages 2148--2156, 2013.

[6]

E. L. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus. Exploiting linear structure within convolutional networks for efficient evaluation. In NIPS, pages 1269--1277, 2014.

Digital Library

[7]

X. Glorot, A. Bordes, and Y. Bengio. Domain adaptation for large-scale sentiment classification: A deep learning approach. In ICML, 2011.

Digital Library

[8]

Y. Gong, L. Liu, M. Yang, and L. Bourdev. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.

[9]

M.-C. Jung, Y.-C. Shin, and S. N. Srihari. Multifont classification using typographical attributes. In ICDAR, pages 353--356. IEEE, 1999.

Digital Library

[10]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1097--1105, 2012.

Digital Library

[11]

Z. Lin, M. Chen, and Y. Ma. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint:1009.5055, 2010.

[12]

H. Ma and D. Doermann. Gabor filter based multi-class classifier for scanned document images. In ICDAR, volume 2, pages 968--968. IEEE, 2003.

Digital Library

[13]

J. Masci, U. Meier, D. Cireşan, and J. Schmidhuber. Stacked convolutional auto-encoders for hierarchical feature extraction. In ICANN, pages 52--59. 2011.

Digital Library

[14]

P. O'Donovan, J. Lıbeks, A. Agarwala, and A. Hertzmann. Exploratory font selection using crowdsourced attributes. ACM TOG, 33(4):92, 2014.

Digital Library

[15]

R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng. Self-taught learning: transfer learning from unlabeled data. In ICML, pages 759--766. ACM, 2007.

Digital Library

[16]

R. Ramanathan, K. Soman, L. Thaneshwaran, V. Viknesh, T. Arunkumar, and P. Yuvaraj. A novel technique for english font recognition using support vector machines. In ARTCom, pages 766--769, 2009.

Digital Library

[17]

H.-M. Sun. Multi-linguistic optical font recognition using stroke templates. In ICPR, volume 2, pages 889--892. IEEE, 2006.

Digital Library

[18]

P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, pages 1096--1103. ACM, 2008.

Digital Library

[19]

T. Wang, D. J. Wu, A. Coates, and A. Y. Ng. End-to-end text recognition with convolutional neural networks. In ICPR, pages 3304--3308. IEEE, 2012.

[20]

Y. Zhu, T. Tan, and Y. Wang. Font recognition based on global texture analysis. IEEE TPAMI, 2001.

Digital Library

Cited By

Yuan JChen SMo BMa YZheng WZhang C(2024)R-GNN: recurrent graph neural networks for font classification of oracle bone inscriptionsHeritage Science10.1186/s40494-024-01133-412:1Online publication date: 29-Jan-2024
https://doi.org/10.1186/s40494-024-01133-4
Bernal ESharma RYenneti SMackey IMalave JWalvoord DBrower B(2024)Task-oriented synthetic-to-real image translation for data-efficient learningSynthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications II10.1117/12.3013814(32)Online publication date: 7-Jun-2024
https://doi.org/10.1117/12.3013814
Tatsukawa YShen IQi AKoyama YIgarashi TShamir A(2024)FontCLIP: A Semantic Typography Visual‐Language Model for Multilingual Font ApplicationsComputer Graphics Forum10.1111/cgf.1504343:2Online publication date: 30-Apr-2024
https://doi.org/10.1111/cgf.15043
Show More Cited By

Index Terms

DeepFont: Identify Your Font from An Image
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

DeepFont: A System for Font Recognition and Similarity
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

We develop the DeepFont system, a large-scale learning-based solution for automatic font identification, organization and selection. In this proposed technical demonstration, we will give our audience a tour to the DeepFont system, with the focus on its ...
Pose-Invariant Facial Expression Recognition Based on 3D Face Morphable Model and Domain Adversarial Learning
Image and Graphics
Abstract
Pose is one of the most important factors affecting performance of face related recognition algorithms including facial expression recognition (FER). Traditionally, non-frontal FER is conducted by either performing face formalization or designing ...
SyDog-Video: A Synthetic Dog Video Dataset for Temporal Pose Estimation
Abstract
We aim to estimate the pose of dogs from videos using a temporal deep learning model as this can result in more accurate pose predictions when temporary occlusions or substantial movements occur. Generally, deep learning models require a lot of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

October 2015

1402 pages

ISBN:9781450334594

DOI:10.1145/2733373

General Chairs:
Xiaofang Zhou
The University of Queensland, Australia
,
Alan F. Smeaton
Dublin City University, Ireland
,
Qi Tian
The University of Texas at San Antonio, USA
,
Program Chairs:
Dick C.A. Bulterman
FXPAL, USA
,
Heng Tao Shen
The University of Queensland, Australia
,
Ketan Mayer-Patel
The University of North Carolina, USA
,
Shuicheng Yan
National University of Singapore, Singapore

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '15

Sponsor:

SIGMM

MM '15: ACM Multimedia Conference

October 26 - 30, 2015

Brisbane, Australia

Acceptance Rates

MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

83
Total Citations
View Citations
1,254
Total Downloads

Downloads (Last 12 months)84
Downloads (Last 6 weeks)13

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yuan JChen SMo BMa YZheng WZhang C(2024)R-GNN: recurrent graph neural networks for font classification of oracle bone inscriptionsHeritage Science10.1186/s40494-024-01133-412:1Online publication date: 29-Jan-2024
https://doi.org/10.1186/s40494-024-01133-4
Bernal ESharma RYenneti SMackey IMalave JWalvoord DBrower B(2024)Task-oriented synthetic-to-real image translation for data-efficient learningSynthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications II10.1117/12.3013814(32)Online publication date: 7-Jun-2024
https://doi.org/10.1117/12.3013814
Tatsukawa YShen IQi AKoyama YIgarashi TShamir A(2024)FontCLIP: A Semantic Typography Visual‐Language Model for Multilingual Font ApplicationsComputer Graphics Forum10.1111/cgf.1504343:2Online publication date: 30-Apr-2024
https://doi.org/10.1111/cgf.15043
Zhu CLu GChen HFeng DWang SZhao YXie RSong L(2024)A Character Position-Aware Compression Framework for Screen Text ImageIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.337967534:9(8821-8835)Online publication date: Sep-2024
https://doi.org/10.1109/TCSVT.2024.3379675
Tonmoy MRakib ARahman RAdnan MMridha MHuang JShin J(2024)A Lightweight Visual Font Style Recognition With Quantized Convolutional AutoencoderIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33787095(120-130)Online publication date: 2024
https://doi.org/10.1109/OJCS.2024.3378709
Tonmoy MAdnan MSaha AMridha MDey N(2024)Descriptor: Multilingual Visual Font Recognition DatasetIEEE Data Descriptions10.1109/IEEEDATA.2024.34607681(8-12)Online publication date: 2024
https://doi.org/10.1109/IEEEDATA.2024.3460768
Sowmyayani SMaheswari SSangeetha SMathivanan SBalusamy BGite SDeshpande N(2024)Automated Detection and Classification of Motorcycle Number Plate Formats to Improve Road Safety2024 International Conference on Electrical Electronics and Computing Technologies (ICEECT)10.1109/ICEECT61758.2024.10739182(1-7)Online publication date: 29-Aug-2024
https://doi.org/10.1109/ICEECT61758.2024.10739182
Murugan VSowmyayani SKavitha JMeenakshi S(2024)AI Driven Smart Number Plate Identification for Automatic Identification2024 IEEE International Conference on Computing, Power and Communication Technologies (IC2PCT)10.1109/IC2PCT60090.2024.10486444(1193-1197)Online publication date: 9-Feb-2024
https://doi.org/10.1109/IC2PCT60090.2024.10486444
Yan FZhang H(2024)SMFNet: One-Shot Recognition of Chinese Character Font Based on Siamese Metric ModelIEEE Access10.1109/ACCESS.2024.337057412(38473-38489)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3370574
Lim SLee YSong Y(2024)Definition and Automatic Extraction Performance Analysis of Stroke Elements in the English AlphabetIEEE Access10.1109/ACCESS.2024.336048212(18931-18938)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3360482
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents