Text Extraction, Enhancement and OCR in Digital Video

Li, Huiping; Doermann, David; Kia, Omid

doi:10.1007/3-540-48172-9_29

Huiping Li⁶,
David Doermann⁶ &
Omid Kia⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1655))

Included in the following conference series:

International Workshop on Document Analysis Systems

674 Accesses
3 Citations

Abstract

In this paper we address the problem of text extraction, enhancement and recognition in digital video. Compared with optical character recognition (OCR) from document images, text extraction and recognition in digital video presents several new challenges. First, the text in video is often embedded in complex backgrounds, making text extraction and separation difficult. Second, image data contained in video frames is often digitized and/or subsampled at a much lower resolution than is typical for document images. As a result, most commercial OCR software can not recognize text extracted from video. We have implemented a hybrid wavelet/neural network segmenter to extract text regions and use a two stage enhancement scheme prior to recognition. First, we use Shannon interpolation to raise the image resolution, and second we postprocess the block with normal/inverse text classification and adaptive thresholding. Experimental results show that our text extraction scheme can extract both scene text and graphical text robustly and reasonable OCR results are achieved after enhancement.

Download to read the full chapter text

Chapter PDF

Robust detection of video text using an efficient hybrid method via key frame extraction and text localization

Article 13 November 2020

An Effective Approach Towards Video Text Recognition

Contour feature learning for locating text in natural scene images

Article 16 January 2022

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Reference

G. Piccioli, E. De Micheli, P. Parodi, and M. Campani. Robust method for road sign detection and recognition. Image and Vision Computing, 14:209–254, 1996.
Article Google Scholar
S. K. Kim, D. W. Kim, and H. J. Kiml, A recognition of vehicle license plate using a genetic algorithm based segmentation. In Proceedings of ICIP, pages 661–664, 1996.
Google Scholar
T. Gotoh, T. Toriu, S. Sasaki, and M. Yoshida. A flexible vision-based algorithm for a book sorting system. IEEE Trans. PAMI, 10:393–399, 1998.
Google Scholar
.J. Zhou, D. Lopresti, and T. Tasdizen. Finding text in color images. In Proceedings of SPIE, Document Recognition V, pages 130–140, 1998.
Google Scholar
R. Lienhart and F. Stuber. Automatic text recognition in digital videos. In Proceedings of ACM Multimedia, pages 11–20, 1996.
Google Scholar
A. K. Jain and B. Yu. Automatic text location in images and video frames. In Proceedings of ICPR, pages 1497–1499, 1998.
Google Scholar
Hae-Kwang Kim. Efficient automatic text location method and content-based indexing and structuring of video database. Journal of Visual Communication and Image Representation, 7:336–344, 1996.
Article Google Scholar
C-M. Lee and A. Kankanhalli. Automatic extraction of characters in complex scene images. International Journal of Pattern Rocognition and Artificial Intelligence, 9:67–82, 1995.
Article Google Scholar
J. Ohya, A. Shio, and S. Akamatsu. Recognizing characters in scene images. IEEE Trans. PAMI, 16:214–220, 1994.
Google Scholar
A. K. Jain and S. Bhattacharjee. Text segmentation using Gabor niters for automatic document processing. Machine Vision and Applications, 5:169–184, 1992.
Article Google Scholar
V. Wu, R. Manmatha, and E. M. Riseman.Automatic text detection and recognition. pages 707–712. 5 1997.
Google Scholar
Y. Zhong, K. Karu, and A.K. Jain. Locating text in complex color images. Pattern Recognition, 28:1523-1236, 1995.
Google Scholar
John D. Hobby and Tin K. Ho. Enhancing degraded document images via bitmap clustering and averaging. In ICDAR'97: Fourth International Conference on Document Analysis and Recogntion, pages 394–400, August 1997.
Google Scholar
J. Liang and R. M. Haralick. Document image restoration using binary morphological filters. In SPIE Vol. 2660, 1996.
Google Scholar
J. Shim, C. Dorai, and R. Bolle. Automatic text extraction from video for contentbased annotation and retrieval. In Proceedings of ICPR, pages 618–620, 1998.
Google Scholar
J. Zhou and D. Lopresti. Ocr for world wide web images. In Proceedings of SPIE, Document Recognition IV, pages 58–66, 1997.
Google Scholar
S. G. Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. PAMI, 11:674–693, 1989.
MATH Google Scholar
K. Sung and T. Poggio. Example-based learning for view-based human face detection. Technical report, MIT, A.I. Memo 1521, CBCL Paper 112, 1994.
Google Scholar
K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, New York, 1990.
MATH Google Scholar
Niblack W. In An introducti on to image processing, pages 115–116, Englewood Cliffs, N.J.: Prentice Hall, 1986.
Google Scholar
V. Kobia, D. S. Doermann, and K. I. Lin. Archiving, indexing, and retrieval of video in the compressed domain. In Proc. of the SPIE Conference on Multimedia Storage and Archiving Systems, volume 2916, pages 78–89, 1996.
Google Scholar
S. Chen. OCR performance evaluation software-user's manual. In T h e Uni versi ty of Washington Database.
Google Scholar
T. Kanungo, G. A. Marton, and O. Bulbul. Omnipage vs. sakhr: Paired model evaluation of two arable ocr products. In Proc. of the SPIE Conference on Document Recognition and Retrieval (VI), volume 3651, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Language and Media Processing Laboratory Institute for Advanced Computer Studies, University of Maryland College Park, MD, 20742-3275
Huiping Li & David Doermann
Advanced Network Technologies Division, National Institute of Standards and Technology Gaithersburg, MD 20899
Omid Kia

Authors

Huiping Li
View author publications
You can also search for this author in PubMed Google Scholar
David Doermann
View author publications
You can also search for this author in PubMed Google Scholar
Omid Kia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Korea University, Center for Artificial Vision Research Anam-dong, Seongbuk-ku, 136-701, Seoul, Korea
Seong-Whan Lee
Shinshu University, Department of Information Engineering, 500 Wakasato, 380-8553, Nagano, Japan
Yasuaki Nakano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Doermann, D., Kia, O. (1999). Text Extraction, Enhancement and OCR in Digital Video. In: Lee, SW., Nakano, Y. (eds) Document Analysis Systems: Theory and Practice. DAS 1998. Lecture Notes in Computer Science, vol 1655. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48172-9_29

Download citation

DOI: https://doi.org/10.1007/3-540-48172-9_29
Published: 13 May 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66507-6
Online ISBN: 978-3-540-48172-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Text Extraction, Enhancement and OCR in Digital Video

Abstract

Chapter PDF

Similar content being viewed by others

Robust detection of video text using an efficient hybrid method via key frame extraction and text localization

An Effective Approach Towards Video Text Recognition

Contour feature learning for locating text in natural scene images

Keywords

Reference

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Text Extraction, Enhancement and OCR in Digital Video

Abstract

Chapter PDF

Similar content being viewed by others

Robust detection of video text using an efficient hybrid method via key frame extraction and text localization

An Effective Approach Towards Video Text Recognition

Contour feature learning for locating text in natural scene images

Keywords

Reference

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation