Abstract
This paper proposes a new approach to the estimation of document states such as interline spacing and text line orientation, which facilitates a number of tasks in document image processing. The proposed method can be applied to spatially varying states as well as invariant ones, so that general cases including images of complex layout, camera-captured images, and handwritten ones can also be handled. Specifically, we find CCs (Connected Components) in a document image and assign a state to each of them. Then the states of CCs are estimated using an energy minimization framework, where the cost function is designed based on frequency domain analysis and minimized via graph-cuts. Using the estimated states, we also develop a new algorithm that performs text block identification and text line extraction. Roughly speaking, we can segment an image into text blocks by cutting the distant connections among the CCs (compared to the estimated interline spacing), and we can group the CCs into text lines using a bottom-up grouping along the estimated text line orientation. Experimental results on a variety of document images show that our method is efficient and provides promising results in several document image processing tasks.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
O’Gorman, L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15, 1162–1173 (1993)
Liang, J., DeMenthon, D., Doermann, D.: Flattening curved documents in images. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2005)
Shafait, F., Breuel, T.M.: Document image dewarping contest. In: Int. Workshop on Camera-Based Document Analysis and Recognition, pp. 181–188 (2007)
Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: International Workshop on Document Analysis Systems, pp. 209–216 (2008)
Cao, H., Ding, X., Liu, C.: A cylindrical surface model to rectify the bound document. In: International Conference on Computer Vision, ICCV (2003)
Koo, H.I., Kim, J., Cho, N.I.: Composition of a dewarped and enhanced document image from two view images. IEEE Trans. Image Process. 18, 1551–1562 (2009)
Shafait, F., Keysers, D., Breuel, T.M.: Performance evaluation and benchmarking of six page segmentation algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30, 941–954 (2008)
Zheng, Y., Li, H., Doermann, D.: Machine printed text and handwriting identification in noisy document images. IEEE Trans. Pattern Anal. Mach. Intell. 26, 337–353 (2004)
Xiao, Y., Yan, H.: Text region extraction in a document image based on the delaunay tessellation. Pattern Recognition 36, 799–809 (2003)
Kise, K., Iwata, M.: Segmentation of page images using the area voronoi diagram. Computer Vision and Image Understanding 70, 370–382 (1998)
Bukhari, S.S., Shafait, F., Breuel, T.M.: Coupled snakelet model for curled textline segmentation of camera-captured document images. In: International Conference on Document Analysis and Recognition, pp. 61–65 (2009)
Lindeberg, T.: Feature detection with automatic scale selection. International Journal of Computer Vision 30, 79–116 (1998)
Yin, F., Liu, C.L.: Handwritten chinese text line segmentation by clustering with distance metric learning. Pattern Recogn. 42, 3146–3157 (2009)
de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry. Springer, Heidelberg (2000)
Antonacopoulos, A.: Page segmentation using the description of the background. Computer Vision and Image Understanding 70, 350–369 (1998)
Bukhari, S., Shafait, F., Breuel, T.: Segmentation of curled textlines using active contours. In: The Eighth IAPR International Workshop on Document Analysis Systems, DAS 2008, pp. 270–277 (2008)
Li, Y., Zheng, Y., Doermann, D., Jaeger, S.: Script-independent text line segmentation in freestyle handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1313–1329 (2008)
Pilu, M., Pollard, S.: A light-weight text image processing method for handheld embedded cameras. In: BMVC (2002)
Pogalin, E., Smeulders, A., Thean, A.: Visual quasi-periodicity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)
Gatos, B., Antonacopoulos, A., Stamatopoulos, N.: Handwriting segmentation contest. In: International Conference on Document Analysis and Recognition, vol. 2, pp. 1284–1288 (2007)
Bukhari, S.S., Breuel, T.M., Shafait, F.: Textline information extraction from grayscale camera-captured document images. In: IEEE International Conference on Image Processing (ICIP), pp. 2013–2016 (2009)
Dey, P., Noushath, S.: e-pcp: A robust skew detection method for scanned document images. In: Pattern Recognition (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koo, H.I., Cho, N.I. (2010). State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15552-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-15552-9_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15551-2
Online ISBN: 978-3-642-15552-9
eBook Packages: Computer ScienceComputer Science (R0)