Abstract
This paper presents a pojection based method for segmenting handwritten Chinese characters in form documents with known structures. In the preprocessing phase, a noise removal method is proposed that preserves strike connections and character edge points. In the character segmentation phase, the projection profile analysis method is used to segment a text line image into projection blocks. In addition, projection blocks are classified into one of four types; mark, half-word, single-word, and two word. Large blocks are then split and small blocks are merged. In addition, an OCR system is adopted to eliminate errors resulting from the inappropriate merging of Chinese numerical characters with other characters. As for 1319 Chinese characters are tested during our experiments, the correct segmentation rates of 92.34% and 91.76% are obtained with and without the OCR module.
Chapter PDF
Similar content being viewed by others
Keywords
Reference]
Srihari, S.N., “ Document Image Understanding,” Proc. IEEE Computer Society Fall Joint Computer Conf., pp.87–96, 1886.
Wang, D. and Srihari, “ Analysis of Form Images,” Proc. 1st Internat. Conf. Document Anal. Recognition, pp.181–191, 1991.
Casey, R. G., D. R. Ferguson, K. Mohiuddin and E. Walach, “ Intelligent forms processing system,” Machine Vision and Applications, Vol. 5, pp. 143–155, 1992.
Lam, S.W., L. Javanbakht, and S. N. Srihari, “ Anatomy of a form reader,” Proc. 2nd Intern. Conf. on December Analysis and Recognition, pp. 506–509, 1993.
L.A. Fletcher and R. Kasturi, “ A robust algorithm for text string separation from mixed text/graphic images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 10, No. 6, pp.910–918, 1998.
R.G. Casey and E. Lecolinet, “ Survey of methods and strategies in character segmentation,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 18, No. 7, pp.680–706, July 1996.
Y. Lu, “ Machine printed character segmentation-An overview,” Pattern Recognition, Vol. 28, No. 1, pp. 67–80, 1995.
G. Seni and E. Cohen, “ External word segmentation of off-line handwritten text lines,” Pattern Recognition, Vol. 27, No. 1, pp. 41–52, 1994.
C. C. Chiang and S. S. Yu, “ An interactive character segmentation method for irregularly formatted Chinese documents,” in Proceedings of the 5th Optical Character Recognition and Document Analysis, Chung Li, Taiwan, 1996, pp. 61–67.
E. Lecolinet and J. V. Moreau, “ A new system for automatic segmentation and recognition of unconstrained zip codes,” in Proceedings Sixth Scandinavian Conference Image Analysis, Oulu, Finland, June 1989, pp. 585.
Y. Lu and M. Shridhar, “ Character segmentation in handwritten words-An overview,” Pattern Recognition, Vol. 29, No. 1, 1996, pp. 77–96.
W. Niblack, An Introduction to Digital Image Processing, Prentice Hall, 1986.
J. L. Chen and H. J. Lee, “ An efficient algorithm for form structure Extraction using Strip Projection,” Pattern Recognition, Vol. 31, No. 9, pp.1353–1368, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, JL., Wu, CH., Lee, HJ. (1999). Chinese Handwritten Character Segmentation in Form Documents. In: Lee, SW., Nakano, Y. (eds) Document Analysis Systems: Theory and Practice. DAS 1998. Lecture Notes in Computer Science, vol 1655. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48172-9_28
Download citation
DOI: https://doi.org/10.1007/3-540-48172-9_28
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66507-6
Online ISBN: 978-3-540-48172-0
eBook Packages: Springer Book Archive