Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Structure recognition methods for various types of documents

  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In this paper, we describe experimental methods of recognizing the document structures of various types of documents in the framework of document understanding. Namely, we interpret document structures with individually characterized document knowledge. The document understanding process is divided into three procedures: the first is the recognition of document structures from a two-dimensional point of view; the second is the recognition of item relationships from a one-dimensional point of view; and the third is the recognition of characters from a zero-dimensional point of view. The procedure for recognizing structures plays the most important role in document understanding. This procedure extracts and classifies the logical item blocks from paper-based documents distinctly.

We discuss the structure recognition methods for three classes of documents: 1) table-form documents, filled-in forms, cataloging lists, etc. — each item block is surrounded by horizontal and vertical line segments; 2) library cataloging cards, name cards, letters, etc. — each item block is separated by spaces; 3) newspapers, pamphlets, etc. — each item block is constructed hierarchically and by combining under roughly specified layouts. The structure recognition procedure is characterized by individual recognition methods: in class 1 documents, binary trees indicating the connective relationships among neighboring item blocks, which are surrounded by line segments; in class 2 documents, binary trees defining the spatial and geometric relationships among neighboring item blocks, which are separated by spaces; and in class 3 documents, composition rules specifying the constructive relationships among neighboring item blocks, which are represented by adjacent relationship graphs. The methods are effective under the knowledge-based frame-work and are integrated complementarily from the top-down (model-driven) and bottom-up (data-driven) approaches. Of course, the integration means vary according to document classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baird HS (1990) Anatomy of a page reader, Proc. MVA 1990, pp 483–486

    Google Scholar 

  • Ejiri M (1989) Knowledge-based approaches to practical image processing. Proc. MIV 1989, pp 1–8

    Google Scholar 

  • Dengel A, Barth G (1988) High level document analysis guided by geometric aspects. Int J Pattern Recognition and Artiflcal Intelligence 2:641–655

    Article  Google Scholar 

  • Higashino I, Fujisawa H, Nakano Y, Ejiri M (1986) A knowledge-based segmentation method for document understanding. Proc. 8th ICPR, pp 745–748

  • Kise K, Momota K, Yamaoka M, Sugiyama J (1990) Model-based understanding of document images. Proc. MVA 1990, pp 471–474

    Google Scholar 

  • Luo Q, Watanabe T, Yoshida Y, Inagaki Y (1990) Recognition of document structure on the basis of spatial and geometric relationships between document items. Proc. MVA 1990, pp 461–464

    Google Scholar 

  • Luo Q, Watanabe T, Sugie N (1992) A structure recognition method for Japanese newspapers. Proc. Symposium on Document Analysis and Information Retrieval, pp 217–234

  • Maderluchner G (1990) Symbolic substraction of fixed formatted graphics and text fonts filled in forms. Proc. MVA 1990, pp 457–460

    Google Scholar 

  • Nagy G (1986) Hierarchical representation of optical scanned documents. Proc. 7th ICPR, pp 347–349

  • Sueu CY (1986) Character recognition by computer and applications. In: Yong TY, Fu KS (eds) Handbook of pattern recognition and image processing. Academic Press, San Diego New York, p 705

    Google Scholar 

  • Toyoda J, Noguchi Y, Nishimura Y (1982) Study of extracting Japanese newspaper article. Proc. 6th ICPR, pp 1113–1115

  • Tsujimoto S, Asada H (1990) Understanding multi-articled documents. Proc. 10th ICPR, pp 551–556

  • Watanabe T, Luo Q, Mizogami M, Yoshida Y, Inagaki Y (1989) Automatic extraction and classification of data items from library cataloging cards by a knowledge-based approach. Proc. MIV 1989, pp 67–71

    Google Scholar 

  • Watanabe T, Luo Q, Yoshida Y, Inagaki Y (1991a) A stepwise recognition method of library cataloging cards on the basis of various kinds of knowledge. Proc. 10th IPCCC, pp 821–827

  • Watanabe T, Naruse H, Luo Q, Sugie N (1991b) Structure Analysis of table-form documents on the basis of the recognition of vertical and horizontal line segments. Proc. 1st ICAAR, pp 63–39

  • Watanabe T, Luo Q, Fukumura T (1992) A framework of layout recognition for document understanding. Proc. Symposium on Document Analysis and Information Retrieval, pp 77–95

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Watanabe, T., Luo, Q. & Sugie, N. Structure recognition methods for various types of documents. Machine Vis. Apps. 6, 163–176 (1993). https://doi.org/10.1007/BF01211939

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01211939

Key words