Abstract.
In this paper, we present a logical representation for form documents to be used for identification and retrieval. A hierarchical structure is proposed to represent the structure of a form by using lines and the XY-tree approach. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Geometrical modifications and slight variations are handled by this representation. Logically identical forms are associated to the same or similar hierarchical structure. Identification and the retrieval of similar forms are performed by computing the edit distances between the generated trees.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received: August 21, 2001 / Accepted: November 5, 2001
Rights and permissions
About this article
Cite this article
Duygulu, P., Atalay, V. A hierarchical representation of form documents for identification and retrieval. IJDAR 5, 17–27 (2002). https://doi.org/10.1007/s100320100077
Issue Date:
DOI: https://doi.org/10.1007/s100320100077