In this chapter, we discuss the use of Self Organizing Maps (SOM) to deal with various tasks in Document Image Analysis. The SOM is a particular type of artificial neural network that computes, during the learning, an unsupervised clustering of the input data arranging the cluster centers in a lattice. After an overview of the previous applications of unsupervised learning in document image analysis, we present our recent work in the field. We describe the use of the SOM at three processing levels: the character clustering, the word clustering, and the layout clustering, with applications to word retrieval, document retrieval and page classification. In order to improve the clustering effectiveness, when dealing with small training sets, we propose an extension of the SOM training algorithm that considers the tangent distance so as to increase the SOM robustness with respect to small transformations of the patterns. Experiments on the use of this extended training algorithm are reported for both character and page layout clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kohonen, T.: Self-organizing maps. Springer Series in Information Sciences (2001)
Marinai, S., Gori, M., Soda, G.: Artificial neural networks for document analysis and recognition. IEEE Transactions on PAMI 27(1) (2005) 23-35
Poggio, T., Girosi, F.: Networks for approximation and learning. Proceedings of the IEEE 78(9) (1990) 1481-1497
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2 (1989) 359-366
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representa-tions by error propagation. In Rumelhart, D.E., McClelland, J.L., eds.: Parallel Distributed Processing. Volume 1. MIT Press, Cambridge (1986) 318-362
Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78(9) (1990) 1464-1480
Bernard, E., Casasent, D.: Invariance and neural nets. IEEE Transactions on Neural Networks 2(5) (1991) 498-508
Avi-Itzhak, H., Diep, T., Garland, H.: High accuracy optical character recogni tion using neural networks with centroid dithering. IEEE Transaction on PAMI 17(2)(1995) 218-224
Oliveira, L., Britto, A.S., Sabourin, R.: A synthetic database to assess segmen-tation algorithms. In: Int’l Conference on Document Analysis and Recognition. (2005) 207-211
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11) (1998) 2278-2324
Simard, P.Y., LeCun, Y., Denker, J.S.: Memory-based character recognition using a transformation invariant metric. In: Int’l Conference on Pattern Recog-nition. (1994) 262-267
Schwenk, H., Milgram, M.: Transformation invariant autoassociation with ap-plication to handwritten character recognition. In: Proc. NIPS. (1996) 991-998
Rumelhart, D.E., McClelland, J.L., The PDP Research Group: Parallel Dis-tributed Processing: Explorations in the Microstructure of Cognition. Volume 1. MIT Press, Cambridge (1986)
Duda, R.O., Hart, P., Stork, D.G.: Pattern Classification. John Wiley & sons (2001)
Ahmed, P.: A neural network based dedicated thinning method. PRL 16(6) (1995) 585-590
Datta, A., Parui, S.K., Chaudhuri, B.B.: Skeletonization by a topology-adaptive self-organizing neural network. Pattern Recognition 34 (2001) 617-629
Sasamura, H., Saito, T.: A simple learning algorithm for growing self-organizing maps and its application to skeletonization.In: Int’l Joint Conference on Neural Networks. Volume 1. (2003) 787-790
Palenichka, R.M., Zaremba, M.B.: Multi-scale model-based skeletonization of object shapes using self-organizing maps. In: Int’l Conference on Pattern Recog-nition. (2002) 143-146
Zhou, J., Lopresti, D.: Extracting text from WWW images. In: Int’l Conference on Document Analysis and Recognition. (1997) 248-252
Park, S., Yun, I., Lee, S.: Color image segmentation based on 3-D clustering: Morphological approach. Pattern Recognition 31(8) (1998) 1061-1076
Worring, M., Todoran, L.: Segmentation of color documents by line oriented clustering using spatial information. In: Int’l Conference on Document Analysis and Recognition. (1999) 67-70
Hu, J., Kashi, R., Wilfong, G.: Document image layout comparison and clas-sification. In: Int’l Conference on Document Analysis and Recognition. (1999) 285-288
Suzuki, M., Tamari, F., Fukuda, R., Uchida, S., Kanahori, T.: INFTY: An in-tegrated OCR system for mathematical documents. In: Document Engineering. (2003) 95-104
Lu, Y., Tan, C.: Information retrieval in document image databases. IEEE Transactions on Knowledge and Data Discovery 16(11) (2004) 1398-1410
Haffner, P., Bottou, L., Howard, P.G., LeCun, Y.: DjVu: analyzing and com-pressing scanned documents for Internet distribution. In: Int’l Conference on Document Analysis and Recognition. (1999) 625-628
Witten, I.H., Moffat, A., Bell, T.C.: Managing gigabytes: compressing and in-dexing documents and images. Academic Press (1999)
Hobby, J.D., Ho, T.K.: Enhancing degraded document images via bitmap clus-tering and averaging. In: Int’l Conference on Document Analysis and Recogni-tion. (1997) 394-400
Chiang, J.H., Gader, P.: Recognition of handprinted numerals in VISA card application form. MVA 10(3) (1997) 144-149
Reddy, N.S., Nagabhushan, P.: A three-dimensional neural network model for unconstrained handwritten numeral recognition: a new approach. Pattern Recognition 31(5) (1998) 511-516
Dehghan, M., Faez, K., Ahmadi, M.: A hybrid handwritten word recognition us-ing self-organizing feature map, discrete HMM, and evolutionary programming. In: Int’l Joint Conference on Neural Networks. (2000) 515-520
Liou, C.Y., Yang, H.C.: Handprinted character recognition based on spatial topology distance measurement. IEEE Transaction on PAMI 18(9) (1996) 941-945
Rahman, A., Fairhurst, M.: A new hybrid approach in combining multiple ex-perts to recognise handwritten numerals. PRL 18(8) (1997) 781-790
Teo, R.Y.M., Shingal, R.: A hybrid classifier for recognizing handwritten numer-als. In: Int’l Conference on Document Analysis and Recognition. (1997) 283-287
Wang, J., Jean, J.: Resolving multifont character confusion with neural net-works. Pattern Recognition 26(1) (1993) 175-188
Su, H., Wang, W., Li, X., Xia, S.: Hierarchical neural network for recognizing hand-written characters in engineering drawings. In: Int’l Conference on Docu-ment Analysis and Recognition. (1995) 46-49
Song, H.H., Lee, S.W.: A self-organizing neural tree for large-set pattern clas-sification. In: Int’l Conference on Document Analysis and Recognition. (1995) 1111-1114
Cho, S.B.: Neural-network classifiers for recognizing totally unconstrained hand-written numerals. IEEE Transactions on Neural Networks 8(1) (1997) 43-53
Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Transac-tions on Neural Networks 11(3) (2000) 574-585
O’Neil, P.: An incremental approach to text representation, categorization, and retrieval. In: Int’l Conference on Document Analysis and Recognition. (1997) 714-717
Merkl, D.: Text classification with self-organizing maps: some lessons learned. Neurocomputing 21(1-3) (1998) 61-78
Ménier, G., Lorette, G.: Lexical analyzer based on a self-organizing feature map. In: Int’l Conference on Document Analysis and Recognition. (1997) 1067-1071
Konig, A.: Interactive visualization and analysis of hierarchical neural projec-tions for data mining. IEEE Transactions on Neural Networks 11(3) (2000) 615-624
Marukawa, K., Hu, T., Fujisawa, H., Shima, Y.: Document retrieval tolerating character recognition errors - evaluation and application. Pattern Recognition 30(8)(1997) 1361-1371
Taghva, K., Borsack, J., Condit, A.: Evaluation of model-based retrieval effec-tiveness with OCR text. ACM TOIS 14(1) (1996) 64-93
Lopresti, D.P.: Robust retrieval of noisy text. Proc. of ADL’ 96. (1996) 76-85
Madhvanath, S., Govindaraju, V.: The role of holistic paradigms in handwritten word recognition. IEEE Transactions on PAMI 23(2) (2001) 149-164
Cesarini, F., Gori, M., Marinai, S., Soda, G.: INFORMys: A flexible invoice-like form reader system. IEEE Transactions on PAMI 20(7) (1998) 730-745
Williams, W., Zalubas, E., Hero, A.: Word spotting in bitmapped fax docu-ments. Information Retrieval 2(2/3) (2000) 207-226
Rath, T.M., Manmatha, R., Lavrenko, V.: A search engine for historical manuscript images. In: ACM SIGIR 04. (2004) 369-376
Marinai, S., Marino, E., Soda, G.: Font adaptive word indexing of modern printed documents. IEEE Transactions on PAMI 28(8) (2006) 1187-1199
Marinai, S., Faini, S., Marino, E., Soda, G.: Efficient word retrieval by means of SOM clustering and PCA. In: DAS 2006, Springer Verlag- LNCS 3872 (2006) 336-347
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wes- ley (1999)
J. Hu, R. Kashi, G. Wilfong: Comparision and classification of documents based on layout similarity. Information Retrieval 2(2/3) (2000) 227-243
Tzacheva, A., El-Sonbaty, Y., El-Kwae, E.A.: Document image matching using a maximal grid approach. In: Proceedings of the SPIE Document Recognition and Retrieval IX. (2002) 121-128
Marinai, S., Marino, E., Soda, G.: Tree clustering for layout-based document image retrieval. In: Proc. Second Int. Workshop on Document Image Analysis for Libraries. (2006) 243-251
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Marinai, S., Marino, E., Soda, G. (2008). Self-Organizing Maps for Clustering in Document Image Analysis. In: Marinai, S., Fujisawa, H. (eds) Machine Learning in Document Analysis and Recognition. Studies in Computational Intelligence, vol 90. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76280-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-76280-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76279-9
Online ISBN: 978-3-540-76280-5
eBook Packages: EngineeringEngineering (R0)