Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Self-Organizing Maps for Clustering in Document Image Analysis

  • Chapter
Machine Learning in Document Analysis and Recognition

Part of the book series: Studies in Computational Intelligence ((SCI,volume 90))

In this chapter, we discuss the use of Self Organizing Maps (SOM) to deal with various tasks in Document Image Analysis. The SOM is a particular type of artificial neural network that computes, during the learning, an unsupervised clustering of the input data arranging the cluster centers in a lattice. After an overview of the previous applications of unsupervised learning in document image analysis, we present our recent work in the field. We describe the use of the SOM at three processing levels: the character clustering, the word clustering, and the layout clustering, with applications to word retrieval, document retrieval and page classification. In order to improve the clustering effectiveness, when dealing with small training sets, we propose an extension of the SOM training algorithm that considers the tangent distance so as to increase the SOM robustness with respect to small transformations of the patterns. Experiments on the use of this extended training algorithm are reported for both character and page layout clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Kohonen, T.: Self-organizing maps. Springer Series in Information Sciences (2001)

    Google Scholar 

  2. Marinai, S., Gori, M., Soda, G.: Artificial neural networks for document analysis and recognition. IEEE Transactions on PAMI 27(1) (2005) 23-35

    Google Scholar 

  3. Poggio, T., Girosi, F.: Networks for approximation and learning. Proceedings of the IEEE 78(9) (1990) 1481-1497

    Article  Google Scholar 

  4. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2 (1989) 359-366

    Article  Google Scholar 

  5. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representa-tions by error propagation. In Rumelhart, D.E., McClelland, J.L., eds.: Parallel Distributed Processing. Volume 1. MIT Press, Cambridge (1986) 318-362

    Google Scholar 

  6. Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78(9) (1990) 1464-1480

    Article  Google Scholar 

  7. Bernard, E., Casasent, D.: Invariance and neural nets. IEEE Transactions on Neural Networks 2(5) (1991) 498-508

    Article  Google Scholar 

  8. Avi-Itzhak, H., Diep, T., Garland, H.: High accuracy optical character recogni tion using neural networks with centroid dithering. IEEE Transaction on PAMI 17(2)(1995) 218-224

    Google Scholar 

  9. Oliveira, L., Britto, A.S., Sabourin, R.: A synthetic database to assess segmen-tation algorithms. In: Int’l Conference on Document Analysis and Recognition. (2005) 207-211

    Google Scholar 

  10. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11) (1998) 2278-2324

    Article  Google Scholar 

  11. Simard, P.Y., LeCun, Y., Denker, J.S.: Memory-based character recognition using a transformation invariant metric. In: Int’l Conference on Pattern Recog-nition. (1994) 262-267

    Google Scholar 

  12. Schwenk, H., Milgram, M.: Transformation invariant autoassociation with ap-plication to handwritten character recognition. In: Proc. NIPS. (1996) 991-998

    Google Scholar 

  13. Rumelhart, D.E., McClelland, J.L., The PDP Research Group: Parallel Dis-tributed Processing: Explorations in the Microstructure of Cognition. Volume 1. MIT Press, Cambridge (1986)

    Google Scholar 

  14. Duda, R.O., Hart, P., Stork, D.G.: Pattern Classification. John Wiley & sons (2001)

    Google Scholar 

  15. Ahmed, P.: A neural network based dedicated thinning method. PRL 16(6) (1995) 585-590

    Google Scholar 

  16. Datta, A., Parui, S.K., Chaudhuri, B.B.: Skeletonization by a topology-adaptive self-organizing neural network. Pattern Recognition 34 (2001) 617-629

    Article  MATH  Google Scholar 

  17. Sasamura, H., Saito, T.: A simple learning algorithm for growing self-organizing maps and its application to skeletonization.In: Int’l Joint Conference on Neural Networks. Volume 1. (2003) 787-790

    Google Scholar 

  18. Palenichka, R.M., Zaremba, M.B.: Multi-scale model-based skeletonization of object shapes using self-organizing maps. In: Int’l Conference on Pattern Recog-nition. (2002) 143-146

    Google Scholar 

  19. Zhou, J., Lopresti, D.: Extracting text from WWW images. In: Int’l Conference on Document Analysis and Recognition. (1997) 248-252

    Google Scholar 

  20. Park, S., Yun, I., Lee, S.: Color image segmentation based on 3-D clustering: Morphological approach. Pattern Recognition 31(8) (1998) 1061-1076

    Article  Google Scholar 

  21. Worring, M., Todoran, L.: Segmentation of color documents by line oriented clustering using spatial information. In: Int’l Conference on Document Analysis and Recognition. (1999) 67-70

    Google Scholar 

  22. Hu, J., Kashi, R., Wilfong, G.: Document image layout comparison and clas-sification. In: Int’l Conference on Document Analysis and Recognition. (1999) 285-288

    Google Scholar 

  23. Suzuki, M., Tamari, F., Fukuda, R., Uchida, S., Kanahori, T.: INFTY: An in-tegrated OCR system for mathematical documents. In: Document Engineering. (2003) 95-104

    Google Scholar 

  24. Lu, Y., Tan, C.: Information retrieval in document image databases. IEEE Transactions on Knowledge and Data Discovery 16(11) (2004) 1398-1410

    Google Scholar 

  25. Haffner, P., Bottou, L., Howard, P.G., LeCun, Y.: DjVu: analyzing and com-pressing scanned documents for Internet distribution. In: Int’l Conference on Document Analysis and Recognition. (1999) 625-628

    Google Scholar 

  26. Witten, I.H., Moffat, A., Bell, T.C.: Managing gigabytes: compressing and in-dexing documents and images. Academic Press (1999)

    Google Scholar 

  27. Hobby, J.D., Ho, T.K.: Enhancing degraded document images via bitmap clus-tering and averaging. In: Int’l Conference on Document Analysis and Recogni-tion. (1997) 394-400

    Google Scholar 

  28. Chiang, J.H., Gader, P.: Recognition of handprinted numerals in VISA card application form. MVA 10(3) (1997) 144-149

    Article  Google Scholar 

  29. Reddy, N.S., Nagabhushan, P.: A three-dimensional neural network model for unconstrained handwritten numeral recognition: a new approach. Pattern Recognition 31(5) (1998) 511-516

    Article  Google Scholar 

  30. Dehghan, M., Faez, K., Ahmadi, M.: A hybrid handwritten word recognition us-ing self-organizing feature map, discrete HMM, and evolutionary programming. In: Int’l Joint Conference on Neural Networks. (2000) 515-520

    Google Scholar 

  31. Liou, C.Y., Yang, H.C.: Handprinted character recognition based on spatial topology distance measurement. IEEE Transaction on PAMI 18(9) (1996) 941-945

    Google Scholar 

  32. Rahman, A., Fairhurst, M.: A new hybrid approach in combining multiple ex-perts to recognise handwritten numerals. PRL 18(8) (1997) 781-790

    Google Scholar 

  33. Teo, R.Y.M., Shingal, R.: A hybrid classifier for recognizing handwritten numer-als. In: Int’l Conference on Document Analysis and Recognition. (1997) 283-287

    Google Scholar 

  34. Wang, J., Jean, J.: Resolving multifont character confusion with neural net-works. Pattern Recognition 26(1) (1993) 175-188

    Article  Google Scholar 

  35. Su, H., Wang, W., Li, X., Xia, S.: Hierarchical neural network for recognizing hand-written characters in engineering drawings. In: Int’l Conference on Docu-ment Analysis and Recognition. (1995) 46-49

    Google Scholar 

  36. Song, H.H., Lee, S.W.: A self-organizing neural tree for large-set pattern clas-sification. In: Int’l Conference on Document Analysis and Recognition. (1995) 1111-1114

    Google Scholar 

  37. Cho, S.B.: Neural-network classifiers for recognizing totally unconstrained hand-written numerals. IEEE Transactions on Neural Networks 8(1) (1997) 43-53

    Google Scholar 

  38. Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Transac-tions on Neural Networks 11(3) (2000) 574-585

    Article  Google Scholar 

  39. O’Neil, P.: An incremental approach to text representation, categorization, and retrieval. In: Int’l Conference on Document Analysis and Recognition. (1997) 714-717

    Google Scholar 

  40. Merkl, D.: Text classification with self-organizing maps: some lessons learned. Neurocomputing 21(1-3) (1998) 61-78

    Article  Google Scholar 

  41. Ménier, G., Lorette, G.: Lexical analyzer based on a self-organizing feature map. In: Int’l Conference on Document Analysis and Recognition. (1997) 1067-1071

    Google Scholar 

  42. Konig, A.: Interactive visualization and analysis of hierarchical neural projec-tions for data mining. IEEE Transactions on Neural Networks 11(3) (2000) 615-624

    Article  Google Scholar 

  43. Marukawa, K., Hu, T., Fujisawa, H., Shima, Y.: Document retrieval tolerating character recognition errors - evaluation and application. Pattern Recognition 30(8)(1997) 1361-1371

    Article  Google Scholar 

  44. Taghva, K., Borsack, J., Condit, A.: Evaluation of model-based retrieval effec-tiveness with OCR text. ACM TOIS 14(1) (1996) 64-93

    Article  Google Scholar 

  45. Lopresti, D.P.: Robust retrieval of noisy text. Proc. of ADL’ 96. (1996) 76-85

    Google Scholar 

  46. Madhvanath, S., Govindaraju, V.: The role of holistic paradigms in handwritten word recognition. IEEE Transactions on PAMI 23(2) (2001) 149-164

    Google Scholar 

  47. Cesarini, F., Gori, M., Marinai, S., Soda, G.: INFORMys: A flexible invoice-like form reader system. IEEE Transactions on PAMI 20(7) (1998) 730-745

    Google Scholar 

  48. Williams, W., Zalubas, E., Hero, A.: Word spotting in bitmapped fax docu-ments. Information Retrieval 2(2/3) (2000) 207-226

    Article  Google Scholar 

  49. Rath, T.M., Manmatha, R., Lavrenko, V.: A search engine for historical manuscript images. In: ACM SIGIR 04. (2004) 369-376

    Google Scholar 

  50. Marinai, S., Marino, E., Soda, G.: Font adaptive word indexing of modern printed documents. IEEE Transactions on PAMI 28(8) (2006) 1187-1199

    Google Scholar 

  51. Marinai, S., Faini, S., Marino, E., Soda, G.: Efficient word retrieval by means of SOM clustering and PCA. In: DAS 2006, Springer Verlag- LNCS 3872 (2006) 336-347

    Google Scholar 

  52. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wes- ley (1999)

    Google Scholar 

  53. J. Hu, R. Kashi, G. Wilfong: Comparision and classification of documents based on layout similarity. Information Retrieval 2(2/3) (2000) 227-243

    Article  Google Scholar 

  54. Tzacheva, A., El-Sonbaty, Y., El-Kwae, E.A.: Document image matching using a maximal grid approach. In: Proceedings of the SPIE Document Recognition and Retrieval IX. (2002) 121-128

    Google Scholar 

  55. Marinai, S., Marino, E., Soda, G.: Tree clustering for layout-based document image retrieval. In: Proc. Second Int. Workshop on Document Image Analysis for Libraries. (2006) 243-251

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Marinai, S., Marino, E., Soda, G. (2008). Self-Organizing Maps for Clustering in Document Image Analysis. In: Marinai, S., Fujisawa, H. (eds) Machine Learning in Document Analysis and Recognition. Studies in Computational Intelligence, vol 90. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76280-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76280-5_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76279-9

  • Online ISBN: 978-3-540-76280-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics