Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Handwritten Annotation Spotting in Printed Documents Using Top-Down Visual Saliency Models

Published: 13 December 2021 Publication History
  • Get Citation Alerts
  • Abstract

    In this article, we address the problem of localizing text and symbolic annotations on the scanned image of a printed document. Previous approaches have considered the task of annotation extraction as binary classification into printed and handwritten text. In this work, we further subcategorize the annotations as underlines, encirclements, inline text, and marginal text. We have collected a new dataset of 300 documents constituting all classes of annotations marked around or in-between printed text. Using the dataset as a benchmark, we report the results of two saliency formulations—CRF Saliency and Discriminant Saliency, for predicting salient patches, which can correspond to different types of annotations. We also compare our work with recent semantic segmentation techniques using deep models. Our analysis shows that Discriminant Saliency can be considered as the preferred approach for fast localization of patches containing different types of annotations. The saliency models were learned on a small dataset, but still, give comparable performance to the deep networks for pixel-level semantic segmentation. We show that saliency-based methods give better outcomes with limited annotated data compared to more sophisticated segmentation techniques that require a large training set to learn the model.

    References

    [1]
    A. M. Awal and A. Belaïd. 2017. Neighborhood label extension for handwritten/printed text separation in Arabic documents. In Proceedings of the 2017 1st International Workshop on Arabic Script Analysis and Recognition. 36–40.
    [2]
    Sang-Woo Ban, Minho Lee, and Hyun-Seung Yang. 2004. A face detection using biologically motivated bottom-up saliency map model and top-down perception model. Neurocomputing 56, 1 (2004), 475–480.
    [3]
    Robinson Beccaloni, Scoble. 2003. Chapter 10: Computerising Unit-level Data in Natural History Card Archives. Scoble, M. J. (Ed.), 176pp.
    [4]
    Abdel Belaïd, K. C. Santosh, and Vincent Poulain D’Andecy. 2013. Handwritten and printed text separation in real document. In Proceedings of MVA.
    [5]
    S. Belongie, J. Malik, and J. Puzicha. 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on PAMI 24, 4 (Apr 2002), 509–522.
    [6]
    M. Benjlaiel, R. Mullot, and A. M. Alimi. 2014. Multi-oriented handwritten annotations extraction from scanned documents. In Proceedings of the Document Analysis Systems 11th IAPR International Workshop on. 126–130.
    [7]
    A. Borji, M. M. Cheng, H. Jiang, and J. Li. 2015. Salient object detection: A benchmark. IEEE Transactions on Image Processing 24, 12 (Dec 2015), 5706–5722.
    [8]
    A. Borji and L. Itti. 2013. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (Jan 2013), 185–207.
    [9]
    Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3 (2011), 27:1–27:27.
    [10]
    M. M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S. M. Hu. 2015. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (March 2015), 569–582.
    [11]
    L. F. da Silva, A. Conci, and A. Sanchez. 2009. Automatic discrimination between printed and handwritten text in documents. In Proceedings of the Computer Graphics and Image Processing XXII Brazilian Symposium. 261–267.
    [12]
    Afef Kacem Echi and Asma Saidani. 2014. How to separate between machine-printed/handwritten and Arabic/words?Electronic Letters on Computer Vision and Image Analysis 13, 1 (2014), 1–16.
    [13]
    M. Emambakhsh, Y. He, and I. Nabney. 2016. Handwritten and machine-printed text discrimination using a template matching approach. In Proceedings of the 2016 12th IAPR Workshop on Document Analysis Systems (DAS). 399–404.
    [14]
    Sébastien Eskenazi, Petra Gomez-Krämer, and Jean-Marc Ogier. 2017. A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognition 64 (2017), 1–14. DOI:https://doi.org/10.1016/j.patcog.2016.10.023
    [15]
    Kuo-Chin Fan, Liang-Shen Wang, and Yin-Tien Tu. 1998. Classification of machine-printed and handwritten texts using character block layout variance.Pattern Recognition 31, 9 (1998), 1275–1284.
    [16]
    F. Farooq, K. Sridharan, and V. Govindaraju. 2006. Identifying handwritten text in mixed documents. In Proceedings of the 18th International Conference on Pattern Recognition, Vol. 2. 1142–1145.
    [17]
    J. Franke and M. Oberlander. 1993. Writing style detection by statistical combination of classifiers in form reader applications. In Proceedings of the 2nd ICDAR, 1993. 581–584.
    [18]
    D. Gao, S. Han, and N. Vasconcelos. 2009. Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 6 (June 2009), 989–1005.
    [19]
    Dashan Gao and N. Vasconcelos. 2005. An experimental comparison of three guiding principles for the detection salient image locations: Stability, complexity, and discrimination. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 84–84.
    [20]
    Dashan Gao and Nuno Vasconcelos. 2009. Decision-theoretic saliency: Computational principles, biological plausibility, and implications for neurophysiology and psychophysics. Neural Computation 21, 1 (2009), 239–271. DOI:DOI:https://doi.org/10.1162/neco.2009.11-06-391
    [21]
    Liang Lin Guanbin Li, Yuan Xie and Yizhou Yu. 2017. Instance–level salient object segmentation. In Proceedings of the IEEE Conference on CVPR.
    [22]
    J. K. Guo and M. Y. Ma. 2001. Separating handwritten material from machine printed text using hidden Markov models. In Proceedings of the 6th International Conference on Document Analysis and Recognition. 439–443.
    [23]
    X. Hong, H. Chang, S. Shan, X. Chen, and W. Gao. 2009. Sigma set: A small second order statistical region descriptor. In Proceedings of the Computer Vision and Pattern Recognition.
    [24]
    Seung Ick Jang, Seon Hwa Jeong, and Yun-Seok Nam. 2004. Classification of machine-printed and handwritten addresses on Korean mail piece images using geometric features. In Proceedings of the 17th International Conference on Pattern Recognition, Vol. 2. 383–386.
    [25]
    Tilke Judd, Krista Ehinger, Frédo Durand, and Antonio Torralba. 2009. Learning to predict where humans look. In Proceedings of the IEEE 12th International Conference on Computer Vision. 2106–2113.
    [26]
    R. Kandan, NirupKumar Reddy, K. R. Arvind, and A. G. Ramakrishnan. 2007. A robust two level classification algorithm for text localization in documents. In Proceedings of the Advances in Visual Computing. George Bebis, Richard Boyle, Bahram Parvin, Darko Koracin, Nikos Paragios, Syeda-Mahmood Tanveer, Tao Ju, Zicheng Liu, Sabine Coquillart, Carolina Cruz-Neira, Torsten Müller, and Tom Malzbender (Eds.). Lecture Notes in Computer Science, Vol. 4842. Springer Berlin, 96–105.
    [27]
    Ergina Kavallieratou, Nikos Fakotakis, and George K. Kokkinakis. 2002. An unconstrained handwriting recognition system.International Journal on Document Analysis and Recognition 4, 4 (2002), 226–242.
    [28]
    Ergina Kavallieratou and Stathis Stamatatos. 2004. Discrimination of machine-printed from handwritten text using simple structural characteristics. In Proceedings of the 17th International Conference on Pattern Recognition. 437–440.
    [29]
    Aysun Kocak, Kemal Cizmeciler, Aykut Erdem, and Erkut Erdem. 2014. Top down saliency estimation via superpixel-based discriminative dictionaries. In Proceedings of the British Machine Vision Conference. BMVA Press.
    [30]
    Jayant Kumar, Rohit Prasad, Huiagu Cao, Wael Abd-Almageed, David Doermann, and Premkumar Natarajan. 2011. Shape codebook based handwritten and machine printed text zone extraction. In Proceedings of the Document Recognition and Retrieval XVIII, Gady Agam and Christian Viard-Gaudin (Eds.). SPIE, 47–54. DOI:https://doi.org/10.1117/12.876725
    [31]
    A. Kölsch, A. Mishra, S. Varshneya, M. Z. Afzal, and M. Liwicki. 2018. Recognizing challenging handwritten annotations with fully convolutional networks. In Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition. 25–31.
    [32]
    Yang Lei, Jian Fan, and Jerry Liu. 2016. A multi-scale approach to extract meaningful annotations from document images. In Proceedings of the International Conference on Accoustics, Speech, and Signal Processing.
    [33]
    Xiao-Hui Li, Fei Yin, and Cheng-Lin Liu. 2018. Printed/handwritten texts and graphics separation in complex documents using conditional random fields. In Proceedings of the 13th IAPR International Workshop on Document Analysis and Systems. 145–150.
    [34]
    Laurence Likforman, Pascal Vaillant, and Aliette de Bodard de la Jacopière. 2006. Automatic name extraction from degraded document images. Pattern Analysis and Applications 9, 2 (Aug 2006), 211.
    [35]
    U.-V. Marti and H. Bunke. 2002. The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5, 1 (Nov 2002), 39–46.
    [36]
    Krystian Mikolajczyk and Cordelia Schmid. 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 10 (Oct 2005), 1615–1630.
    [37]
    U. Pal and B. B. Chaudhuri. 1999. Automatic separation of machine-printed and hand-written text lines. In Proceedings of the 5th ICDAR’99.645–648.
    [38]
    S. Pandey and G. Harit. 2015. Segmenting printed text and handwritten annotation by Spectral Partitioning. In Proceedings of the 5th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics. 1–4.
    [39]
    Xujun Peng, Srirangaraj Setlur, Venu Govindaraju, and Ramachandrula Sitaram. 2013. Handwritten text separation from annotated machine printed documents using Markov Random Fields. International Journal on Document Analysis and Recognition 16, 1 (2013), 1–16.
    [40]
    S. J. Pinson and W. A. Barrett. 2011. Connected component level discrimination of handwritten and machine-printed text using eigenfaces. In Proceedings of the International Conference on Document Analysis and Recognition. 1394–1398.
    [41]
    N. Sang, L. Wei, and Y. Wang. 2010. A biologically-inspired top-down learning model based on visual attention. In Proceedings of the 20th International Conference on Pattern Recognition. 3736–3739.
    [42]
    J. Eduardo Bastos Dos Santos, B. Dubuisson, and F. Bortolozzi. 2002. Characterizing and distinguishing text in bank cheque images. In Proceedings of the XV Brazilian Symposium on Computer Graphics and Image Processing. 203–209.
    [43]
    Mathias Seuret, Marcus Liwicki, and Rolf Ingold. 2014. Pixel level handwritten and printed content discrimination in scanned documents. In Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. 423–428.
    [44]
    E. Shelhamer, J. Long, and T. Darrell. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (April 2017), 640–651.
    [45]
    Shravya Shetty, Harish Srinivasan, and Sargur Srihari. 2007. Segmentation and labeling of documents using conditional random fields. In Proceedings of the International Conference on Document Analysis and Recognition.
    [46]
    S. N. Srihari, Yong-Chul Shin, V. Ramanaprasad, and Dar-Shyang Lee. 1996. A system to read names and addresses on tax forms. Proceedings of the IEEE 84, 7 (Jul 1996), 1038–1049.
    [47]
    Dirk Walther and Christof Koch. 2006. Modeling attention to salient proto-objects. Neural Networks 19, 9 (2006), 1395–1407.
    [48]
    Linfeng Xu, Liaoyuan Zeng, Huiping Duan, and Nii Longdon Sowah. 2014. Saliency detection in complex scenes. EURASIP Journal on Image and Video Processing2014, 1 (24 Jun 2014), 31.
    [49]
    Mai Xu, Lai Jiang, Zhaoting Ye, and Zulin Wang. 2016. Bottom-up saliency detection with sparse representation of learnt texture atoms. Pattern Recognition 60, C (2016), 348–360. DOI:https://doi.org/10.1016/j.patcog.2016.05.023
    [50]
    J. Yang and M. H. Yang. 2012. Top-down visual saliency via joint CRF and dictionary learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2296–2303.
    [51]
    Konstantinos Zagoris, Ioannis Pratikakis, Apostolos Antonacopoulos, Basilis Gatos, and Nikos Papamarkos. 2014. Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognition 47, 3 (2014), 1051–1062.
    [52]
    Yefeng Zheng, Huiping Li, and David Doermann. 2004. Machine printed text and handwriting identification in noisy document images. IEEE Transactions on Pattern Analysis Machine Intelligence 26, 3 (2004), 337–353.

    Index Terms

    1. Handwritten Annotation Spotting in Printed Documents Using Top-Down Visual Saliency Models

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 3
      May 2022
      413 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3505182
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 December 2021
      Accepted: 01 September 2021
      Revised: 01 August 2021
      Received: 01 April 2020
      Published in TALLIP Volume 21, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Discriminant saliency
      2. handwritten annotations
      3. CRF (conditional random field)
      4. sparse codes
      5. FCN (fully convolutional network)

      Qualifiers

      • Research-article
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 499
        Total Downloads
      • Downloads (Last 12 months)53
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 26 Jul 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media