Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3490035.3490306acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvgipConference Proceedingsconference-collections
research-article

TexRGAN: a deep adversarial framework for text restoration from deformed handwritten documents

Published: 19 December 2021 Publication History

Abstract

Free form handwritten document images commonly contains deformed text images such as struck-out and underlined words. The deformed text images drastically degrades the performance of intensely used document processing applications like optical character recognition (OCR). Here we propose an end-to-end text image restoration system based on generative adversarial network (GAN). The proposed model TexRGAN is perhaps the first attempt to restore deformed handwritten texts like struck-out and underlined text images using GAN model and it simultaneously handles strikeout and underline words both with a single deep network model. The proposed GAN model uses spatial as well as structural loss to generate restored text images from a given deformed text image input as condition. The proposed network is trained in weakly supervised approach to avoid unavailability of training data and the cost and error of manual annotations. We evaluate the performance of the proposed TexRGAN on various types of deformation and shapes of strike-through-strokes such as slanted strokes, nearly straight strokes, multiple strokes, underlines, crossed strokes etc. The TexRGAN is also evaluated directly in terms the OCR performance. The evaluation metrics show robustness and applicability in real-world scenario.

References

[1]
[n.d.]. Bibliothèque de Rouen (Rouen Library), Rouen Cedex-76043, France. http://www.bovary.fr. [Online; accessed 19-July-2008].
[2]
[n.d.]. The Morgan Library Museum, New York, USA-10016. http://www.themorgan.org. [Online; accessed 19-July-2008].
[3]
[n.d.]. George Washington Papers, The Library of Congress, USA. http://memory.loc.gov/ammem/gwhtml/gwhome.html. [Online; accessed 19-July-2008].
[4]
[n.d.]. Queensland State Archive, Australia-4113. http://www.archivessearch.qld.gov.au. [Online; accessed 19-July-2008].
[5]
Jyotirmoy Banerjee, Anoop M Namboodiri, and CV Jawahar. 2009. Contextual restoration of severely degraded document images. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 517--524.
[6]
Showmik Bhowmik, Ram Sarkar, Mita Nasipuri, and David Doermann. 2018. Text and non-text separation in offline document images: a survey. International Journal on Document Analysis and Recognition (IJDAR) 21, 1-2 (2018), 1--20.
[7]
Ankan Kumar Bhunia, Ayan Kumar Bhunia, Prithaj Banerjee, Aishik Konwer, Abir Bhowmick, Partha Pratim Roy, and Umapada Pal. 2018. Word level font-to-font image translation using convolutional recurrent generative adversarial networks. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 3645--3650.
[8]
Bidyut B Chaudhuri and Chandranath Adak. 2017. An approach for detecting and cleaning of struck-out handwritten text. Pattern Recognition 61 (2017), 282--294.
[9]
Kartik Dutta, Praveen Krishnan, Minesh Mathew, and CV Jawahar. 2018. Improving cnn-rnn hybrid networks for handwriting recognition. In 2018 16th international conference on frontiers in handwriting recognition (ICFHR). IEEE, 80--85.
[10]
Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen, Shai Mazor, and Roee Litman. 2020. Scrabblegan: Semi-supervised varying length handwritten text generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4324--4333.
[11]
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414--2423.
[12]
Arna Ghosh, Biswarup Bhattacharya, and Somnath Basu Roy Chowdhury. 2017. Handwriting profiling using generative adversarial networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.
[13]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.
[14]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.
[15]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision. Springer, 694--711.
[16]
Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Prithaj Banerjee, Partha Pratim Roy, and Umapada Pal. 2018. Staff line removal using generative adversarial networks. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 1103--1108.
[17]
Cheng-Lin Liu, Fei Yin, Da-Han Wang, and Qiu-Feng Wang. 2013. Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recognition 46, 1 (2013), 155--162.
[18]
U-V Marti and Horst Bunke. 2002. The IAM-database: an English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5, 1 (2002), 39--46.
[19]
Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).
[20]
Nobuyuki Otsu. 1979. A threshold selection method from gray-level histograms. IEEE transactions on systems, man, and cybernetics 9, 1 (1979), 62--66.
[21]
Umapada Pal, Ramachandran Jayadevan, and Nabin Sharma. 2012. Handwriting recognition in indian regional scripts: a survey of offline techniques. ACM Transactions on Asian Language Information Processing (TALIP) 11, 1 (2012), 1.
[22]
Arnab Poddar, Akash Chakraborty, Jayanta Mukhopadhyay, and Prabir Kumar Biswas. 2021. Detection and Localisation of Struck-Out-Strokes in Handwritten Manuscripts. In International Conference on Document Analysis and Recognition. Springer, 98--112.
[23]
Arnab Poddar, Rohan Mukherjee, Jayanta Mukhopadhyay, and Prabir Kumar Biswas. 2018. MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System. In Workshop on Document Analysis and Recognition. Springer, 3--14.
[24]
Sanjoy Pratihar, Partha Bhowmick, Shamik Sural, and Jayanta Mukhopadhyay. 2013. Removal of hand-drawn annotation lines from document images by digital-geometric analysis and inpainting. In 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG). IEEE, 1--4.
[25]
Harald Scheidl. 2018. Handwritten text recognition in historical documents. Ph.D. Dissertation. Wien.
[26]
Sebastian Schreiber, Stefan Agne, Ivo Wolf, Andreas Dengel, and Sheraz Ahmed. 2017. Deepdesrt: Deep learning for detection and structure recognition of tables in document images. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR), Vol. 1. IEEE, 1162--1167.
[27]
Baoguang Shi, Xiang Bai, and Cong Yao. 2016. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence 39, 11 (2016), 2298--2304.
[28]
Mohamed Ali Souibgui and Yousri Kessentini. 2020. De-gan: A conditional generative adversarial network for document enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

Cited By

View all
  • (2024)Unpaired document image denoising for OCR using BiLSTM enhanced CycleGANInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-024-00499-2Online publication date: 3-Oct-2024
  • (2024)Deformity removal from handwritten text documents using variable cycle GANInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-024-00466-x27:4(615-627)Online publication date: 7-May-2024
  • (2023)Strike off removal in Indic scripts with transfer learningNeural Computing and Applications10.1007/s00521-023-08433-z35:17(12927-12943)Online publication date: 9-Mar-2023
  • Show More Cited By
  1. TexRGAN: a deep adversarial framework for text restoration from deformed handwritten documents

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICVGIP '21: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing
      December 2021
      428 pages
      ISBN:9781450375962
      DOI:10.1145/3490035
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 December 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. GANs
      2. handwritten text image restoration
      3. optical character recognition
      4. underlined and struck-out text image

      Qualifiers

      • Research-article

      Funding Sources

      • IMPRINT India Initiative, Ministry of Education, Government of India

      Conference

      ICVGIP '21

      Acceptance Rates

      Overall Acceptance Rate 95 of 286 submissions, 33%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)17
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 16 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Unpaired document image denoising for OCR using BiLSTM enhanced CycleGANInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-024-00499-2Online publication date: 3-Oct-2024
      • (2024)Deformity removal from handwritten text documents using variable cycle GANInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-024-00466-x27:4(615-627)Online publication date: 7-May-2024
      • (2023)Strike off removal in Indic scripts with transfer learningNeural Computing and Applications10.1007/s00521-023-08433-z35:17(12927-12943)Online publication date: 9-Mar-2023
      • (2023)Intelligent Document Processing in End-to-End RPA Contexts: A Systematic Literature ReviewConfluence of Artificial Intelligence and Robotic Process Automation10.1007/978-981-19-8296-5_5(95-131)Online publication date: 14-Mar-2023
      • (2023)TBM-GAN: Synthetic Document Generation with Degraded BackgroundDocument Analysis and Recognition - ICDAR 202310.1007/978-3-031-41679-8_21(366-383)Online publication date: 19-Aug-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media