Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3395027.3419603acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models

Published: 29 September 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Offline Handwritten Text Recognition (HTR) is a task that offers a challenge in computer vision, where images are the only source of information. In fact, several approaches to optical models have been developed, such as through of Hidden Markov Model (HMM) or recurrent Bidirectional/Multidimensional layers. The current state-of-the-art consists of combined deep learning techniques, the Convolutional Recurrent Neural Networks (CRNN), in which recurrent layers still suffer from vanishing gradient problem when processing very long texts. In a way, high-performance models generally have millions of trainable parameters and a high computational cost. However, recently a new optical model architecture, Gated-CNN, demonstrated improvements to complement CRNN modeling. Thus, in this work, we present a new small architecture for HTR (based on Gated-CNN) integrated with two steps of language model at the character and word levels, respectively. Therefore, we used 9 state-of-the-art approaches and validated the results using the IAM public dataset. Finally, the proposed model surpasses the results obtained by different approaches in the literature, reaching recognition rates of CER 2.7% and WER 5.6%, which means an improvement of 13% over the best results on IAM dataset.

    References

    [1]
    T. Bluche and R. Messina. 2017. Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 01 (11 2017), 646--651.
    [2]
    D. Castro, B. Bezerra, and M. Valenca. 2018. Boosting the Deep Multidimensional Long-Short-Term Memory Network for Handwritten Recognition Systems. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, Niagara Falls, USA, 127--132.
    [3]
    K.-N. Chen, C.-H. Chen, and C.-C. Chang. 2012. Efficient illumination compensation techniques for text images. Digital Signal Processing 22, 5 (2012), 726--733.
    [4]
    K. Cho, B. van Merriënboer, D. Bahdanau, and Y. Bengio. 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar, 103--111.
    [5]
    Y. N. Dauphin, A. Fan, M. Auli, and D. Grangier. 2017. Language Modeling with Gated Convolutional Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (ICML'17). JMLR.org, Sydney, NSW, Australia, 933--941.
    [6]
    P. Doetsch, M. Kozielski, and H. Ney. 2014. Fast and Robust Training of Recurrent Neural Networks for Offline Handwriting Recognition. Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR 2014 (12 2014), 279--284.
    [7]
    K. He, X. Zhang, S. Ren, and J. Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, Las Condes, Chile, 1026--1034.
    [8]
    R. R. Ingle, Y. Fujii, T. Deselaers, J. Baccash, and A. C. Popat. 2019. A Scalable Handwritten Text Recognition System. 2019 International Conference on Document Analysis and Recognition (ICDAR) 01 (2019), 17--24.
    [9]
    S. Ioffe. 2017. Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 1942--1950.
    [10]
    M. Kozielski, P. Doetsch, and H. Ney. 2013. Improvements in RWTH's System for Off-Line Handwriting Recognition. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. IEEE, Washington, USA, 935--939.
    [11]
    U.-V. Marti and H. Bunke. 2002. The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5 (11 2002), 39--46.
    [12]
    V. Pham, T. Bluche, C. Kermorvant, and J. Louradour. 2014. Dropout Improves Recurrent Neural Networks for Handwriting Recognition. In 2014 14th International Conference on Frontiers in Handwriting Recognition. IEEE, Crete Island, Greece, 285--290.
    [13]
    A. Poznanski and L. Wolf. 2016. CNN-N-Gram for Handwriting Word Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 01 (2016), 2305--2314.
    [14]
    J. Puigcerver. 2017. Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition? 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 01 (11 2017), 67--72.
    [15]
    A. Vinciarelli and J. Luettin. 2001. A new normalization technique for cursive handwritten words. Pattern Recognition Letters 22 (2001), 1043--1050.
    [16]
    P. Voigtlaender, P. Doetsch, and H. Ney. 2016. Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks. In 15th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, Shenzhen, China, 228--233.

    Cited By

    View all
    • (2024)Recognizing text lines in handwritten archival document images using octave convolutional and attention recurrent neural networksMultimedia Tools and Applications10.1007/s11042-024-19717-4Online publication date: 9-Jul-2024
    • (2023)An end-to-end pipeline for historical censuses processingInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-023-00428-926:4(419-432)Online publication date: 17-Mar-2023
    • (2022)Refocus attention span networks for handwriting line recognitionInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-022-00422-7Online publication date: 25-Dec-2022
    • Show More Cited By

    Index Terms

    1. HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020
        September 2020
        130 pages
        ISBN:9781450380003
        DOI:10.1145/3395027
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        In-Cooperation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 29 September 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Deep Neural Networks
        2. Gated-CNN
        3. Language Models
        4. Offline Handwritten Text Recognition
        5. Optical Character Recognition

        Qualifiers

        • Short-paper
        • Research
        • Refereed limited

        Funding Sources

        • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
        • Conselho Nacional de Desenvolvimento Científico e Tecnológico

        Conference

        DocEng '20
        Sponsor:
        DocEng '20: ACM Symposium on Document Engineering 2020
        September 29 - October 1, 2020
        CA, Virtual Event, USA

        Acceptance Rates

        Overall Acceptance Rate 178 of 537 submissions, 33%

        Upcoming Conference

        DocEng '24
        ACM Symposium on Document Engineering 2024
        August 20 - 23, 2024
        San Jose , CA , USA

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)63
        • Downloads (Last 6 weeks)0
        Reflects downloads up to

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Recognizing text lines in handwritten archival document images using octave convolutional and attention recurrent neural networksMultimedia Tools and Applications10.1007/s11042-024-19717-4Online publication date: 9-Jul-2024
        • (2023)An end-to-end pipeline for historical censuses processingInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-023-00428-926:4(419-432)Online publication date: 17-Mar-2023
        • (2022)Refocus attention span networks for handwriting line recognitionInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-022-00422-7Online publication date: 25-Dec-2022
        • (2022)Active Transfer Learning for Handwriting RecognitionFrontiers in Handwriting Recognition10.1007/978-3-031-21648-0_17(245-258)Online publication date: 4-Dec-2022
        • (2022)Case Study of Few-Shot Learning in Text Recognition ModelsWeb Information Systems Engineering – WISE 202110.1007/978-3-030-91560-5_29(394-401)Online publication date: 1-Jan-2022
        • (2021)HTR for Greek Historical Handwritten DocumentsJournal of Imaging10.3390/jimaging71202607:12(260)Online publication date: 2-Dec-2021
        • (2021)Boosting Offline Handwritten Text Recognition in Historical Documents With Few Labeled LinesIEEE Access10.1109/ACCESS.2021.30826899(76674-76688)Online publication date: 2021

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media