Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3508546.3508615acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacaiConference Proceedingsconference-collections
research-article

Handwritten Mathematical Expression Recognition with Self-Attention

Published: 25 February 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Attention-based encoder-decoder models have made great success on handwritten mathematical expression recognition in recent years. However, this kind of method has the problem of attention drift, because under the local attention mechanism based on RNN, the high similarity between coding features can cause attention confusion. To settle this problem, we propose an encoder-decoder model with self-attention, which captures the global information of the feature map and fuses the local information of the CNN as complementary features. Experiments are conducted on the CROHME2014 and CROHME 2016 competition datasets. The experimental results show that, when only using the official training dataset, the proposed method achieves recognition accuracies of 51.98% and 50.74% on the CROHME2014 and CROHME2016 competition datasets, respectively, which outperforms the other methods significantly. The improvements demonstrate the effectiveness of the self-attention module.

    References

    [1]
    Hongyu Wang, Guangcun Shan: Recognizing Handwritten Mathematical Expressions as LaTex Sequences Using a Multiscale Robust Neural Network. CoRR abs/2003.00817 (2020)
    [2]
    Anh Duc Le, Masaki Nakagawa: A system for recognizing online handwritten mathematical expressions by using improved structural analysis. Int. J. Document Anal. Recognit. 19(4): 305-319 (2016)
    [3]
    Jyoti Aneja, Aditya Deshpande, Alexander G. Schwing: Convolutional Image Captioning. CoRR abs/1711.09151 (2017)
    [4]
    Xiang-Dong Zhou, Da-Han Wang, Feng Tian, Cheng-Lin Liu, Masaki Nakagawa:Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields. IEEE Trans. Pattern Anal. Mach. Intell. 35(10): 2413-2426 (2013)
    [5]
    Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, Li Fei-Fei:Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. Int. J. Comput. Vis. 123(1): 32-73 (2017)
    [6]
    Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, Cheng-Lin Liu:Handwritten Mathematical Expression Recognition via Paired Adversarial Learning. Int. J. Comput. Vis. 128(10): 2386-2401 (2020)
    [7]
    Richard Zanibbi, Dorothea Blostein: Recognition and retrieval of mathematical expressions. Int. J. Document Anal. Recognit. 15(4): 331-357 (2012)
    [8]
    Francisco Alvaro, Joan-Andreu Sánchez, José-Miguel Benedí: An integrated grammar-based approach for mathematical expression recognition. Pattern Recognit. 51: 135-147 (2016)
    [9]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun:Deep Residual Learning for Image Recognition. CVPR 2016: 770-778
    [10]
    Karen Simonyan, Andrew Zisserman:Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015
    [11]
    Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alexander A. Alemi: Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. AAAI 2017: 4278-4284
    [12]
    Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, Kurt Keutzer: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR abs/1602.07360 (2016)
    [13]
    Jiaming Wang, Jun Du, Jianshu Zhang, Bin Wang, Bo Ren: Stroke constrained attention network for online handwritten mathematical expression recognition. Pattern Recognit. 119: 108047 (2021)
    [14]
    Jianshu Zhang, Jun Du, Lirong Dai:A GRU-Based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition. ICDAR 2017: 902-907
    [15]
    Ting Zhang, Harold Mouchère, Christian Viard-Gaudin: A tree-BLSTM-based recognition system for online handwritten mathematical expressions. Neural Comput. Appl. 32(9): 4689-4708 (2020)
    [16]
    Antoni Buades, Bartomeu Coll, Jean-Michel Morel: A Non-Local Algorithm for Image Denoising. CVPR (2) 2005: 60-65
    [17]
    Zilong Huang, Xinggang Wang, Lichao Huang, Chang Huang, Yunchao Wei, Wenyu Liu: CCNet: Criss-Cross Attention for Semantic Segmentation. ICCV 2019: 603-612
    [18]
    Zhanzhan Cheng, Fan Bai, Yunlu Xu, Gang Zheng, Shiliang Pu, Shuigeng Zhou: Focusing Attention: Towards Accurate Text Recognition in Natural Images. ICCV 2017: 5086-5094
    [19]
    Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc Viet Le, Ruslan Salakhutdinov:Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. ACL (1) 2019: 2978-2988
    [20]
    Jianpeng Cheng, Li Dong, Mirella Lapata: Long Short-Term Memory-Networks for Machine Reading. EMNLP 2016: 551-561
    [21]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention is All you Need. NIPS 2017: 5998-6008N
    [22]
    Edidiong Okon, Lingyun Shi, Rich Tsui:Automated Diagnosis Coding from Clinical Notes Using Attention-Augmented Recurrent Convolutional Neural Networks. AMIA 2020
    [23]
    Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. EMNLP 2014: 1724-1734
    [24]
    Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam:Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. ECCV (7) 2018: 833-851
    [25]
    Kam-Fai Chan, Dit-Yan Yeung: Error detection, error correction and performance evaluation in on-line mathematical expression recognition. Pattern Recognit. 34(8): 1671-1684 (2001)
    [26]
    Mehmet Celik, Berrin A. Yanikoglu: Probabilistic Mathematical Formula Recognition Using a 2D Context-Free Graph Grammar. ICDAR 2011: 161-166
    [27]
    R. Yamamoto, S. Sako, T. Nishimoto, and S. Sagayama. Online recognition of handwritten mathematical expressions based on stroke-based stochastic context-free grammar. IWFHR 2006: 249-254
    [28]
    Scott MacLean, George Labahn: A new approach for recognizing handwritten mathematics using relational grammars and fuzzy sets. Int. J. Document Anal. Recognit. 16(2): 139-163 (2013)
    [29]
    Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, Alexander M. Rush: Image-to-Markup Generation with Coarse-to-Fine Attention. ICML 2017: 980-989
    [30]
    Jianshu Zhang, Jun Du, Shiliang Zhang, Dan Liu, Yulong Hu, Jin-Shui Hu, Si Wei, Li-Rong Dai, Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognit. 71: 196-206 (2017)
    [31]
    Zhe Li, Lianwen Jin, Songxuan Lai, Yecheng Zhu: Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention. ICFHR 2020: 175-180
    [32]
    Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, Cheng-Lin Liu: Image-to-Markup Generation via Paired Adversarial Learning. ECML/PKDD (1) 2018: 18-34
    [33]
    Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, Cheng-Lin Liu: Handwritten Mathematical Expression Recognition via Paired Adversarial Learning. Int. J. Comput. Vis. 128(10): 2386-2401 (2020)
    [34]
    Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, Kaiming He:Non-local Neural Networks. CoRR abs/1711.07971 (2017)
    [35]
    Yanli Liu, Jin Wang, Chen Xi, Yanwen Guo, Qunsheng Peng: A Robust and Fast Non-local Means Algorithm for Image Denoising. CAD/Graphics 2007: 30
    [36]
    Harold Mouchère, Christian Viard-Gaudin, Richard Zanibbi, Utpal Garain, Dae Hwan Kim:ICDAR 2013 CROHME: Third International Competition on Recognition of Online Handwritten Mathematical Expressions. ICDAR 2013: 1428-1432
    [37]
    Harold Mouchère, Christian Viard-Gaudin, Richard Zanibbi, Utpal Garain: ICFHR 2014 Competition on Recognition of On-Line Handwritten Mathematical Expressions (CROHME 2014). ICFHR 2014: 791-796
    [38]
    Harold Mouchère, Christian Viard-Gaudin, Richard Zanibbi, Utpal Garain: ICFHR2016 CROHME: Competition on Recognition of Online Handwritten Mathematical Expressions. ICFHR 2016: 607-612
    [39]
    Mahshad Mahdavi, Richard Zanibbi, Harold Mouchère, Christian Viard-Gaudin, Utpal Garain: ICDAR 2019 CROHME + TFD: Competition on Recognition of Handwritten Mathematical Expressions and Typeset Formula Detection. ICDAR 2019: 1533-1538
    [40]
    Da-Han Wang, Fei Yin, Jin-Wen Wu, Yu-Pei Yan, Zhi-Cai Huang, Gui-Yun Chen, Yao Wang, Cheng-Lin Liu:ICFHR 2020 Competition on Offline Recognition and Spotting of Handwritten Mathematical Expressions - OffRaSHME. ICFHR 2020: 211-215

    Cited By

    View all
    • (2023)Elevating Handwritten Mathematical Expression Recognition: Unveiling 2D Structural Insights Through Weak Supervision2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)10.1109/MLCCIM60412.2023.00057(352-357)Online publication date: 25-Jul-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence
    December 2021
    699 pages
    ISBN:9781450385053
    DOI:10.1145/3508546
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 February 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. handwritten mathematical expression
    2. non-local
    3. offline recognition
    4. self-attention

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Natural Science Foundation of Fujian Province
    • Natural Science Foundation of China
    • Industry-University Cooperation Project of Fujian Science and Technology Department

    Conference

    ACAI'21

    Acceptance Rates

    Overall Acceptance Rate 173 of 395 submissions, 44%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Elevating Handwritten Mathematical Expression Recognition: Unveiling 2D Structural Insights Through Weak Supervision2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)10.1109/MLCCIM60412.2023.00057(352-357)Online publication date: 25-Jul-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media