Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

Published: 03 October 2014 Publication History

Abstract

Writer-specific character writing variations such as those of stroke order and stroke number are an important source of variability in the input when handwriting is captured “online” via a stylus and a challenge for robust online recognition of handwritten characters and words. It has been shown by several studies that explicit modeling of character allographs is important for achieving high recognition accuracies in a writer-independent recognition system. While previous approaches have relied on unsupervised clustering at the character or stroke level to find the allographs of a character, in this article we propose the use of constrained clustering using automatically derived domain constraints to find a minimal set of stroke clusters. The allographs identified have been applied to Devanagari character recognition using Hidden Markov Models and Nearest Neighbor classifiers, and the results indicate substantial improvement in recognition accuracy and/or reduction in memory and computation time when compared to alternate modeling techniques.

References

[1]
V. J. Babu, L. Prasanth, R. R. Sharma, G. V. P. Rao, and A. Bharath. 2007. HMM-based online handwriting recognition system for Telugu symbols. In Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR’07). 63--67.
[2]
Claus Bahlmann and Hans Burkhardt. 2004. The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping. IEEE Trans. Pattern Anal. Mach. Intell. 26, 3, 299--310.
[3]
S. Basu. 2005. Semi-Supervised Clustering: Probabilistic Models, Algorithms and Experiments. Ph.D. Dissertation. University of Texas at Austin.
[4]
S. Basu, M. Bilenko, A. Banerjee, and R. Mooney. 2006. Probabilistic semi-supervised clustering with constraints. In Semi-Supervised Learning, O. Chapelle, B. Scholkopf, and A. Zien, Eds., MIT Press, Cambridge, MA, 73--102.
[5]
S. Basu and I. Davidson. 2006. Clustering under constraints: Theory and practice. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06).
[6]
A. Bharath, V. Deepu, and Sriganesh Madhvanath. 2005. An approach to identify unique styles in online handwriting recognition. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). 775--778.
[7]
A. Bharath and Sriganesh Madhvanath. 2009. A framework based on semi-supervised clustering for discovering unique writing styles. In Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09). 891--895.
[8]
A. Bharath and Sriganesh Madhvanath. 2012. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4, 670--682.
[9]
Nilanjana Bhattacharya and Umapada Pal. 2012. Stroke segmentation and recognition from bangla online handwritten text. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 736--741.
[10]
Alain Biem. 2006. Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1041--1051.
[11]
Kumar Chellapilla, Patrice Simard, and Ahmad Abdulkader. 2006. Allograph based writer adaptation for handwritten character recognition. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR’06).
[12]
S. D. Connell. 2000. Online handwriting recognition using multiple pattern class models. Ph.D. Dissertation, Michigan State Univ.
[13]
S. D. Connell and A. K. Jain. 1998. Learning prototypes for on-line handwritten digits. In Proceedings of the 14th International Conference on Pattern Recognition (ICPR’98). 182--184.
[14]
S. D. Connell, R. M. K. Sinha, and A. K. Jain. 2000. Recognition of unconstrained on-line Devanagari characters. In Proceedings of the 15th International Conference on Pattern Recognition (ICPR’00). 368--371.
[15]
F. Coulmas. 1996. The Blackwell Encyclopedia of Writing Systems. Blackwell, Oxford.
[16]
Richard O. Duda, Peter E. Hart, and David G. Stork. 2001. Pattern Classification. Wiley.
[17]
A. L. N. Fred and A. K. Jain. 2005. Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27, 6, 835--850.
[18]
Jianying Hu, Sok Gek Lim, and Michael K. Brown. 2000. Writer independent on-line handwriting recognition using an HMM approach. Pattern Recogn. 33, 1, 133--147.
[19]
S. Jaeger, S. Manke, J. Reichert, and A. Waibel. 2001. Online handwriting recognition: The NPen++ Recognizer. Int. J. Doc. Anal. Recogn. 3, 3, 169--180.
[20]
D. Klein, S. D. Kamvar, and C. D. Manning. 2002. From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In Proceedings of the 19th International Conference on Machine Learning (ICML’02). 307--314.
[21]
T. Kohonen. 1990. The self-organizing map. Proc. IEEE 78, 9, 1464--1480.
[22]
B. Kulis, S. Basu, I. Dhillon, and R. J. Mooney. 2005. Semi-supervised graph clustering: A kernel approach. In Proceedings of the 22nd International Conference on Machine Learning (ICML’05). 457--464.
[23]
J. J. Lee, J. Kim, and J. H. Kim. 2000. Data-driven Design of HMM Topology for On-line Handwriting Recognition. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 107--121.
[24]
Cheng-Lin Liu and Masaki Nakagawa. 2001. Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition. Pattern Recog. 34, 3, 601--615.
[25]
N. Matic, J. Platt, and T. Wang. 2002. QuickStroke: An incremental on-line Chinese handwriting recognition system. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 435--439.
[26]
M. Nakai, N. Akira, H. Shimodaira, and S. Sagayama. 2001. Substroke approach to HMM-based on-line Kanji handwriting recognition. In Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR’01). 491--495.
[27]
M. Nakai, H. Shimodaira, and S. Sagayama. 2003. Generation of hierarchical dictionary for stroke-order free Kanji handwriting recognition based on substroke HMM. In Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR’03). 514--518.
[28]
Michael P. Perrone and S. D. Connell. 2000. K-means clustering for hidden Markov models. In Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR’00). 229--238.
[29]
R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 2, 257--286.
[30]
J. Rajkumar, K. Mariraja, K. Kanakapriya, S. Nishanthini, and V. S. Chakravarthy. 2012. Two schemas for online character recognition of Telugu script based on support vector machines. In Proceedings of the 13th International Conference on Frontiers in Handwriting Recognition (ICFHR’12). 563--568.
[31]
S. Salvador and P. Chan. 2004. Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence. 18521857.
[32]
K. C. Santosh, C. Natteey, and B. Lamiroyz. 2010. Spatial similarity based stroke number and order free clustering. In Proceedings of the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 652--657.
[33]
H. Swethalakshmi. 2007. Online handwritten character recognition for Devanagari and Tamil scripts using support vector machines. Master’s thesis, Indian Institute of Technology, Madras, India.
[34]
K. Takahashi, H. Yasuda, and T. Matsumoto. 1997. A fast HMM algorithm for on-line handwritten character recognition. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 369--375.
[35]
Christian Viard-Gaudin, Pierre Michel Lallican, Philippe Binter, and Stefan Knerr. 1999. The IRESTE On/Off (IRONOFF) dual handwriting database. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 455--458.
[36]
V. Vuori. 2002. Clustering writing styles with a self-organizing map. In Proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR’02). 345--350.
[37]
V. Vuori and J. Laaksonen. 2002. A comparison of techniques for automatic clustering of handwritten characters. In Proceedings of the 16th International Conference on Pattern Recognition (ICPR’02). 168--171.
[38]
L. Vuurpijl and L. Schomaker. 1997. Finding structure in diversity: A hierarchical clustering method for the categorization of allographs in handwriting. In Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97). 387--393.
[39]
L. G. Vuurpijl and L. R. B. Schomaker. 1997. Coarse writing-style clustering based on simple stroke-related features. In Progress in Handwriting Recognition, A. C. Downton and S. Impedovo Eds., World Scientific, London, UK, 37--44.
[40]
K. Wagstaff, C. Cardie, S. Rogers, and S. Schrdl. 2001. Constrained K-means clustering with background knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 577--584.
[41]
K. Yamasaki. 1999. Automatic prototype stroke generation based on stroke clustering for on-line handwritten japanese character recognition. In Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR’99). 673--676.
[42]
L. Yi, J. Rong, and A. K. Jain. 2007. BoostCluster: Boosting clustering by pairwise constraint. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 450--459.
[43]
L. Zelnik-Manor and P. Peronam. 2004. Self-tuning spectral clustering. In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (NIPS’04). 1601--1608.

Cited By

View all
  • (2016)An End-to-End System for Bangla Online Handwriting Recognition2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)10.1109/ICFHR.2016.0076(373-378)Online publication date: Oct-2016
  • (2015)A Path-Based Distance for Street Map ComparisonACM Transactions on Spatial Algorithms and Systems10.1145/27299771:1(1-28)Online publication date: 29-Jul-2015

Index Terms

  1. Allograph modeling for online handwritten characters in devanagari using constrained stroke clustering

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian Language Information Processing
    ACM Transactions on Asian Language Information Processing  Volume 13, Issue 3
    September 2014
    83 pages
    ISSN:1530-0226
    EISSN:1558-3430
    DOI:10.1145/2676410
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 October 2014
    Accepted: 01 April 2014
    Revised: 01 April 2014
    Received: 01 August 2013
    Published in TALIP Volume 13, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Devanagari character recognition
    2. allograph modeling
    3. constrained stroke clustering
    4. online handwriting recognition

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)An End-to-End System for Bangla Online Handwriting Recognition2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)10.1109/ICFHR.2016.0076(373-378)Online publication date: Oct-2016
    • (2015)A Path-Based Distance for Street Map ComparisonACM Transactions on Spatial Algorithms and Systems10.1145/27299771:1(1-28)Online publication date: 29-Jul-2015

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media