Abstract
The term handwriting recognition is used to describe the capability of a computer system to transform human handwriting into machine processable text. Handwriting recognition has many applications in various fields such as bank-cheque processing, postal-address interpretation, document archiving, mail sorting and form processing in administration, insurance offices. A collection of different scripts is employed in writing languages throughout the world. Many researchers have done work for handwriting recognition of various non-Indic and Indic scripts from the most recent couple of years. But, only a limited number of systems are offered for word recognition for these scripts. This paper presents an extensive systematic survey of word recognition techniques. This survey of word recognition is classified broadly based on different scripts in which a word is written. Experimental evaluation of word recognition tools/techniques is presented in this paper. Different databases have been surveyed to evaluate the performance of techniques used to recognize words, and the achieved recognition accuracies have been reported. The efforts in two directions (non-Indic and Indic scripts) are reflected in this paper. We increased awareness of the potential benefits of word recognition techniques and identify the need to develop an efficient word recognition technique. Recommendations are also provided for future research. It is also observed that the research in this area is quietly thin and still more research is to be done, particularly in the case of word recognition of printed/handwritten documents in Indic scripts.
Similar content being viewed by others
References
Acharyya A, Rakshit S, Sarkar R, Basu S, Nasipuri M (2013) Handwritten word recognition using MLP based classifier: a holistic approach. Int J Comput Sci Issues 10(2):422–427
Adak C, Chaudhuri BB, Blumenstein M (2016) Offline cursive Bengali word recognition using CNNs with a recurrent model. In: Proceedings of the 15th international conference on frontiers in handwriting recognition, pp 429–434
Al-Boeridi ON, Ahmad SM (2015) A scalable hybrid decision system (HDS) for Roman word recognition using ANN SVM: study case on Malay word recognition. Nat Comput Appl Forum 26(6):1505–1513
Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: Proceedings of the sixth international conference on computational intelligence and communication networks, pp 257–261
Bouaziz S, Mezghani A, Kanoun S (2014) Arabic handwritten word recognition with large vocabulary based on explicit segmentation. In: Proceedings of the international conference on information and communication technologies innovation and application, pp 1–4
Bouwhuis D, Bouma H (1979) Visual word recognition of three letter words as derived from the recognition of the constituent letters. Percept Psychophys 25:12–22
Caesar T, Gloger JM, Kaltenmeier A, Mandler E (1994) Handwritten word recognition using statistics. In: Proceedings of the IEE European workshop on handwriting analysis and recognition: a European perspective, pp 1–5
Cattell J (1886) The time taken up by cerebral operations. Mind 11:277–282
Cheikh IB, Kacem A (2007) Neural network for the recognition of handwritten Tunisian city names. In: Proceedings of the international conference on document analysis and recognition, pp 1108–1112
Chowdhury K, Alam L, Sarmin S, Arefin S, Hoque MM (2015) A fuzzy features based online handwritten Bangla word recognition framework. In: Proceedings of the 18th international conference on computer and information technology (ICCIT), pp 484–489
Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for Off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recogn Lett 79:73–79
Dehghan M, Faez K, Ahmadi M, Shridhar M (2001) Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recogn Lett 34(5):1057–1065
Dhandra BV, Mallikarjun H, Hegadi R, Malemath VS (2006) Word-wise script identification from bilingual documents based on morphological reconstruction. In: Proceedings of the first international conference on digital information management, pp 389–394
Dhiman S, Lehal GS (2017) Performance comparison of Gurmukhi script: k-NN classifier with DCT and Gabor filter. Int J Adv Res Comput Sci 8(5):762–764
Eynard L, Emptoz H (2009) Italic or roman: word style recognition without a priori knowledge for old printed documents. In: Proceedings of the 10th international conference on document analysis and recognition, pp 823–827
Ghosh R, Roy PP (2016) Comparison of zone-features for online Bengali and Devanagari word recognition using HMM. In: Proceedings of the 15th international conference on frontiers in handwriting recognition, pp 435–440
Gough PB (1972) One second of reading. In: Kavanagh JF, Mattingly IG (eds) Language by ear and by eye. MIT Press, Cambridge
Gowda PK, Chethan S, Harsha J, Rakesh J, Tanushree KN (2017) Offline Kannada handwritten word recognition using locality preserving projections (LPP). Int J Innov Res Comput Commun Eng 5(5):9955–9960
Fisher DF (1975) Reading and visual search. Memory Cogn 3:188–196
Hafiz AM, Bhat GM (2016) Arabic OCR using a novel hybrid classification scheme. J Pattern Recognit Res 11(1):55–60
Ibrayim M, Hamdulla A (2015) On-line handwritten Uyghur word recognition using segmentation-based techniques. Int J Signal Process Image Process Pattern Recognit 8(6):51–60
Imani Z, Ahmadyfard AR, Zohrevand A (2016) Holistic Farsi handwritten word recognition using gradient features. J Artif Intell Data Min 4(1):19–25
Jayech K, Mahjoub M, Amara NB (2016) Arabic handwritten word recognition based on dynamic Bayesian network. Int Arab J Inf Technol 13(6B):1024–1031
Karim A, Kadhm MS (2015) Handwriting word recognition based on neural networks. Int J Appl Eng Res 10(22):43120–43124
Karim A, Kadhm MS (2015) Handwriting word recognition based on SVM classifier. Int J Adv Comput Sci Appl 6(11):64–68
Khaissidi G, Elfakir Y, Mrabti M, Lakhliai Z, Chenouni D, El-yacoubi M (2016) Segmentation-free word spotting for handwritten Arabic documents. Int J Interact Multim Artif Intell 4:6–10
Khemiri A, Echi AK, Belaid A, Elloumi M (2016) A System for off-line Arabic handwritten word recognition based on Bayesian approach. In: Proceedings of the 15th international conference on frontiers in handwriting recognition, pp 560–565
Kumar M, Chandran S (2015) Handwritten Malayalam word recognition system using neural networks. Int J Eng Res Technol (IJERT) 4(4):90–99
Kumar M, Jindal MK, Sharma RK, (2011a) Review on OCR for handwritten indian scripts character recognition. In: Proceedings of the first international conference on digital image processing and pattern recognition (DPPR), Tirunelveli, Tamil Nadu, vol 205, pp 268–276
Kumar M, Jindal MK, Sharma RK (2011b) k-nearest neighbor based offline handwritten Gurmukhi character recognition. In: Proceedings of the international conference on image information processing (ICIIP), Jaypee University of Information Technology, Waknaghat (Shimla), pp 1–4
Kumar M, Sharma RK, Jindal MK (2011c) Classification of characters and grading writers in offline handwritten Gurmukhi script. In: Proceedings of the international conference on image information processing (ICIIP), Jaypee University of Information Technology, Waknaghat (Shimla), pp 1–4
Kumar M, Sharma RK, Jindal MK, (2011d) SVM based offline handwritten Gurmukhi character recognition. In: Proceedings of the international workshop on soft computing applications and knowledge discovery (SCAKD), National Research University Higher School of Economics, Moscow (Russia), pp 51–62
Kumar M, Jindal MK, Sharma RK (2012) Offline handwritten Gurmukhi character recognition: study of different features and classifiers combinations. In: Proceedings of the workshop on document analysis and recognition (IWDAR), IIT Bombay, pp 94–99
Kumar M, Jindal MK, Sharma RK (2013) PCA based offline handwritten Gurmukhi character recognition. Smart Comput Rev 3(5):346–357
Kumar M, Sharma RK, Jindal MK (2014) Efficient feature extraction techniques for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(4):381–391
Kumar M, Sharma RK, Jindal MK (2014) A novel hierarchical technique for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(6):567–572
Kumar M, Jindal MK, Sharma RK, Jindal SR (2018) Character and numeral recognition for Non-Indic and Indic scripts: a survey. Artif Intell Rev. https://doi.org/10.1007/s10462-017-9607-x
Kumar S (2016) A study for handwritten Devanagari word recognition. In: Proceedings of the international conference on communication and signal processing, pp 1009–1014
Lehal GS, Singh C (2000) A Gurmukhi script recognition system. In: Proceedings of the international conference on pattern recognition (ICPR’00), pp 557–560
Liu J, Ma LL, Wu J (2016) Online handwritten Mongolian word recognition using MWRCNN and position maps. In: Proceedings of the 15th international conference on frontiers in handwriting recognition, pp 60–65
Maruyama K, Nakano Y (2000) Recognition method for cursive Japanese word written in latin characters. In: Proceedings of the seventh international workshop on frontiers in handwriting recognition, pp 133–142
Mohanty S, Swain BK (2010) Markov model based Oriya isolated speech recognizer—an emerging solution for visually impaired students in school and public examination. In: Proceedings of the international conference on communications and technologies, pp 107–111
Mori S, Suen CY, Yamato K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029–1058
Moubtahij HE, Satori K, Halli A (2016) Recognition of off-line Arabic handwriting words using HMM toolkit (HTK). In: Proceedings of the 13th international conference computer graphics, imaging and visualization, pp 167–171
Naik A, Patel MS (2014) Offline English handwritten word recognizer using best feature extraction. Int J Adv Comput Theory Eng (IJACTE) 3(2):61–63
Obaidullah SM, Santosh KC, Halder C, Das N, Roy K (2017) Automatic Indic script identification from handwritten documents: page, block, line and word-level approach. J Mach Learn Cybern (JMLC). https://doi.org/10.1007/s13042-017-0702-8
Oyedotun OK, Khashman A (2016) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Patel C, Desai A (2011) Zone identification for Gujarati handwritten word. In: Proceedings of the second international conference on emerging applications of information technology, pp 194–197
Patel MS, Reddy SC (2014) An impact of grid based approach in offline handwritten Kannada word recognition. In: Proceedings of the international conference on contemporary computing and informatics (IC3I), pp 630–633
Patel MS, Kumar R, Reddy SC (2015) Offline Kannada handwritten word recognition using locality preserving projection (LPP) for feature extraction. Int J Innov Res Sci Eng Technol 4(7):5078–5086
Patil P, Ansari S (2014) Online handwritten Devanagari word recognition using HMM based technique. Int J Comput Appl 95(17):17–21
Rani R, Dhir R, Lehal GS (2013) Modified gabor feature extraction method for word level script identification—experimentation with Gurumukhi and English scripts. Int J Signal Process Image Process Pattern Recognit 6(5):25–38
Ranjan R, Dubey RK (2016) Isolated word recognition using HMM for Maithili dialect. In: Proceedings of the international conference on signal processing and communication, pp 323–327
Rasagna V, Kumar A, Jawahar CV, Manmatha R (2009) Robust recognition of documents by fusing results of word clusters. In: Proceedings of the 10th international conference on document analysis and recognition, pp 566–570
Roy K, Pal U (2006) Word-wise hand-written script separation for Indian Postal automation. In: Tenth international workshop on frontiers in handwriting recognition, pp 1–6
Roy K, Alaei A, Pal U (2010) Word-wise handwritten Persian and Roman script identification. In: Proceedings of the 12th international conference on frontiers in handwriting recognition, pp 628–633
Roy PP, Chherawala Y, Cheriet M (2014) Deep-belief-network based rescoring for handwritten word recognition. In: Proceedings of the 14th international conference on frontiers in handwriting recognition, pp 506–511
Sahu AK, Mati GR (2016) Odia isolated word recognition using DTW. Int J Eng Res Technol (IJERT) 5(8):552–556
Septi M, Bedda M (2006) Contribution to the recognition of hand Arabic word based on neural network. In: Proceedings of the international conference on information and communication technologies, pp 1635–1639
Sharma DV, Jhajj P (2010) Recognition of isolated handwritten characters in Gurmukhi script. Int J Comput Appl 4(8):9–17
Shaw B, Parui SK, Shridhar M (2008) Offline handwritten Devanagari word recognition: a holistic approach based on directional chain code feature and HMM. In: Proceedings of the international conference on information technology, pp 203–208
Shaw B, Bhattacharya U, Parui SK (2015) Offline handwritten Devanagari word recognition: information fusion at feature and classifier levels. In: Proceedings of the 3rd IAPR Asian conference on pattern recognition, pp 720–724
Shridhar M, Kimura F, Truijen B, Houle GF (2002) Impact of Lexicon completeness on city name recognition. In: Proceedings of the eighth international workshop on frontiers in handwriting recognition (IWFHR’02), pp 513–518
Singh G, Sachan M (2014) Multi-layer perceptorn (MLP) neural network technique for offline handwritten Gurmukhi character recognition. In: Proceedings of the IEEE international conference on computational intelligence and computing research, pp 221–225
Singh S, Kariveda T, Gupta JD, Bhattacharya K (2015) Handwritten words recognition for legal amounts of bank cheques in English script. In: Proceedings of the 8th international conference on advances in pattern recognition, pp 1–5
Smith F (1969) Familiarity of configuration vs. discriminability of features in the visual identification of words. Psychon Sci 14:261–262
Sperling G (1963) A model for visual memory tasks. Hum Factors 5:19–31
Steinherz T, Rivlin E, Intrator N (1999) Offline cursive script word recognition—a survey. IJDAR 2(2–3):90–110
Su B, Lu S (2017) Accurate recognition of words in scenes without character segmentation using recurrent neural network. Pattern Recogn Lett 63:397–405
Tamen Z, Drias H, Boughaci D (2017) An efficient multiple classifier system for Arabic handwritten words recognition. Pattern Recogn Lett 93:123–132
Tay YH, Lallican PM, Khalid M, Gaudin CV, Knerr S (2010) An offline cursive handwritten word recognition system. In: Proceedings of the IEEE region 10 international conference on electrical and electronic technology, pp 519–524
Thadchanamoorthy S, Kodikara ND, Premaretne HL (2013) Tamil handwritten city name database development and recognition for postal automation. In: Proceedings of the 12th international conference on document analysis and recognition, pp 793–797
Verma B, Gader P, Chen W (2001) Fusion of multiple handwritten word recognition techniques. Pattern Recogn Lett 22(9):991–998
Vichianchai V (2011) Thai-word segmentation through Thai writing structure matching. In: Proceedings of the international conference on modeling, simulation and control, vol 10, pp 184–188
Vinciarelli A (2002) A survey of offline cursive word recognition. Pattern Recogn 35(7):1433–1446
Waard WPD (1995) An optimized minimal edit distance for hand-written word recognition. Pattern Recogn Lett 16:1091–1096
Wang GY, Zhang YM, Sun ML, Wang X, Zhang Y (2016) Speech signal feature parameters extraction algorithm based on PCNN for isolated word recognition. In: Proceedings of the international conference on audio, language and image processing (ICALIP), pp 679–682
Woodworth RS (1938) Experimental psychology. Holt, New York
Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inform 13(2):520–531
Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inform 13(2):616–624
Zhang Q, Yuan Y, Li N, Wei X, Miao J (2009) A new way for chinese place name recognition. In: Proceedings of the international conference on Asian language processing, pp 129–134
Zinjore RS, Ramteke RJ (2015) Identification and removal of Devanagari script and extraction of roman words from printed bilingual text document. In: IJCA proceedings on national conference on digital image and signal processing (DISP), pp 17–20
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: A quality assessment forms
1.1 Screening question
Section-1
Does the research paper refer to word recognition? | Yes |
Consider: The paper includes the study of word recognition. All types of studies, i.e., case study, experimental study or research paper is included. |
Section-1 is evaluated first. If the reply is positive, then proceed to Section-2.
1.2 Screening question
Section-2
Key sub-area categorization | |
Is the research paper focusing on word recognition? | Yes |
Consider: – Is the study’s focus or main focus on word recognition or not? – Did the study fit in any one of the sub-areas categorized? (Apparently the study motivated different categories.) |
If the study’s primary focus is on word detection, proceed to section-3, else proceed to section-4.
1.3 Detailed questions
Section-3
Findings | |
Is there clear statement of the findings? | Yes |
Consider: Did the study mention the approach/word detection? Has the word detection technique reported? What is the corresponding transformation technique, findings, i.e., source representation? | |
Comparison | |
Was the data reported sufficient for comparative analysis? | Yes |
Consider: Are the necessary parameters for comparison discussed? Is the study referring to handwritten word recognition explicitly? |
1.4 Detailed questions
Section-4
Findings | |
Did the study mention the type of word recognition? | Yes |
Consider: How well the word recognition is categorized? Did the study explicitly mention the type of word recognition, or is to be inferred from the study? |
Appendix 2: Data items extracted from all papers
Data item | Description |
---|---|
Study identifier | Unique ID for the study |
Bibliographic data | Author, year, title, source |
Type of article | Journal article, conference article, workshop paper |
Study aims/context/application domain | What are the aims of the study, i.e., search focus, i.e., the research areas the paper focus on |
Study design | Classification of study—feature extraction, classification, word recognition, comparative analysis, etc. |
What is the word recognition technique? | It explicitly refers to the techniques used for extracting the features of word, segmentation techniques if any and classification techniques to recognize a word |
How was comparison carried out? | Values of important parameters for word recognition, i.e., recall, precision, application area, scalability, portability |
Subject system | How the data was collected: it refers to the subject system and its size |
Data analysis | Data analysis, i.e., corresponding source representation and match detection techniques are extracted |
Developer of the tool and usage | It refers to the word detection tool, developer and usage of the tool |
Study findings | Major findings or conclusions from the primary study like percentage of word’s recognition accuracy |
Other | Does the study explicitly refer to handwritten word recognition or printed word recognition, any other important point |
Rights and permissions
About this article
Cite this article
Kaur, H., Kumar, M. A comprehensive survey on word recognition for non-Indic and Indic scripts. Pattern Anal Applic 21, 897–929 (2018). https://doi.org/10.1007/s10044-018-0731-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-0731-2