3 Paper DSD
3 Paper DSD
3 Paper DSD
To cite this article: Surendra P. Ramteke, Ajay A. Gurjar & Dhiraj S. Deshmukh (2019): A Novel
Weighted SVM Classifier Based on SCA for Handwritten Marathi Character Recognition, IETE
Journal of Research, DOI: 10.1080/03772063.2019.1623093
Article views: 1
ABSTRACT KEYWORDS
The research on handwritten optical character recognition (OCR) of Marathi script is very challeng- Optical character
ing due to the complex structural properties of the script that are not observed in most other scripts. recognition; WOAR-SVM;
This paper gives an OCR framework for handwritten Marathi document classification and recognition Sine cosine algorithm;
system. Due to the large variety of symbols the Marathi characters recognition poses great chal- Modified Pihu method;
Global transformation
lenge and their proximity in appearance. The weighted one-against-rest support vector machines
(WOAR-SVM) assume a noteworthy part to deal with vast feature measures which are utilized for
the classification. Here, a new sine cosine algorithm is proposed for the identification of handwritten
Marathi text. By utilizing different morphological operations the preprocessing is finished and the
Marathi text is flexibly segmented in three levels; line segmentation, word segmentation and charac-
ter segmentation with Modified Pihu method. Various features like statistical, global transformation,
geometrical and topological features are extracted from the preprocessed image by extraction tech-
niques. Result obtained show that various features with WOAR-SVM classifier perform the best by
yielding high accuracy as 95.14%.
1. INTRODUCTION
There are heaps of issues in handwritten character recog-
In digital computer machine the serious research topic is nition when contrasted with documents of machine
simulation of human perusing. Not only the main advan- printed. Since various peoples have distinctive styles in
tages of such exertion were difficult for simulating human composing, pen-tip estimate and in their writing some
reading but also the probability of efficient application peoples have skewness. To overcome this issue every one
in which printed and handwritten character present on of the difficulties makes the researchers to work. In India
document has to be transferred into machine justifi- Devanagari script is an older most one, which is utilized
able format [1]. The automatic character recognition of to write numerous languages like Nepali, Hindi, Marathi,
printed and handwritten written record information has Sindhi and Sanskrit for documentation [9]. Be that as it
an assortment of practical and commercial applications may, the generally preferred language is Marathi; a very
in libraries, banks and post offices [2,3]. In the field of less measure of work has been finished.
image processing, pattern recognition, machine learning
and artificial inelegancy the optical character recogni- In these areas most of the present work is restricted to
tion (OCR) is an examination. OCR is a procedure of English and a few oriental languages. For Indic scripts
converting scanned images of machine printed or hand- the absence of efficient solutions in Marathi language
written text into a computer processable format [4]. All has hampered extraction of information from a histor-
the OCR particularly of records in English language has ical importance and social archives. For text charac-
been comprehensively studied and actualized effectively ter segmentation different techniques have elaborated;
over years [5]. they are wavelet transforms [10,11], curvelet transform
[12,13], and Gradient feature [14,15]. Consequently, for
The OCR comprises two classifications based on data the OCR these techniques are not demonstrated depend-
acquisition process: off-line character recognition and able [16,17].
online character recognition [6–8]. The off-line character
recognition is additionally separated into two sections: In this paper, we exhibit a modified approach for
machine printed and handwritten character recognition. handwritten Marathi text document classification and
© 2019 IETE
2 S. P. RAMTEKE ET AL.: A NOVEL WEIGHTED SVM CLASSIFIER BASED ON SCA FOR HANDWRITTEN MARATHI CHARACTER RECOGNITION
recognition process. To begin with, the Marathi text The hybrid LMs are built by joining both neural net-
documents are changed into the image samples. In the works (FNNLMs and RNNLMs) with BLMs. For reason-
wake of preprocessing the noise is expelled and script able correlation with state-of-the-art system and BLMs,
was segmented adaptable in three levels utilizing ‘Mod- assessed in a system with the similar character over-
ified Pihu Method’. After that, various feature extraction segmentation and classification models as previously,
techniques are applied to segmented image to extract different LMs are analyzed utilizing a little text corpus
relevant features from character to form feature vec- used previously. In their work the execution of both
tors. These feature vectors are then utilized by a novel ICDAR-2013 and CASIA-HWDB competition dataset
approach named weighted one-against-rest support vec- is enhanced essentially. The character level accurate
tor machines (WOAR-SVM) classifier for the classifi- rate and correct rate accomplish 95.88% and 95.95% in
cation and to recognize the Marathi word image the CASIA-HWDB, respectively. R Pramanik et al. [21] have
sine cosine algorithm (SCA) optimization algorithm is introduced a novel shape decomposition-based segmen-
used. WOAR-SVM is a binary classification algorithm; tation technique to disintegrate the compound characters
nonetheless, it is utilized to classify the feature database of into prominent shape components. The less number of
the document. Our approach would add to decrease the classes to recognize the shape decomposition lessens the
error, so our proposed strategy achieves high accuracy classification complexity also enhances the recognition
and furthermore, this approach accurately recognizes the accuracy at the same time. At the segmentation area
substantial volume of characters. The rest of the paper the decomposition is done where the two fundamental
is structured as follows: In section 2, the brief review of shapes are joined to form a compound character.
handwritten character identification is provided. Section
3 expounds the proposed method in detail. Section 4 A system for offline recognition cursive Arabic hand-
illustrates experimental analysis and finally, section 5 written text based on hidden Markov models (HMMs)
concludes the paper. is presented by R. Mouhcine et al. [22]. Without explicit
segmentation the system is analytical to perform embed-
ded training. By baseline estimation the statistical fea-
2. LITERATURE SURVEY: A BRIEF REVIEW
tures are extricated and in the word image geometric to
Many of the frameworks have clarified about a variety integrate both peculiarities of the text and pixel distri-
of different strategies for text character segmentation. A bution characteristics. Using HMMs these features are
portion of the works is assessed here. modeled and trained by embedded training. The popu-
lar deep convolutional neural networks (DCNNs) have
Y Zhang et al. [18] have presented a novel adversarial been introduced by C Boufenar et al. [23]. The DCNNs
feature learning (AFL) model to enhance the HCR exe- have adequately replaced the hand-crafted descriptors
cution on restricted data training that incorporates the with network features and appeared to provide preferable
prior knowledge of printed data and writer-independent outcomes than other traditional methods. In machine
semantic features. From accessible handcrafted feature learning, it is one of the quickest developing areas and
methods, the introduced AFL method is distinctive promise to reshape the future of artificial intelligence.
which automatically exploits writer-independent seman- In three different ways the CNN model can be used:
tic features and standard printed data as prior knowledge training the CNN from scratch, from a pretrained model
is learnt objectively. To solve the issues of speed and stor- utilizing transfer learning strategy to leverage features,
age capacity, X Xiao et al. [19] have introduced a global keeping the transfer learning strategy and fine-tune the
supervised low-rank expansion technique and an adap- CNN architecture weights.
tive drop-weight method. A nine-layer CNN intended for
HCCR comprises 3,755 classes and devises an algorithm S Roy et al. [24] have exhibited a novel deep learning
that decreases the computational cost of networks by nine strategy for the recognition of handwritten Bangla iso-
times and in baseline model the network is compressed lated compound character. On the CMATERdb 3.1.3.3
to 1/18 of the original size with 0.21% accuracy drop. dataset a new benchmark of recognition accuracy is
Contrasted with CNN strategy for HCCR, the introduced reported. In different pattern recognition issues the
model is around 30 times speedier and 10 times costlier. greedy layer-wise training of deep neural network has
helped to made critical steps. The authors utilize layer-
The author [20] assesses the impacts of two sorts of wise training to DCNN in a supervised fashion and
character-level NNLMs as FNNLMs (feed forward neural to accomplish faster convergence the training process
network LMs) and RNNLMs (recurrent neural network is augmented with the RM-SProp algorithm. To solve
LMs) for enhancing Chinese handwriting recognition. the problem of recognizing isolated handwritten words
S. P. RAMTEKE ET AL.: A NOVEL WEIGHTED SVM CLASSIFIER BASED ON SCA FOR HANDWRITTEN MARATHI CHARACTER RECOGNITION 3
the authors [25] have introduced the use of a new neu- vectors are then utilized by a novel approach based
ral network architecture that combines a deep convolu- on weighted one-against-rest support vector machine
tional neural network with an encoder–decoder called classifier and the SCA optimization algorithm. To opti-
sequence to sequence. The presented architecture distin- mize the WOAR-SVM, the SCA algorithm is utilized to
guishes the contextual and characters with their neigh- select the Marathi text from the handwritten document.
bors to recognize any given word. Under several experi-
ments the author’s models are tested on two handwritten
databases like IAM and RIMES to determine the optimal 3.1 Image Acquisition Phase
parameterization of the model.
Image acquisition is an initial phase of character recogni-
tion system. In this phase the input handwritten or paper
document image is scanned and converted into electronic
3. PROPOSED METHOD FOR MARATHI
form in bitmap images such as JPEG, BMT, TIF and TNG.
CHARACTER RECOGNITION AND
The acquired image is fed to the pre-processing phase
CLASSIFICATION
Figure 2.
The design of the proposed OCR system is shown in
Figure 1. For this character classification and recognition
the preprocessing, segmentation, feature extraction and
optimization techniques are proposed in our method. At
first, the text documents are changed over into the image
samples in the preprocessing stage. Then, the Marathi
script was segmented flexibly in three levels as line seg-
mentation, word segmentation and character segmen-
tation with ‘Modified Pihu Method’ being proposed to
enhance the segmentation accuracy. After segmentation
various features are extracted from the character image.
A typical feature of the handwritten text is the presen-
tation of text created by the author. Feature extraction
stage is to expel the data redundancy. These feature Figure 2: Sample document image
3.2 Preprocessing Phase For the segmentation technique the projection pro-
file method is utilized in our method. The segmenta-
The next phase is preprocessing, it manages enhanc- tion technique incorporates the three vital strategies:
ing image quality for better recognition by the system. Line segmentation, Word segmentation and Character
Preprocessing of handwritten document is required to segmentation.
identify and evacuate all undesirable bit patterns which
prompt to lessen the recognition accuracy. The main a) Line Segmentation: In this segmentation method,
objectives of preprocessing are Binarization, noise reduc- initially on every line or row utilizing the horizontal
tion and line removal. After text preprocessing the var- projection profile method the sum of all white pix-
ious feature extraction techniques have been utilized to els is estimated and also the appropriate histogram
extract features for recognition process. of the image is generated as takes after.
• The horizontal histogram of the image is con-
(a) Grayscale conversion: In grayscale conversion, the structed.
stored bitmap images (JPEG, BMT, TIF and TNG) • The distance between proper two histograms is
are changed over to grayscale image format. Here, recognized, based on the threshold value every
in the matrix form the images are available where histogram is separated and saved.
all the values of every element are identical to how • Finally, from the image the segmented line is
bright or dark the pixel at the fitting position should produced.
be colored. b) Word Segmentation: In word segmentation strategy,
(b) Binarization: The binarization process utilizes a to estimate the entirety of every single white pixel
global threshold approach to convert a grayscale the vertical projection profile approach is used. The
image into a binary image. Based on the threshold segmentation of the word is delineated as takes after.
value these procedures increment the processing rate • At first, the vertical histogram for the image is
and diminished the required storage space Figure 3. developed.
(c) Image Noise Removal: In scanning devices the gen- • In every column, discover the number of white
erated noises in image are line segment separated, pixels and by using the histogram the columns
bumps in lines and gaps. The main distortions are with no white pixel are detected.
local variations, dilation and erosion, etc., and fur- • Replace every such column by 1 and change over
thermore it is exceptionally essential to supplant the the unfilled rows as 0 and content words will have
restrictions. The median filtering is utilized to per- unique pixels and save it.
form noise removal. From the image this strategy • From the line the words are segmented based on
decreases the salt and pepper noise. the threshold value and the procedure is rehashed
for each line Figure 4.
c) Character Segmentation using Modified Pihu
3.3 Segmentation Phase
Method: The Modified Pihu method is proposed to
In the segmentation phase the continuous character of overcome the existing method limitations [26,27].
preprocessed image is broken down into sub-images of Figure 5 demonstrates the Marathi word and its
individual character. The segmentation plays a notewor- different components. Amid segmentation the Pihu
thy part in character recognition process. Segmentation method [27] does not evacuate the word header line
process can be sorted as global and local segmentation. and focuses only on the shape of the characters. So,
S. P. RAMTEKE ET AL.: A NOVEL WEIGHTED SVM CLASSIFIER BASED ON SCA FOR HANDWRITTEN MARATHI CHARACTER RECOGNITION 5
arrows in Figure 5.
w (Hl )
P
CVWPC = PW (j); for j = 0, 1 ≤ m ≤ Wh (5)
rn, 0
where, the mth row of the first column is rn,0 and Pw (j)
denotes the black pixels and the same procedure is
repeated for the whole word Figure 5.
Figure 4: (a) Segmented Line Image (b) Segmented Word Image Step 4: The IM is segmented using Equation (6)
⎧
⎪
⎪ true, CVWPC − 95%
⎪
⎪
⎪
⎪P (Hl )
w W
⎪
⎪ IMj,i ,
⎨
j=rn,0 i=1
f (ch_cut) = (6)
⎪
⎪true, if (mid(CVWPC − 95%)
⎪
⎪
⎪
⎪ ∪ (5%Pw ∈ lmh − 15%))
⎪
⎪
⎩
Otherwise; false
The LM is segmented using Equation (11). a) Zernike moments: The Zernike moments normal-
⎧ ization aims to influence the recognizing proce-
⎪ (H
l )
W
⎪
⎪ dure of an object in terms size of image transla-
⎪
⎪
IMj,i , true,
⎪
⎨ j=1 i=1 tion and rotation independent. The Zernike moment
f (lmh) = if (Chh ≥ (Chavg_h + lmh)) (11) with order n and repetition rof a continuous image
⎪
⎪ functionf (x, y) is given as
⎪
⎪ Otherwise; false,
⎪
⎪
⎩
if (Chh < (Chavg_h + lmh) n + 1
Znr = f (x, y) [vnr (x, y)] ∗ (12)
3.14 x y
where the height and characters of the LM are lmh and
Chh , respectively. Amid the character recognition pro-
b) Hough Transform: The Hough transform procedure
cess, the advantage is that the modifier optimizes the
is utilized for baseline document detection. It is like-
character class count independently by storing and helps
wise applied to characterize the characters’ parame-
in diminishing the processing time. The separated modi-
ter curves. The Hough transform is given by
fiers are extracted utilizing feature extraction techniques
in the following section.
L
H(a) = h (xi , yi, a1,... an ) (13)
i=1
3.4 Feature Extraction
c) Fourier Descriptor: For shape analysis the Fourier
The most vital part of the recognition system is feature
transformation is broadly utilized. The transformed
extraction technique. This phase is utilized to evacuate
coefficients are from the shape of the Fourier
data redundancy. The feature extraction can be char-
descriptors to represent the shape in frequency
acterized as extracting the most illustrative data from
domain. The number of coefficients generated from
raw information that limit inside class design variability
the transform is vast, to capture the overall features
while improving. Various feature extraction methods are
of the shape the subsets of coefficient are enough.
characterized in three groups: statistical features, global
The boundary of particular shape has k pixel num-
transformation features and geometrical and topological
bered from 0 to k − 1. Along the contour, kth pixel
features.
has (xk , yk )position. The shape of two parametric
conditions with (x, y) coordinates as s(k) = x(k) +
3.4.1 Statistical Features
i y(k). The discrete Fourier transform of s(k) is
The statistical features are derived from the statistical
points of distribution. They provide low complexity and
1
k−1
−j2π vk
high speed of variation to some extent, also used for b(v) = s(k)e k ; v = 0, 1, . . . k − 1 (14)
reducing the feature set dimension. The following are the k 0
statistical features:
d) Gabor Transform: The variation of the windowed
Fourier transform is a Gabor transform. The win-
a) Zoning: The character frame is divided into several
dow utilized as a part of this case is certifiably not
overlapping or non-overlapping zones. The densi-
a discrete size yet defined by a Gaussian function.
ties of some features in different regions are analyzed
In both spatial and frequency domains the trans-
utilizing zoning approach [28].
form possesses optimal localization properties. The
b) Crossings and Distances: The number of crossing of
2D Gabor transform gives an extracted feature and
a contour by a line segment in a specified direction
it is represented by
is the popular statistical feature. The frame contain-
ing the character is parceled into an arrangement of
regions in different ways and afterward features of x 2 + χ 2 y2 2 xπ
G (χ , φ, η, κ) = exp( ). cos( + ϕ)
each region are extracted. 2 ξ2 κ
(15)
3.4.2 Global Transformation Features
These features are invariant to global deformations like where, x = a cos θ − y sin θ by varying the parame-
rotations and translation. For the purpose of classifi- ters like χ , φ, η, κ the transform can be used better.
cation the continuous signal generally contains more
data that need to be represented. By linear combina- 3.4.3 Geometrical and Topological Features
tion the signal is represented by series of simple well- These features may represent the global and local char-
characterized functions. acter properties and have high resistances to distortions
S. P. RAMTEKE ET AL.: A NOVEL WEIGHTED SVM CLASSIFIER BASED ON SCA FOR HANDWRITTEN MARATHI CHARACTER RECOGNITION 7
• The best global optimum is stored in variables as Values of χ1+ scales measure the uncertainty dj (yt ) in
destination during optimization and updates their a positive direction, whereas χ1− denotes the same for
positions. negative values of χ1− . The final decision d(yt )was deter-
mined by Equation (23).
The next section employs recognition and classification ⎧
⎨d(yt ) = arg maxc [χj βi yi K(yi, yt ) + b]
n
which take the optimized output of the feature extraction j
techniques to their input and then determine which class i=1 (23)
⎩
it actually belongs to. d(yt ) = arg maxcj [dj (yt )]
Synthetic Handwritten
S. No Metrics data data
1 Accuracy (%) 92.80 95.14
2 Precision (%) 93.6 96.5
3 Recall (%) 94.7 98.3
4 FAR (%) 91.24 95.04
5 FRR (%) 90.41 93.52
6 F-measure (%) 94.68 98.1
recognition of optical character in this section. The quan- Figure 10: Analysis of (a) Accuracy (b) F-measure using the pro-
titative metrics used are accuracy, precision, recall, FAR, posed method
FRR and F-measure to evaluate the classification results,
as shown in Table 1.
Table 2: Performance evaluation with different classifiers Table 4: Error comparison using the proposed method
Classification Classification Classification
S.No Classifier rate (%) time (ms) S.No Algorithm Script Error (%) time (ms)
1 K-NN 88.6 98.04 1 Zheng et al. [30] Synthetic data 6.2 85.37
2 SVM 73.3 43.59 Handwritten data 6.8
3 WOAR-SVM (proposed) 94 42.15 2 Lin et al. [31] Synthetic data 2.5 52.16
Handwritten data 7.9
3 Proposed Synthetic data 1.7 42.15
Handwritten data 0.3
7. H. Al-Muhtaseb, S. Mahmoud, and R. Qahwaji, “Recogni- 19. X. Xiao, L. Jin, Y. Yang, W. Yang, J. Sun, and T. Chang,
tion of off-line printed Arabic text using hidden Markov “Building fast and compact convolutional neural networks
models,” Signal. Processing., Vol. 88, no. 12, pp. 2902–2912, for offline handwritten Chinese character recognition,”
Dec.2008. Pattern Recognit., Vol. 72, pp. 72–81, Dec.2017.
8. S. Naz, K. Hayat, M. Imran Razzak, M. Waqas Anwar, S. 20. Y. Wu, F. Yin, and C. Liu, “Improving handwritten Chinese
Madani, and S. Khan, “The optical character recognition text recognition using neural network language models
of Urdu-like cursive scripts,” Pattern Recognit., Vol. 47, no. and convolutional neural network shape models,” Pattern
3, pp. 1229–1248, Mar.2014. Recognit., Vol. 65, pp. 251–264, May.2017.
9. A. Choudhary, R. Rishi, and S. Ahlawat, “Off-line hand- 21. R. Pramanik, and S. Bag, “Shape decomposition-based
written character recognition using features extracted handwritten compound character recognition for Bangla
from Binarization technique,” AASRI Procedia, Vol. 4, pp. OCR,” J. Vis. Commun. Image. Represent., Vol. 50, pp.
306–312, Jan.2013. 123–134, Jan.2018.
10. S. Pasha, and M. Padma. “Handwritten Kannada char- 22. R. Mouhcine, A. Mustapha, and M. Zouhir, “Recogni-
acter recognition using wavelet transform and structural tion of cursive Arabic handwritten text using embed-
features”, 2015 International Conference on Emerging ded training based on HMMs,” Journal of Electrical Sys-
Research in Electronics, Computer Science and Technol- tems and Information Technology., Vol. 5, pp. 245–251,
ogy (ICERECT), Dec.2015. Apr.2017.
11. J. Olszewska, “Active contour based optical character 23. C. Boufenar, A. Kerboua, and M. Batouche. “Investigation
recognition for automated scene understanding,” Neuro on deep learning for off-line handwritten.
Computing, Vol. 161, pp. 65–71, Aug.2015.
24. S. Roy, N. Das, M. Kundu, and M. Nasipuri, “Handwritten
12. P. Singh, R. Sarkar, N. Das, S. Basu, M. Kundu, and M. isolated Bangla compound character recognition: A new
Nasipuri, “Benchmark databases of handwritten Bangla- benchmark using a novel deep learning approach,” Pattern
Roman and Devanagari-Roman mixed-script document Recognit. Lett., Vol. 90, pp. 15–21, Apr. 2017.
images,” Multimed. Tools. Appl., Vol. 45, pp. 11–19,
May.2017. 25. J. Sueiras, V. Ruiz, A. Sanchez, and J. Velez, “Offline contin-
uous handwriting recognition using sequence to sequence
13. G. Verma, S. Prasad and P. Kumar, “Handwritten Hindi neural networks,” Neuro Computing, Vol. 289, pp. 119–128,
character recognition using curvelet transform”, May.2018.
S. P. RAMTEKE ET AL.: A NOVEL WEIGHTED SVM CLASSIFIER BASED ON SCA FOR HANDWRITTEN MARATHI CHARACTER RECOGNITION 13
26. S. Ramana Murthy, V. Roy, M. H. Narang, and S. 29. S. Mirjalili, “SCA: A sine cosine algorithm for solving
Gupta, “An approach to divide pre-detected Devanagari optimization problems,” Knowl. Based. Syst., Vol. 96, pp.
words from the scene images into characters,” Signal. 120–133, Mar.2016.
Image. Video. Process., Vol. 7, no. 6, pp. 1071–1082, Nov.
2012. 30. Y. Zheng, H. Li, and D. Doermann, “Machine printed text
and handwriting identification in noisy document images,”
27. K. Jindal, and R. Kumar, “A new method for segmentation IEEE Trans. Pattern Anal. Mach. Intell., Vol. 26, no. 3, pp.
of pre-detected Devanagari words from the scene images: 337–353, Mar.2004.
Pihu method,” Comput. Electr. Eng., Vol.70, pp.754–763,
Dec.2017. 31. Y. Lin, Y. Song, Y. Li, F. Wang, and K. He, “Multilingual
corpus construction based on printed and handwritten
28. S. Rajashekararadhya and P. Ranjan, “Zone based feature character separation,” Multimed. Tools. Appl., Vol. 76, no.
extraction algorithm for handwritten Numeral recogni- 3, pp. 4123–4139, Feb.2015.
tion of Kannada script”, 2009 IEEE International Advance
Computing Conference, Mar. 2009.