Hand Written Bangla Numerals Recognition For Automated Postal System
Hand Written Bangla Numerals Recognition For Automated Postal System
Hand Written Bangla Numerals Recognition For Automated Postal System
to
= e = (2)
Where V (x) is the intensity value of the pixel at position x = (x, y). We assume that the prior probability
that a particular pixel will belong to a certain class is equal and homogeneous, i.e., Pr(x c) is the same
for each class and for all pixels.
Step 4: Apply the Bayes rule to calculate the posterior probability that a given pixel belongs to a
particular class, given its intensity:
)
) Pr( ) | ) ( Pr(
) Pr( ) | ) ( Pr(
) ) ( | Pr(
e e =
e e =
= = e
x x v x V
c x c x v x V
v x V c x (3)
After applying Region Probability Estimation and Region Classification we get the images shown in fig 2.
below
5( a ) 5(b)
Fig-5: (a) preprocessed image, (b) resultant image after Region Probability Estimation and Region Classification
Hand Written Bangla Numerals Recognition for Automated Postal System
www.iosrjournals.org 47 | Page
4.3 Recognition of numerals
Recognition of numerals is done by Support Vector Machine (SVM). The input of the SVM is the output of
region probability estimation and region Classification. SVM is described below:
|
.
|
\
|
=
=
l
i
i i i
b x x k y x f
0
) ( sgn ) ( o (4)
<
>
=
0 1
0 1
) sgn(
u for
u for
u (5)
The appeal of SVM lies in their strong connection to the underlying statistical learning theory. According to the
structural risk minimization principle, a function that can classify training data accurately and which belongs to
a set of functions with the lowest capacity (particularly in the VC-dimension) will generalize best, regardless of
the dimensionality of the input space. In the case of a canonical hyper plane, minimizing the VC-dimension
corresponds to maximizing the margin. As a result, for many applications, SVM has been shown to provide a
better generalization performance than conventional techniques. The training data points can be expressed as
(x1, y1), (x2, y2), . . . , (xl, yl)(xi Rn; yi R is the target value), SVM approximates the function using the
following form:
f (x) = sgn(w.(x)+b) (6)
where (x) represents a high-dimensional feature space which is nonlinearly mapped from the input space x.
The coefficients w and b are estimated by minimizing the regularized risk function . Support vector machine for
a pattern recognition problem can be formulated as the quadratic optimization problem [10].
V. Result and Comparison with Existing System
Total 180 digits, including some scanned digits were taken as sample digits. All of was extracted
correctly. This the Rate of success is 100% in the preprocessing phase.
Table 1: Result of the Experiment.
No. of sample digits 300
No. of Extracted Digits 298
No. of error Extraction 2
Rate of Success 99.33%
Rate of error 0.67%
The performance of an OCR system depends on two facts namely response time and recognition rate. Table-2
shows a recognition rate comparison and Table-3 shows the response time comparison.
Table 2: Recognition rate comparison
Approach 0 1 2 3 4 5 6 7 8 9 Avg
IRPCA 93.55 89.17 90.53 86.27 87.70 90.08 94.20 98.79 98.02 94.44 92.27
PCA+SVM 97.01 91.12 70.38 82.59 93.33 93.14 77.23 91.27 90.67 83.37 87.01
KPCA+SVM 97.08 92.11 71.25 81.85 93.33 92.84 76.37 91.36 90.66 84.10 87.10
KPS 99.06 92.52 93.58 91.95 97.88 97.25 93.14 99.51 98.42 90.64 95.36
Integrated system 98.87 94.41 92.56 91.25 95.56 95.25 93.02 98.36 97.89 93.25 95.06
Proposed
method
98.09 95.75 94.56 96.36 99.32 96.88 95.96 97.25 96.99 96.83 96.80
Table 3: Response time comparison
From the above two tables it is clear that the response time is considerable and the recognition rate of the
proposed method is better than any other system.
VI. Conclusion
This paper proposes an efficient OCR system for handwritten Bangla numeral recognition. The
proposed method is based on preprocessing, k-means clustering, Bayesian theorem and SVM. An efficient
recognition system for handwritten Bangla numerals has been developed. The experimental results confirm the
IRPCA(s) KPS(s) Integrated system Proposed System
0.105 0.23 0.36 0.21
Hand Written Bangla Numerals Recognition for Automated Postal System
www.iosrjournals.org 48 | Page
effectiveness of the proposed approaches. We employ the newest hybridization strategy in the system for
improving recognition performance. The recognition result achieved by the hybrid system is more reliable than
that by one method alone or the integrated method. Experiments also demonstrate that the response time of the
hybrid system is acceptable. It is obvious that hybridization of different approaches will improve the
performance of the whole recognition system. In our future work, we will include more recognition approaches
in our hybrid system for achieving better recognition ability in the case that the response time is acceptable. We
will also work not only for better recognition for postal code but also for the whole address part automated
References
[1] B. B. Chaudhuri, U. Pal, A Complete Printed Bangla OCR System, Pattern Recognition, Vol. 31, No. 5, pp. 531-549, 1998
[2] U. Pal, B.B. Chaudhuri, Indian script character recognition: a survey, Pattern Recognition 37, pp:18871899, year:2004
[3] U. Pal, On the development of an optical character recognition (OCR) system for printed Bangla script, Ph.D. Thesis, 1997.
[4] U. Pal, B.B. Chaudhuri, Automatic recognition of unconstrained offline Bangla hand- written numerals, in: T. Tan, Y. Shi, W. Gao
[5] (Eds.), Advances in Multimodal Interfaces, Lecture Notes in Computer Science, vol. 1948, Springer, Berlin, pp. 371378,year:
2000.
[6] K. Roy, S. Vajda, U. Pal, B.B. Chaudhuri, A system towards Indian postal automation, in: Proceedings of the Ninth International
[7] Workshop on Frontiers in Handwritten Recognition, (IWFHR-9 2004), pp. 580585, year: October 2004.
[8] U. Pal, A. Belad, Ch. Choisy, Touching numeral segmentation using water reservoir concept, Pattern Recognition Lett. 24, 261
272, year: 2003.
[9] YingWen, Yue Lu, Pengfei Shi, Handwritten Bangla numeral recognition system and its application to postal automation .
Pattern Recognition 40, pp:99 107. year:2007.
[10] J.-X. Dong, A. Krzyzak, C.Y. Suen, Fast SVM training algorithm with decomposition on very large data sets, IEEE Trans. Pattern
Anal.