Optical Character Recognition Technique Algorithms
Optical Character Recognition Technique Algorithms
Optical Character Recognition Technique Algorithms
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
INTRODUCTION
275
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
276
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
Neural Networks
This strategy simulates the way the human
neural system works; it samples the pixels in each
image and matches them to a known index of
character pixel patterns. The ability to recognize the
characters through abstraction is great for fixed
documents and damaged text. Neural networks are
ideal for specific types of problems, such as
processing stock market data or finding trends in
graphical patterns.In all these approaches Neural
Networks are efficient than others.
Figure 4: Structure of ANN
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
278
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
6. PROPOSED ALGORITHM
processing
unit
and
deriving
the
corresponding adjustments needed to
produce the correct output.
The connection weights are then adjusted
and the neural network has just learned
from an experience. Once the network is
trained, it will provide the desired output for
any of the input patterns.
The network undergoes supervised training,
with a finite number of pattern pairs
consisting of an input pattern and a desired
or target output pattern.
An input pattern is presented at the input
layer. The neurons here pass the pattern
activations to the next layer neurons, which
are in a hidden layer.
The outputs of the hidden layer neurons are
obtained by the weights and the inputs, these
hidden layer outputs become inputs to the
output neurons, which process the inputs
using an optional bias and a threshold
function.
The final output of the network is
determined by the activations from the
output layer.
A similar computation, still based on the
error in the output, is made for the
connection weights between the input and
hidden layers.
The procedure is repeated with each pattern
pair assigned for training the network. Each
pass through all the training patterns is
called a cycle or an epoch. The process is
then repeated as many cycles as needed until
the error is within a prescribed tolerance.
The adjustment for the threshold value of a
neuron in the output layer is obtained by
multiplying the calculated error in the output
at the output neuron and the learning rate
and the momentum parameter used in the
adjustment calculation for weights at this
layer.
= e
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
7. APPLICATIONS
Optical character recognition has been applied to
number of applications. Some of them have been
explained below.
A. Invoice Imaging
Invoice imaging is widely used in many
business applications to keep track of financial
records and prevent a backlog of payments from
piling up. In government agencies and independent
organizations, OCR simplifies data collection and
analysis, among other processes. As the technology
continues to develop, more and more applications
are found for OCR technology, including increased
use of hand writing recognition. Furthermore, other
technologies related to OCR, such as barcode
recognition used daily in retail and other industries.
B. Legal Industry
Legal industry is also one of the
beneficiaries of the OCR technology. OCR is used
to digitize documents, and directly entered to
computer database. Legal professional scan further
search documents required from huge data bases by
simply typing a few keywords.
E. Captcha
A CAPTCHA is a program that can
generate and grade tests that human can pass but
current computer programmers cannot. Hacking is
a serious threat to internet usage, now a days, most
of the human activities like economic transactions,
admission for education, registrations, travel
bookings etc., are carried out through internet and
all this requires a password which is misused by
hackers. They create programs to like dictionary
attacks and automatic false enrolments which lead
to waste of memory and resources of website.
Dictionary attack is attack against password
authenticated systems where a hacker writes a
program to repeatedly try different passwords like
from dictionary of most common passwords. In
CAPTCHA, an image consisting of series of letters
of number is generated which is obscured by image
distortion techniques, size and font variation,
distracting backgrounds, random segments,
highlights and noise in the image. This system can
be used to remove this noise and segment the image
to make the image tractable for the OCR (Optical
Character Recognition) systems.
F. Institutional Repositories and Digital
Libraries
C. Banking
Another important application of OCR is
in banking, where it is used to process cheques
without human involvement. A cheque can be
inserted into a machine where the system scans the
amount to be issued and the correct amount of
money is transferred. This technology has nearly
been perfected for printed checks, and is fairly
accurate for hand written checks as well reducing
the waiting time in banks.
D. Healthcare
Healthcare has also seen an increase in the
use of OCR technology to process paperwork.
Healthcare professionals always have to deal with
large volumes of forms for patients, including
insurance forms as well as general health forms. To
keep up with all of this information, it is useful to
input relevant data into an electronic database that
can be accessed as necessary. Form processing
tools, powered by OCR, are able to extract
information from forms and put it into databases, so
that every patient's data is promptly recorded.
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
Table 1: Experimental Result of OCR System
9. CONCLUSION
In this paper we surveyed a large number of
methods of optical character recognition. We
analyzed the advantages and drawbacks of various
OCR methods. We also proposed a modified back
propagation method. It is widely used in neural
network. The proposed method computes error
rate efficiently, it results in increasing the accuracy
of neural network. Our proposed neural network
based method is providing 100 % accuracy in
OCR.
REFERENCES
281
ISSN: 1992-8645
www.jatit.org
E-ISSN: 1817-3195
282