Intensive research has been done on optical character recognition (OCR) and a large number of articles have been
published on this topic during the last few decades. Many commercial OCR systems are now available in the
market. The character recognition problem itself can be considered as a mostly solved. So here we are giving review
of some of the methods for detection of characters/word easily with less error in retrieved text. This material serves
as a guide and update for readers working in the Offline English Characters/Word Recognition area. First, the
historical evolution of OCR systems and English Script Properties is presented. Then, the available Offline English
Characters/Word Recognition techniques with their superiorities are reviewed. Finally, the current status of Offline
English CR is discussed, and directions for future research are suggested. Moreover, the paper also contains a
comprehensive bibliography of many selected papers appeared in reputed journals and conference proceedings as
an aid for the researchers working in the field of Offline English CR.
Keywords— Handwritten Character Recognition, Image processing, Feature extraction, Feed forward Neural
Network, Convolutional NN, SVM, LeNet-5, HMM, Hybrid HMM/ANN, Projection Based Notch-Elimination,
Artificial NN, Stroke length, Contour directional angle, Stochastic Context-Free Grammars (SCFG), Lexicon-
driven and Segmentation, Statistical Model.
I. INTRODUCTION available. This survey is restricted to offline systems.
Standard database is also important for handwriting
O FFLINE handwriting recognition is the task of recognition research. The database is used in the
determining what letters or words are present in a digital development, evaluation, and comparison of different
image of handwritten text. It is of significant benefit to handwriting recognition algorithms [2]. Many databases
man-machine communication and can assist in the have been developed in the character recognition
automatic processing of handwritten documents. It is a community ranging from printed, handwritten, isolated or
subtask of Optical Character Recognition (OCR), whose cursive, and also in various scripts. Some of them are
domain can be machine-print or handwriting but is more publicly available such as IAM [2-3] and IRONOFF [4].
commonly machine-print. The recognition of English
handwriting presents unique challenges and benefits and In this paper, we present a review of the offline handwritten
has been approached more recently than the recognition of English Recognition (OHER) work done on English
text in other scripts. This paper describes the state of the art language scripts. The review is organized into VI sections.
of this field. Handwritten recognition is usually classified Sections I cover introduction and section II or III cover
into two groups which are online and offline. Online properties on OCR or English scripts. In Section IV and V,
character recognition deals with information about writing we discuss different methodologies and performances in
dynamics as the text is being written while offline character OCR development as well as research work done on
recognition deals with static Information in which English characters and word recognition. In Section VI, we
acquisition is done after all the text is written. Offline discuss the scope of future work and conclude the paper.
character recognition usually uses other medium of written
text such as papers. One of the main issues in handwritten II. PROPERTIES OF ENGLISH OCR
text recognition is that its accuracy, in human depends on Machine replication of human functions, like reading, is an
knowledge about which language the text is written. Same ancient dream. However, over the last five decades,
text of the same corpus but written in different language machine reading has grown from a dream to reality. Optical
can result in different accuracy [1]. It is “offline” if it is character recognition has become one of the most
applied to previously written text, such as any images successful applications of technology in the field of
scanned in by a scanner. The online problem is usually character recognition and artificial intelligence. Optical
easier than the offline problem since more information is character recognition is the past when in 1929 Gustav
Tauschek got a patent on OCR in Germany followed by process (on-line or off-line) and 2) the text type (machine-
Handel who obtained a US Patent on OCR in USA in 1933. printed or handwritten) [11]. No matter in which class the
Since then number of character recognition systems have problem belongs, in general, there are five major stages in
been developed and are in use for even commercial the CR problem in Offline Handwritten English
purposes also. But still there is a hope to build some more Recognition [13].1) Preprocessing; 2) Segmentation; 3)
intelligent hand written character recognition system Feature Extraction; 4) Classification; 5) Post processing.
because hand writing differs from one person to other. His
writing style, shape of alphabets and their sizes makes the Preprocessing Segmentation
difference and complexity to recognize the characters [5].
The next generation is characterized by the recognition Preprocessing: The preprocessing stage is a collection of
capabilities of a set of regular machine printed characters as operations that apply successive transformations on an
well as hand-printed characters. At the early stages, the image. It takes in a raw image and enhances it by reducing
scope was restricted to numerals only. Such machines noise and distortion, and hence simplifies segmentation,
appeared in the middle of 1960s to early 1970s. In this feature extraction, and consequently recognition. The
generation, the first and famous OCR system was IBM quality of input text depends on many factors.
1287, which was exhibited at the 1965 New York world
fair [6]. In terms of hardware configuration, the system was Document history: A document that has been faxed or
a hybrid one, combining analog and digital technology. The copied several times is harder to read than the original.
first automatic letter-sorting machine for postal code Text gets thinner or thicker, salt and pepper noise appears,
numbers of Toshiba was also developed during this period. and contrast diminishes.
The methods were based on the structural analysis
approach. Printing process: A typeset document is clearer than a
typewritten one, which in turn is clearer than the output of
The third generation can be characterized by the OCR of a dot-matrix printer. Other deformations that relate to the
poor print quality characters, and hand –printed characters printing process include ink spreading, and ink chipping.
for a large category character set. Commercial OCR
systems with such capabilities appeared roughly during the Font clarity: Exotic fonts, small font sizes, italic and bold
decade 1975 to 1985 [6–8]. characters, subscripts and superscripts, and using multiple
font sizes (e.g., drop caps) and styles complicate
The fourth generation can be characterized by the OCR of recognition.
complex documents intermixing with text, graphics, table
and mathematical symbols, unconstrained hand written Paper quality: Opaque, heavy-weight, smooth, uniform
characters, color document, low-quality noisy documents grain paper is easier to read than lightweight, transparent
like photocopy and fax, etc. Some pieces of work on paper (e.g., newspapers).
complex documents provided good results. Although many
pieces of work on unconstrained hand written character are Document condition: The presence of extraneous markings
available in the literature, the recognition accuracy hardly and stains make reading harder.
exceeds 85%. Very few studies on color documents have
been published and research on this problem is continuing. Image acquisition: The digitization of on-line script is
Also, research on noisy document is in progress [9, 10]. limited by tablet resolution and sampling rate, and often
introduces distortions like small zigzags. The quality of
Writing may be classified as culture specific artifact. Even scanned text is compromised by positioning variations
when using the same language, the motor-behavior of how (skew, translations, stretching, etc.), defocusing, unclean
the text is taught and learned at early school could be document glass, and the limited resolution.
different for different people. The study investigates the
direction of the CR research, analyzing the limitations of Once text is acquired, either on-line or off-line, it should be
methodologies for the systems, which can be classified preprocessed to simplify recognition. Preprocessing
based upon two major criteria: 1) the data acquisition operations are usually specialized image processing
operations that transform the image into another with making stage in which the features extracted from a pattern
reduced noise and variation. are compared to those of the model set. Based on the
features, classification attempts to identify the pattern as a
Those operations include binarization, filtering and member of a certain class. When classifying a pattern,
smoothing, thinning, alignment, normalization, and classification often produces a set of hypothesized solutions
baseline detection. Ideally, preprocessing should remove all
instead of generating a unique solution. The (subsequent)
variations and detail from a text image that are meaningless
to the recognition method. As that goal are still illusive, post-processing stage uses higher level information to
preprocessing attempts to reduce noise and data variations select the correct solution.
as much as possible.
Historically, classification followed two main paradigms:
Segmentation: The segmentation stage takes in a page syntactic (or structural) and statistical (or decision
image and separates the different logical parts, like text theoretic) classification. Recently, recognition using neural
from graphics, lines of a paragraph, and characters (or parts networks has provided a third paradigm.
thereof) of a word. After the preprocessing stage, most
OCR systems isolate the individual characters or strokes Post Processing: The post-processing stage, which is the
before recognizing them. Segmenting a page of text can be final stage, improves recognition by refining the decisions
broken down into two levels: page decomposition and word taken by the previous stage and recognizes words by using
segmentation, When working with pages that contain context. It is ultimately responsible for outputting the best
different object types like graphics, headings, mathematical solution and is often implemented as a set of techniques
formulas, and text blocks, page decomposition separates that rely on character frequencies, lexicons, and other
the different page elements, producing text blocks, lines, context information.
and sub-words. While page decomposition might identify
sets of logical components of a page, word segmentation
The final stage in the recognition process is post-
separates the characters of a sub-word.
processing. One of the objectives of post-processing is to
Feature Extraction: The feature extraction stage analyzes improve word recognition rate (as opposed to character
a text segment and selects a set of features that can be used recognition rate). Post-processing is often implemented as a
to uniquely identify the text segment. These features are set of techniques that rely on character frequencies,
extracted and passed in a form suitable for the recognition lexicons, and other contextual information. As classification,
phase. sometimes, produces a set of possible solutions instead of a
unique solution, post-processing is responsible for selecting
Once an OCR system has an isolated pattern (character or the right solution using higher level information that is not
primitive), its next step is to extract the features of the available to the classifier. Post-processing also uses that
pattern and pass them along to the classifier to classify it.
higher level information to check the correctness of the
Feature extraction is one of the most difficult and important
problems of pattern recognition [14]. The selected set of solutions returned by the classifier. The most common
features should be a small set whose values efficiently post-processing operations are spell checking and
discriminate between patterns of different classes, but are correction. Spell checking can be as simple as looking up
similar for patterns within the same class. The feature words in a lexicon.
extraction step is closely related to classification because
the type of features extracted here must match what the III. PROPERTIES OF ENGLISH SCRIPTS
classifier expects. The two main control approaches for
feature extraction and classification are interleaved control
versus one-step control. In interleaved control, an OCR
system alternates between feature extraction and
classification. In one such realization, the OCR system
extracts a set of features from a pattern, and based on the
feature values, classifies the pattern into a (small) number
of categories. The system then extracts another set of
features that are specific to each category and classifies the Fig. 2. Samples of Handwritten English Characters
pattern [15, 16, and 17]. A to Z
Classification: The classification stage is the main The modern English alphabet is a Latin alphabet consisting
decision-making stage of an OCR system. The of 26 letters (each having an upper case and a lower case
classification stage uses the features extracted in the
previous stage to identify the text segment according to
preset rules. This stage may use feature models obtained in
an (off-line) training (modeling) phase to classify the test
data. Classification in an OCR system is the main decision
The tool to train the system with the obtained feature In general, the overall program has been divided into two
vectors is taken to be HMM because OHR systems based parts, training and testing. Training requires the net to read
on HMM have been shown to outperform segmentation segmented input patterns and testing requires the net to
based approaches [30]-[32]. With the usage of HMM read any test character pattern, to read the produced target
models for the pattern recognition or character recognition, samples and to count the majority of samples and to find
a HMM model keeps information for a character when the out the numeric equivalent of the sample to identify the
model is trained properly and the trained model can be used character. RST gives very good accuracy, if the characters
to recognize an unknown character. The advantage with were written in boxed sheets. In this research, the method
applied used the logic of encaging the characters without
HMM based systems is that they are segmentation free that
using the boxed sheets but the logic provides static
is no pre- segmentation of word/line images into small
encaging. Problems in identifying the characters arises
units such as sub-words or characters is required [31]. On when the characters gets fully deviated from their positions
the other hand, HMM based approaches have been found to on the sheet. Efficient algorithm is still to be explored to
possess some limitations also. These limitations are due to encage the characters written on any position of the paper.
two reasons-(a) the assumptions of conditional independence Generalizations among variations in sizes of the characters
of the observations given the state sequence and (b) the in the box also produces problem.
restriction on feature extraction imposed by frame based
observations [33]. Supriya Deshmukh, Leena Ragha [36] proposes a method
on offline isolated English character. The method is also
applied to Marathi vowels. The image acquired is
preprocessed to remove all unwanted details from the
image so that the image is suitable for feature extraction.
Feature extraction plays an important role in handwritten
recognition. The two feature extraction methods based on
directional features are considered. The first method uses
stroke distribution of a character. The second method uses
contour extraction. The Two directional features are
compared with two different correlation techniques
separately to check the suitability of the recognition
method. First correlation technique calculates the
dissimilarity between reference pattern and test pattern, and
the other calculates the similarity between reference pattern
and test pattern. The result of the comparison is to classify
the character under consideration to a class if hit. If miss,
the confusion information is extracted for the analysis.
They observed that the Stroke length method give good
Fig.8. System Overview of proposed approach performance on a character that has straight lines while
contour method behaves well on a character curves.
In this research, an approach has been made to increase
the rate of recognition of handwritten character by finding Hiromitsu N. and Takehiko T. [37] research examines
both local and global features. Multiple level HMM model effective recognition techniques for deformed characters,
is designed for some specific letters having wide range of extending conventional recognition techniques using on-
line character writing information containing writing
variations from writer to writer. In the last section, a trial
pressure data. That study extends conventional recognition
has been made to put a line of demarcation between
similar looking characters. techniques using on-line character writing information
containing writing pressure information. A recognition
All this specialty of this research has made us obtain an system using simple pattern matching and HMM was made
average accuracy of 98.26%. For a significant number of for evaluation experiments using Common Hand printed
English character patterns from the ETL6 database to
letters, the accuracy rate is even close to 100%.
determine effectiveness of the proposed extending
Rakesh Kumar Mandal and N R Manna [34] give a concept recognition method. Character recognition performance is
of recognizing hand written character pattern has been increased in both expansion recognition methods using on-
developed and implemented called Row-wise segmentation line writing information. On-line character writing
information comprises vector patterns of pen movement at
technique. RST helps in minimizing errors in pattern
writing scene. Offline character patterns are merely pixel
recognition due to different handwriting styles to great
extent. In this method input pattern matrix is segmented patterns that include no vector information. Although some
row-wise into different groups. Target pattern is also off-line character recognition systems use some correlation
grouped where each group is the numeric equivalent of the of stroke information [38], all proposed off-line character
chronological position of each English alphabet. Each input recognition methods use only off-line character pattern
information. However, an effective system for improving
segment is fully interconnected with each target group.
recognition performance could use some on-line writing
Number of target groups is equal to the number of rows in
the input matrix. information for off-line recognition.
units. For modeling the contextual units, a state-tying offline handwritten text lines from the IAM database, and
process based on decision tree clustering is introduced. the recognition rates achieved, in comparison to the ones
Decision trees are built according to a set of expert-based reported in the literature, are among the best for the same task.
questions on how characters are written. Questions are
divided into global questions, yielding larger clusters, and The key features of the recognition system are the novel
precise questions, yielding smaller ones. Such clustering approach to preprocessing and recognition, which are both
enables us to reduce the total number of models and based on ANNs. The preprocessing is based on using
Gaussians densities by 10. We then apply this modeling to MLPs:
the recognition of handwritten words. They introduced a to clean and enhance the images,
to automatically classify local extrema in order to
novel approach to build efficient context dependent word
correct the slope and to normalize the size of the text
models based on the HMM framework. The key features of lines images, and
such approach are the use of dynamic features and state- to perform a nonuniform slant correction.
based clustering.
The recognition is based on hybrid optical HMM/ANN
Salvador E.B., M.J. Castro-Bleda, Jorge Gorbe-Moya, and models, where an MLP is used to estimate the emission
Francisco Z.-M.,[43] proposed the use of hybrid Hidden probabilities.
Markov Model (HMM)/Artificial Neural Network (ANN)
models for recognizing unconstrained offline handwritten Aiquan Yuan, Gang Bai, Po yang, yanni Guo, Xinting Zgao
texts. The structural part of the optical models has been [44] presents a novel segmentation-based and lexicon-
modeled with Markov chains, and a Multilayer Perceptron driven handwritten English recognition systems based. For
is used to estimate the emission probabilities. That research the segmentation, a modified online segmentation method
based rules are applied. Then, convolutional neural
also presents new techniques to remove slope and slant
networks are introduced for offline character recognition.
from handwritten text and to normalize the size of text Experiments are evaluated on UNIPEN lowercase data sets,
images with supervised learning methods. Slope correction with the word recognition rate of 92.20%. That word
and size normalization are achieved by classifying local recognition system is segmentation dependent, exploring
extrema of text contours with Multilayer Perceptrons. Slant segmentation methods with better performances is
is also removed in a nonuniform way by using Artificial considerately critical.
Neural Networks. Experiments have been conducted on
