The Eighth IAPR Workshop on Document Analysis Systems
An Objective Evaluation Methodology for Document Image Binarization
Techniques
K. Ntirogiannis, B. Gatos and I. Pratikakis
Computational Intelligence Laboratory,
Institute of Informatics and Telecommunications,
National Center for Scientific Research “Demokritos”,
GR-153 10 Agia Paraskevi, Athens, Greece
http://www.iit.demokritos.gr/cil,
{kntir,bgat,ipratika}@iit.demokritos.gr
evaluators [1], [2], [3], [7], [16] while in the second
category, the binary result is subjected to OCR and the
corresponding result is evaluated with respect to
character or word accuracy [5], [6], [17], [18]. The
third category uses a binary image as ground truth
image and evaluation is performed by comparing the
two images (ground truth and binarization result
image) at pixel level [4], [19], [20], [21], [22]. In this
category, either noise is added to the ground truth
image so as to produce the grey scale image (synthetic
images), or the original grey scale image is optimally
binarized by a human, after many binarization tests and
supplementary cleaning, in order to produce a
satisfying ground truth image. The last category uses a
combination of human-oriented evaluation and OCR
results accuracy [7], [23].
Evaluation performed by a human expert is not
only subjective but also time consuming. Furthermore,
it lacks robustness since it has been observed that in
fuzzy situations, the same observer may make different
selections for the same dataset in different sessions. In
[2], the amount of symbols, text, lines, etc., that are
broken or blurred, the loss of complete objects and the
noise in both background and print are used as
evaluation criteria for a human expert. Moreover, in
[3], two widely used evaluation metrics of Precision
and Recall, are defined by the manual count of fully
restored words. The use of OCR as a means for
evaluation is widely used in the evaluation of modern
machine-printed documents only, supported by
contemporary OCR engines (e.g. ABBYY FineReader)
since handwritten OCR yields no satisfactory results
[1]. Synthetic images are widely used in binarization
evaluation process, however they have to be carefully
constructed and they are usually of small amount.
Abstract
Evaluation of document image binarization
techniques is a tedious task that is mainly performed
by a human expert or by involving an OCR engine.
This paper presents an objective evaluation
methodology for document image binarization
techniques that aims to reduce the human involvement
in the ground truth construction and consecutive
testing. A skeletonized ground truth image is produced
by the user following a semi-automatic procedure. The
estimated ground truth image can aid in evaluating the
binarization result in terms of recall and precision as
well as to further analyze the result by calculating
broken and missing text, deformations and false
alarms. A detailed description of the methodology
along with a benchmarking of the six (6) most
promising state-of-the-art binarization algorithms
based on the proposed methodology is presented.
1. Introduction
Document image binarization is an important step
in the document image analysis and recognition
pipeline. The performance of a binarization technique
directly affects the recognition analysis. Therefore, it is
imperative to have an objective evaluation which will
account for the performance of the binarization.
Several efforts have been presented that strive
towards evaluating the performance of document
image binarization techniques. These efforts can be
divided in four categories. In the first category,
evaluation is performed by one or many human
978-0-7695-3337-7/08 $25.00 © 2008 IEEE
DOI 10.1109/DAS.2008.41
217
curve approximately in the medial axis of the
character. Our proposed skeletonized ground truth
image construction stage strives toward automating the
aforementioned procedure. To accomplish this, we
follow the consecutive stages described in the
following.
The grey scale image (Figure 2(a)) is binarized
using an adaptive binarization technique [24] (Figure
2(b)). Then, a skeletonization method [8] is used and
the resulting skeleton image has one pixel wide text
(Figure 3(a)). Due to artifacts in the character,
skeletonization does not always represent the complete
character. In this case, the user is required to delineate
the remaining character or remove spurious parts. To
further aid the user during the correction stage, we
show both the layers of the skeleton and the grey level
image (Figure 3(b)) to guide him/her in the correction
process. Finally, a second skeletonization pass,
guarantees that ground truth text is one pixel wide
(Figure 3(c)).
In this paper, an objective evaluation methodology
for document image binarization techniques is
presented, that avoids the dependence of an OCR
engine and reduces the human interference as it is
based upon a semi-automated ground truth
construction. Not only the evaluation metrics of
precision and recall, but also broken and missing text,
false alarms and deformations are calculated during the
proposed evaluation procedure.
Figure 1 shows all stages of the proposed
methodology. Each stage is analyzed in the following
sections. Specifically the construction of the ground
truth image is described in Section 2. The evaluation
metrics of recall and precision are detailed in Sections
3 and 4 respectively. In Section 5 the evaluation results
of
representative
state-of-the-art
binarization
techniques are presented and in Section 6 the
conclusions are discussed.
(a)
(b)
Figure 2. (a) Original image and (b) the corresponding
binary image after adaptive binarization
(c)
(b)
(a)
Figure 3. (a) Binary image after skeletonization
(skeleton image); (b) simultaneous viewing of skeleton
and grey level image layer; (c) skeletonized ground
truth image which text is one pixel wide
Figure 1. The proposed methodology stages
The skeletonized ground truth image and the
corresponding connected component labeled image are
defined by the following equations:
⎧0, background
(1)
SG ( x, y ) = ⎨
⎩1, text
2. Construction of Ground Truth Image
Stage
In the proposed methodology, the construction of
ground truth plays an important role, since it aids
towards the automation of the evaluation procedure. It
consists of two distinct stages, namely Skeletonized
Ground Truth (SG) stage and Estimated Ground Truth
(EG) stage. These stages will be described in detail, in
the following sections.
2.1.
⎧0, if SG(x,y) = 0
SGC ( x, y ) = ⎨
⎩i, otherwise
(2)
where i ∈ K , K = {1, 2…, M} and M denotes the
maximum number of the connected components found
in the ground truth image.
After the end of the skeletonized ground truth
construction stage, we are able to automatically
measure the performance of any binarization algorithm
in terms of recall (see Section 3).
Skeletonized Ground Truth Image Stage
The idea of building a skeletonized ground truth
image originates from a user’s natural interaction, in
which a character is presented by its silhouette. For
this task, a user would directly draw a one pixel wide
218
2.2.
corresponding skeletonized ground truth component or
components (Figure 5(b)) start to dilate. As soon as
half of the edges that belong to cci are covered by the
dilated skeletonized ground truth components (Figures
5(c) and 5(d)), dilation stops. The dilated components
represent the estimated ground truth components
(Figure 5(f)). The construction of the estimated ground
truth image requires the aforementioned procedure to
be applied in every connected component of the binary
image B. It is worth mentioning that dilated
components cannot exceed cci, denoting that a dilated
component cannot be larger than the original one.
For the sake of clarity, we provide the following
definitions, along with the algorithm for the estimated
ground truth image stage in pseudo code.
Let A be a binary image, and Ar denotes the
corresponding dilated image after r iterations. Let BC
denotes the connected component labeled image of B,
defined as:
⎧0, if B(x,y) = 0
(5)
BC ( x, y ) = ⎨
⎩ j , otherwise
Estimated Ground Truth Image Stage
To complete the evaluation process we should
calculate the performance of binarization algorithms in
terms of precision (see Section 4). Precision requires
considering ground truth characters as much close as
the original ones. In this paper, we present a
methodology to automatically estimate the ground
truth for the computation of precision taking into
account that a skeletonized ground truth image has
been achieved.
Given the skeletonized ground truth image, we
apply a dilation constrained by the edge image CE
(Figure 4(a)) and the binary image B under evaluation
(Figure 4(b)) where
⎧1, if (x,y) ∈ ∂I ( x, y )
CE ( x, y ) = ⎨
(3)
⎩0, otherwise
⎧0, background
(4)
B ( x, y ) = ⎨
⎩1, text
where ∂I ( x, y ) denotes the resulting image after Canny
Edge detection [9] on the original grey scale image
I(x,y).
where j ∈ L , L = {1, 2…, N} and N denotes the
maximum number of the connected components found
in the binarized image.
Algorithm description (pseudo code)
1. A (x, y) = SG(x, y),
0
2. ∀(x, y) ∈ B(x, y) : BC(x, y) = i
(a)
(b)
(c)
Figure 4. (a) Edges of original image; (b) binary image
AND
under evaluation and (c) estimated ground truth image
(in grey)
∑
BC(x,y) = i
B(x,y) ⋅ SG(x,y) >0
3. Stop = false
4. while (NOT(Stop))
r+1
(a)
(b)
(c)
5. A (x, y) = ( A (x, y) ⊕ B(x, y)) ∩ B(x, y)
(d)
6. if
(e)
∑
r
r+1
A (x, y) ⋅ CE(x, y)
∑
BC(x,y) = i
BC(x,y) = i
(f)
r +1
CE(x, y)
>
1
2
OR A (x, y) = A (x, y)
Figure 5. (a) A connected component of the binary
image B; (b) the corresponding edges (in black) along
with the skeletonized ground truth SG components (in
grey); (c) the dilated SG components after one dilation
(in grey), 15.29% of the edges covered; (d) the dilated
SG components after two dilations (in grey), 54.9% of
the edges covered, dilation stops; (e) the original image
and (f) the estimated ground truth image (in grey)
placed over the binary image (in black)
7.
r
Stop = true
8. End if
where i ∈ L and ⊕ denotes dilation.
The aforementioned procedure described by the
pseudo code is applied for each connected component
of the binary image B and the estimated ground truth
image EG is fully constructed, as defined in Eq. 6.
9. End while
Specifically, as far as a connected component (cci)
of the binary image B is concerned (Figure 5(a)), the
219
∑
x = Ix , y = Iy
⎧
SG (x,y) ⋅ B (x,y) > 0
⎪ 1, if
⎪
x =1, y =1
f (i ) = ⎨
(8)
SGC (x,y) =i
⎪
⎪⎩0, otherwise
where i ∈ K.
Broken text (BT) can be defined by the percentage
of the skeletonized ground truth image SG parts which
are not detected in the resulting binary image B while
belonging to components that are partially detected
(Figures 6(d) and 6(e)). Broken text is given by the
following equation:
Example estimated ground truth images are shown in
Figures 4(c) and 5(f).
⎧0, background
(6)
EG ( x, y ) = ⎨
⎩1, text
3. Recall
Recall is defined as the percentage of the
skeletonized ground truth image SG that is detected in
the resulting binary image B. Recall is given by the
following equation:
∑
x = Ix , y = Iy
Recall =
SG ( x, y ) ⋅ B( x, y )
x =1, y =1
x = Ix , y = Iy
∑
x =1, y =1
∑
x = Ix , y = Iy
100 % (7)
BT =
SG ( x, y )
x =1, y =1
3.2.
100% (9)
SG ( x, y )
Missing Text
Missing text (MT) is defined by the percentage of
the skeletonized ground truth image SG parts which
are not detected in the resulting binary image B while
belonging to components that are not detected at all
(Figures 6(d) and 6(f)). Missing text is given by the
following equation:
∑
x = Ix , y = Iy
MT =
(b)
M
∑
x = Ix , y = Iy
x =1, y =1
These parts that belong to SG and are not detected
can be classified as broken or missing text. Broken text
is related with SG components that are partially
detected while missing text denotes all SG components
that have not been detected at all (Figures 6(d), 6(e),
and 6(f)). In the following sections, we further analyze
the recall result by calculating broken and missing
texts information.
(a)
f ( SGC ( x, y )) ⋅ (1 − B( x, y ))
x =1, y =1
B
(1 − f ( SGC ( x, y ))) ⋅ SG ( x, y )
∑
x = Ix , y = Iy
x =1, y =1
100% (10)
SG ( x, y )
4. Precision
(c)
(e)
B
Considering the binary image B, precision
estimates the foreground areas that are actually text. In
our method, the actual text is considered as the
estimated ground truth image EG as described in
Section 2.
Precision is defined as the percentage of the
estimated ground truth image that is detected in the
binary image (Figure 7(c)).
(d)
(f)
Figure 6. (a) Original grey scale image; (b) resulting
binary image B; (c) skeletonized ground truth SG image
overlay; (d) broken [B] and missing [M] text parts; (e)
Broken parts (in black) indicated in circles and (f)
missing part indicated in circle
3.1.
∑
x = Ix , y = Iy
Precision =
Broken Text
x =1, y =1
EG ( x, y ) ⋅ B ( x, y )
∑
x = Ix , y = Iy
x =1, y =1
Let f() be the function that defines whether part of a
skeletonized ground truth component is partially
detected in the binary image B. f() is calculated as
follows:
100%
(11)
B ( x, y )
The foreground areas of the binary image that are
not detected during precision estimation are considered
220
(deform) components but they are also responsible for
merging adjacent components (Figures 7(d) and 7(e)).
In our method, deformations of the binary image
are defined by the percentage of all text pixels of the
binary image B that are not detected in the estimated
ground truth image EG and do not belong to false
alarms components as described in the previous
section.
The deformation of the binary image is defined by
Eq. 14 while deformation leading to merging is
defined by Eq. 15.
as either false alarms or deformations which are
described in the following section.
4.1.
False Alarms
False alarms refer to foreground pixels of the
binary image B that do not belong to estimated ground
truth image (Figure 7(f)). They are defined by the
percentage of all pixels of the components of the
binary image B that do not have any corresponding
pixel with the skeletonized ground truth image SG.
∑
x = Ix , y = Iy
FAlarms =
x =1, y =1
∑
x = Ix , y = Iy
x =1, y =1
Deform =
100% (12)
x =1, y =1
B ( x, y )
d ( BC ( x, y )) ⋅ B ( x, y ) ⋅ (1 − EG ( x, y ))
∑
x = Ix, y = Iy
where h(i) is a function denoting whether a binary
component is not detected in the skeletonized ground
truth image.
x = Ix , y = Iy
⎧
SG(x,y) ⋅ B(x,y) = 0
⎪1, if
⎪
(13)
h(i ) = ⎨
x =1, y =1
BC ( x , y ) = i
⎪
⎪⎩0, otherwise
MergeDeform =
∑
x =1, y =1
∑
100% (14)
x = Ix , y = Iy
x =1, y =1
B ( x, y )
m( BC ( x, y )) ⋅ B ( x, y ) ⋅ (1 − EG ( x, y ))
∑
x = Ix , y = Iy
x =1, y =1
100%
(15)
B ( x, y )
where d(i) and m(i) are functions denoting whether a
binary component corresponds to one or more ground
truth components respectively. Functions d(i) and m(i)
are defined as follows:
where i ∈ L
(a)
∑
x = Ix , y = Iy
h( BC ( x, y )) ⋅ B ( x, y )
⎧1, if | SGC ( x, y ) | = 1
⎪
d (i ) = ⎨ ∀( x, y ) : BC ( x, y ) = i
⎪0, otherwise
⎩
⎧1, if | SGC ( x, y ) | > 1,
⎪
m(i ) = ⎨ ∀( x, y ) : BC ( x, y ) = i
⎪0, otherwise
⎩
(b)
(16)
(17)
where | SGC(x,y) | denotes the Cardinality of SGC(x,y).
(c)
(e)
(d)
5. Experimental Results
The proposed objective evaluation methodology for
document image binarization techniques was applied
on a set of grey scale handwritten and machine-printed
documents with low quality, shadows, non-uniform
illumination, strains, presence of characters from the
other side of the page and other significant artifacts.
Among all documents, we selected the twenty (20)
most representative (ten handwritten and ten machineprinted) and marked the skeletonized ground truth SG
following the procedure described in Section 2. Six (6)
of the most promising global and adaptive binarization
techniques were chosen for evaluation:
1. Otsu’s method (OTS) [10]
2. Bernsen’s method (BER) [11]
3. Niblack’s method (NIB) [12]
4. Sauvola’s method (SAU) [13]
(f)
Figure 7. (a) Original grey scale image; (b) resulting
binary image B; (c) Estimated ground truth image (in
grey) and false alarms along with deformations (in
black); (d) Deformations that do not result in component
merging (in black); (e) Deformations that do result in
component merging (in black) and (f) false alarms (in
black).
4.2.
Deformations
Components often merge with adjacent background
information that was detected as text during
binarization. Deformations do not only enlarge
221
5. Adaptive Logical method (AL) [14]
6. Adaptive Degraded Document method (GPP)[15]
Our evaluation is based on F-measure which is defined
as follows:
2 ⋅ Precision ⋅ Recall
F=
(18)
Precision + Recall
An example of the application of all methodologies
to a grey scale handwritten document image is given in
Figure 8 where the skeletonized ground truth image
SG is also demonstrated. The corresponding evaluation
results are shown in Table 1. Likewise, in Figure 9, an
example of the application of all methodologies to a
grey scale machine-printed document image is given
and Table 2 demonstrates the corresponding evaluation
results.
The evaluation results where the average values for
all handwritten and machine-printed test images have
been taken into account are presented in Tables 3 and 4
respectively. According to these results, Adaptive
Degraded Document method (GPP) [15] had the best
overall performance concerning both handwritten and
machine-printed document images with F-measures of
85.23% and 93.19% respectively. As far as the
handwritten images are concerned, GPP performed
slightly better than the Sauvola’s method (SAU) [13]
which F-measure is 84.61%, while in the case of
machine-printed images, Otsu’s method (OTS) [10]
was second with F-measure of 93.10%.
A more extensive analysis of the evaluation results,
concerning all twenty (20) test images (Table 5),
shows that Adaptive Level Method (AL) [14] has the
best precision (88.44%) (Table 5) and the highest
broken text value (Tables 3 and 4) that leads to the
worst recall (87.55%) (Table 5). This behavior is
expected, since AL is based on stroke width
estimation. Moreover, in handwritten documents which
have characters of nearly the same width, AL has
indeed the best precision (88.83%) (Table 3), while in
machine-printed images, where characters of different
widths may exist, AL has the fourth best precision
(88.04%) (Table 4). On the other hand, Niblack’s
(NIB) [12] and Otsu’s (OTS) [10] methods, have the
best and second best recall rates of 99.15% and
93.90% respectively. However, they have the lowest
and second lowest precision rates of 47.13% and
81.01% respectively. In other words, these two
methods preserve the characters information, but they
cannot effectively remove the background noise,
especially in cases where background and characters
do not significantly differ. Bernstein’s method (BER)
[11] is slightly better than OTS (Table 5), even if it has
more broken text and false alarms (Tables 3 and 4).
The overall rank that is presented in Table 5 is
verified by other works, [4] and [15]. In the first case
of [4], where a binary ground truth image and pixel
accuracy were used as well, SAU, BER, OTS and NIB
were evaluated with the same rank, concerning the
document test images. In the second case of [15],
where OCR word accuracy was used, GPP, SAU, OTS
and NIB were also evaluated with the same rank
according to the proposed methodology. AL method
did not participate in any of the other works
mentioned.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Figure 8. (a) Original handwritten image; (b)
skeletonized ground truth image; (c) GPP binarization;
(d) AL binarization; (e) BER binarization; (f) NIB
binarization; (g) OTS binarization and (h) SAU
binarization.
Table 1. All evaluation metrics for every binarization
technique concerning the handwritten image of Figure
8.
Figure 8
F-measure
Recall
Precision
Broken
Text
Missing
Text
MergeDeform
Deform
False
Alarms
222
GPP
93.25
96.27
90.42
AL
87.92
88.74
87.11
BER
90.39
96.90
84.69
NIB
67.95
99.56
51.58
OTS
91.23
98.58
84.91
SAU
92.84
96.36
89.56
03.73
11.09
02.93
00.44
01.25
03.64
00.00
00.17
00.17
00.00
00.17
00.00
00.71
00.14
00.19
07.84
04.81
00.74
06.81
09.30
11.54
08.39
07.24
06.81
02.06
03.45
03.57
32.19
02.84
02.89
Table 3. The average value of all evaluation metrics
for every binarization technique concerning all
handwritten images.
Hand/ten
F-measure
Recall
Precision
Broken
Text
Missing
Text
MergeDeform
Deform
False
Alarms
(a)
(b)
(c)
GPP
85.23
85.36
87.98
AL
81.99
79.82
88.83
BER
77.70
87.83
75.86
NIB
50.87
98.47
35.42
OTS
77,02
89.40
73.59
SAU
84.61
86.51
85.89
14.04
19.53
11.90
01.42
10.30
12.89
00.60
00.65
00.27
00.01
00.30
00.60
01.05
00.29
13.42
12.93
15.36
00.61
10.02
08.94
06.89
09.14
08.96
10.22
00.95
01.94
03.83
42.51
02.09
03.28
Table 4. The average value of all evaluation metrics
(d)
for every binarization technique concerning all machineprinted images.
Printed
F-measure
Recall
Precision
Broken
Text
Missing
Text
MergeDeform
Deform
False
Alarms
(e)
(f)
(g)
GPP
93.19
98.33
88.64
AL
91.39
95.27
88.04
BER
92.74
97.16
88.95
NIB
72.41
99.82
58.84
OTS
93.10
98.40
88.42
SAU
92.30
98.64
86.92
01.66
04.72
02.83
00.18
01.60
01.36
00.01
00.01
00.01
00.00
00.00
00.00
00.30
00.09
00.14
01.82
00.45
00.46
09.82
09.04
08.16
09.21
10.03
09.81
01.24
02.83
02.75
30.13
01.10
02.81
Table 5. The average value of F-measure, Recall and
Precision, for every binarization technique concerning
all test images.
(h)
All
F-measure
Recall
Precision
Figure 9. (a) Original machine-printed image; (b)
skeletonized ground truth image; (c) GPP binarization;
(d) AL binarization; (e) BER binarization; (f) NIB
binarization; (g) OTS binarization and (h) SAU
binarization.
AL
91.67
98.35
85.85
BER
93.41
98.88
88.52
NIB
46.00
99.91
29.88
OTS
91.59
99.93
84.53
SAU
92.15
99.29
85.96
00.76
01.64
01.12
00.09
00.07
00.71
00.00
00.01
00.00
00.00
00.00
00.00
01.14
00.41
01.03
02.50
02.22
01.50
09.65
10.03
08.72
04.84
12.37
09.07
00.97
03.71
01.73
62.78
00.88
02.87
BER
85.22
92.50
82.41
NIB
61.64
99.15
47.13
OTS
85.06
93.90
81.01
SAU
88.46
92.58
86.41
This work is based on a semi-automatic procedure
for the construction of the ground truth as well as a
fully automated evaluation scheme. The evaluation
results of the proposed methodology, are extensively
analyzed and verified by other works, [4] and [15], as
presented in the previous section. However, it is likely
to evaluate different binary images with different
ground truth images, since the ground truth image is
constructed in relation with the corresponding binary
image B and the edge image CE. Even though, the
difference of the ground truth images cannot seriously
affect the evaluation results.
In future work, we will focus on the development
of a more accurate technique for the estimation of
technique concerning the machine-printed image of
Figure 9.
GPP
93.42
99.24
88.24
AL
86.69
87.55
88.44
6. Conclusion
Table 2. All evaluation metrics for every binarization
Figure 9
F-measure
Recall
Precision
Broken
Text
Missing
Text
MergeDeform
Deform
False
Alarms
GPP
89.21
91.85
88.31
223
Conference on Pattern Recognition, Paris, France,
1986, pp. 1251-1255.
ground truth image EG, that will not depend on the
binary image B under evaluation.
[12] W. Niblack, “An Introduction to Digital Image
Processing”, Prentice-Hall, Englewood Cliffs, NJ, pp.
115–116, 1986.
Acknowledgements
[13] J. Sauvola, and M.Pietikainen, “Adaptive Document
The research leading to these results has received
funding from the European Community's Seventh
Framework Programme under grant agreement n°
215064 (project IMPACT).
Image Binarization”, Pattern Recognition, 2000, vol.
33, no. 2, pp. 225-236.
[14] Y. Yang, and H. Yan, “An Adaptive Logical Method
for Binarization of Degraded Document Images”,
Pattern Recognition, 2000, vol. 33, pp. 787-807.
[15] B. Gatos, I. Pratikakis, and S. J. Perantonis, "Adaptive
Degraded Document Image Binarization", Pattern
Recognition, March 2006, vol. 39, no. 3, pp. 317-327.
References
[1] E. Kavallieratou, and S. Stathis, “Adaptive Binarization
of Historical Document Images”, 18th International
Conference on Pattern Recognition (ICPR’06), Hong
Kong, China, 2006, vol. 3, pp. 742-745.
[2] D. Trier, and T. Taxt, “Evaluation of Binarization
Methods for Document Images”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, March
1995, vol. 17, no. 3, pp. 312-315.
[16] B. H. Kang, G. S. Han, H. G. Kim, J. S. Kim, C. R.
Yoon, and M. S. Cho, “Fuzzy Inference and Logical
Level Methods for Binary Graphic/Character Image
Extraction”, 1998 IEEE International Conference on
Systems, Man and Cybernetics, San Diego, California,
USA, October 1998, vol. 5, pp. 4626-4629.
[17] C. Wolf, J. M. Jolion, and F. Chassaing, “Text
[3] Q. Wang, and C. L. Tan, “Matching of Double Sided
Document Images to Remove Interference”, IEEE
Computer Society Conference on Computer Vision and
Pattern Recognition (CVPR ’01), Kauai island of
Hawaii, USA, 2001, vol. 1, p. 1084.
Localization, Enhancement and Binarization in
Multimedia Documents”, Proceedings of the 16th
International Conference on Pattern Recognition
(ICPR ’02), Quebec City, Canada, August 2002, vol.2,
pp. 1037-1040.
[4] M. Sezgin, and B. Sankur, “Survey over Image
[18] H. Cao, and V. Govindaraju, “Handwritten Carbon
Thresholding Techniques and Quantitative Performance
Evaluation”, Journal of Electronic Imaging, January
2004, vol. 13, no. 1, pp. 146-168.
[5] J. He, Q. D. M. Do, A.C. Downton, and J. H. Kim, “A
Comparison of Binarization Methods for Historical
Archive Documents”, Proceedings of the 8th
International Conference on Document Analysis and
Recognition (ICDAR ’05), Seoul, South Korea, AugustSeptember 2005, vol. 1, pp. 538-542.
[6] Y. Zhu, C. Wang, and R. Dai, “Document Image
Binarization Based on Stroke Enhancement”,
Proceedings of the 18th International Conference on
Pattern Recognition (ICPR ’06), Hong Kong, China,
August 2006, vol. 1, pp. 955-958.
Form Preprocessing Based on Markov Random Field”,
IEEE Conference on Computer Vision and Pattern
Recognition (CVPR ’07), USA, June 2007, pp. 1-7.
[19] C. Bastos, C. Mello, J. Andrade, D. Falcão, M. Lima,
W. Santos, and A. Oliveira, “Thresholding Images of
Historical Documents with Back-to-Front Interference
based on Color Quantization by Genetic Algorithms”,
19th IEEE International Conference on Tools with
Artificial Intelligence(ICTAI ‘07), Patras, Greece,
October 2007, vol. 1, pp. 488-491.
[20] H. Kohmura, and T. Wakahara, “Determining Optimal
Filters for Binarization of Degraded Characters in
Color Using Genetic Algorithms”, 18th International
Conference on Pattern Recognition (ICPR '06), Hong
Kong, August 2006, vol. 3, pp. 661-664.
[7] F. Chang, K. H. Liang, T. M. Tan, and W. L. Hwang,
“Binarization of Document Images Using Hadamard
Multiresolution Analysis”, Proceedings of the 5th
International Conference on Document Analysis and
Recognition (ICDAR ’99), Bangalore, India, September
1999, pp. 157-160.
[21] T. Obafemi-Ajayi, G. Agam, and O. Frieder,
[8] H. J. Lee, and B. Chen, “Recognition of Handwritten
[22] LY. Fan, C. L. Tan, and LX. Fan, “Edge-preserving
Chinese Characters via Short Line Segments”, Pattern
Recognition, May 1992, vol. 25, no. 5, pp. 543-552.
Prefiltering for Document Image Binarization”,
International Conference on Image Processing, Greece,
October 2001, vol. 1, pp. 1070-1073.
“Ensemble LUT Classification for Degraded Document
Enhancement”, Proceedings of SPIE, 15th Document
Recognition and Retrieval Conference, San Jose,
California, USA, January 2008, vol. 6815.
[9] J. Canny, “A Computational Approach to Edge
[23] V. Wu, R. Manmatha, and E. M. Riseman, “Textfinder:
Detection”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, November 1986, vol. 8, no. 6,
pp. 679-698.
An Automatic System to Detect and Recognize Text in
Images”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, November 1999, vol. 21, no. 11,
pp. 1224-1229.
[10] N. Otsu, “A Thresholding Selection Method from
Gray-level Histogram”, IEEE Transanctions on
Systems, Man and Cybernetics, March 1979, vol. 9,
pp. 62-66.
[24] M. Kamel, and A. Zhao, “Extraction of Binary
Character/Graphics Images from Grayscale Document
Images”, CVGIP: Computer Vision Graphics and
Image Processing, May 1993, vol. 55, no. 3, pp. 203217.
[11] J. Bernsen, “Dynamic Thresholding of Grey-level
Images”,
Proceedings
of
the
8th
International
224