Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

An Objective Evaluation Methodology for Document Image Binarization Techniques

2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems

The Eighth IAPR Workshop on Document Analysis Systems An Objective Evaluation Methodology for Document Image Binarization Techniques K. Ntirogiannis, B. Gatos and I. Pratikakis Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, GR-153 10 Agia Paraskevi, Athens, Greece http://www.iit.demokritos.gr/cil, {kntir,bgat,ipratika}@iit.demokritos.gr evaluators [1], [2], [3], [7], [16] while in the second category, the binary result is subjected to OCR and the corresponding result is evaluated with respect to character or word accuracy [5], [6], [17], [18]. The third category uses a binary image as ground truth image and evaluation is performed by comparing the two images (ground truth and binarization result image) at pixel level [4], [19], [20], [21], [22]. In this category, either noise is added to the ground truth image so as to produce the grey scale image (synthetic images), or the original grey scale image is optimally binarized by a human, after many binarization tests and supplementary cleaning, in order to produce a satisfying ground truth image. The last category uses a combination of human-oriented evaluation and OCR results accuracy [7], [23]. Evaluation performed by a human expert is not only subjective but also time consuming. Furthermore, it lacks robustness since it has been observed that in fuzzy situations, the same observer may make different selections for the same dataset in different sessions. In [2], the amount of symbols, text, lines, etc., that are broken or blurred, the loss of complete objects and the noise in both background and print are used as evaluation criteria for a human expert. Moreover, in [3], two widely used evaluation metrics of Precision and Recall, are defined by the manual count of fully restored words. The use of OCR as a means for evaluation is widely used in the evaluation of modern machine-printed documents only, supported by contemporary OCR engines (e.g. ABBYY FineReader) since handwritten OCR yields no satisfactory results [1]. Synthetic images are widely used in binarization evaluation process, however they have to be carefully constructed and they are usually of small amount. Abstract Evaluation of document image binarization techniques is a tedious task that is mainly performed by a human expert or by involving an OCR engine. This paper presents an objective evaluation methodology for document image binarization techniques that aims to reduce the human involvement in the ground truth construction and consecutive testing. A skeletonized ground truth image is produced by the user following a semi-automatic procedure. The estimated ground truth image can aid in evaluating the binarization result in terms of recall and precision as well as to further analyze the result by calculating broken and missing text, deformations and false alarms. A detailed description of the methodology along with a benchmarking of the six (6) most promising state-of-the-art binarization algorithms based on the proposed methodology is presented. 1. Introduction Document image binarization is an important step in the document image analysis and recognition pipeline. The performance of a binarization technique directly affects the recognition analysis. Therefore, it is imperative to have an objective evaluation which will account for the performance of the binarization. Several efforts have been presented that strive towards evaluating the performance of document image binarization techniques. These efforts can be divided in four categories. In the first category, evaluation is performed by one or many human 978-0-7695-3337-7/08 $25.00 © 2008 IEEE DOI 10.1109/DAS.2008.41 217 curve approximately in the medial axis of the character. Our proposed skeletonized ground truth image construction stage strives toward automating the aforementioned procedure. To accomplish this, we follow the consecutive stages described in the following. The grey scale image (Figure 2(a)) is binarized using an adaptive binarization technique [24] (Figure 2(b)). Then, a skeletonization method [8] is used and the resulting skeleton image has one pixel wide text (Figure 3(a)). Due to artifacts in the character, skeletonization does not always represent the complete character. In this case, the user is required to delineate the remaining character or remove spurious parts. To further aid the user during the correction stage, we show both the layers of the skeleton and the grey level image (Figure 3(b)) to guide him/her in the correction process. Finally, a second skeletonization pass, guarantees that ground truth text is one pixel wide (Figure 3(c)). In this paper, an objective evaluation methodology for document image binarization techniques is presented, that avoids the dependence of an OCR engine and reduces the human interference as it is based upon a semi-automated ground truth construction. Not only the evaluation metrics of precision and recall, but also broken and missing text, false alarms and deformations are calculated during the proposed evaluation procedure. Figure 1 shows all stages of the proposed methodology. Each stage is analyzed in the following sections. Specifically the construction of the ground truth image is described in Section 2. The evaluation metrics of recall and precision are detailed in Sections 3 and 4 respectively. In Section 5 the evaluation results of representative state-of-the-art binarization techniques are presented and in Section 6 the conclusions are discussed. (a) (b) Figure 2. (a) Original image and (b) the corresponding binary image after adaptive binarization (c) (b) (a) Figure 3. (a) Binary image after skeletonization (skeleton image); (b) simultaneous viewing of skeleton and grey level image layer; (c) skeletonized ground truth image which text is one pixel wide Figure 1. The proposed methodology stages The skeletonized ground truth image and the corresponding connected component labeled image are defined by the following equations: ⎧0, background (1) SG ( x, y ) = ⎨ ⎩1, text 2. Construction of Ground Truth Image Stage In the proposed methodology, the construction of ground truth plays an important role, since it aids towards the automation of the evaluation procedure. It consists of two distinct stages, namely Skeletonized Ground Truth (SG) stage and Estimated Ground Truth (EG) stage. These stages will be described in detail, in the following sections. 2.1. ⎧0, if SG(x,y) = 0 SGC ( x, y ) = ⎨ ⎩i, otherwise (2) where i ∈ K , K = {1, 2…, M} and M denotes the maximum number of the connected components found in the ground truth image. After the end of the skeletonized ground truth construction stage, we are able to automatically measure the performance of any binarization algorithm in terms of recall (see Section 3). Skeletonized Ground Truth Image Stage The idea of building a skeletonized ground truth image originates from a user’s natural interaction, in which a character is presented by its silhouette. For this task, a user would directly draw a one pixel wide 218 2.2. corresponding skeletonized ground truth component or components (Figure 5(b)) start to dilate. As soon as half of the edges that belong to cci are covered by the dilated skeletonized ground truth components (Figures 5(c) and 5(d)), dilation stops. The dilated components represent the estimated ground truth components (Figure 5(f)). The construction of the estimated ground truth image requires the aforementioned procedure to be applied in every connected component of the binary image B. It is worth mentioning that dilated components cannot exceed cci, denoting that a dilated component cannot be larger than the original one. For the sake of clarity, we provide the following definitions, along with the algorithm for the estimated ground truth image stage in pseudo code. Let A be a binary image, and Ar denotes the corresponding dilated image after r iterations. Let BC denotes the connected component labeled image of B, defined as: ⎧0, if B(x,y) = 0 (5) BC ( x, y ) = ⎨ ⎩ j , otherwise Estimated Ground Truth Image Stage To complete the evaluation process we should calculate the performance of binarization algorithms in terms of precision (see Section 4). Precision requires considering ground truth characters as much close as the original ones. In this paper, we present a methodology to automatically estimate the ground truth for the computation of precision taking into account that a skeletonized ground truth image has been achieved. Given the skeletonized ground truth image, we apply a dilation constrained by the edge image CE (Figure 4(a)) and the binary image B under evaluation (Figure 4(b)) where ⎧1, if (x,y) ∈ ∂I ( x, y ) CE ( x, y ) = ⎨ (3) ⎩0, otherwise ⎧0, background (4) B ( x, y ) = ⎨ ⎩1, text where ∂I ( x, y ) denotes the resulting image after Canny Edge detection [9] on the original grey scale image I(x,y). where j ∈ L , L = {1, 2…, N} and N denotes the maximum number of the connected components found in the binarized image. Algorithm description (pseudo code) 1. A (x, y) = SG(x, y), 0 2. ∀(x, y) ∈ B(x, y) : BC(x, y) = i (a) (b) (c) Figure 4. (a) Edges of original image; (b) binary image AND under evaluation and (c) estimated ground truth image (in grey) ∑ BC(x,y) = i B(x,y) ⋅ SG(x,y) >0 3. Stop = false 4. while (NOT(Stop)) r+1 (a) (b) (c) 5. A (x, y) = ( A (x, y) ⊕ B(x, y)) ∩ B(x, y) (d) 6. if (e) ∑ r r+1 A (x, y) ⋅ CE(x, y) ∑ BC(x,y) = i BC(x,y) = i (f) r +1 CE(x, y) > 1 2 OR A (x, y) = A (x, y) Figure 5. (a) A connected component of the binary image B; (b) the corresponding edges (in black) along with the skeletonized ground truth SG components (in grey); (c) the dilated SG components after one dilation (in grey), 15.29% of the edges covered; (d) the dilated SG components after two dilations (in grey), 54.9% of the edges covered, dilation stops; (e) the original image and (f) the estimated ground truth image (in grey) placed over the binary image (in black) 7. r Stop = true 8. End if where i ∈ L and ⊕ denotes dilation. The aforementioned procedure described by the pseudo code is applied for each connected component of the binary image B and the estimated ground truth image EG is fully constructed, as defined in Eq. 6. 9. End while Specifically, as far as a connected component (cci) of the binary image B is concerned (Figure 5(a)), the 219 ∑ x = Ix , y = Iy ⎧ SG (x,y) ⋅ B (x,y) > 0 ⎪ 1, if ⎪ x =1, y =1 f (i ) = ⎨ (8) SGC (x,y) =i ⎪ ⎪⎩0, otherwise where i ∈ K. Broken text (BT) can be defined by the percentage of the skeletonized ground truth image SG parts which are not detected in the resulting binary image B while belonging to components that are partially detected (Figures 6(d) and 6(e)). Broken text is given by the following equation: Example estimated ground truth images are shown in Figures 4(c) and 5(f). ⎧0, background (6) EG ( x, y ) = ⎨ ⎩1, text 3. Recall Recall is defined as the percentage of the skeletonized ground truth image SG that is detected in the resulting binary image B. Recall is given by the following equation: ∑ x = Ix , y = Iy Recall = SG ( x, y ) ⋅ B( x, y ) x =1, y =1 x = Ix , y = Iy ∑ x =1, y =1 ∑ x = Ix , y = Iy 100 % (7) BT = SG ( x, y ) x =1, y =1 3.2. 100% (9) SG ( x, y ) Missing Text Missing text (MT) is defined by the percentage of the skeletonized ground truth image SG parts which are not detected in the resulting binary image B while belonging to components that are not detected at all (Figures 6(d) and 6(f)). Missing text is given by the following equation: ∑ x = Ix , y = Iy MT = (b) M ∑ x = Ix , y = Iy x =1, y =1 These parts that belong to SG and are not detected can be classified as broken or missing text. Broken text is related with SG components that are partially detected while missing text denotes all SG components that have not been detected at all (Figures 6(d), 6(e), and 6(f)). In the following sections, we further analyze the recall result by calculating broken and missing texts information. (a) f ( SGC ( x, y )) ⋅ (1 − B( x, y )) x =1, y =1 B (1 − f ( SGC ( x, y ))) ⋅ SG ( x, y ) ∑ x = Ix , y = Iy x =1, y =1 100% (10) SG ( x, y ) 4. Precision (c) (e) B Considering the binary image B, precision estimates the foreground areas that are actually text. In our method, the actual text is considered as the estimated ground truth image EG as described in Section 2. Precision is defined as the percentage of the estimated ground truth image that is detected in the binary image (Figure 7(c)). (d) (f) Figure 6. (a) Original grey scale image; (b) resulting binary image B; (c) skeletonized ground truth SG image overlay; (d) broken [B] and missing [M] text parts; (e) Broken parts (in black) indicated in circles and (f) missing part indicated in circle 3.1. ∑ x = Ix , y = Iy Precision = Broken Text x =1, y =1 EG ( x, y ) ⋅ B ( x, y ) ∑ x = Ix , y = Iy x =1, y =1 Let f() be the function that defines whether part of a skeletonized ground truth component is partially detected in the binary image B. f() is calculated as follows: 100% (11) B ( x, y ) The foreground areas of the binary image that are not detected during precision estimation are considered 220 (deform) components but they are also responsible for merging adjacent components (Figures 7(d) and 7(e)). In our method, deformations of the binary image are defined by the percentage of all text pixels of the binary image B that are not detected in the estimated ground truth image EG and do not belong to false alarms components as described in the previous section. The deformation of the binary image is defined by Eq. 14 while deformation leading to merging is defined by Eq. 15. as either false alarms or deformations which are described in the following section. 4.1. False Alarms False alarms refer to foreground pixels of the binary image B that do not belong to estimated ground truth image (Figure 7(f)). They are defined by the percentage of all pixels of the components of the binary image B that do not have any corresponding pixel with the skeletonized ground truth image SG. ∑ x = Ix , y = Iy FAlarms = x =1, y =1 ∑ x = Ix , y = Iy x =1, y =1 Deform = 100% (12) x =1, y =1 B ( x, y ) d ( BC ( x, y )) ⋅ B ( x, y ) ⋅ (1 − EG ( x, y )) ∑ x = Ix, y = Iy where h(i) is a function denoting whether a binary component is not detected in the skeletonized ground truth image. x = Ix , y = Iy ⎧ SG(x,y) ⋅ B(x,y) = 0 ⎪1, if ⎪ (13) h(i ) = ⎨ x =1, y =1 BC ( x , y ) = i ⎪ ⎪⎩0, otherwise MergeDeform = ∑ x =1, y =1 ∑ 100% (14) x = Ix , y = Iy x =1, y =1 B ( x, y ) m( BC ( x, y )) ⋅ B ( x, y ) ⋅ (1 − EG ( x, y )) ∑ x = Ix , y = Iy x =1, y =1 100% (15) B ( x, y ) where d(i) and m(i) are functions denoting whether a binary component corresponds to one or more ground truth components respectively. Functions d(i) and m(i) are defined as follows: where i ∈ L (a) ∑ x = Ix , y = Iy h( BC ( x, y )) ⋅ B ( x, y ) ⎧1, if | SGC ( x, y ) | = 1 ⎪ d (i ) = ⎨ ∀( x, y ) : BC ( x, y ) = i ⎪0, otherwise ⎩ ⎧1, if | SGC ( x, y ) | > 1, ⎪ m(i ) = ⎨ ∀( x, y ) : BC ( x, y ) = i ⎪0, otherwise ⎩ (b) (16) (17) where | SGC(x,y) | denotes the Cardinality of SGC(x,y). (c) (e) (d) 5. Experimental Results The proposed objective evaluation methodology for document image binarization techniques was applied on a set of grey scale handwritten and machine-printed documents with low quality, shadows, non-uniform illumination, strains, presence of characters from the other side of the page and other significant artifacts. Among all documents, we selected the twenty (20) most representative (ten handwritten and ten machineprinted) and marked the skeletonized ground truth SG following the procedure described in Section 2. Six (6) of the most promising global and adaptive binarization techniques were chosen for evaluation: 1. Otsu’s method (OTS) [10] 2. Bernsen’s method (BER) [11] 3. Niblack’s method (NIB) [12] 4. Sauvola’s method (SAU) [13] (f) Figure 7. (a) Original grey scale image; (b) resulting binary image B; (c) Estimated ground truth image (in grey) and false alarms along with deformations (in black); (d) Deformations that do not result in component merging (in black); (e) Deformations that do result in component merging (in black) and (f) false alarms (in black). 4.2. Deformations Components often merge with adjacent background information that was detected as text during binarization. Deformations do not only enlarge 221 5. Adaptive Logical method (AL) [14] 6. Adaptive Degraded Document method (GPP)[15] Our evaluation is based on F-measure which is defined as follows: 2 ⋅ Precision ⋅ Recall F= (18) Precision + Recall An example of the application of all methodologies to a grey scale handwritten document image is given in Figure 8 where the skeletonized ground truth image SG is also demonstrated. The corresponding evaluation results are shown in Table 1. Likewise, in Figure 9, an example of the application of all methodologies to a grey scale machine-printed document image is given and Table 2 demonstrates the corresponding evaluation results. The evaluation results where the average values for all handwritten and machine-printed test images have been taken into account are presented in Tables 3 and 4 respectively. According to these results, Adaptive Degraded Document method (GPP) [15] had the best overall performance concerning both handwritten and machine-printed document images with F-measures of 85.23% and 93.19% respectively. As far as the handwritten images are concerned, GPP performed slightly better than the Sauvola’s method (SAU) [13] which F-measure is 84.61%, while in the case of machine-printed images, Otsu’s method (OTS) [10] was second with F-measure of 93.10%. A more extensive analysis of the evaluation results, concerning all twenty (20) test images (Table 5), shows that Adaptive Level Method (AL) [14] has the best precision (88.44%) (Table 5) and the highest broken text value (Tables 3 and 4) that leads to the worst recall (87.55%) (Table 5). This behavior is expected, since AL is based on stroke width estimation. Moreover, in handwritten documents which have characters of nearly the same width, AL has indeed the best precision (88.83%) (Table 3), while in machine-printed images, where characters of different widths may exist, AL has the fourth best precision (88.04%) (Table 4). On the other hand, Niblack’s (NIB) [12] and Otsu’s (OTS) [10] methods, have the best and second best recall rates of 99.15% and 93.90% respectively. However, they have the lowest and second lowest precision rates of 47.13% and 81.01% respectively. In other words, these two methods preserve the characters information, but they cannot effectively remove the background noise, especially in cases where background and characters do not significantly differ. Bernstein’s method (BER) [11] is slightly better than OTS (Table 5), even if it has more broken text and false alarms (Tables 3 and 4). The overall rank that is presented in Table 5 is verified by other works, [4] and [15]. In the first case of [4], where a binary ground truth image and pixel accuracy were used as well, SAU, BER, OTS and NIB were evaluated with the same rank, concerning the document test images. In the second case of [15], where OCR word accuracy was used, GPP, SAU, OTS and NIB were also evaluated with the same rank according to the proposed methodology. AL method did not participate in any of the other works mentioned. (a) (b) (c) (d) (e) (f) (g) (h) Figure 8. (a) Original handwritten image; (b) skeletonized ground truth image; (c) GPP binarization; (d) AL binarization; (e) BER binarization; (f) NIB binarization; (g) OTS binarization and (h) SAU binarization. Table 1. All evaluation metrics for every binarization technique concerning the handwritten image of Figure 8. Figure 8 F-measure Recall Precision Broken Text Missing Text MergeDeform Deform False Alarms 222 GPP 93.25 96.27 90.42 AL 87.92 88.74 87.11 BER 90.39 96.90 84.69 NIB 67.95 99.56 51.58 OTS 91.23 98.58 84.91 SAU 92.84 96.36 89.56 03.73 11.09 02.93 00.44 01.25 03.64 00.00 00.17 00.17 00.00 00.17 00.00 00.71 00.14 00.19 07.84 04.81 00.74 06.81 09.30 11.54 08.39 07.24 06.81 02.06 03.45 03.57 32.19 02.84 02.89 Table 3. The average value of all evaluation metrics for every binarization technique concerning all handwritten images. Hand/ten F-measure Recall Precision Broken Text Missing Text MergeDeform Deform False Alarms (a) (b) (c) GPP 85.23 85.36 87.98 AL 81.99 79.82 88.83 BER 77.70 87.83 75.86 NIB 50.87 98.47 35.42 OTS 77,02 89.40 73.59 SAU 84.61 86.51 85.89 14.04 19.53 11.90 01.42 10.30 12.89 00.60 00.65 00.27 00.01 00.30 00.60 01.05 00.29 13.42 12.93 15.36 00.61 10.02 08.94 06.89 09.14 08.96 10.22 00.95 01.94 03.83 42.51 02.09 03.28 Table 4. The average value of all evaluation metrics (d) for every binarization technique concerning all machineprinted images. Printed F-measure Recall Precision Broken Text Missing Text MergeDeform Deform False Alarms (e) (f) (g) GPP 93.19 98.33 88.64 AL 91.39 95.27 88.04 BER 92.74 97.16 88.95 NIB 72.41 99.82 58.84 OTS 93.10 98.40 88.42 SAU 92.30 98.64 86.92 01.66 04.72 02.83 00.18 01.60 01.36 00.01 00.01 00.01 00.00 00.00 00.00 00.30 00.09 00.14 01.82 00.45 00.46 09.82 09.04 08.16 09.21 10.03 09.81 01.24 02.83 02.75 30.13 01.10 02.81 Table 5. The average value of F-measure, Recall and Precision, for every binarization technique concerning all test images. (h) All F-measure Recall Precision Figure 9. (a) Original machine-printed image; (b) skeletonized ground truth image; (c) GPP binarization; (d) AL binarization; (e) BER binarization; (f) NIB binarization; (g) OTS binarization and (h) SAU binarization. AL 91.67 98.35 85.85 BER 93.41 98.88 88.52 NIB 46.00 99.91 29.88 OTS 91.59 99.93 84.53 SAU 92.15 99.29 85.96 00.76 01.64 01.12 00.09 00.07 00.71 00.00 00.01 00.00 00.00 00.00 00.00 01.14 00.41 01.03 02.50 02.22 01.50 09.65 10.03 08.72 04.84 12.37 09.07 00.97 03.71 01.73 62.78 00.88 02.87 BER 85.22 92.50 82.41 NIB 61.64 99.15 47.13 OTS 85.06 93.90 81.01 SAU 88.46 92.58 86.41 This work is based on a semi-automatic procedure for the construction of the ground truth as well as a fully automated evaluation scheme. The evaluation results of the proposed methodology, are extensively analyzed and verified by other works, [4] and [15], as presented in the previous section. However, it is likely to evaluate different binary images with different ground truth images, since the ground truth image is constructed in relation with the corresponding binary image B and the edge image CE. Even though, the difference of the ground truth images cannot seriously affect the evaluation results. In future work, we will focus on the development of a more accurate technique for the estimation of technique concerning the machine-printed image of Figure 9. GPP 93.42 99.24 88.24 AL 86.69 87.55 88.44 6. Conclusion Table 2. All evaluation metrics for every binarization Figure 9 F-measure Recall Precision Broken Text Missing Text MergeDeform Deform False Alarms GPP 89.21 91.85 88.31 223 Conference on Pattern Recognition, Paris, France, 1986, pp. 1251-1255. ground truth image EG, that will not depend on the binary image B under evaluation. [12] W. Niblack, “An Introduction to Digital Image Processing”, Prentice-Hall, Englewood Cliffs, NJ, pp. 115–116, 1986. Acknowledgements [13] J. Sauvola, and M.Pietikainen, “Adaptive Document The research leading to these results has received funding from the European Community's Seventh Framework Programme under grant agreement n° 215064 (project IMPACT). Image Binarization”, Pattern Recognition, 2000, vol. 33, no. 2, pp. 225-236. [14] Y. Yang, and H. Yan, “An Adaptive Logical Method for Binarization of Degraded Document Images”, Pattern Recognition, 2000, vol. 33, pp. 787-807. [15] B. Gatos, I. Pratikakis, and S. J. Perantonis, "Adaptive Degraded Document Image Binarization", Pattern Recognition, March 2006, vol. 39, no. 3, pp. 317-327. References [1] E. Kavallieratou, and S. Stathis, “Adaptive Binarization of Historical Document Images”, 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, 2006, vol. 3, pp. 742-745. [2] D. Trier, and T. Taxt, “Evaluation of Binarization Methods for Document Images”, IEEE Transactions on Pattern Analysis and Machine Intelligence, March 1995, vol. 17, no. 3, pp. 312-315. [16] B. H. Kang, G. S. Han, H. G. Kim, J. S. Kim, C. R. Yoon, and M. S. Cho, “Fuzzy Inference and Logical Level Methods for Binary Graphic/Character Image Extraction”, 1998 IEEE International Conference on Systems, Man and Cybernetics, San Diego, California, USA, October 1998, vol. 5, pp. 4626-4629. [17] C. Wolf, J. M. Jolion, and F. Chassaing, “Text [3] Q. Wang, and C. L. Tan, “Matching of Double Sided Document Images to Remove Interference”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’01), Kauai island of Hawaii, USA, 2001, vol. 1, p. 1084. Localization, Enhancement and Binarization in Multimedia Documents”, Proceedings of the 16th International Conference on Pattern Recognition (ICPR ’02), Quebec City, Canada, August 2002, vol.2, pp. 1037-1040. [4] M. Sezgin, and B. Sankur, “Survey over Image [18] H. Cao, and V. Govindaraju, “Handwritten Carbon Thresholding Techniques and Quantitative Performance Evaluation”, Journal of Electronic Imaging, January 2004, vol. 13, no. 1, pp. 146-168. [5] J. He, Q. D. M. Do, A.C. Downton, and J. H. Kim, “A Comparison of Binarization Methods for Historical Archive Documents”, Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR ’05), Seoul, South Korea, AugustSeptember 2005, vol. 1, pp. 538-542. [6] Y. Zhu, C. Wang, and R. Dai, “Document Image Binarization Based on Stroke Enhancement”, Proceedings of the 18th International Conference on Pattern Recognition (ICPR ’06), Hong Kong, China, August 2006, vol. 1, pp. 955-958. Form Preprocessing Based on Markov Random Field”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’07), USA, June 2007, pp. 1-7. [19] C. Bastos, C. Mello, J. Andrade, D. Falcão, M. Lima, W. Santos, and A. Oliveira, “Thresholding Images of Historical Documents with Back-to-Front Interference based on Color Quantization by Genetic Algorithms”, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI ‘07), Patras, Greece, October 2007, vol. 1, pp. 488-491. [20] H. Kohmura, and T. Wakahara, “Determining Optimal Filters for Binarization of Degraded Characters in Color Using Genetic Algorithms”, 18th International Conference on Pattern Recognition (ICPR '06), Hong Kong, August 2006, vol. 3, pp. 661-664. [7] F. Chang, K. H. Liang, T. M. Tan, and W. L. Hwang, “Binarization of Document Images Using Hadamard Multiresolution Analysis”, Proceedings of the 5th International Conference on Document Analysis and Recognition (ICDAR ’99), Bangalore, India, September 1999, pp. 157-160. [21] T. Obafemi-Ajayi, G. Agam, and O. Frieder, [8] H. J. Lee, and B. Chen, “Recognition of Handwritten [22] LY. Fan, C. L. Tan, and LX. Fan, “Edge-preserving Chinese Characters via Short Line Segments”, Pattern Recognition, May 1992, vol. 25, no. 5, pp. 543-552. Prefiltering for Document Image Binarization”, International Conference on Image Processing, Greece, October 2001, vol. 1, pp. 1070-1073. “Ensemble LUT Classification for Degraded Document Enhancement”, Proceedings of SPIE, 15th Document Recognition and Retrieval Conference, San Jose, California, USA, January 2008, vol. 6815. [9] J. Canny, “A Computational Approach to Edge [23] V. Wu, R. Manmatha, and E. M. Riseman, “Textfinder: Detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence, November 1986, vol. 8, no. 6, pp. 679-698. An Automatic System to Detect and Recognize Text in Images”, IEEE Transactions on Pattern Analysis and Machine Intelligence, November 1999, vol. 21, no. 11, pp. 1224-1229. [10] N. Otsu, “A Thresholding Selection Method from Gray-level Histogram”, IEEE Transanctions on Systems, Man and Cybernetics, March 1979, vol. 9, pp. 62-66. [24] M. Kamel, and A. Zhao, “Extraction of Binary Character/Graphics Images from Grayscale Document Images”, CVGIP: Computer Vision Graphics and Image Processing, May 1993, vol. 55, no. 3, pp. 203217. [11] J. Bernsen, “Dynamic Thresholding of Grey-level Images”, Proceedings of the 8th International 224