Abstract
This paper presents a real database for the Arabic printed text recognition, APTID / MF (Arabic Printed Text Image Database / Multi-Font).This database can be used to evaluate the system that recognizes Arabic printed texts with an open vocabulary. APTID / MF may be also used for research in word segmentation and font identification. APTID / MF is obtained from 387 pages of Arabic printed documents scanned with grayscale format and 300 dpi resolutions. From this documents, 1,845 text-blocks have been extracted. In addition ground truth file is provided for each texts-block. APTID / MF also includes an Arabic printed character image dataset made up of 27,402 samples. The database is freely available to interested researchers.
Chapter PDF
Similar content being viewed by others
References
Amara, N.B.: On the Problematic and Orientations in Recognition of the Arabic Writing. In: CIFED 2002, pp. 1–10 (2002)
Kanoun, S., Alimi, A.M., Lecourtier, Y.: Affixal Approach for Arabic Decom-posable Vocabulary Recognition: A Validation on Printed Word in Only One Font. In: ICDAR 2005, pp. 1025–1029 (2005)
Pechwitz, M., Maddouri, S., Margner, V., Ellouze, N., Amiri, H.: IFN/ENIT-Database of Handwritten Arabic Words. In: CIFED 2002, pp. 127–136 (2002)
Mozaffari, S., Faez, K., Faradji, F., Ziaratban, M., Golzan, M.: Isolated Far-si/Arabic character database for handwritten OCR research. In: International Work-shop on Frontiers of Handwriting Recognition, pp. 385–389 (2006)
Mozaffari, S., El Abed, H., Margner, V., Faez, K., Amirshahi, A.: IfN/Farsi-Database: A Database of Farsi Handwritten City Names. ICFHR (2008)
Slimane, F., Ingold, R., Kanoun, S., Alimi, A., Hennebert, J.: A New Arabic Printed Text Image Database and Evaluation Protocols. In: proc. of 10th IEEE International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 946–950 (2009)
Davidson, R., Hopely, R.: Arabic and Persian OCR Training and Test Data Sets. In: Proceedings of Symposium. On Document Image Understanding Technology (1997)
AL-hashim, A.G., Mahmoud, S.A.: Benchmark Database and GUI Environment for Printed Arabic Text Recognition Research. Wseas Transactions Information Science and Applications 7(4), 10 (2010)
Hu, M.: Visual pattern recognition by moment invariants. IRE Trans. Information Theory, IT 8, 179–187 (1962)
Flusser, J., Suk, T.: Pattern recognition by affine moment invariants. Pattern Recognition 26(1), 167–174 (1993)
Zernike, F.: Diffraction theory of the cut procedure and its improved form, the phase contrast method. Physica 1, 689–704 (1934)
Tsirikolias, K., Mertzios, B.G.: Statistical pattern recognition using efficient two dimensional moments with applications to character recognition. Pattern Recognition 26, 877–882 (1993)
Derrode, S., Ghorbel, F.: Digital Fourier Mellin Transform- Reconstruction and es-timate of objects movement on levels of gray. In: Proc. of GRETSI conference, Grenoble, France, pp. 566–658 (1997)
Davis, C.B., Beecher, R., Beecher, M.: The statistical use of Fourier descriptors. Original Research Article Mathematical and Computer Modeling 11, 419–424 (1988)
Freeman, H.: On the encoding of arbitrary geometric configurations. IEEE Trans. Electronic Comp. EC-10, 260–268 (1968)
Heutte, L.: Reconnaissance de caractères manuscrits: Application a la lecture au-tomatique des chèques et des enveloppes postales. Doctorat Thesis, University of Rouen (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jaiem, F.K., Kanoun, S., Khemakhem, M., El Abed, H., Kardoun, J. (2013). Database for Arabic Printed Text Recognition Research. In: Petrosino, A. (eds) Image Analysis and Processing – ICIAP 2013. ICIAP 2013. Lecture Notes in Computer Science, vol 8156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41181-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-41181-6_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41180-9
Online ISBN: 978-3-642-41181-6
eBook Packages: Computer ScienceComputer Science (R0)