Improving Arabic morphological analyzers benchmark

Jaafar, Younes; Bouzoubaa, Karim; Yousfi, Abdellah; Tajmout, Rachida; Khamar, Hakima

doi:10.1007/s10772-016-9340-x

Improving Arabic morphological analyzers benchmark

Published: 19 April 2016

Volume 19, pages 259–267, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Younes Jaafar¹,
Karim Bouzoubaa¹,
Abdellah Yousfi²,
Rachida Tajmout¹ &
…
Hakima Khamar³

298 Accesses
7 Citations
Explore all metrics

Abstract

The various tools dedicated to Arabic natural language processing have undergone significant development during recent years. Among these tools, Arabic morphological analyzers are of great importance because they are often used within other projects that are more advanced such as syntactic parsers, search engines, machine translation systems, etc. Thus, researchers are forced to make a decision concerning which morphological analyzer to use in their research projects, and this task is very difficult since there are many criteria to take into account. In order to facilitate this choice, we considered the problem of benchmarking morphological analyzers in a previous work by proposing a solution that allows returning a set of metrics of each analyzer that are: accuracy, precision, recall, F-measure and the execution time. In this article, we present two new major improvements to our solution: the establishment of the first version of our corpus that is dedicated to the evaluation of morphological analyzers, as well as the introduction of a new metric, which combines all metrics related to results as well as the execution time of the analyzers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

References

Alansary, S., Nagi, M., & Adly, N. (2007). Building an international corpus of Arabic. 7th international conference on language engineering, (p. np.). Cairo.
ALECSO. (n.d.). Retrieved December 23, 2014, from مواصفات نظام التحليل الصرفي في اللغة العربية: http://www.alecso.org.tn/images/stories/OULOUM/MOHALLILAT%20SARFIA_DAMAS_2009/022%202%20SPECIFICATIONS.pdf.
Al-Kabi, M., Al-Radaideh, Q., & Akkawi, K. (2011). Benchmarking and assessing the performance of Arabic stemmers. Journal of Information Science, 37(2), 111–119.
Article Google Scholar
Alkhalil Morpho Sys. (2013). Retrieved April 23, 2015, from Alkhalil Morpho Sys: http://sourceforge.net/projects/alkhalil/.
Al-Sughaiyer, I. A., & Al-Kharashi, I. A. (2004). Arabic morphological analysis techniques: A comprehensive survey. Journal of the American Society For Information Science and Technology, 55(3), 189–213. Retrieved from Imad Al-Sughayer and Ibrahim Al-Kharashi. “Arabic morphological Analysis Techniques: a comprehensive Survey”. Computer and Electronics.
Boudlal, A., Lakhouaja, A., Mazroui, A., Meziane, A., Ould Abdallahi, O. B., & Shoul, M. (2011). Alkhalil Morpho Sys: A morphosyntactic analysis system for Arabic texts. Proceedings of ACIT’2010.
Brihaye, P. (2003). AraMorph. Retrieved April 23, 2015, from AraMorph: http://www.nongnu.org/aramorph/english/index.html.
Buckwalter, T. (2002a). Arabic morphology analysis. Retrieved April 23, 2015, from QAMUS: http://www.qamus.org/morphology.htm.
Buckwalter, T. (2002b). Buckwalter Arabic morphological analyzer version 1.0.
Champsaur, C. (2013, January). La traduction automatique : Un outil pour les traducteurs? The Journal of Specialised Translation, 19, pp. 19–28.
Chennoufi, A., & Mazroui, A. (2014). Apport de la deuxième version de l’analyseur Alkhalil Morpho Sys dans la voyellation automatique des textes Arabes. 5th international conference on Arabic language processing (CITALA 2014), (pp. 223–230). Oujda.
Darwish, K. (2002). Building a shallow Arabic morphological analyzer in one day. Proceedings of the ACL-2002 workshop on computational approaches to semitic languages, (pp. 47–54). Retrieved from https://aclweb.org/anthology.
Diab, M. (2009). Second generation tools (AMIRA 2.0): Fast and robust tokenization, POS tagging, and base phrase chunking. Second international conference on Arabic language resources and tools, (pp. 285–288). Cairo.
Dukes, K. (2010). The Quranic Arabic corpus. Retrieved April 23, 2015, from Quranic Arabic Corpus. http://corpus.quran.com.
Dukes, K., & Habash, N. (2010). Morphological annotation of Quranic Arabic. Language resources and evaluation conference (LREC). Malta. Retrieved from https://aclweb.org/anthology.
Graff, D., Maamouri, M., Bouziri, B., Krouna, S., Kulick, S., & Buckwalter, T. (2009). Standard Arabic morphological analyzer (SAMA) version 3.1. Linguistic Data Consortium LDC2009E73.
Habash, N., Rambow, O., & Roth, R. (n.d.). MADA + TOKAN software suite. Retrieved April 23, 2015, from MADA + TOKAN: http://www1.cs.columbia.edu/~rambow/software-downloads/MADA_Distribution.html.
Habash, N., Rambow, O., & Roth, R. (2009). Mada + tokan: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization. Proceedings of the 2nd international conference on Arabic language resources and Tools (MEDAR), (pp. 102–109). Cairo.
Hassan, Y., Aly, M., & Atiya, A. (2014). Arabic spelling correction using supervised learning. Proceedings of the EMNLP 2014 workshop on Arabic, (pp. 121–126). Doha.
Hattab, M., Haddad, B., Yaseen, M., Duraidi, A., & Shmais, A. A. (2009). Addaall Arabic search engine: Improving search based on combination of morphological analysis and generation considering semantic patterns. The 2nd international conference on Arabic language resources & tools, (pp. 159–162).
Jaafar, Y., & Bouzoubaa, K. (2014). Benchmark of Arabic morphological analyzers: Challenges and solutions. Intelligent systems: Theories and applications (SITA-14), (pp. 1–6). Rabat.
Kano, Y., Dorado, R., McCrohon, L., Ananiadou, S., & Tsujii, J. (2010). U-Compare: An integrated language resource evaluation platform including a comprehensive UIMA resource library. Proceedings of the seventh international conference on language resources and evaluation (LREC 2010), (pp. 428–434).
Koulali, R., & Meziane, A. (2013). Experiments with Arabic topic detection. Journal of Theoretical and Applied Information Technology, 50(1), 28–32.
Google Scholar
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1–135.
Article Google Scholar
Pasha, A., Al-Badrashiny, M., Diab, M., El Kholy, A., Eskander, R., Habash, N., & Roth, R. M. (2014). MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. LREC’14, (pp. 1094–1101). Reykjavik.
Sawalha, M., & Atwell, E. (2008). Comparative evaluation of Arabic language morphological analysers and stemmers. International conference on computational linguistics—COLING, (pp. 107–110). Retrieved from https://aclweb.org/anthology.
Sawalha, M. (n.d.). Gold Standard of Arabic. Gold standard for evaluating Arabic morphological analyzers. Retrieved April 23, 2015, from http://www.comp.leeds.ac.uk/sawalha/goldstandard.html.
Smrž, O. (2007). ElixirFM: Implementation of functional Arabic morphology. Proceedings of the 2007 workshop on computational approaches to Semitic languages: common issues and resources (pp. 1–8). Stroudsburg: Association for Computational Linguistics.
Wali, W., Gargouri, B., & Ben Hamadou, A. (2014). A system for evaluating the content of LMF Arabic dictionaries. 5th international conference on Arabic language processing (CITALA 2014), (pp. 159–167). Oujda.

Download references

Author information

Authors and Affiliations

Mohammadia School of Engineers, Mohammed Vth University, Rabat, Morocco
Younes Jaafar, Karim Bouzoubaa & Rachida Tajmout
FSJES, Mohammed Vth University, Rabat, Morocco
Abdellah Yousfi
Faculty of Letters and Human Sciences, Mohammed Vth University, Rabat, Morocco
Hakima Khamar

Authors

Younes Jaafar
View author publications
You can also search for this author in PubMed Google Scholar
Karim Bouzoubaa
View author publications
You can also search for this author in PubMed Google Scholar
Abdellah Yousfi
View author publications
You can also search for this author in PubMed Google Scholar
Rachida Tajmout
View author publications
You can also search for this author in PubMed Google Scholar
Hakima Khamar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Younes Jaafar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jaafar, Y., Bouzoubaa, K., Yousfi, A. et al. Improving Arabic morphological analyzers benchmark. Int J Speech Technol 19, 259–267 (2016). https://doi.org/10.1007/s10772-016-9340-x

Download citation

Received: 10 November 2015
Accepted: 02 April 2016
Published: 19 April 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10772-016-9340-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Arabic morphological analyzers benchmark

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Arabic Dialects Morphological Analyzers: A Survey

An Enhanced Rule Based Arabic Morphological Analyzer Based on Proposed Assessment Criteria

An Evaluation of the Morphological Analysis of Egyptian Arabic TreeBank

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Improving Arabic morphological analyzers benchmark

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Arabic Dialects Morphological Analyzers: A Survey

An Enhanced Rule Based Arabic Morphological Analyzer Based on Proposed Assessment Criteria

An Evaluation of the Morphological Analysis of Egyptian Arabic TreeBank

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation