Abstract
Automatic temporal information extraction is an important task for many natural language processing systems. This task requires thorough knowledge of the ontological and grammatical characteristics of temporal information in the text as well as annotated linguistic resources of the temporal entities. Before creating the resources or developing the system, it is first necessary to define a structured schema which describes how to annotate temporal entities. In this paper, we present a revised version of Arabic TimeML, and we propose an enriched Arabic corpus, called “ARA-TimeBank”, for events, temporal expressions and temporal relations based on the new Arabic TimeML. We describe our methodology which combines a pre-annotation phase with manuel validation and verification. ARA-TimeBank is the first corpus constructed for Arabic, which meets the needs of TimeML and addresses the limitations of existing Arabic TimeBank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
To fully understand the other attributes such as EVENT and MAKEINSTANCE tags, please see the proposed schema in [13].
- 5.
- 6.
- 7.
The dependency path is analysed by udpipe tool: http://lindat.mff.cuni.cz/serv ices/udpipe/.
References
Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)
Altuna, B., Aranzabe, M.J., Díaz de Ilarraza, A.: Adapting TimeML to Basque: event annotation. In: Gelbukh, A. (ed.) CICLing 2016. LNCS, vol. 9624, pp. 565–577. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75487-1_43
Batita, M.A., Ayadi, R., Zrigui, M.: Reasoning over Arabic WordNet relations with neural tensor network. Computación y Sistemas 23(3), (2019)
Batita, M.A., Zrigui, M.: The enrichment of Arabic WordNet antonym relations. In: Gelbukh, A. (ed.) CICLing 2017. LNCS, vol. 10761, pp. 342–353. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77113-7_27
Bittar, A., Amsili, P., Denis, P., Danlos, L.: French TimeBank: an ISO-TimeML annotated reference corpus. In: The 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, United States, pp. 130–134 (2011)
Boguraev, B., Pustejovsky, J., Ando, R., Verhagen, M.: TimeBank evolution as a community resource for TimeML parsing. Lang. Resour. Eval. 41, 91–115 (2007)
Boudaa, T., El Marouani, M., Enneya, N.: Arabic temporal expression tagging and normalization. In: Tabii, Y., Lazaar, M., Al Achhab, M., Enneya, N. (eds.) Big Data, Cloud and Applications, pp. 546–557 (2018)
Caselli, T., Bartalesi Lenzi, V., Sprugnoli, R., Pianta, E., Prodanof, I.: Annotating events, temporal expressions and relations in Italian: the It-TimeML experience for the Ita-TimeBank. In: Proceedings of the 5th Linguistic Annotation Workshop (2011)
Costa, F., Branco, A.: TimeBankPT: a TimeML annotated corpus of Portuguese. In: LREC, pp. 3727–3734 (2012)
Derczynski, L., Strötgen, J., Maynard, D., Greenwood, M.A., Jung, M.: GATE-time: extraction of temporal expressions and event. In: LREC 2016, pp. 3702–3708 (2016)
Forăscu, C., Tufiş, D.: Romanian TimeBank: an annotated parallel corpus for temporal information. In: LREC, pp. 3762–3766 (2012)
Haffar, N., Hkiri, E., Zrigui, M.: Arabic linguistic resource and specifications for event annotation. In: Proceedings of the 34th International Business Information Management Association Conference (IBIMA), Vision 2025: Education Excellence and Management of Innovations through Sustainable Economic Competitive Advantage, pp. 4316–4327 (2019)
Haffar, N., Hkiri, E., Zrigui, M.: TimeML Annotation of events and temporal expressions in Arabic texts. In: Nguyen, N.T., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds.) ICCCI 2019. LNCS (LNAI), vol. 11683, pp. 207–218. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28377-3_17
Haffar, N., et al.: Pedagogical indexed arabic text in cloud e-learning system. IJCAC 7(1), 32–46 (2017). https://doi.org/10.4018/IJCAC.2017010102
Hkiri, E., Mallat, S., Zrigui, M.: Events automatic extraction from arabic texts. IJIRR 6(1), 36–51 (2016)
Jeong, Y.S., Joo, W.T., Do, H.W., Lim, C.G., Choi, K.S., Choi, H.J.: Korean TimeML and Korean TimeBank. In: LREC, pp. 356–359 (2016)
Mahmoud, A., Zrigui, A., Zrigui, M.: A text semantic similarity approach for Arabic paraphrase detection. In: CICLing, pp. 338–349 (2017)
Mahmoud, A., Zrigui, M.: Deep neural network models for paraphrased text classification in the Arabic language. In: NLDB, pp. 3–16 (2019)
Mirzaei, A., Moloodi, A.: Persian proposition bank. In: LREC, pp. 3828–3835 (2016)
Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: Iso-TimeML: an international standard for semantic annotation. In: LREC (2010)
Saleh, I., Tounsi, L., van Genabith, J.: ZamAn and Raqm: extracting temporal and numerical expressions in Arabic. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds.) AIRS 2011. LNCS, vol. 7097, pp. 562–573. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25631-8_51
Strötgen, J., Gertz, M.: Multilingual and cross-domain temporal tagging. Lang. Resour. Eval. 47, 269–298 (2013)
UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., Pustejovsky, J.: SemEval-2013 task 1: TempEval-3: evaluating time expressions, events, and temporal relations. In: SemEval 2013, vol. 2, pp. 1–9 (2013)
Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: SemEval-2007 task 15: TempEval temporal relation identification. In: SemEval 2007, pp. 75–80 (2007)
Verhagen, M., et al.: Automating temporal annotation with TARSQI. In: ACLDEMO 2005, pp. 81–84 (2005)
Zaraket, F.A., Makhlouta, J.: Arabic temporal entity extraction using morphological analysis. Int. J. Comput. Linguist. Appl. 3, 121–136 (2012)
Zrigui, M., Ayadi, R., Mars, M., Maraoui, M.: Arabic text classification framework based on latent dirichlet allocation. CIT 20(2), 125–140 (2012)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Haffar, N., Hkiri, E., Zrigui, M. (2020). Enrichment of Arabic TimeML Corpus. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2020. Lecture Notes in Computer Science(), vol 12496. Springer, Cham. https://doi.org/10.1007/978-3-030-63007-2_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-63007-2_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63006-5
Online ISBN: 978-3-030-63007-2
eBook Packages: Computer ScienceComputer Science (R0)