Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Enrichment of Arabic TimeML Corpus

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12496))

Included in the following conference series:

  • 1444 Accesses

Abstract

Automatic temporal information extraction is an important task for many natural language processing systems. This task requires thorough knowledge of the ontological and grammatical characteristics of temporal information in the text as well as annotated linguistic resources of the temporal entities. Before creating the resources or developing the system, it is first necessary to define a structured schema which describes how to annotate temporal entities. In this paper, we present a revised version of Arabic TimeML, and we propose an enriched Arabic corpus, called “ARA-TimeBank”, for events, temporal expressions and temporal relations based on the new Arabic TimeML. We describe our methodology which combines a pre-annotation phase with manuel validation and verification. ARA-TimeBank is the first corpus constructed for Arabic, which meets the needs of TimeML and addresses the limitations of existing Arabic TimeBank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://timexportal.wikidot.com/corpora-timebank12.

  2. 2.

    http://www.timeml.org/publications/timeMLdocs/timeml_1.2.1.html.

  3. 3.

    https://catalog.ldc.upenn.edu/LDC2012T12.

  4. 4.

    To fully understand the other attributes such as EVENT and MAKEINSTANCE tags, please see the proposed schema in [13].

  5. 5.

    https://heideltime.ifi.uni-heidelberg.de.

  6. 6.

    https://github.com/nafaa5/Arabic-event-timex-gazetteers-.

  7. 7.

    The dependency path is analysed by udpipe tool: http://lindat.mff.cuni.cz/serv ices/udpipe/.

References

  1. Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)

    Article  Google Scholar 

  2. Altuna, B., Aranzabe, M.J., Díaz de Ilarraza, A.: Adapting TimeML to Basque: event annotation. In: Gelbukh, A. (ed.) CICLing 2016. LNCS, vol. 9624, pp. 565–577. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75487-1_43

    Chapter  Google Scholar 

  3. Batita, M.A., Ayadi, R., Zrigui, M.: Reasoning over Arabic WordNet relations with neural tensor network. Computación y Sistemas 23(3), (2019)

    Google Scholar 

  4. Batita, M.A., Zrigui, M.: The enrichment of Arabic WordNet antonym relations. In: Gelbukh, A. (ed.) CICLing 2017. LNCS, vol. 10761, pp. 342–353. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77113-7_27

    Chapter  Google Scholar 

  5. Bittar, A., Amsili, P., Denis, P., Danlos, L.: French TimeBank: an ISO-TimeML annotated reference corpus. In: The 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, United States, pp. 130–134 (2011)

    Google Scholar 

  6. Boguraev, B., Pustejovsky, J., Ando, R., Verhagen, M.: TimeBank evolution as a community resource for TimeML parsing. Lang. Resour. Eval. 41, 91–115 (2007)

    Article  Google Scholar 

  7. Boudaa, T., El Marouani, M., Enneya, N.: Arabic temporal expression tagging and normalization. In: Tabii, Y., Lazaar, M., Al Achhab, M., Enneya, N. (eds.) Big Data, Cloud and Applications, pp. 546–557 (2018)

    Google Scholar 

  8. Caselli, T., Bartalesi Lenzi, V., Sprugnoli, R., Pianta, E., Prodanof, I.: Annotating events, temporal expressions and relations in Italian: the It-TimeML experience for the Ita-TimeBank. In: Proceedings of the 5th Linguistic Annotation Workshop (2011)

    Google Scholar 

  9. Costa, F., Branco, A.: TimeBankPT: a TimeML annotated corpus of Portuguese. In: LREC, pp. 3727–3734 (2012)

    Google Scholar 

  10. Derczynski, L., Strötgen, J., Maynard, D., Greenwood, M.A., Jung, M.: GATE-time: extraction of temporal expressions and event. In: LREC 2016, pp. 3702–3708 (2016)

    Google Scholar 

  11. Forăscu, C., Tufiş, D.: Romanian TimeBank: an annotated parallel corpus for temporal information. In: LREC, pp. 3762–3766 (2012)

    Google Scholar 

  12. Haffar, N., Hkiri, E., Zrigui, M.: Arabic linguistic resource and specifications for event annotation. In: Proceedings of the 34th International Business Information Management Association Conference (IBIMA), Vision 2025: Education Excellence and Management of Innovations through Sustainable Economic Competitive Advantage, pp. 4316–4327 (2019)

    Google Scholar 

  13. Haffar, N., Hkiri, E., Zrigui, M.: TimeML Annotation of events and temporal expressions in Arabic texts. In: Nguyen, N.T., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds.) ICCCI 2019. LNCS (LNAI), vol. 11683, pp. 207–218. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28377-3_17

    Chapter  Google Scholar 

  14. Haffar, N., et al.: Pedagogical indexed arabic text in cloud e-learning system. IJCAC 7(1), 32–46 (2017). https://doi.org/10.4018/IJCAC.2017010102

    Article  Google Scholar 

  15. Hkiri, E., Mallat, S., Zrigui, M.: Events automatic extraction from arabic texts. IJIRR 6(1), 36–51 (2016)

    Google Scholar 

  16. Jeong, Y.S., Joo, W.T., Do, H.W., Lim, C.G., Choi, K.S., Choi, H.J.: Korean TimeML and Korean TimeBank. In: LREC, pp. 356–359 (2016)

    Google Scholar 

  17. Mahmoud, A., Zrigui, A., Zrigui, M.: A text semantic similarity approach for Arabic paraphrase detection. In: CICLing, pp. 338–349 (2017)

    Google Scholar 

  18. Mahmoud, A., Zrigui, M.: Deep neural network models for paraphrased text classification in the Arabic language. In: NLDB, pp. 3–16 (2019)

    Google Scholar 

  19. Mirzaei, A., Moloodi, A.: Persian proposition bank. In: LREC, pp. 3828–3835 (2016)

    Google Scholar 

  20. Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: Iso-TimeML: an international standard for semantic annotation. In: LREC (2010)

    Google Scholar 

  21. Saleh, I., Tounsi, L., van Genabith, J.: ZamAn and Raqm: extracting temporal and numerical expressions in Arabic. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds.) AIRS 2011. LNCS, vol. 7097, pp. 562–573. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25631-8_51

    Chapter  Google Scholar 

  22. Strötgen, J., Gertz, M.: Multilingual and cross-domain temporal tagging. Lang. Resour. Eval. 47, 269–298 (2013)

    Article  Google Scholar 

  23. UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., Pustejovsky, J.: SemEval-2013 task 1: TempEval-3: evaluating time expressions, events, and temporal relations. In: SemEval 2013, vol. 2, pp. 1–9 (2013)

    Google Scholar 

  24. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: SemEval-2007 task 15: TempEval temporal relation identification. In: SemEval 2007, pp. 75–80 (2007)

    Google Scholar 

  25. Verhagen, M., et al.: Automating temporal annotation with TARSQI. In: ACLDEMO 2005, pp. 81–84 (2005)

    Google Scholar 

  26. Zaraket, F.A., Makhlouta, J.: Arabic temporal entity extraction using morphological analysis. Int. J. Comput. Linguist. Appl. 3, 121–136 (2012)

    Google Scholar 

  27. Zrigui, M., Ayadi, R., Mars, M., Maraoui, M.: Arabic text classification framework based on latent dirichlet allocation. CIT 20(2), 125–140 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Nafaa Haffar , Emna Hkiri or Mounir Zrigui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Haffar, N., Hkiri, E., Zrigui, M. (2020). Enrichment of Arabic TimeML Corpus. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2020. Lecture Notes in Computer Science(), vol 12496. Springer, Cham. https://doi.org/10.1007/978-3-030-63007-2_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63007-2_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63006-5

  • Online ISBN: 978-3-030-63007-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics