Abstract
This paper deals with theoretical problems found in the work that is being carried out for annotating semantic roles in the Basque Dependency Treebank (BDT). We will present the resources used and the way the annotation is being done. Following the model proposed in the PropBank project, we will show the problems found in the annotation process and decisions we have taken. The representation of the semantic tag has been established and detailed guidelines for the annotation process have been defined, although it is a task that needs continuous updating. Besides, we have adapted AbarHitz, a tool used in the construction of the BDT, to this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agirre, E., Aldezabal, I., Pociello, E.: A pilot study of English Selectional Preferences and their Cross-Lingual Compatibility with Basque. In: International Conference on Text Speech and Dialogue, Czech Republic, pp. 12–19 (2003)
Agirre, E., Aldezabal, I., Etxeberria, J., Izagirre, I., Mendizabal, K., Pociello, E., Quintian, M.: A methodology for the joint development of the Basque WordNet and Semcor. In: Proceedings of the 5th International Conference on Language Resources and Evaluations (LREC), Genoa, Italy (2006a)
Agirre, E., Aldezabal, I., Etxeberria, J., Pociello, E.: A Preliminary Study for Building the Basque PropBank. In: Proceedings of the 5th International Conference on Language Resources and Evaluations (LREC), Genoa, Italy (2006b)
Aldezabal, I.: Levin’s verb classes and Basque. A comparative approach, UMIACS Departmental Colloquia. University of Maryland (1998)
Aldezabal, I., Aranzabe, M., Atutxa, A., Gojenola, K., Sarasola, K., Goenaga, P.: Extracción masiva de información sobre subcategorización verbal vasca a partir de corpus. In: Actas del XVII Congreso de la SEPLN, vol. 27, pp. 29–36. Universidad de Jaen, Spain (2001)
Aldezabal, I., Aranzabe, M.J., Atutxa, A., Gojenola, K., Oronoz, M., Sarasola, K.: Application of finite-state transducers to the acquisition of verb subcategorization information. Natural Language Engineering 9, 39–48 (2003)
Aldezabal, I.: Aditz-azpikategorizazioaren azterketa. 100 aditzen azterketa zehatza, Levin, oinarri harturik eta metodo automatikoak baliatuz. Leioa (Bilbao): University of Basque Country thesis (2004)
Aldezabal, I.: Estudio preliminar para la creación de Euskal PropBank. In: Castellón, I., Fernández, A. (eds.) Perspectivas de análisis de la unidad verbal, SERES. Universitat de Barcelona, Spain (2007)
Aldezabal, I., Aranzabe, M.J., Díaz de Ilarraza, A., Estarrona, A., Fernández, K., Uria, L.: EPEC-RS: EPEC (Euskararen Prozesamendurako Erreferentzia Corpusa) rol semantikoekin etiketatzeko eskuliburua [Guidelines to tag semantic roles in the EPEC corpus (the Reference Corpus for the Processing of Basque)]. Internal Report, UPV / EHU / LSI / TR 02-2010 (2010)
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: Proceedings of the COLING-ACL, Montreal, Canada (1998)
Bengoetxea, K., Gojenola, K.: Desarrollo de un analizador sintáctico-estadístico basado en dependencias para el euskera [Development of a statistical parser for Basque]. Procesamiento del Lenguaje Natural 39, 5–12 (2007)
Bird, S., Maeda, K., Ma, X., Lee, H., Randall, B., Zayat, S.: TreeTrans: Diverse Tools Built on The Annotation Graph Toolkit. In: Third International Conference on Language Resources and Evaluation, Las Palmas, Canary Islands, Spain, pp. 29–31 (2002)
Civit, M., Aldezabal, I., Pociello, E., Taulé, M., Aparicio, J., Màrquez, L.: 3LB-LEX: léxico verbal con frames sintáctico-semánticos. In: XXI Congreso de la SEPLN, Granada, Spain (2005)
Díaz de Ilarraza, A., Garmendia, A., Oronoz, M.: Abar-Hitz: An annotation tool for the Basque Dependency Treebank. In: Paper presented at the International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)
Hajic, J., Panevová, J., Urešová, Z., Bémová, A., Kolárová, V., Pajas, P.: PDT-VALLEX: Creating a Largecoverage Valency Lexicon for Treebank Annotation. In: Proceedings of the Second Workshop on Treebanks and Linguistic Theories, Sweden, pp. 57–68 (2003)
Kingsbury, P., Palmer, M.: From Treebank to PropBank. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Spain (2002)
Kipper, K., Palmer, M., Rambow, O.: Extending PropBank with VerbNet Semantic Predicates. In: Workshop on Applied Interlinguas, held in conjunction with AMTA 2002, Tiburon, CA (2002)
Levin, B.: English Verb Classes and Alternations. A preliminary Investigation. The University of Chicago Press, Chicago (1993)
Marcus, M.: The Penn TreeBank: A revised corpus design for extracting predicate argument structure. In: Proceedings of the ARPA Human Language Technology Workshop, Princeton, NJ (1994)
Nianwen, X.: Labeling Chinese predicates with semantic roles. Computational Linguistics 34(2), 225–255 (2008)
Palmer, M., Xue, N.: Annotating the Propositions in the Penn Chinese Treebank. In: Proceedings of the Second Sighan Workshop, Sapporo, Japan (2003)
Palmer, M., Gildea, D., Kingsbury, P.: The Proposition Bank: A Corpus Annotated with Semantic Roles. Computational Linguistics Journal 31(1) (2005)
Rosén, V., Smedt, K.D., Dyvik, H., Meurer, P.: TREPIL: Developing Methods and Tools for Multilevel Treebank Construction. In: Civit, M., Küber, S., Martí, M. (eds.) Proceeding of the Fourth Workshop on Trebank and Linguistics Theories, pp. 161–172. Universitat de Barcelona, Spain (2005)
Zapirain, B., Agirre, E., Màrquez, L.: Robustness and Generalization of Role Sets: PropBank vs. VerbNet. In: Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics, ACL 2008: HLT, Columbus, Ohio, pp. 550–558 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aldezabal, I., Aranzabe, M.J., Díaz de Ilarraza, A., Estarrona, A., Uria, L. (2010). EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-12116-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12115-9
Online ISBN: 978-3-642-12116-6
eBook Packages: Computer ScienceComputer Science (R0)