Abstract
Knowledge management (ontologies development, disambiguation of words, semantic web, etc.) must extract knowledge from somewhere. The main source of knowledge are natural language texts, in which humans express how they view and conceptualize the world. However, the automatic extraction of knowledge from texts is not a trivial task. In this paper we present a semantic annotated corpus as a source for knowledge extraction. Semantic is the bridge between linguistic input and knowledge (concepts, real world). A corpus with semantic information annotated is a useful resource to extract knowledge from a real context: it is a semi-structured database that offers deep information about human knowledge, concepts and relations between them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bateman, J.A.: On the relationship between ontology construction and natural language: a socio-semiotic view. International Journal of Human-Computer Studies 43, 929–944 (1995)
Nirenburg, S., Raskin, V.: Ontological semantics. MIT Press, Cambridge (2004)
Navarro, B., Civit, M., Martí, M.A., Fernández, B., Marcos, R.: Syntactic, semantic and pragmatic annotation in Cast3LB. In: Proceedings of the Shallow Processing of Large Corpora. A Corpus Linguistics WorkShop, Lancaster, UK (2003)
Vossen, P.: EuroWordNet: Building a Multilingual Database with WordNets for European Languages. The ELRA Newsletter 3 (1998)
Sebastián, N., Martí, M.A., Carreiras, M.F., Cuetos, F.: 2000 LEXESP: Léxico Informatizado del Español. Edicions de la Universitat de Barcelona, Barcelona (2000)
Kilgarriff, A.: Gold standard datasets for evaluating word sense disambiguation programs. Computer Speech and Language. Special Use on Evaluation 12, 453–472 (1998)
Carletta, J.: Assessing agreement on classification tasks: the kappa statistics. Conputational Linguistics 22, 249–254 (1996)
Cohen, J.: A coefficient of agreement for nominal scales. Educational Psychological Measurement 20 (1960)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Navarro, B., Martínez-Barco, P., Palomar, M. (2005). Semantic Annotation of a Natural Language Corpus for Knowledge Extraction. In: Montoyo, A., Muńoz, R., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2005. Lecture Notes in Computer Science, vol 3513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11428817_38
Download citation
DOI: https://doi.org/10.1007/11428817_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26031-8
Online ISBN: 978-3-540-32110-1
eBook Packages: Computer ScienceComputer Science (R0)