Abstract
This paper introduces an unsupervised algorithm that collects senses contained in WordNet to explain words, whose meaning is unknown, but plenty of documents are available that contain the word in that unknown sense. Based on the widely accepted idea that the meaning of a word is characterized by its context, a neural network architecture was designed to reconstruct the meaning of the unknown word. The connections of the network were derived from word co-occurrences and word-sense statistics. The method was tested on 80 TOEFL synonym questions, from which 63 questions were answered correctly. This is comparable to other methods tested on the same questions, but using a larger corpus or richer lexical database. The approach was found robust against details of the architecture.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychol. Rev. 104, 211–240 (1997)
Turney, P.: Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In: ECML Proceedings, Freiburg, Germany, pp. 491–502 (2001)
Terra, E., Clarke, C.L.A.: Choosing the word most typical in context using a lexical co-occurrence network. In: Proc. of Conf. on Human Language Technol. and North American Chapter of Assoc. of Comput. Linguistics., pp. 244–251 (2003)
Jarmasz, M., Szpakowicz, S.: Roget’s Thesaurus and semantic similarity. In: Proc. of the Int. Conf. on Recent Advances in Natural Language Proc., RANLP 2003 (2003)
Haykin, S.: Neural Networks: A comprehensive foundation. Prentice Hall, New Jersey (1999)
Landauer, T.K. (personal communication)
Burgess, C.: From simple associations to the building blocks of language: Modeling meaning in memory with the HAL model. Behav. Res. Methods, Instr. and Comps. 30, 188–198 (1998)
Olshausen, B.A.: Learning linear, sparse factorial codes. A.I. Memo 1580, MIT AI Lab. C.B.C.L. Paper No. 138 (1996)
Szatmáry, B., Szirtes, G., L’́orincz, A., Eggert, J., Körner, E.: Robust hierarchical image representation using non-negative matrix factorization with sparse code shrinkage preprocessing. Pattern Anal.and Appl. 6, 194–200 (2003)
Cilibrasi, R., Vitanyi, P.: Automatic meaning discovery using google. arXiv:cs.CL/0412098 v2 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gábor, B., Gyenes, V., Lőrincz, A. (2005). Corpus-Based Neural Network Method for Explaining Unknown Words by WordNet Senses. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_47
Download citation
DOI: https://doi.org/10.1007/11564126_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)