Abstract
Machine-readable versions of everyday dictionaries have been seen as a likely source of information for use in natural language processing because they contain an enormous amount of lexical and semantic knowledge. However, after 15 years of research, the results appear to be disappointing. No comprehensive evaluation of machine-readable dictionaries (MRDs) as a knowledge source has been made to date, although this is necessary to determine what, if anything, can be gained from MRD research. To this end, this paper will first consider the postulates upon which MRD research has been based over the past fifteen years, discuss the validity of these postulates, and evaluate the results of this work. We will then propose possible future directions and applications that may exploit these years of effort, in the light of current directions in not only NLP research, but also fields such as lexicography and electronic publishing.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. A. Amsler: The structure of the Merriam-Webster Pocket Dictionary. Ph. D. Dissertation, University of Texas at Austin, (1980)
N. Calzolari: Detecting patterns in a lexical data base. Proceedings of the 10th International Conference on Computational Linguistics, COLING'84, 170–173 (1984).
M. S. Chodorow, R.J. Byrd, G. E. Heidorn: Extracting semantic hierarchies from a large on-line dictionary. Proceedings of the 23rd Annual Conference of the Association for Computational Linguistics, Chicago, 299–304 (1985).
J. Markowitz, T. Ahlswede, M. Evens: Semantically significant patterns in dictionary definitions. Proceedings of the 24th Annual Conference of the Association for Computational Linguistics, New York, 112–119 (1986).
R.J. Byrd, N. Calzolari, M.S. Chodorow, J.L. Klavans, M.S. Neff, O. Rizk: Tools and methods for computational linguistics. Computational Linguistics, 13, 3/4, 219–240 (1987).
J. Nakamura, M. Nagao: Extraction of semantic information from an ordinary English dictionary and its evaluation. Proceedings of the 13th International Conference on Computational Linguistics, COLING'88, 459–464 (1988).
N. Ide, J.Véronis: Refining taxonomies extracted from machine-readable dictionaries. In: S. Hockey, N. Ide (eds.): Research in Humanities Computing 2. Oxford University Press (in press).
J. Klavans, M. Chodorow, N. Wacholder: From dictionary to knowledge base via taxonomy. Proceedings of the 6th Conference of the UW Centre for the New OED, Waterloo, 110–132 (1990).
Y. Wilks, D. Fass, C. Guo, J. MacDonald, T. Plate, B. Slator: Providing machine tractable dictionary tools. Machine Translation 5, 99–154 (1990).
F. Pigamo: Outils de traitement sémantique du langage naturel. Thèse de l'Ecole Nationale Supérieure des Télécommunications, Paris, 242pp. (1990).
A. Alonge: Analysing dictionary definitions of motion verbs. Proceedings of the 15th International Conference on Computational Linguistics, COLING'92 1315–1319(1992),.
R. Martin: Inférences et définition lexicographique. Colloque “Lexique et Inférences”, Metz, (1991).
P. Procter: Cambridge Language Survey: The development of a non-language specific semantic coding system using multiple inheritance. Paper presented at International Workshop of the European Association of Machine Translation, “Machine Translation and the Lexicon”, Heidelberg, 26–28 (April 1993).
R.A. Amsler: Words and worlds. Proceedings of the Third Workshop on Theoretical Issues in Natural Language Processing (TINLAP-3). Las Cruces, plNM (1987).
B.K. Boguraev: The definitional power of words. Proceedings of the Third Workshop on Theoretical Issues in Natural Language Processing (TINLAP-3). Las Cruces, NM, 11–15 (1987).
A. Kilgarriff: Dictionary word sense distinctions: An enquiry into their nature. Computers and the Humanities, 26 (5–6), 365–388 (1993).
J. Véronis, N.M. Ide: Word sense disambiguation with very large neural networks extracted from machine readable dictionaries. Proceedings of the 14th International Conference on Computational Linguistics, COLING'90, Helsinki, 2, 389–394 (1990).
K. Jensen, J.-L. Binot: Disambiguating prepositional phrase attachements by using on-line dictionary definitions. Computational Linguistics 13, 3–4, 251–260 (1987).
S. Montemagni, L. Vanderwende: Structural patterns vs. string patterns for extracting semantic information from dictionaries. Proceedings of the 15th International Conference on Computational Linguistics, COLING'92, 546–552 (1992).
Y. Ravin: Disambiguating and interpreting verb definitions. Proceedings of the 28th Annual Conference of the Association for Computational Linguistics, Pittsburgh, 260–267 (1990).
M.S. Neff, B.K. Boguraev: Dictionaries, dictionary grammars and dictionary entry parsing. Proceedings of the 27rd Annual Conference of the Association for Computational Linguistics, Vancouver, 91–101 (1989).
T. Ahlswede, M. Evens, K. Rossi: Building a lexical database by parsing Webster's Seventh Collegiate Dictionary. Proceedings of the 2nd Annual Conference of the UW Centre for the NewOED, Waterloo, Canada, 65–76 (1985).
R.A. Amsler, F.W. Tompa: An SGML-based standard for English monolingual dictionaries. Proceedings of the 4th Annual Conference of the UW Centre for the New Oxford English Dictionary, Waterloo, Ontario, 61–80 (1988).
N. Ide, J. Véronis: Print dictionaries, TEI Working Paper AI5 D17, Distributed by the Text Encoding Initiative. Compter Center, University of Illinois at Chicago, 60pp. (1992).
N. Ide, J. Veronis, S. Warwick-Armstrong, N. Calzolari: Principles for encoding machine readable dictionaries, EURALEX'92 Proceedings, H. Tommola, K. Varantola, T. Salmi-Tolonen, Y. Schopp, eds., in Studia Translatologica, Ser. a, 2, Tampere, Finland, 239–246 (1992).
N. Ide, J. Véronis: Encoding dictionaries. Computers and the Humanities 29, 1–3 (to appear).
M.S. Neff, R.J. Byrd, O.A. Rizk: Creating and querying lexical databases. Proceedings of the Association for Computational Linguistics Second Applied Conference on Natural Language Processing. Austin, Texas 84–92 (1988),.
N. Ide, J. Le Maitre, J. Veronis: Outline of a model for lexical databases. Information Processing and Managment 29, 2, 159–186 (1993).
J. Le Maitre, N. Ide, J. Véronis: Deux modèles pour la représentation des données lexicales et leur implémentation orientée-objet. Actes des 9èmes Journées Bases de Données Avancées, Toulouse, 312–331 (1993).
S.W. McRoy: Using multiple knowledge sources for word sense discrimination. Computational Linguistics 18, 1, 1–30 (1992).
M. Lesk: Automated sense disambiguation using machine-readable dictionaries: How to tell a pine cone from an ice cream cone. Proceedings of the 1986 SIGDOC Conference (1986).
N. Ide, J. Véronis: Caught in the web of words: Using networks generated from dictionaries for content analysis. Paper presented at ACH/ALLC'91 Joint International Conference, Tempe, Arizona (1991).
N. Ide, J. Véronis: Very large neural networks for word-sense disambiguation. 9th European Conference on Artificial Intelligence, ECAI'90, Stockholm, 366–368 (1990).
D.B. Lenat, M. Prakash, M. Shepherd: CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks. AI Magazine 7 (4), 65–85 (1986).
J.M. Sinclair: An account of the COBUILD project. London: Collins ELT 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ide, N., Véronis, J. (1995). Knowledge extraction from machine-readable dictionaries: An evaluation. In: Steffens, P. (eds) Machine Translation and the Lexicon. WMTL 1993. Lecture Notes in Computer Science, vol 898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59040-4_18
Download citation
DOI: https://doi.org/10.1007/3-540-59040-4_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-59040-8
Online ISBN: 978-3-540-49174-3
eBook Packages: Springer Book Archive