Abstract
The paper describes the steps that were undertaken in order to start the production of a comprehensive morphological dictionary of compounds for Serbian. First, the classes of multi-word expressions were determined that were to be covered by the dictionaries. In the next step the useful sources of compounds were detected. The retrieved compounds were then classified according to their inflectional properties. The recently developed special finite state transducers were constructed for each of these classes which produce all the variants and morphological forms for the compounds of the class. Finally, the software module was developed that facilitates the production of the dictionary of compound lemmas with all the necessary information in the required format.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Corbett, G.G.: Number. Cambridge University Press, Cambridge (2000)
Courtois, B., Silberztein, M. (eds.): Dictionnaires électroniques du français. Langue Française, 87, Larousse (1990)
Downing, P.: On the Creation and Use of English Compound Nouns. In: Language, vol. 153(4). Linguistic Society of America (1977)
Gross, G. Définition des noms composés dans un lexique-grammaire. In: Langue Française, Larousse, Paris, vol. 87 (1990)
Habert, B., Jacquemin, C.: Noms composés, termes, dénominations complexes: problématiques linguistiques et traitements automatiques. In: TAL, vol. 2 (1993)
Jacquemin, C.: Spotting and Discovering Terms through Natural Language Processing. MIT Press, Cambridge (2001)
Karttunen, L.: Finite-State Lexicon Compiler. Technical Report. ISTL-NLTT2993-04-02. Xerox Palo Alto Research Center. Xerox Corporation (1993)
Koeva, S., Krstev, C., Obradović, I., Vitas, D.: Resources for Processing Bulgarian and Serbian — a brief overview of Completeness, Compatibility and Similarities. In: Piperidis, S., Paskaleva, E. (eds.) Workshop on Language and Speech Infrastructure for Information Access in the Balkanic Countries, Borovets, Bulgaria, September 25, 2005, pp. 31–38 (2005)
Krstev, C., Stanković, R., Vitas, D., Obradović, I.: WS4LR: A Workstation for Lexical Resources. In: Proc. of LREC 2006, Genoa, ELRA (2006)
Krstev, C., Vitas, D., Gucul, S.: Recognition of Personal Names in Serbian Texts. In: Angelova, G. (ed.) Proc. of the International Conference Recent Advances in Natural Language Processing, Borovets, Bulgaria, September 21-23, 2005, pp. 288–292 (2005)
Kyriacopoulou, T., Mrabti, S., Yannacopoulou, A.: Le dictionnaire électronique des noms composés en grec moderne. In: Lingvisticae Investigationes, vol. 25(1), pp. 7–28. John Benjamins B.V., Amsterdam (2002)
Laporte, E.: Reduction of lexical ambiguity. In: Lingvisticae Investigationes, vol. 24(1), pp. 67–103. John Benjamins B.V., Amsterdam (2001)
Mikheev, A., Grover, C., Moens, M.: Description of the LTG System Used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (MUC-7)
Monachini, M., Soria, C.: Building Multilingual Terminological Lexicon for Less Widely Available Languages. In: Proc. of LTC 2005, Poznań, Poland, pp. 129–133 (2005)
Ranchhod, E.M.: Using Corpora to Increase Portuguese MWE Dictionaries. Tagging MWE in a Portuguese Corpus. In: Proc. of the Corpus Linguistics Conference Series, vol. 1(1) (to appear, 2005)
Savary, A.: A formalism for the computational morphology of multi-word units. Archives of Control Sciences 15(LI), 437–449 (2005)
Savary, A.: Multiflex — User’s Manual and Technical Documentation, version 1.0. Technical Report 285, LI-University of Tours, Tours (2005)
Silberztein, M.: Le dictionnaire électronique des mots composés. Langue Française 87, 71–83 (1990)
Silberztein, M.: NooJ Manual, Université de Franche-Comté (2005), http://perso.wanadoo.fr/rosavram/NooJ
Vitas, D., Krstev, C.: Derivational Morphology in an E-Dictionary of Serbian. In: Vetulani, Z. (ed.) Proc. of LTC 2005, Poznań, Poland, pp. 139–143 (2005)
Vitas, D., Pavlović-Lažetić, G., Krstev, C., Popović, L., Obradović, I.: Processing Serbian Written Texts: An Overview of Resources and Basic Tools. In: Piperidis, S., Karkaletisis, V. (eds.) Workshop on Balkan Language Resources and Tools, Thessaloniki, Greece, November 21, 2003, pp. 97–104 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krstev, C., Vitas, D., Savary, A. (2006). Prerequisites for a Comprehensive Dictionary of Serbian Compounds. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds) Advances in Natural Language Processing. FinTAL 2006. Lecture Notes in Computer Science(), vol 4139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816508_55
Download citation
DOI: https://doi.org/10.1007/11816508_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37334-6
Online ISBN: 978-3-540-37336-0
eBook Packages: Computer ScienceComputer Science (R0)