Abstract
In this paper we present a comparison of three morphological taggers for Russian with regard to the quality of morphological disambiguation performed by these taggers. We test the quality of the analysis in three different ways: lemmatization, POS-tagging and assigning full morphological tags. We analyze the mistakes made by the taggers, outline their strengths and weaknesses, and present a possible way to improve the quality of morphological analysis for Russian.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Erjavec, T.: Multext-east version 3: multilingual morphosyntactic specifications, lexicons and corpora. In: LREC (2004)
Bocharov, V., Bichineva, S., Granovsky, D., Ostapuk, N., Stepanova, M.: Quality assurance tools in the opencorpora project. In: Proceeding of the International Conference on Computational Linguistics and Intelligent Technology, Dialog 2011, pp. 10–17 (2011)
Astaf’eva, I., Bonch-Osmolovskaya, A., Garejshina, A., Grishina, J., D’jachkov, V., Ionov, M., Koroleva, A., Kudrinsky, M., Lityagina, A., Luchina, E., et al.: NLP evaluation: Russian morphological parsers. In: Proceedings of Dialog Conference, Moscow, Russia (2010)
Segalovich, I.: A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: MLMTA, Citeseer, pp. 273–280 (2003)
Korobov, M.: Morphological analyzer and generator for Russian and Ukrainian languages. In: Khachay, M.Y., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) Analysis of Images, Social Networks and Texts. Communications in Computer and Information Science, vol. 542, pp. 320–332. Springer, Heidelberg (2015)
Padró, L., Stanilovsky, E.: Freeling 3.0: towards wider multilinguality. In: LREC2012 (2012)
Schmid, H.: Improvements in part-of-speech tagging with an application to German. In: Proceedings of the ACL SIGDAT-Workshop, Citeseer (1995)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing, vol. 12, pp. 44–49. Citeseer (1994)
Sharoff, S., Kopotev, M., Erjavec, T., Feldman, A., Divjak, D.: Designing and evaluating a Russian tagset. In: LREC (2008)
Acknowledgments
I would like to thank Elmira Mustakimova, Svetlana Toldova and Timofey Arkhangelskiy for their participation in the project. I am also grateful to Mikhail Korobov for his valuable remarks on pymorphy performance and explanations of error causes (and for developing pymorphy, of course).
This article is an output of a research project implemented as part of the Basic Research Program at the National Research University Higher School of Economics (HSE).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kuzmenko, E. (2017). Morphological Analysis for Russian: Integration and Comparison of Taggers. In: Ignatov, D., et al. Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol 661. Springer, Cham. https://doi.org/10.1007/978-3-319-52920-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-52920-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52919-6
Online ISBN: 978-3-319-52920-2
eBook Packages: Computer ScienceComputer Science (R0)