Abstract
Syntactic parsing in NLP is the task of working out the grammatical structure of sentences. Some of the purely formal approaches to parsing such as phrase structure grammar, dependency grammar have been successfully employed for a variety of languages. While phrase structure based constituent analysis is possible for fixed order languages such as English, dependency analysis between the grammatical units have been suitable for many free word order languages. These approaches rely on identifying the linguistic units based on their formal syntactic properties and establishing the relationships between such units in the form of a tree. Instead, we characterize every morphosyntactic unit as a mapping between form and function on the lines of Construction Grammar and parsing as identification of dependency relations between such conceptual units. Our approach to parser annotation shows an average MALT LAS score of 82.21% on Tamil gold annotated corpus of 935 sentences in a five-fold validation experiment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Indian Language Machine Translation Project funded by DIT, Government of India.
- 3.
The gold annotation was carried out by AU-KBC Research Centre, Chennai.
- 4.
References
Goldberg, A.E.: Construction Grammar. Wiley Online Library (2002)
Fried, M., Östman, J.O.: Construction grammar. In: Construction Grammar in a Cross-Language Perspective (2011)
Langacker, R.W.: Cognitive Grammar: A Basic Introduction. Oxford University Press, Oxford (2008)
Shieber, S.M.: Evidence against the context-freeness of natural language. In: Savitch, W.J., Bach, E., Marsh, W., Safran-Naveh, G. (eds.) The Formal Complexity of Natural Language. Studies in Linguistics and Philosophy, vol. 33, pp. 320–334. Springer, Heidelberg (1985). https://doi.org/10.1007/978-94-009-3401-6_12
Melčuk, I.A.: Dependency Syntax: Theory and Practice. SUNY Press, Albany (1988)
Bharati, A., Chaitanya, V., Sangal, R., Ramakrishnamacharyulu, K.: Natural Language Processing: A Paninian Perspective. Prentice-Hall of India, New Delhi (1995)
Bharati, A., Sangal, R.: Parsing free word order languages in the Paninian framework. In: Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pp. 105–111. Association for Computational Linguistics (1993)
Bharati, A., Gupta, M., Yadav, V., Gali, K., Sharma, D.M.: Simple parser for Indian languages in a dependency framework. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 162–165. Association for Computational Linguistics (2009)
Mannem, P.: Bidirectional dependency parser for Hindi, Telugu and Bangla. In: Proceedings of NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, India (2009)
Nivre, J.: Parsing Indian languages with MaltParser. In: Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, pp. 12–18 (2009)
Ambati, B.R., Gadde, P., Jindal, K.: Experiments in Indian language dependency parsing. In: Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, pp. 32–37 (2009)
Antony, P., Warrier, N.J., Soman, K.: Penn treebank-based syntactic parsers for South Dravidian languages using a machine learning approach. Int. J. Comput. Appl. 7, 14–21 (2010)
Selvam, M., Natarajan, A., Thangarajan, R.: Structural parsing of natural language text in Tamil using phrase structure hybrid language model. Int. J. Comput. Inf. Syst. Sci. Eng. 2008, 2–4 (2008)
Ramasamy, L., Žabokrtský, Z.: Tamil dependency parsing: results using rule based and corpus based approaches. In: Gelbukh, A.F. (ed.) CICLing 2011. LNCS, vol. 6608, pp. 82–95. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19400-9_7
Straka, M., Hajic, J., Straková, J., Hajic Jr., J.: Parsing universal dependency treebanks using neural networks and search-based oracle. In: International Workshop on Treebanks and Linguistic Theories (TLT 2014), p. 208 (2014)
Kumari, B.V.S., Rao, R.R.: Hindi dependency parsing using a combined model of Malt and MST. In: 24th International Conference on Computational Linguistics, p. 171. Citeseer (2012)
Kesidi, S.R., Kosaraju, P., Vijay, M., Husain, S.: A constraint based hybrid dependency parser for Telugu. Int. J. Comput. Linguist. Appl. 2, 53 (2011)
Seddah, D., Tsarfaty, R., Kübler, S., Candito, M., Choi, J., Farkas, R., Foster, J., Goenaga, I., Gojenola, K., Goldberg, Y., et al.: Overview of the SPMRL 2013 shared task: cross-framework evaluation of parsing morphologically rich languages. Association for Computational Linguistics (2013)
Amritavalli, R., Jayaseelan, K.: Finiteness and negation in Dravidian. In: The Oxford Handbook of Comparative Syntax, pp. 178–220 (2005)
Amritavalli, R.: Separating tense and finiteness: anchoring in Dravidian. Nat. Lang. Linguist. Theory 32, 283–306 (2014)
McFadden, T., Sundaresan, S.: Finiteness in south Asian languages: an introduction. Nat. Lang. Linguist. Theory 32, 1–27 (2014)
Jayaseelan, K.A.: The serial verb construction in Malayalam. In: Dayal, V., Mahajan, A. (eds.) Clause Structure in South Asian Languages. Studies in Natural Language and Linguistic Theory, vol. 61, pp. 67–91. Springer, Heidelberg (2004). https://doi.org/10.1007/978-1-4020-2719-2_3
Jayaseelan, K.: Coordination, relativization and finiteness in Dravidian. Nat. Lang. Linguist. Theory 32, 191–211 (2014)
Herring, S.C.: Aspect as a discourse category in Tamil. In: Annual Meeting of the Berkeley Linguistics Society, vol. 14 (2011)
Karmakar, S., Kasturirangan, R.: Cognitive processes underlying the meaning of complex predicates and serial verbs from the perspective of individuating and ordering situations in bānlā. In: Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia, pp. 81–87. ACM (2010)
Bharati, A., Husain, D.S.S., Bai, L., Begam, R., Sangal, R.: Anncorra: Treebanks for Indian languages, guidelines for annotating Hindi treebank (version-2.0) (2009)
Ambati, B.R., Husain, S., Nivre, J., Sangal, R.: On the role of morphosyntactic features in Hindi dependency parsing. In: Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages, pp. 94–102. Association for Computational Linguistics (2010)
Szabolcsi, A.: What do quantifier particles do? Linguist. Philos. 38, 159–204 (2015)
Bharati, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: Annotating corpora guidelines for POS and chunk annotation for Indian languages. LTRC-TR31 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Muralidaran, V., Misra Sharma, D. (2018). Construction Grammar Based Annotation Framework for Parsing Tamil. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-75477-2_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)