Abstract
This paper presents a new conversion method to automatically transform a constituent-based Vietnamese Treebank into dependency trees. On a dependency Treebank created according to our new approach, we examine two state-of-the-art dependency parsers: the MSTParser and the MaltParser. Experiments show that the MSTParser outperforms the MaltParser. To the best of our knowledge, we report the highest performances published to date in the task of dependency parsing for Vietnamese. Particularly, on gold standard POS tags, we get an unlabeled attachment score of 79.08% and a labeled attachment score of 71.66%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2009)
Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X, pp. 149–164 (2006)
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 Shared Task on Dependency Parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 915–932 (2007)
McDonald, R., Nivre, J.: Characterizing the Errors of Data-Driven Dependency Parsing Models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 122–131 (June 2007)
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective Dependency Parsing Using Spanning Tree Algorithms. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 523–530 (2005)
McDonald, R., Lerman, K., Pereira, F.: Multilingual Dependency Analysis with a Two-stage Discriminative Parser. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X 2006, pp. 216–220 (2006)
Nakagawa, T.: Multilingual Dependency Parsing Using Global Features. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 952–956 (2007)
Koo, T., Collins, M.: Efficient Third-order Dependency Parsers. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 1–11 (2010)
Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: Proceedings of the 8th International Workshop of Parsing Technologies, IWPT 2003 (2003)
Nilsson, J., Nivre, J., Hall, J.: Graph Transformations in Data-Driven Dependency Parsing. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 257–264 (July 2006)
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13, 1 (2007)
Nivre, J., McDonald, R.: Integrating Graph-Based and Transition-Based Dependency Parsers. In: Proceedings of ACL 2008, pp. 950–958. HLT (June 2008)
Zhang, Y., Clark, S.: A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, pp. 562–571 (October 2008)
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)
Nguyen, P.T., Vu, X.L., Nguyen, T.M.H., Nguyen, V.H., Le, H.P.: Building a Large Syntactically-Annotated Corpus of Vietnamese. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 182–185 (August 2009)
Johansson, R., Nugues, P.: Extended Constituent-to-dependency Conversion for English. In: Proceedings of 16th Nordic Conference of Computational Linguistics, NODALIDA 2007, Tartu, Estonia, pp. 105–112 (2007)
Collins, M.: Three generative, lexicalised models for statistical parsing. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, ACL 1997, pp. 16–23 (1997)
Seeker, W., Kuhn, J.: Making Ellipses Explicit in Dependency Conversion for a German Treebank. In: Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, pp. 3132–3139 (2012)
Candito, M., Crabbé, B., Denis, P.: Statistical French dependency parsing: treebank conversion and first results. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 (2010)
Gelbukh, A., Calvo, H., Torres, S.: Transforming a constituency treebank into a dependency treebank. In: Proceedings of XXI Conference of the Spanish Society for Natural Language Processing, SEPLN 2005, vol. 35, pp. 145–152 (2005)
Marinov, S., Nivre, J.: A data-driven dependency parser for Bulgarian. In: Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (2005)
Ma, X., Zhang, X., Zhao, H., Lu, B.L.: Dependency Parser for Chinese Constituent Parsing. In: Joint Conference on Chinese Language Processing, pp. 1–6 (2010)
Choi, J.D., Palmer, M.: Statistical dependency parsing in Korean: from corpus generation to automatic parsing. In: Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages, pp. 1–11 (2011)
Hong, P.L., Nguyen, T.M.H., Roussanaly, A.: Vietnamese Parsing with an Automatically Extracted Tree-Adjoining Grammar. In: Proceedings of the 9th IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future, pp. 1–6. IEEE (February 2012)
Thi, L.N., My, L.H., Viet, H.N., Minh, H.N.T., Hong, P.L.: Building a Treebank for Vietnamese Dependency Parsing. In: Proceedings of the 10th IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (2013)
Le-Hong, P., Nguyen, T.M.H., Nguyen, P.T., Roussanaly, A.: Automated extraction of tree adjoining grammars from a treebank for Vietnamese. In: Proceedings of The Tenth International Workshop on Tree Adjoining Grammars and Related Formalisms (2010)
Choi, J.D., Palmer, M.: Robust constituent-to-dependency conversion for English. In: Proceedings of 9th Treebanks and Linguistic Theories Workshop, pp. 55–66 (2010)
de Marneffe, M.C., Manning, C.D.: The Stanford typed dependencies representation. In: Proceedings of the Coling 2008 workshop on Cross-Framework and Cross-Domain Parser Evaluation. Number, pp. 1–8 (2008)
Čmejrek, M., Cu\vr\’in, J., Havelka, J.: Prague Czech-English Dependency Treebank: Any Hopes for a Common Annotation Scheme? In: HLT-NAACL 2004 Workshop: Frontiers in Corpus Annotation, pp. 47–54 (May 2004)
Ballesteros, M., Nivre, J.: MaltOptimizer: A System for MaltParser Optimization. In: Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, vol. (2006), pp. 2757–2763 (2012)
Nguyen, D.Q., Nguyen, D.Q., Pham, S.B., Pham, D.D.: Ripple Down Rules for Part-of-Speech Tagging. In: Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011, vol. Part I, pp. 190–201 (2011)
Nguyen, D.Q., Nguyen, D.Q., Pham, D.D., Pham, S.B.: RDRPOSTagger: A Ripple Down Rules-based Part-Of-Speech Tagger. In: Proc. of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Nguyen, D.Q., Nguyen, D.Q., Pham, S.B., Nguyen, PT., Le Nguyen, M. (2014). From Treebank Conversion to Automatic Dependency Parsing for Vietnamese. In: Métais, E., Roche, M., Teisseire, M. (eds) Natural Language Processing and Information Systems. NLDB 2014. Lecture Notes in Computer Science, vol 8455. Springer, Cham. https://doi.org/10.1007/978-3-319-07983-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-07983-7_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07982-0
Online ISBN: 978-3-319-07983-7
eBook Packages: Computer ScienceComputer Science (R0)