Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

From Treebank Conversion to Automatic Dependency Parsing for Vietnamese

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2014)

Abstract

This paper presents a new conversion method to automatically transform a constituent-based Vietnamese Treebank into dependency trees. On a dependency Treebank created according to our new approach, we examine two state-of-the-art dependency parsers: the MSTParser and the MaltParser. Experiments show that the MSTParser outperforms the MaltParser. To the best of our knowledge, we report the highest performances published to date in the task of dependency parsing for Vietnamese. Particularly, on gold standard POS tags, we get an unlabeled attachment score of 79.08% and a labeled attachment score of 71.66%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Kübler, S., McDonald, R., Nivre, J.: Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2009)

    Google Scholar 

  2. Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X, pp. 149–164 (2006)

    Google Scholar 

  3. Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 Shared Task on Dependency Parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 915–932 (2007)

    Google Scholar 

  4. McDonald, R., Nivre, J.: Characterizing the Errors of Data-Driven Dependency Parsing Models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 122–131 (June 2007)

    Google Scholar 

  5. McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective Dependency Parsing Using Spanning Tree Algorithms. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 523–530 (2005)

    Google Scholar 

  6. McDonald, R., Lerman, K., Pereira, F.: Multilingual Dependency Analysis with a Two-stage Discriminative Parser. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X 2006, pp. 216–220 (2006)

    Google Scholar 

  7. Nakagawa, T.: Multilingual Dependency Parsing Using Global Features. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 952–956 (2007)

    Google Scholar 

  8. Koo, T., Collins, M.: Efficient Third-order Dependency Parsers. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 1–11 (2010)

    Google Scholar 

  9. Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: Proceedings of the 8th International Workshop of Parsing Technologies, IWPT 2003 (2003)

    Google Scholar 

  10. Nilsson, J., Nivre, J., Hall, J.: Graph Transformations in Data-Driven Dependency Parsing. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 257–264 (July 2006)

    Google Scholar 

  11. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13, 1 (2007)

    Article  Google Scholar 

  12. Nivre, J., McDonald, R.: Integrating Graph-Based and Transition-Based Dependency Parsers. In: Proceedings of ACL 2008, pp. 950–958. HLT (June 2008)

    Google Scholar 

  13. Zhang, Y., Clark, S.: A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, pp. 562–571 (October 2008)

    Google Scholar 

  14. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  15. Nguyen, P.T., Vu, X.L., Nguyen, T.M.H., Nguyen, V.H., Le, H.P.: Building a Large Syntactically-Annotated Corpus of Vietnamese. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 182–185 (August 2009)

    Google Scholar 

  16. Johansson, R., Nugues, P.: Extended Constituent-to-dependency Conversion for English. In: Proceedings of 16th Nordic Conference of Computational Linguistics, NODALIDA 2007, Tartu, Estonia, pp. 105–112 (2007)

    Google Scholar 

  17. Collins, M.: Three generative, lexicalised models for statistical parsing. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, ACL 1997, pp. 16–23 (1997)

    Google Scholar 

  18. Seeker, W., Kuhn, J.: Making Ellipses Explicit in Dependency Conversion for a German Treebank. In: Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, pp. 3132–3139 (2012)

    Google Scholar 

  19. Candito, M., Crabbé, B., Denis, P.: Statistical French dependency parsing: treebank conversion and first results. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 (2010)

    Google Scholar 

  20. Gelbukh, A., Calvo, H., Torres, S.: Transforming a constituency treebank into a dependency treebank. In: Proceedings of XXI Conference of the Spanish Society for Natural Language Processing, SEPLN 2005, vol. 35, pp. 145–152 (2005)

    Google Scholar 

  21. Marinov, S., Nivre, J.: A data-driven dependency parser for Bulgarian. In: Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (2005)

    Google Scholar 

  22. Ma, X., Zhang, X., Zhao, H., Lu, B.L.: Dependency Parser for Chinese Constituent Parsing. In: Joint Conference on Chinese Language Processing, pp. 1–6 (2010)

    Google Scholar 

  23. Choi, J.D., Palmer, M.: Statistical dependency parsing in Korean: from corpus generation to automatic parsing. In: Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages, pp. 1–11 (2011)

    Google Scholar 

  24. Hong, P.L., Nguyen, T.M.H., Roussanaly, A.: Vietnamese Parsing with an Automatically Extracted Tree-Adjoining Grammar. In: Proceedings of the 9th IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future, pp. 1–6. IEEE (February 2012)

    Google Scholar 

  25. Thi, L.N., My, L.H., Viet, H.N., Minh, H.N.T., Hong, P.L.: Building a Treebank for Vietnamese Dependency Parsing. In: Proceedings of the 10th IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (2013)

    Google Scholar 

  26. Le-Hong, P., Nguyen, T.M.H., Nguyen, P.T., Roussanaly, A.: Automated extraction of tree adjoining grammars from a treebank for Vietnamese. In: Proceedings of The Tenth International Workshop on Tree Adjoining Grammars and Related Formalisms (2010)

    Google Scholar 

  27. Choi, J.D., Palmer, M.: Robust constituent-to-dependency conversion for English. In: Proceedings of 9th Treebanks and Linguistic Theories Workshop, pp. 55–66 (2010)

    Google Scholar 

  28. de Marneffe, M.C., Manning, C.D.: The Stanford typed dependencies representation. In: Proceedings of the Coling 2008 workshop on Cross-Framework and Cross-Domain Parser Evaluation. Number, pp. 1–8 (2008)

    Google Scholar 

  29. Čmejrek, M., Cu\vr\’in, J., Havelka, J.: Prague Czech-English Dependency Treebank: Any Hopes for a Common Annotation Scheme? In: HLT-NAACL 2004 Workshop: Frontiers in Corpus Annotation, pp. 47–54 (May 2004)

    Google Scholar 

  30. Ballesteros, M., Nivre, J.: MaltOptimizer: A System for MaltParser Optimization. In: Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, vol. (2006), pp. 2757–2763 (2012)

    Google Scholar 

  31. Nguyen, D.Q., Nguyen, D.Q., Pham, S.B., Pham, D.D.: Ripple Down Rules for Part-of-Speech Tagging. In: Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011, vol. Part I, pp. 190–201 (2011)

    Google Scholar 

  32. Nguyen, D.Q., Nguyen, D.Q., Pham, D.D., Pham, S.B.: RDRPOSTagger: A Ripple Down Rules-based Part-Of-Speech Tagger. In: Proc. of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Nguyen, D.Q., Nguyen, D.Q., Pham, S.B., Nguyen, PT., Le Nguyen, M. (2014). From Treebank Conversion to Automatic Dependency Parsing for Vietnamese. In: Métais, E., Roche, M., Teisseire, M. (eds) Natural Language Processing and Information Systems. NLDB 2014. Lecture Notes in Computer Science, vol 8455. Springer, Cham. https://doi.org/10.1007/978-3-319-07983-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07983-7_26

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07982-0

  • Online ISBN: 978-3-319-07983-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics