Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Bidirectional Deep Learning of Context Representation for Joint Word Segmentation and POS Tagging

  • Conference paper
  • First Online:
Advanced Computational Methods for Knowledge Engineering (ICCSAMA 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 629))

  • 1056 Accesses

Abstract

Word segmentation and POS tagging are crucial steps for natural language processing. Though deep learning facilitates learning a joint model without feature engineering, it still suffers from unreliable word embedding when words are rare or unknown. We introduce two-level backoff models to which morphological information and character-level contexts are integrated. Experimental results on Thai and Chinese show that our backoff models improve the accuracy of both tasks and excels in OOV recovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. JMLR 3, 1137–1155 (2003)

    MATH  Google Scholar 

  2. Boonkwan, P., Supnithi, T., Pailai, J., Kongkachandra, R.: Gradient-descent error correction of POS tagging. In: Proceedings of SNLP (2013)

    Google Scholar 

  3. Boriboon, M., Kriengket, K., Chootrakool, P., Phaholphinyo, S., Purodakananda, S., Thanakulwarapas, T., Kosawat, K.: BEST corpus development and analysis. In: Proceedings of the 2009 International Conference on Asian Language Processing, pp. 322–327 (2009)

    Google Scholar 

  4. Chen, K.L., Hsieh, Y.M.: Chinese treebanks and grammar extraction. In: Proceedings of IJCNLP, pp. 560–565 (2004)

    Google Scholar 

  5. Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014: Deep Learning and Representation Learning Workshop (2014)

    Google Scholar 

  6. Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of EMNLP (2002)

    Google Scholar 

  7. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. JMLR 12, 2493–2537 (2011)

    MATH  Google Scholar 

  8. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of AIStats, vol. 9, pp. 249–256 (2010)

    Google Scholar 

  9. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  10. Kaji, N., Kitsuregawa, M.: Accurate word segmentation and POS tagging for japanese microblogs: corpus annotation and joint modeling with lexical normalization. In: Proceedings of EMNLP, pp. 99–109 (2014)

    Google Scholar 

  11. Kongyoung, S., Rugchatjaroen, A., Kosawat, K.: TLex+: a hybrid method using conditional random fields and dictionaries for Thai word segmentation. In: Proceedings of KICSS (2015)

    Google Scholar 

  12. Kruengkrai, C., Uchimoto, K., Kazama, J., Torisawa, K., Isahara, H., Jaruskulchai, C.: A word and character-cluster hybrid model for Thai word segmentation. In: Proceedings of InterBEST 2009: Thai Word Segmentation Workshop, pp. 24–29 (2009)

    Google Scholar 

  13. Kruengkrai, C., Uchimoto, K., Kazama, J., Wang, Y., Torisawa, K., Isahara, H.: An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging. In: Proceedings of the Joint Conference of the 47th ACL and the 4th IJCNLP of the AFNLP, vol. 1, pp. 513–521 (2009)

    Google Scholar 

  14. Lyu, C., Zhang, Y., Ji, D.: Joint word segmentation, POS-tagging, and syntactic chunking. In: Proceedings of AAAI, pp. 3007–3014 (2016)

    Google Scholar 

  15. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)

    Google Scholar 

  16. Murata, M., Ma, Q., Isahara, H.: Part of speech tagging in Thai language using support vector machine. In: Proceedings of NLPRS: The 2nd Workshop on Natural Language Processing and Neural Networks (2001)

    Google Scholar 

  17. Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: Proceedings of ACL, pp. 149–155 (2016)

    Google Scholar 

  18. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of EMNLP, pp. 1532–1543 (2014)

    Google Scholar 

  19. Qian, T., Zhang, Y., Zhang, M., Ren, Y., Ji, D.: A transition-based model for joint segmentation, POS-tagging, and normalization. In: Proceedings of EMNLP, pp. 1837–1846 (2015)

    Google Scholar 

  20. Qian, X., Liu, Y.: Joint Chinese word segmentation, POS tagging, and parsing. In: Proceedings of the 2012 Joint Conference on EMNLP and CoNLL, pp. 501–511 (2012)

    Google Scholar 

  21. Ratliff, N., Bagnell, J.A., Zinkevich, M.: (Online) Subgradient methods for structured prediction. In: Proceedings of AIStats (2007)

    Google Scholar 

  22. Shi, Y., Wang, M.: A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks. In: Proceedings of the IJCAI, pp. 1707–1712 (2007)

    Google Scholar 

  23. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  24. Zeng, X., Wong, D.F., Chao, L.S., Trancoso, I.: Graph-based semi-supervised model for joint Chinese word segmentation and part-of-speech tagging. In: Proceedings of ACL, pp. 770–779 (2013)

    Google Scholar 

  25. Zhang, Y., Clark, S.: A fast decoder for joint word segmentation and POS-tagging using a single discriminative model. In: Proceedings of EMNLP, pp. 843–852 (2010)

    Google Scholar 

  26. Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and POS tagging. In: Proceedings of EMNLP, pp. 647–657 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prachya Boonkwan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Boonkwan, P., Supnithi, T. (2018). Bidirectional Deep Learning of Context Representation for Joint Word Segmentation and POS Tagging. In: Le, NT., van Do, T., Nguyen, N., Thi, H. (eds) Advanced Computational Methods for Knowledge Engineering. ICCSAMA 2017. Advances in Intelligent Systems and Computing, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-61911-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61911-8_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61910-1

  • Online ISBN: 978-3-319-61911-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics