Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Qun, Nuo; Yan, Hang; Qiu, Xi-Peng; Huang, Xuan-Jing

doi:10.1007/s11390-020-9576-4

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Regular Paper
Published: 30 September 2020

Volume 35, pages 1115–1126, (2020)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Nuo Qun^1,2^na1,
Hang Yan^1,2^na1,
Xi-Peng Qiu^1,2 &
…
Xuan-Jing Huang^1,2

204 Accesses
11 Citations
Explore all metrics

Abstract

Semi-Markov conditional random fields (Semi-CRFs) have been successfully utilized in many segmentation problems, including Chinese word segmentation (CWS). The advantage of Semi-CRF lies in its inherent ability to exploit properties of segments instead of individual elements of sequences. Despite its theoretical advantage, Semi-CRF is still not the best choice for CWS because its computation complexity is quadratic to the sentence’s length. In this paper, we propose a simple yet effective framework to help Semi-CRF achieve comparable performance with CRF-based models under similar computation complexity. Specifically, we first adopt a bi-directional long short-term memory (BiLSTM) on character level to model the context information, and then use simple but effective fusion layer to represent the segment information. Besides, to model arbitrarily long segments within linear time complexity, we also propose a new model named Semi-CRF-Relay. The direct modeling of segments makes the combination with word features easy and the CWS performance can be enhanced merely by adding publicly available pre-trained word embeddings. Experiments on four popular CWS datasets show the effectiveness of our proposed methods. The source codes and pre-trained embeddings of this paper are available on https://github.com/fastnlp/fastNLP/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Xue N. Chinese word segmentation as character tagging. International Journal of Computational Linguistics and Chinese Language Processing, 2003, 8(1): 29-48.
Google Scholar
Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the 18th International Conference on Machine Learning, June 2001, pp.282-289.
Zheng X, Chen H, Xu T. Deep learning for Chinese word segmentation and POS tagging. In Proc. the 2013 Conference on Empirical Methods in Natural Language Processing, October 2013, pp.647-657.
Pei W, Ge T, Chang B. Max-margin tensor neural network for Chinese word segmentation. In Proc. the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014, pp.293-303.
Chen X, Qiu X, Zhu C, Liu P, Huang X. Long short-term memory neural networks for Chinese word segmentation. In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, September 2015, pp.1197-1206.
Chen X, Qiu X, Zhu C, Huang X. Gated recursive neural network for Chinese word segmentation. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, July 2015, pp.1744-1753.
Zhang Y, Clark S. Chinese segmentation with a word-based perceptron algorithm. In Proc. the 45th Annual Meeting of the Association for Computational Linguistics, June 2007, pp.840-847.
Sun W. Word-based and character-based word segmentation models: Comparison and combination. In Proc. the 23rd International Conference on Computational Linguistics, August 2010, pp.1211-1219.
Cai D, Zhao H. Neural word segmentation learning for Chinese. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.409-420.
Zhang M, Zhang Y, Fu G. Transition-based neural word segmentation. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.421-431.
Liu Y, Che W, Guo J, Qin B, Liu T. Exploring segment representations for neural segmentation models. In Proc. the 25th International Joint Conference on Artificial Intelligence, July 2016, pp.2880-2886.
Sarawagi S, Cohen W. Semi-Markov conditional random fields for information extraction. In Proc. the Annual Conference on Neural Information Processing Systems, December 2005, pp.1185-1192.
Andrew G. A hybrid Markov/semi-Markov conditional random field for sequence segmentation. In Proc. the 2006 Conference on Empirical Methods in Natural Language Processing, July 2006, pp.465-472.
Sun X, Zhang Y, Matsuzaki T, Tsuruoka Y, Tsujii J. A discriminative latent variable Chinese segmenter with hybrid word/character information. In Proc. the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, May 2009, pp.56-64.
Kong L, Dyer C, Smith N A. Segmental recurrent neural networks. In Proc. the 4th International Conference on Learning Representations, May 2015.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780.
Article Google Scholar
Chen X, Shi Z, Qiu X, Huang X. Adversarial multi-criteria learning for Chinese word segmentation. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.1193-1203.
Chen X, Shi Z, Qiu X, Huang X. DAG-based long short-term memory for neural word segmentation. arXiv:1707.00248, 2017. https://arxiv.org/abs/1707.00248, August 2019.
Yang J, Zhang Y, Liang S. Subword encoding in Lattice LSTM for Chinese word segmentation. arXiv:1810.12594, 2018. https://arxiv.org/abs/1810.12594, August 2019.
Elman J L. Finding structure in time. Cognitive Science, 1990, 14(2): 179-211.
Article Google Scholar
Song Y, Shi S, Li J, Zhang H. Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In Proc. the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, June 2018, pp.175-180.
Emerson T. The second international Chinese word segmentation bakeoff. In Proc. the 4th SIGHAN Workshop on Chinese Language Processing, June 2005, pp.123-133.
Zeiler M D. ADADELTA: An adaptive learning rate method. arXiv:1212.5701, 2012. https://arxiv.org/abs/1212.5701, August 2019.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
MathSciNet MATH Google Scholar
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In Proc. the 13th International Conference on Artificial Intelligence and Statistics, May 2010, pp.249-256.
Ling W, Dyer C, Black A W, Trancoso I. Two/too simple adaptations of word2vec for syntax problems. In Proc. the 2015 Conference of the North American Chapter of the Association for Computational Linguistics, May 2015, pp.1299-1304.
Zhang Q, Liu X, Fu J. Neural networks incorporating dictionaries for Chinese word segmentation. In Proc. the 32nd AAAI Conference on Artificial Intelligence, February 2018, pp.5682-5689.
Finkel J R, Manning C D. Nested named entity recognition. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, August 2009, pp.141-150.
Ye Z, Ling Z. Hybrid semi-Markov CRF for neural sequence labeling. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, July 2018, pp.235-240.
Sun X, Huang D, Song H, Ren F. Chinese new word identification: A latent discriminative model with global features. Journal of Computer Science and Technology, 2011, 26(1): 14-24.
Article Google Scholar

Download references

Author information

Nuo Qun and Hang Yan contributed equally to this work.

Authors and Affiliations

School of Computer Science, Fudan University, Shanghai, 200433, China
Nuo Qun, Hang Yan, Xi-Peng Qiu & Xuan-Jing Huang
Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, 200433, China
Nuo Qun, Hang Yan, Xi-Peng Qiu & Xuan-Jing Huang

Authors

Nuo Qun
View author publications
You can also search for this author in PubMed Google Scholar
Hang Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xi-Peng Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Xuan-Jing Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xi-Peng Qiu.

Electronic supplementary material

ESM 1

(PDF 540 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qun, N., Yan, H., Qiu, XP. et al. Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node. J. Comput. Sci. Technol. 35, 1115–1126 (2020). https://doi.org/10.1007/s11390-020-9576-4

Download citation

Received: 23 March 2019
Revised: 04 July 2019
Published: 30 September 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11390-020-9576-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Is Local Window Essential for Neural Network Based Chinese Word Segmentation?

Span Labeling Approach for Vietnamese and Chinese Word Segmentation

Mongolian Word Segmentation Based on BiLSTM-CNN-CRF Model

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Is Local Window Essential for Neural Network Based Chinese Word Segmentation?

Span Labeling Approach for Vietnamese and Chinese Word Segmentation

Mongolian Word Segmentation Based on BiLSTM-CNN-CRF Model

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation