Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3298023.3298052guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Neural machine translation advised by statistical machine translation

Published: 04 February 2017 Publication History

Abstract

Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years. However, recent studies show that NMT generally produces fluent but inadequate translations (Tu et al. 2016b; 2016a; He et al. 2016; Tu et al. 2017). This is in contrast to conventional Statistical Machine Translation (SMT), which usually yields adequate but non-fluent translations. It is natural, therefore, to leverage the advantages of both models for better translations, and in this work we propose to incorporate SMT model into NMT framework. More specifically, at each decoding step, SMT offers additional recommendations of generated words based on the decoding information from NMT (e.g., the generated partial translation and attention history). Then we employ an auxiliary classifier to score the SMT recommendations and a gating function to combine the SMT recommendations with NMT generations, both of which are jointly trained within the NMT architecture in an end-to-end manner. Experimental results on Chinese-English translation show that the proposed approach achieves significant and consistent improvements over state-of-the-art NMT and SMT systems on multiple NIST test sets.

References

[1]
Arthur, P.; Neubig, G.; and Nakamura, S. 2016. Incorporating discrete translation lexicons into neural machine translation. In Proceedings of the 2016 Conference on EMNLP.
[2]
Bahdanau, D.; Cho, K.; and Bengio, Y. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.
[3]
Brown, P. F.; Pietra, V. J. D.; Pietra, S. A. D.; and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational linguistics.
[4]
Chiang, D. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the 43rd ACL.
[5]
Chitnis, R., and DeNero, J. 2015. Variable-length word encodings for neural translation models. In Proceedings of the 2015 Conference on EMNLP.
[6]
Cho, K.; van Merriënboer, B.; Bahdanau, D.; and Bengio, Y. 2014a. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint.
[7]
Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; and Bengio, Y. 2014b. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on EMNLP.
[8]
Chung, J.; Cho, K.; and Bengio, Y. 2016. A character-level decoder without explicit segmentation for neural machine translation. In Proceedings of the 54th ACL.
[9]
Cohn, T.; Hoang, C. D. V.; Vymolova, E.; Yao, K.; Dyer, C.; and Haffari, G. 2016. Incorporating structural alignment biases into an attentional neural translation model. In Proceedings of the 2016 NAACL.
[10]
Costa-jussà, M. R., and Fonollosa, J. A. R. 2016. Character-based neural machine translation. In Proceedings of the 54th ACL.
[11]
Feng, S.; Liu, S.; Li, M.; and Zhou, M. 2016. Implicit distortion and fertility models for attention-based encoder-decoder nmt model. arXiv preprint.
[12]
Gu, J.; Lu, Z.; Li, H.; and Li, V. O. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th ACL.
[13]
He, W.; He, Z.; Wu, H.; and Wang, H. 2016. Improved neural machine translation with smt features. In Proceedings of the 30th AAAI Conference on Artificial Intelligence.
[14]
Heafield, K. 2011. KenLM: faster and smaller language model queries. In Proceedings of the EMNLP 2011 Sixth Workshop on Statistical Machine Translation.
[15]
Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8):1735-1780.
[16]
Jean, S.; Cho, K.; Memisevic, R.; and Bengio, Y. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd ACL and the 7th IJCNLP.
[17]
Kalchbrenner, N., and Blunsom, P. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on EMNLP.
[18]
Koehn, P.; Och, F. J.; and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of the 2003 NAACL.
[19]
Ling, W.; Trancoso, I.; Dyer, C.; and Black, A. W. 2015. Character-based neural machine translation. arXiv preprint.
[20]
Luong, M.-T., and Manning, C. D. 2016. Achieving open vocabulary neural machine translation with hybrid word-character models. In Proceedings of the 54th ACL.
[21]
Luong, T.; Sutskever, I.; Le, Q.; Vinyals, O.; and Zaremba, W. 2015. Addressing the rare word problem in neural machine translation. In Proceedings of the 53rd ACL and the 7th IJCNLP.
[22]
Meng, F.; Lu, Z.; Tu, Z.; Li, H.; and Liu, Q. 2016. A deep memory-based architecture for sequence-to-sequence learning. In Proceedings of ICLR-Workshop 2016.
[23]
Och, F. J., and Ney, H. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of 40th ACL.
[24]
Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st ACL.
[25]
Schuster, M., and Paliwal, K. K. 1997. Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on 45(11):2673-2681.
[26]
Sennrich, R.; Haddow, B.; and Birch, A. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th ACL.
[27]
Stahlberg, F.; Hasler, E.; Waite, A.; and Byrne, B. 2016. Syntactically guided neural machine translation. In Proceedings of the 54th ACL (Volume 2: Short Papers).
[28]
Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27.
[29]
Tang, Y.; Meng, F.; Lu, Z.; Li, H.; and Yu, P. L. 2016. Neural machine translation with external phrase memory. arXiv preprint arXiv:1606.01792.
[30]
Tu, Z.; Liu, Y.; Lu, Z.; Liu, X.; and Li, H. 2016a. Context gates for neural machine translation. arXiv preprint arXiv:1608.06043.
[31]
Tu, Z.; Lu, Z.; Liu, Y.; Liu, X.; and Li, H. 2016b. Modeling coverage for neural machine translation. In Proceedings of the 54th ACL.
[32]
Tu, Z.; Liu, Y.; Shang, L.; Liu, X.; and Li, H. 2017. Neural machine translation with reconstruction. In Proceedings of the 31th AAAI Conference on Artificial Intelligence.
[33]
Wuebker, J.; Green, S.; DeNero, J.; Hasan, S.; and Luong, M.-T. 2016. Models and inference for prefix-constrained machine translation. In Proceedings of the 54th ACL.
[34]
Zeiler, M. D. 2012. Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.

Cited By

View all
  • (2019)Explicitly Modeling Word Translations in Neural Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/334235319:1(1-17)Online publication date: 23-Jul-2019
  1. Neural machine translation advised by statistical machine translation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence
    February 2017
    5106 pages

    Sponsors

    • Association for the Advancement of Artificial Intelligence
    • amazon: amazon
    • Infosys
    • Facebook: Facebook
    • IBM: IBM

    Publisher

    AAAI Press

    Publication History

    Published: 04 February 2017

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Explicitly Modeling Word Translations in Neural Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/334235319:1(1-17)Online publication date: 23-Jul-2019

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media