Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
note

Adversarial Training for Unknown Word Problems in Neural Machine Translation

Published: 21 August 2019 Publication History

Abstract

Nearly all of the work in neural machine translation (NMT) is limited to a quite restricted vocabulary, crudely treating all other words the same as an < unk> symbol. For the translation of language with abundant morphology, unknown (UNK) words also come from the misunderstanding of the translation model to the morphological changes. In this study, we explore two ways to alleviate the UNK problem in NMT: a new generative adversarial network (added value constraints and semantic enhancement) and a preprocessing technique that mixes morphological noise. The training process is like a win-win game in which the players are three adversarial sub models (generator, filter, and discriminator). In this game, the filter is to emphasize the discriminator’s attention to the negative generations that contain noise and improve the training efficiency. Finally, the discriminator cannot easily discriminate the negative samples generated by the generator with filter and human translations. The experimental results show that the proposed method significantly improves over several strong baseline models across various language pairs and the newly emerged Mongolian-Chinese task is state-of-the-art.

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv 1409, 0473.
[2]
Xilun Chen, Yu Sun, et al. 2016. Adversarial deep averaging networks for cross-lingual sentiment classification. In Association for Computational Linguistics (ACL’16). 557--570.
[3]
J. Chung, C. Gulcehre, K. H. Cho, et al. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv 4, 3555.
[4]
P. Dayan and L. F. Abbott. 2003. Theoretical neuroscience: Computational and mathematical modelling of neural systems. Journal of Cognitive Neuroscience 15, 1 (2003), 154--155.
[5]
Yue Wang Fei, Zhang Jie, et al. 2017. PDP: Parallel dynamic programming. IEEE/CAA Journal of Automatica Sinica and IEEE 4, 1 (2017), 1--5.
[6]
Jonas Gehring, Michael Auli, David Grangier, et al. 2017. Convolutional sequence to sequence learning. In International Conference on Machine Learning (ICML’17). 1243--1252.
[7]
Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv 1308.0850.
[8]
Caglar Gulcehre, Sungjin Ahn, et al. 2016. Pointing the unknown words. In Association for Computational Linguistics (ACL’16). 140--149.
[9]
Karl Moritz Hermann, Tomáŝ Koĉiský, et al. 2015. Teaching machines to read and comprehend. In Conference and Workshop on Neural Information Processing Systems (NIPS’15). 1693--1701.
[10]
Sébastien Jean, Kyunghyun Cho, et al. 2015. On using very large target vocabulary for neural machine translation. In Association for Computational Linguistics (ACL’15). 1--10.
[11]
Minh-Thang Luong, Ilya Sutskever, et al. 2015. Addressing the rare word problem in neural machine translation. In International Joint Conference on Natural Language Processing (IJCNLP’15). 11--19.
[12]
F. Meunier and C. M. Longtin. 2007. Morphological decomposition and semantic integration in word processing. Journal of Memory and Language 56, 4 (2007), 457--471.
[13]
Tomáš Mikolov, Stefan Kombrink, et al. 2011. Extensions of recurrent neural network language model. In International Conference on Acoustics, Speech, and Signal Processing. 5528--5531.
[14]
Volodymyr Mnih, Koray Kavukcuoglu, et al. 2013. Playing Atari with deep reinforcement learning. arXiv 1312.5602 (2013).
[15]
Frederic Morin and Yoshua Bengio. 2005. Hierarchical probabilistic neural network language model. In International Conference on Artificial Intelligence and Statistics (AiStats’05). 246--252.
[16]
Kishore Papineni, Salim Roukos, et al. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). 311--318.
[17]
Marc’Aurelio Ranzato, Sumit Chopra, et al. 2015. Sequence level training with recurrent neural networks. arXiv 1511.06732 (2015).
[18]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Association for Computational Linguistics (ACL’16). 1715--1725.
[19]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Conference and Workshop on Neural Information Processing Systems (NIPS’14). 3104--3112.
[20]
A. Tamar, Y. Wu, G. Thomas, et al. 2016. Value iteration networks. In Neural Information Processing Systems (NIPS’16). 2154--2162.
[21]
Ashish Vaswani, Noam Shazeer, et al. 2017. Attention is all you need. In Conference and Workshop on Neural Information Processing Systems (NIPS’17). 5998--6008.
[22]
L. Wu, Y. Xia, L. Zhao, et al. 2018. Adversarial neural machine translation. In Asian Conference on Machine Learning (ACML’18). 374--385.
[23]
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2016. SeqGAN: Sequence generative adversarial nets with policy gradient. In The Association for the Advancement of Artificial Intelligence (AAAI’16). 2852--2858.
[24]
Yuan Zhang, Regina Barzilay, and Tommi Jaakkola. 2017. Aspect-augmented adversarial networks for domain adaptation. Transactions of the Association for Computational Linguistics 5, 1 (2017), 515--528.
[25]
Zhen Yang, Wei Chen, Feng Wang, and Bo Xu. 2018. Improving neural machine translation with conditional sequence generative adversarial nets. In The North American Chapter of the Association for Computational Linguistics (NAACL’18). 1346--1355.
[26]
Jun Yan Zhu, Taesung Park, et al. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision (ICCV’17). 2223--2232.

Cited By

View all
  • (2023)Design and Proofreading of the English-Chinese Computer-Aided Translation System by the Neural NetworkComputational Intelligence and Neuroscience10.1155/2023/94508162023Online publication date: 1-Jan-2023
  • (2023)Design of Machine Translation Algorithm for English Long Sentences Based on Artificial Neural Network2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00039(183-187)Online publication date: 29-Dec-2023
  • (2023)Design of Computer Intelligent Proofreading Algorithm for English Translation Based on Markov Model2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00037(174-178)Online publication date: 29-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 1
January 2020
345 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3338846
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2019
Accepted: 01 June 2019
Revised: 01 May 2019
Received: 01 January 2019
Published in TALLIP Volume 19, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Neural machine translation
  2. UNK
  3. generative adversarial network
  4. value iteration

Qualifiers

  • Note
  • Research
  • Refereed

Funding Sources

  • Mongolian Language Information Special Support Project of Inner Mongolia
  • Natural Science Foundation of Inner Mongolia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)1
Reflects downloads up to 02 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Design and Proofreading of the English-Chinese Computer-Aided Translation System by the Neural NetworkComputational Intelligence and Neuroscience10.1155/2023/94508162023Online publication date: 1-Jan-2023
  • (2023)Design of Machine Translation Algorithm for English Long Sentences Based on Artificial Neural Network2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00039(183-187)Online publication date: 29-Dec-2023
  • (2023)Design of Computer Intelligent Proofreading Algorithm for English Translation Based on Markov Model2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00037(174-178)Online publication date: 29-Dec-2023
  • (2022)Research on Traditional Mongolian-Chinese Neural Machine Translation Based on Dependency Syntactic Information and Transformer ModelApplied Sciences10.3390/app12191007412:19(10074)Online publication date: 7-Oct-2022
  • (2022)Can NMT understand me?Proceedings of the 1st International Workshop on Natural Language-based Software Engineering10.1145/3528588.3528653(59-66)Online publication date: 21-May-2022

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media