note

Adversarial Training for Unknown Word Problems in Neural Machine Translation

Authors:

Nier WuAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 19, Issue 1

Article No.: 17, Pages 1 - 12

https://doi.org/10.1145/3342482

Published: 21 August 2019 Publication History

Abstract

Nearly all of the work in neural machine translation (NMT) is limited to a quite restricted vocabulary, crudely treating all other words the same as an < unk> symbol. For the translation of language with abundant morphology, unknown (UNK) words also come from the misunderstanding of the translation model to the morphological changes. In this study, we explore two ways to alleviate the UNK problem in NMT: a new generative adversarial network (added value constraints and semantic enhancement) and a preprocessing technique that mixes morphological noise. The training process is like a win-win game in which the players are three adversarial sub models (generator, filter, and discriminator). In this game, the filter is to emphasize the discriminator’s attention to the negative generations that contain noise and improve the training efficiency. Finally, the discriminator cannot easily discriminate the negative samples generated by the generator with filter and human translations. The experimental results show that the proposed method significantly improves over several strong baseline models across various language pairs and the newly emerged Mongolian-Chinese task is state-of-the-art.

References

[1]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv 1409, 0473.

[2]

Xilun Chen, Yu Sun, et al. 2016. Adversarial deep averaging networks for cross-lingual sentiment classification. In Association for Computational Linguistics (ACL’16). 557--570.

[3]

J. Chung, C. Gulcehre, K. H. Cho, et al. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv 4, 3555.

[4]

P. Dayan and L. F. Abbott. 2003. Theoretical neuroscience: Computational and mathematical modelling of neural systems. Journal of Cognitive Neuroscience 15, 1 (2003), 154--155.

Digital Library

[5]

Yue Wang Fei, Zhang Jie, et al. 2017. PDP: Parallel dynamic programming. IEEE/CAA Journal of Automatica Sinica and IEEE 4, 1 (2017), 1--5.

[6]

Jonas Gehring, Michael Auli, David Grangier, et al. 2017. Convolutional sequence to sequence learning. In International Conference on Machine Learning (ICML’17). 1243--1252.

Digital Library

[7]

Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv 1308.0850.

[8]

Caglar Gulcehre, Sungjin Ahn, et al. 2016. Pointing the unknown words. In Association for Computational Linguistics (ACL’16). 140--149.

[9]

Karl Moritz Hermann, Tomáŝ Koĉiský, et al. 2015. Teaching machines to read and comprehend. In Conference and Workshop on Neural Information Processing Systems (NIPS’15). 1693--1701.

Digital Library

[10]

Sébastien Jean, Kyunghyun Cho, et al. 2015. On using very large target vocabulary for neural machine translation. In Association for Computational Linguistics (ACL’15). 1--10.

[11]

Minh-Thang Luong, Ilya Sutskever, et al. 2015. Addressing the rare word problem in neural machine translation. In International Joint Conference on Natural Language Processing (IJCNLP’15). 11--19.

[12]

F. Meunier and C. M. Longtin. 2007. Morphological decomposition and semantic integration in word processing. Journal of Memory and Language 56, 4 (2007), 457--471.

[13]

Tomáš Mikolov, Stefan Kombrink, et al. 2011. Extensions of recurrent neural network language model. In International Conference on Acoustics, Speech, and Signal Processing. 5528--5531.

[14]

Volodymyr Mnih, Koray Kavukcuoglu, et al. 2013. Playing Atari with deep reinforcement learning. arXiv 1312.5602 (2013).

[15]

Frederic Morin and Yoshua Bengio. 2005. Hierarchical probabilistic neural network language model. In International Conference on Artificial Intelligence and Statistics (AiStats’05). 246--252.

[16]

Kishore Papineni, Salim Roukos, et al. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). 311--318.

Digital Library

[17]

Marc’Aurelio Ranzato, Sumit Chopra, et al. 2015. Sequence level training with recurrent neural networks. arXiv 1511.06732 (2015).

[18]

Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Association for Computational Linguistics (ACL’16). 1715--1725.

[19]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Conference and Workshop on Neural Information Processing Systems (NIPS’14). 3104--3112.

Digital Library

[20]

A. Tamar, Y. Wu, G. Thomas, et al. 2016. Value iteration networks. In Neural Information Processing Systems (NIPS’16). 2154--2162.

Digital Library

[21]

Ashish Vaswani, Noam Shazeer, et al. 2017. Attention is all you need. In Conference and Workshop on Neural Information Processing Systems (NIPS’17). 5998--6008.

Digital Library

[22]

L. Wu, Y. Xia, L. Zhao, et al. 2018. Adversarial neural machine translation. In Asian Conference on Machine Learning (ACML’18). 374--385.

[23]

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2016. SeqGAN: Sequence generative adversarial nets with policy gradient. In The Association for the Advancement of Artificial Intelligence (AAAI’16). 2852--2858.

Digital Library

[24]

Yuan Zhang, Regina Barzilay, and Tommi Jaakkola. 2017. Aspect-augmented adversarial networks for domain adaptation. Transactions of the Association for Computational Linguistics 5, 1 (2017), 515--528.

[25]

Zhen Yang, Wei Chen, Feng Wang, and Bo Xu. 2018. Improving neural machine translation with conditional sequence generative adversarial nets. In The North American Chapter of the Association for Computational Linguistics (NAACL’18). 1346--1355.

[26]

Jun Yan Zhu, Taesung Park, et al. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision (ICCV’17). 2223--2232.

Cited By

Liu YZhang S(2023)Design and Proofreading of the English-Chinese Computer-Aided Translation System by the Neural NetworkComputational Intelligence and Neuroscience10.1155/2023/94508162023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/9450816
Feng Y(2023)Design of Machine Translation Algorithm for English Long Sentences Based on Artificial Neural Network2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00039(183-187)Online publication date: 29-Dec-2023
https://doi.org/10.1109/ICIRDC62824.2023.00039
Li X(2023)Design of Computer Intelligent Proofreading Algorithm for English Translation Based on Markov Model2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00037(174-178)Online publication date: 29-Dec-2023
https://doi.org/10.1109/ICIRDC62824.2023.00037
Show More Cited By

Index Terms

Adversarial Training for Unknown Word Problems in Neural Machine Translation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Adversarial learning

Recommendations

Explicitly Modeling Word Translations in Neural Machine Translation

In this article, we show that word translations can be explicitly incorporated into NMT effectively to avoid wrong translations. Specifically, we propose three cross-lingual encoders to explicitly incorporate word translations into NMT: (1) Factored ...
Noise-Based Adversarial Training for Enhancing Agglutinative Neural Machine Translation
PRICAI 2019: Trends in Artificial Intelligence
Abstract
This study solves the problem of unknown(UNK) word in machine translation of agglutinative language in two ways. (1) a multi-granularity preprocessing based on morphological segmentation is used for the input of generative adversarial net. (2) a ...
Using Translation Memory to Improve Neural Machine Translations
ICDLT '22: Proceedings of the 2022 6th International Conference on Deep Learning Technologies

In this paper, we describe a way of using translation memory (TM) to improve the translation quality and stability of neural machine translation (NMT) systems, especially when the sentences to be translated have high similarity with sentences stored in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19, Issue 1

January 2020

345 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3338846

Editor:
Imed Zitouni
Microsoft, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2019

Accepted: 01 June 2019

Revised: 01 May 2019

Received: 01 January 2019

Published in TALLIP Volume 19, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Note
Research
Refereed

Funding Sources

Mongolian Language Information Special Support Project of Inner Mongolia
Natural Science Foundation of Inner Mongolia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
230
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu YZhang S(2023)Design and Proofreading of the English-Chinese Computer-Aided Translation System by the Neural NetworkComputational Intelligence and Neuroscience10.1155/2023/94508162023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/9450816
Feng Y(2023)Design of Machine Translation Algorithm for English Long Sentences Based on Artificial Neural Network2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00039(183-187)Online publication date: 29-Dec-2023
https://doi.org/10.1109/ICIRDC62824.2023.00039
Li X(2023)Design of Computer Intelligent Proofreading Algorithm for English Translation Based on Markov Model2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC)10.1109/ICIRDC62824.2023.00037(174-178)Online publication date: 29-Dec-2023
https://doi.org/10.1109/ICIRDC62824.2023.00037
Qing-dao-er-ji RCheng KPang R(2022)Research on Traditional Mongolian-Chinese Neural Machine Translation Based on Dependency Syntactic Information and Transformer ModelApplied Sciences10.3390/app12191007412:19(10074)Online publication date: 7-Oct-2022
https://doi.org/10.3390/app121910074
Liguori PImprota CDe Vivo SNatella RCukic BCotroneo DSorbo APanichella S(2022)Can NMT understand me?Proceedings of the 1st International Workshop on Natural Language-based Software Engineering10.1145/3528588.3528653(59-66)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3528588.3528653

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents