research-article

Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism

Authors:

Yonghua WenAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 19, Issue 3

Article No.: 34, Pages 1 - 13

https://doi.org/10.1145/3365244

Published: 09 January 2020 Publication History

Abstract

How to utilize information sufficiently is a key problem in neural machine translation (NMT), which is effectively improved in rich-resource NMT by leveraging large-scale bilingual sentence pairs. However, for low-resource NMT, lack of bilingual sentence pairs results in poor translation performance; therefore, taking full advantage of global information in the encoding-decoding process is effective for low-resource NMT. In this article, we propose a novel reread-feedback NMT architecture (RFNMT) for using global information. Our architecture builds upon the improved sequence-to-sequence neural network and consists of a double-deck attention-based encoder-decoder framework. In our proposed architecture, the information generated by the first-pass encoding and decoding process flows to the second-pass encoding process for more sufficient parameters initialization and information use. Specifically, we first propose a “reread” mechanism to transfer the outputs of the first-pass encoder to the second-pass encoder, and then the output is used for the initialization of the second-pass encoder. Second, we propose a “feedback” mechanism that transfers the first-pass decoder’s outputs to a second-pass encoder via an important weight model and an improved gated recurrent unit (GRU). Experiments on multiple datasets show that our approach achieves significant improvements over state-of-the-art NMT systems, especially in low-resource settings.

References

[1]

Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. 2017. Unsupervised Neural Machine Translation. arxiv:cs.CL/1710.11041

[2]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. arxiv:cs.CL/1409.0473

[3]

Franck Burlot and François Yvon. 2019. Using Monolingual Data in Neural Machine Translation: A Systematic Study. arxiv:cs.CL/1903.11437

[4]

Rajen Chatterjee, José de Souza, Matteo Negri, and Marco Turchi. 2016. The FBK participation in the WMT 2016 automatic post-editing shared task. In Proceedings of the First Conference on Machine Translation, Volume 2: Shared Task Papers. 745--750.

[5]

Yun Chen, Yang Liu, and Victor O. K. Li. 2018. Zero-Resource Neural Machine Translation with Multi-Agent Communication Game. arxiv:cs.CL/1802.03116

[6]

Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar, 103--111. https://doi.org/10.3115/v1/W14-4012

[7]

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14).

[8]

Marta R. Costa-Jussà, Noé Casas, Carlos Escolano, and José A. R. Fonollosa. 2019. Chinese-catalan: A neural machine translation approach based on pivoting and attention mechanisms. ACM Transactions on Asian and Low-Resource Language Information Processing 18, 4, Article 43 (April 2019), 8 pages.

Digital Library

[9]

Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. Data augmentation for low-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 567--573. https://doi.org/10.18653/v1/P17-2090

[10]

Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. 2016. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 866--875. https://doi.org/10.18653/v1/N16-1101

[11]

Nicolas Ford, Daniel Duckworth, Mohammad Norouzi, and George E. Dahl. 2018. The Importance of Generation Order in Language Modeling. arxiv:cs.LG/1808.07910

[12]

Jiatao Gu, Hany Hassan, Jacob Devlin, and Victor O. K. Li. 2018. Universal neural machine translation for extremely low resource languages. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).

[13]

Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2015. On Using Monolingual Corpora in Neural Machine Translation. arXiv:cs.CL/1503.03535.

[14]

Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On Using Very Large Target Vocabulary for Neural Machine Translation. arXiv:cs.CL/1412.2007.

[15]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arxiv:cs.LG/1412.6980

[16]

Guillaume Lample and Alexis Conneau. 2019. Cross-Lingual Language Model Pretraining. arxiv:cs.CL/1901.07291

[17]

Guillaume Lample, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2017. Unsupervised Machine Translation Using Monolingual Corpora Only. arXiv:cs.CL/1711.00043.

[18]

Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Phrase-Based and Neural Unsupervised Machine Translation. arxiv:cs.CL/1804.07755

[19]

Hideki Nakayama and Noriki Nishida. 2017. Zero-resource machine translation by multimodal encoder-decoder network with multimedia pivot. Machine Translation 31, 1 (2017), 49--64. https://doi.org/10.1007/s10590-017-9197-z

Digital Library

[20]

Jan Niehues, Eunah Cho, Thanh Le Ha, and Alex Waibel. 2016. Pre-translation for neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 1828--1836. https://www.aclweb.org/anthology/C16-1172.

[21]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). Association for Computational Linguistics, Stroudsburg, PA, 311--318.

[22]

Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (Nov 1997), 2673--2681. https://doi.org/10.1109/78.650093

Digital Library

[23]

Rico Sennrich, Barry Haddow, and Alexandra Birch. 2015. Improving Neural Machine Translation Models with Monolingual Data. arxiv:cs.CL/1511.06709

[24]

Matthew Snover, Nitin Madnani, Bonnie J. Dorr, and Richard Schwartz. 2009. Fluency, adequacy, or HTER?: Exploring different human judgments with a tunable MT metric. In Proceedings of the 4th Workshop on Statistical Machine Translation (StatMT’09). Association for Computational Linguistics, Stroudsburg, PA, 259--268. Retrieved from http://dl.acm.org/citation.cfm?id=1626431.1626480.

[25]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. arXiv:cs.CL/1409.3215.

Digital Library

[26]

Zhaopeng Tu, Zhengdong Lu, Liu Yang, Xiaohua Liu, and Li Hang. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 76--85. https://doi.org/10.18653/v1/P16-1008

[27]

Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, and Tie-Yan Liu. 2017. Deliberation networks: Sequence generation beyond one-pass decoding. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 1784--1794. Retrieved from http://papers.nips.cc/paper/6775-deliberation-networks-sequence-generation-beyond-one-pass-decoding.pdf.

[28]

Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, and William W. Cohen. 2016. Review Networks for Caption Generation. arxiv:cs.LG/1605.07912

Digital Library

[29]

Cheng Yong, Xu Wei, Zhongjun He, He Wei, Wu Hua, Maosong Sun, and Liu Yang. 2016. Semi-Supervised Learning for Neural Machine Translation. arXiv:cs.CL/1606.04596.

[30]

Cheng Yong, Liu Yang, Yang Qian, Maosong Sun, and Xu Wei. 2016. Neural Machine Translation with Pivot Languages. arXiv:cs.CL/1611.04928.

[31]

Chen Yun, Liu Yang, Cheng Yong, and Victor O. K. Li. 2017. A teacher-student framework for zero-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1925--1935. https://doi.org/10.18653/v1/P17-1176

[32]

Wenyuan Zeng, Wenjie Luo, Sanja Fidler, and Raquel Urtasun. 2016. Efficient summarization with read-again and copy mechanism. arXiv:cs.CL/1611.03382.

[33]

Jiajun Zhang and Chengqing Zong. 2016. Exploiting source-side monolingual data in neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 1535--1545. https://doi.org/10.18653/v1/D16-1160

[34]

Yang Zhen, Chen Wei, Wang Feng, and Xu Bo. 2018. Unsupervised neural machine translation with weight sharing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 46--55. https://doi.org/10.18653/v1/P18-1005

[35]

Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceeding of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas. 1568--1575. https://doi.org/10.18653/v1/D16-1163

Cited By

Dwivedi RNand PPal O(2024)NLP-reliant Neural Machine Translation techniques used in smart city applicationsInformation System and Smart City10.59400/issc.v3i1.4813:1(481)Online publication date: 2-Apr-2024
https://doi.org/10.59400/issc.v3i1.481
Liu HDay MWang C(2023)Speech-to-speech Low-resource Translation2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI58017.2023.00023(91-95)Online publication date: Aug-2023
https://doi.org/10.1109/IRI58017.2023.00023
Yu ZHuang YGuo J(2022)Improving thai-lao neural machine translation with similarity lexiconJournal of Intelligent & Fuzzy Systems10.3233/JIFS-21223642:4(4005-4014)Online publication date: 4-Mar-2022
https://doi.org/10.3233/JIFS-212236
Show More Cited By

Index Terms

Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Extremely low-resource neural machine translation for Asian languages
Abstract
This paper presents a set of effective approaches to handle extremely low-resource language pairs for self-attention based neural machine translation (NMT) focusing on English and four Asian languages. Starting from an initial set of parallel ...
A Novel Neural Machine Translation Approach for low-resource Sanskrit-Hindi Language pair
Sanskrit is one of the earliest native languages and is correctly described as "the gods' language" because of its wide use in Indian religious literature from the past. However, it is becoming less popular in modern India. Due in significant part to the ...
Using Translation Memory to Improve Neural Machine Translations
ICDLT '22: Proceedings of the 2022 6th International Conference on Deep Learning Technologies

In this paper, we describe a way of using translation memory (TM) to improve the translation quality and stability of neural machine translation (NMT) systems, especially when the sentences to be translated have high similarity with sentences stored in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19, Issue 3

May 2020

228 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3378675

Editor:
Imed Zitouni
Microsoft, USA

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2020

Accepted: 01 September 2019

Revised: 01 July 2019

Received: 01 May 2019

Published in TALLIP Volume 19, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National key Research and Development Plan Project
Yunnan High-Tech Industry Development Project
Natural Science Foundation of Yunnan Province
National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
434
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)1

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Dwivedi RNand PPal O(2024)NLP-reliant Neural Machine Translation techniques used in smart city applicationsInformation System and Smart City10.59400/issc.v3i1.4813:1(481)Online publication date: 2-Apr-2024
https://doi.org/10.59400/issc.v3i1.481
Liu HDay MWang C(2023)Speech-to-speech Low-resource Translation2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI58017.2023.00023(91-95)Online publication date: Aug-2023
https://doi.org/10.1109/IRI58017.2023.00023
Yu ZHuang YGuo J(2022)Improving thai-lao neural machine translation with similarity lexiconJournal of Intelligent & Fuzzy Systems10.3233/JIFS-21223642:4(4005-4014)Online publication date: 4-Mar-2022
https://doi.org/10.3233/JIFS-212236
Yu ZYu ZXian YHuang YGuo J(2022)Improving Chinese-Vietnamese Neural Machine Translation with Linguistic DifferencesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/347753621:2(1-12)Online publication date: 25-Mar-2022
https://dl.acm.org/doi/10.1145/3477536

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents