Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism

Published: 09 January 2020 Publication History
  • Get Citation Alerts
  • Abstract

    How to utilize information sufficiently is a key problem in neural machine translation (NMT), which is effectively improved in rich-resource NMT by leveraging large-scale bilingual sentence pairs. However, for low-resource NMT, lack of bilingual sentence pairs results in poor translation performance; therefore, taking full advantage of global information in the encoding-decoding process is effective for low-resource NMT. In this article, we propose a novel reread-feedback NMT architecture (RFNMT) for using global information. Our architecture builds upon the improved sequence-to-sequence neural network and consists of a double-deck attention-based encoder-decoder framework. In our proposed architecture, the information generated by the first-pass encoding and decoding process flows to the second-pass encoding process for more sufficient parameters initialization and information use. Specifically, we first propose a “reread” mechanism to transfer the outputs of the first-pass encoder to the second-pass encoder, and then the output is used for the initialization of the second-pass encoder. Second, we propose a “feedback” mechanism that transfers the first-pass decoder’s outputs to a second-pass encoder via an important weight model and an improved gated recurrent unit (GRU). Experiments on multiple datasets show that our approach achieves significant improvements over state-of-the-art NMT systems, especially in low-resource settings.

    References

    [1]
    Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. 2017. Unsupervised Neural Machine Translation. arxiv:cs.CL/1710.11041
    [2]
    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. arxiv:cs.CL/1409.0473
    [3]
    Franck Burlot and François Yvon. 2019. Using Monolingual Data in Neural Machine Translation: A Systematic Study. arxiv:cs.CL/1903.11437
    [4]
    Rajen Chatterjee, José de Souza, Matteo Negri, and Marco Turchi. 2016. The FBK participation in the WMT 2016 automatic post-editing shared task. In Proceedings of the First Conference on Machine Translation, Volume 2: Shared Task Papers. 745--750.
    [5]
    Yun Chen, Yang Liu, and Victor O. K. Li. 2018. Zero-Resource Neural Machine Translation with Multi-Agent Communication Game. arxiv:cs.CL/1802.03116
    [6]
    Kyunghyun Cho, Bart Van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar, 103--111. https://doi.org/10.3115/v1/W14-4012
    [7]
    Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14).
    [8]
    Marta R. Costa-Jussà, Noé Casas, Carlos Escolano, and José A. R. Fonollosa. 2019. Chinese-catalan: A neural machine translation approach based on pivoting and attention mechanisms. ACM Transactions on Asian and Low-Resource Language Information Processing 18, 4, Article 43 (April 2019), 8 pages.
    [9]
    Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. Data augmentation for low-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 567--573. https://doi.org/10.18653/v1/P17-2090
    [10]
    Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. 2016. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 866--875. https://doi.org/10.18653/v1/N16-1101
    [11]
    Nicolas Ford, Daniel Duckworth, Mohammad Norouzi, and George E. Dahl. 2018. The Importance of Generation Order in Language Modeling. arxiv:cs.LG/1808.07910
    [12]
    Jiatao Gu, Hany Hassan, Jacob Devlin, and Victor O. K. Li. 2018. Universal neural machine translation for extremely low resource languages. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).
    [13]
    Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2015. On Using Monolingual Corpora in Neural Machine Translation. arXiv:cs.CL/1503.03535.
    [14]
    Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On Using Very Large Target Vocabulary for Neural Machine Translation. arXiv:cs.CL/1412.2007.
    [15]
    Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arxiv:cs.LG/1412.6980
    [16]
    Guillaume Lample and Alexis Conneau. 2019. Cross-Lingual Language Model Pretraining. arxiv:cs.CL/1901.07291
    [17]
    Guillaume Lample, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2017. Unsupervised Machine Translation Using Monolingual Corpora Only. arXiv:cs.CL/1711.00043.
    [18]
    Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Phrase-Based and Neural Unsupervised Machine Translation. arxiv:cs.CL/1804.07755
    [19]
    Hideki Nakayama and Noriki Nishida. 2017. Zero-resource machine translation by multimodal encoder-decoder network with multimedia pivot. Machine Translation 31, 1 (2017), 49--64. https://doi.org/10.1007/s10590-017-9197-z
    [20]
    Jan Niehues, Eunah Cho, Thanh Le Ha, and Alex Waibel. 2016. Pre-translation for neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, 1828--1836. https://www.aclweb.org/anthology/C16-1172.
    [21]
    Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). Association for Computational Linguistics, Stroudsburg, PA, 311--318.
    [22]
    Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (Nov 1997), 2673--2681. https://doi.org/10.1109/78.650093
    [23]
    Rico Sennrich, Barry Haddow, and Alexandra Birch. 2015. Improving Neural Machine Translation Models with Monolingual Data. arxiv:cs.CL/1511.06709
    [24]
    Matthew Snover, Nitin Madnani, Bonnie J. Dorr, and Richard Schwartz. 2009. Fluency, adequacy, or HTER?: Exploring different human judgments with a tunable MT metric. In Proceedings of the 4th Workshop on Statistical Machine Translation (StatMT’09). Association for Computational Linguistics, Stroudsburg, PA, 259--268. Retrieved from http://dl.acm.org/citation.cfm?id=1626431.1626480.
    [25]
    Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. arXiv:cs.CL/1409.3215.
    [26]
    Zhaopeng Tu, Zhengdong Lu, Liu Yang, Xiaohua Liu, and Li Hang. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 76--85. https://doi.org/10.18653/v1/P16-1008
    [27]
    Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, and Tie-Yan Liu. 2017. Deliberation networks: Sequence generation beyond one-pass decoding. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 1784--1794. Retrieved from http://papers.nips.cc/paper/6775-deliberation-networks-sequence-generation-beyond-one-pass-decoding.pdf.
    [28]
    Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, and William W. Cohen. 2016. Review Networks for Caption Generation. arxiv:cs.LG/1605.07912
    [29]
    Cheng Yong, Xu Wei, Zhongjun He, He Wei, Wu Hua, Maosong Sun, and Liu Yang. 2016. Semi-Supervised Learning for Neural Machine Translation. arXiv:cs.CL/1606.04596.
    [30]
    Cheng Yong, Liu Yang, Yang Qian, Maosong Sun, and Xu Wei. 2016. Neural Machine Translation with Pivot Languages. arXiv:cs.CL/1611.04928.
    [31]
    Chen Yun, Liu Yang, Cheng Yong, and Victor O. K. Li. 2017. A teacher-student framework for zero-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1925--1935. https://doi.org/10.18653/v1/P17-1176
    [32]
    Wenyuan Zeng, Wenjie Luo, Sanja Fidler, and Raquel Urtasun. 2016. Efficient summarization with read-again and copy mechanism. arXiv:cs.CL/1611.03382.
    [33]
    Jiajun Zhang and Chengqing Zong. 2016. Exploiting source-side monolingual data in neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 1535--1545. https://doi.org/10.18653/v1/D16-1160
    [34]
    Yang Zhen, Chen Wei, Wang Feng, and Xu Bo. 2018. Unsupervised neural machine translation with weight sharing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 46--55. https://doi.org/10.18653/v1/P18-1005
    [35]
    Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceeding of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas. 1568--1575. https://doi.org/10.18653/v1/D16-1163

    Cited By

    View all
    • (2024)NLP-reliant Neural Machine Translation techniques used in smart city applicationsInformation System and Smart City10.59400/issc.v3i1.4813:1(481)Online publication date: 2-Apr-2024
    • (2023)Speech-to-speech Low-resource Translation2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI58017.2023.00023(91-95)Online publication date: Aug-2023
    • (2022)Improving thai-lao neural machine translation with similarity lexiconJournal of Intelligent & Fuzzy Systems10.3233/JIFS-21223642:4(4005-4014)Online publication date: 4-Mar-2022
    • Show More Cited By

    Index Terms

    1. Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Transactions on Asian and Low-Resource Language Information Processing
          ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 3
          May 2020
          228 pages
          ISSN:2375-4699
          EISSN:2375-4702
          DOI:10.1145/3378675
          Issue’s Table of Contents
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 09 January 2020
          Accepted: 01 September 2019
          Revised: 01 July 2019
          Received: 01 May 2019
          Published in TALLIP Volume 19, Issue 3

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. Low-resource
          2. feedback
          3. neural machine translation
          4. reread

          Qualifiers

          • Research-article
          • Research
          • Refereed

          Funding Sources

          • National key Research and Development Plan Project
          • Yunnan High-Tech Industry Development Project
          • Natural Science Foundation of Yunnan Province
          • National Natural Science Foundation of China

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)12
          • Downloads (Last 6 weeks)1
          Reflects downloads up to 11 Aug 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)NLP-reliant Neural Machine Translation techniques used in smart city applicationsInformation System and Smart City10.59400/issc.v3i1.4813:1(481)Online publication date: 2-Apr-2024
          • (2023)Speech-to-speech Low-resource Translation2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI58017.2023.00023(91-95)Online publication date: Aug-2023
          • (2022)Improving thai-lao neural machine translation with similarity lexiconJournal of Intelligent & Fuzzy Systems10.3233/JIFS-21223642:4(4005-4014)Online publication date: 4-Mar-2022
          • (2022)Improving Chinese-Vietnamese Neural Machine Translation with Linguistic DifferencesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/347753621:2(1-12)Online publication date: 25-Mar-2022

          View Options

          Get Access

          Login options

          Full Access

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media