Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Unsupervised Neural Machine Translation for Similar and Distant Language Pairs: An Empirical Study

Published: 31 March 2021 Publication History

Abstract

Unsupervised neural machine translation (UNMT) has achieved remarkable results for several language pairs, such as French–English and German–English. Most previous studies have focused on modeling UNMT systems; few studies have investigated the effect of UNMT on specific languages. In this article, we first empirically investigate UNMT for four diverse language pairs (French/German/Chinese/Japanese–English). We confirm that the performance of UNMT in translation tasks for similar language pairs (French/German–English) is dramatically better than for distant language pairs (Chinese/Japanese–English). We empirically show that the lack of shared words and different word orderings are the main reasons that lead UNMT to underperform in Chinese/Japanese–English. Based on these findings, we propose several methods, including artificial shared words and pre-ordering, to improve the performance of UNMT for distant language pairs. Moreover, we propose a simple general method to improve translation performance for all these four language pairs. The existing UNMT model can generate a translation of a reasonable quality after a few training epochs owing to a denoising mechanism and shared latent representations. However, learning shared latent representations restricts the performance of translation in both directions, particularly for distant language pairs, while denoising dramatically delays convergence by continuously modifying the training data. To avoid these problems, we propose a simple, yet effective and efficient, approach that (like UNMT) relies solely on monolingual corpora: pseudo-data-based unsupervised neural machine translation. Experimental results for these four language pairs show that our proposed methods significantly outperform UNMT baselines.

References

[1]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2018. Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI Press, 5012--5019.
[2]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2018. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 789--798.
[3]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2018. Unsupervised statistical machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3632--3642.
[4]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2019. An effective approach to unsupervised machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 194--203.
[5]
Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. 2018. Unsupervised neural machine translation. In Proceedings of the 6th International Conference on Learning Representations.
[6]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Ling. 5 (2017), 135--146.
[7]
Ondřej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Christof Monz, Mathias Müller, and Matt Post. 2019. Findings of the 2019 conference on machine translation (WMT19). In Proceedings of the 4th Conference on Machine Translation, Volume 2: Shared Task Papers. Association for Computational Linguistics, 1--61.
[8]
Michael Collins, Philipp Koehn, and Ivona Kucerova. 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 531--540.
[9]
Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. In Advances in Neural Information Processing Systems 32. Curran Associates, Red Hook, NY, 7059--7069.
[10]
Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2018. Word translation without parallel data. In Proceedings of the 6th International Conference on Learning Representations.
[11]
Jinhua Du and Andy Way. 2017. Pre-reordering for neural machine translation: Helpful or harmful? Prague Bull. Math. Ling. 108, 1 (2017), 171--182.
[12]
Akiko Eriguchi, Kazuma Hashimoto, and Yoshimasa Tsuruoka. 2016. Tree-to-sequence attentional neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 823--833.
[13]
Isao Goto, Masao Utiyama, and Eiichiro Sumita. 2013. Post-ordering by parsing with ITG for japanese-english statistical machine translation. ACM Trans. Asian Lang. Inf. Process. 12, 4 (2013), 17:1--17:22.
[14]
Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. 2016. Dual learning for machine translation. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems. Curran Associates, Red Hook, NY, 820--828.
[15]
Felix Hill, Kyunghyun Cho, and Anna Korhonen. 2016. Learning distributed representations of sentences from unlabelled data. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’16). Association for Computational Linguistics, 1367--1377.
[16]
Vu Cong Duy Hoang, Philipp Koehn, Gholamreza Haffari, and Trevor Cohn. 2018. Iterative back-translation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Association for Computational Linguistics, 18--24.
[17]
Yuki Kawara, Chenhui Chu, and Yuki Arase. 2018. Recursive neural network based preordering for english-to-japanese machine translation. In Proceedings of ACL 2018, Student Research Workshop. Association for Computational Linguistics, 21--27.
[18]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations.
[19]
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Association for Computational Linguistics, 177--180.
[20]
Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Unsupervised machine translation using monolingual corpora only. In Proceedings of the 6th International Conference on Learning Representations.
[21]
Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 5039--5049.
[22]
Yichong Leng, Xu Tan, Tao Qin, Xiang-Yang Li, and Tie-Yan Liu. 2019. Unsupervised pivot translation for distant languages. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 175--183.
[23]
M. Paul Lewis. 2009. Ethnologue: Languages of the World. SIL international, Dallas, Texas.
[24]
Benjamin Marie and Atsushi Fujita. 2018. Unsupervised neural machine translation initialized by unsupervised statistical machine translation. CoRR abs/1810.12703 (2018).
[25]
Benjamin Marie, Haipeng Sun, Rui Wang, Kehai Chen, Atsushi Fujita, Masao Utiyama, and Eiichiro Sumita. 2019. NICT’s unsupervised neural and statistical machine translation systems for the WMT19 news translation task. In Proceedings of the 4th Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1). Association for Computational Linguistics, 294--301.
[26]
Shuo Ren, Zhirui Zhang, Shujie Liu, Ming Zhou, and Shuai Ma. 2019. Unsupervised neural machine translation with SMT as posterior regularization. In Proceedings of the 32rd AAAI Conference on Artificial Intelligence, (AAAI’19), Proceedings of the 31st Innovative Applications of Artificial Intelligence Conference (IAAI’19), Proceedings of the 9th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’19). AAAI Press, 241--248.
[27]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 86--96.
[28]
Anders Søgaard, Sebastian Ruder, and Ivan Vulić. 2018. On the limitations of unsupervised bilingual dictionary induction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 778--788.
[29]
Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning (ICML’19). PMLR, Long Beach, CA, 5926--5936.
[30]
Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, and Tiejun Zhao. 2019. Unsupervised bilingual word embedding agreement for unsupervised neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1235--1245.
[31]
Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 76--85.
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30. Curran Associates, Red Hook, NY, 5998--6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.
[33]
Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408.
[34]
Jiawei Wu, Xin Wang, and William Yang Wang. 2019. Extract and edit: An alternative to back-translation for unsupervised neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 1173--1183.
[35]
Chang Xu, Tao Qin, Gang Wang, and Tie-Yan Liu. 2019. Polygon-Net: A general framework for jointly boosting multiple unsupervised neural machine translation models. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 5320--5326.
[36]
Zhen Yang, Wei Chen, Feng Wang, and Bo Xu. 2018. Unsupervised neural machine translation with weight sharing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 46--55.
[37]
Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, and Enhong Chen. 2018. Joint training for neural machine translation models with monolingual data. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18), Proceedings of the 30th innovative Applications of Artificial Intelligence (IAAI’18), and Proceedings of the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI’18). AAAI Press, 555--562.

Cited By

View all
  • (2024)Handling syntactic difference in Chinese-Vietnamese neural machine translationJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23376246:3(5533-5544)Online publication date: 5-Mar-2024
  • (2024)Unsupervised Multimodal Machine Translation for Low-resource Distant Language PairsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/365216123:4(1-22)Online publication date: 9-Mar-2024
  • (2024)Integration of Machine Translation Systems and Language Models Based on Big Data2024 International Conference on Data Science and Network Security (ICDSNS)10.1109/ICDSNS62112.2024.10691299(1-5)Online publication date: 26-Jul-2024
  • Show More Cited By

Index Terms

  1. Unsupervised Neural Machine Translation for Similar and Distant Language Pairs: An Empirical Study

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 1
    Special issue on Deep Learning for Low-Resource Natural Language Processing, Part 1 and Regular Papers
    January 2021
    332 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3439335
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 March 2021
    Accepted: 01 August 2020
    Revised: 01 June 2020
    Received: 01 October 2019
    Published in TALLIP Volume 20, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Unsupervised neural machine translation
    2. pseudo-data-based unsupervised neural machine translation
    3. similar and distant language pairs

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • JSPS grant-in-aid for early-career scientists (19K20354)
    • Unsupervised Neural Machine Translation in Universal Scenarios
    • National Key Research and Development Program of China
    • NICT tenure-track researcher startup fund “Toward Intelligent Machine Translation.”

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Handling syntactic difference in Chinese-Vietnamese neural machine translationJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23376246:3(5533-5544)Online publication date: 5-Mar-2024
    • (2024)Unsupervised Multimodal Machine Translation for Low-resource Distant Language PairsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/365216123:4(1-22)Online publication date: 9-Mar-2024
    • (2024)Integration of Machine Translation Systems and Language Models Based on Big Data2024 International Conference on Data Science and Network Security (ICDSNS)10.1109/ICDSNS62112.2024.10691299(1-5)Online publication date: 26-Jul-2024
    • (2023)Investigating Unsupervised Neural Machine Translation for Low-resource Language Pair English-Mizo via Lexically Enhanced Pre-trained Language ModelsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/360922222:8(1-18)Online publication date: 23-Aug-2023
    • (2023)Rule Based Fuzzy Computing Approach on Self-Supervised Sentiment Polarity Classification with Word Sense Disambiguation in Machine Translation for Hindi LanguageACM Transactions on Asian and Low-Resource Language Information Processing10.1145/3574130Online publication date: 22-Feb-2023
    • (2023)GA-SCS: Graph-Augmented Source Code SummarizationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/355482022:2(1-19)Online publication date: 21-Feb-2023
    • (2022)Unsupervised English Intelligent Machine Translation in Wireless Network EnvironmentSecurity and Communication Networks10.1155/2022/82082422022Online publication date: 1-Jan-2022
    • (2022)Exploration of the Problems and Solutions Based on the Translation of Computer Software into Japanese LanguageMathematical Problems in Engineering10.1155/2022/37120902022(1-9)Online publication date: 6-Sep-2022
    • (2022)A New Concept of Electronic Text Based on Semantic Coding System for Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/346965521:1(1-16)Online publication date: 31-Jan-2022
    • (2022)Unsupervised Pivot-based Neural Machine Translation for English to Kannada2022 IEEE 19th India Council International Conference (INDICON)10.1109/INDICON56171.2022.10039732(1-6)Online publication date: 24-Nov-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media