Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes

Zheng, Yafang; Lin, Lei; Yuan, Yuxuan; Shi, Xiaodong

doi:10.1007/978-981-99-8076-5_16

Yafang Zheng^12,13,
Lei Lin^12,13,
Yuxuan Yuan^12,13 &
…
Xiaodong Shi^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14452))

Included in the following conference series:

International Conference on Neural Information Processing

591 Accesses

Abstract

In a multilingual neural machine translation model that fully shares parameters across all languages, a popular approach is to use an artificial language token to guide translation into the desired target language. However, recent studies have shown that language-specific signals in prepended language tokens are not adequate to guide the MNMT models to translate into right directions, especially on zero-shot translation (i.e., off-target translation issue). We argue that the representations of prepended language tokens are overly affected by its context information, resulting in potential information loss of language tokens and insufficient indicative ability. To address this issue, we introduce multiple language prototypes to guide translation into the desired target language. Specifically, we categorize sparse contextualized language representations into a few representative prototypes over training set, and inject their representations into each individual token to guide the models. Experiments on several multilingual datasets show that our method significantly alleviates the off-target translation issue and improves the translation quality on both zero-shot and supervised directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Adaptive Transformer for Multilingual Neural Machine Translation

Noise-Robust Semi-supervised Multi-modal Machine Translation

Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context

Article 31 March 2022

Notes

1.
For all datasets, the signature is: BLEU+case.mixed+nrefs.1+smooth.exp+tok.{13a, zh,ja-mecab-0.996}+version.2.3.1, tok.zh and tok.ja-mecab-0.996 are only for Chinese and Japanese respectively.
2.
We employed langid.id toolkit for language identification.

References

Aharoni, R., Johnson, M., Firat, O.: Massively multilingual neural machine translation. In: Proceedings of the NAACL (2019)
Google Scholar
Al-Shedivat, M., Parikh, A.: Consistency by agreement in zero-shot neural machine translation. In: Proceedings of the NAACL (2019)
Google Scholar
Arivazhagan, N., Bapna, A., Firat, O., Aharoni, R., Johnson, M., Macherey, W.: The missing ingredient in zero-shot neural machine translation (2019)
Google Scholar
Arivazhagan, N., et al.: Massively multilingual neural machine translation in the wild: findings and challenges (2019)
Google Scholar
Cettolo, M., et al.: Overview of the IWSLT 2017 evaluation campaign. In: Proceedings of the 14th International Conference on Spoken Language Translation (2017)
Google Scholar
Chen, L., Ma, S., Zhang, D., Wei, F., Chang, B.: On the off-target problem of zero-shot multilingual neural machine translation. In: Proceedings of the ACL Findings (2023)
Google Scholar
Firat, O., Cho, K., Bengio, Y.: Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the NAACL (2016)
Google Scholar
Gu, J., Wang, Y., Cho, K., Li, V.O.: Improved zero-shot neural machine translation via ignoring spurious correlations. In: Proceedings of the ACL (2019)
Google Scholar
Gu, S., Feng, Y.: Improving zero-shot multilingual translation with universal representations and cross-mapping. In: Proceedings of the EMNLP Findings (2022)
Google Scholar
Ha, T.L., Niehues, J., Waibel, A.: Toward multilingual neural machine translation with universal encoder and decoder. In: Proceedings of the 13th International Conference on Spoken Language Translation (2016)
Google Scholar
Jin, R., Xiong, D.: Informative language representation learning for massively multilingual neural machine translation. In: Proceedings of the COLING (2022)
Google Scholar
Johnson, M., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans. Assoc. Comput. Linguist. 5, 339–351 (2017)
Article Google Scholar
Kong, X., Renduchintala, A., Cross, J., Tang, Y., Gu, J., Li, X.: Multilingual neural machine translation with deep encoder and multiple shallow decoders. In: Proceedings of the EACL (2021)
Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28, 129–137 (1982)
Article MathSciNet MATH Google Scholar
Malaviya, C., Neubig, G., Littell, P.: Learning language representations for typology prediction. In: Proceedings of the EMNLP (2017)
Google Scholar
Oncevay, A., Haddow, B., Birch, A.: Bridging linguistic typology and multilingual machine translation with multi-view language representations. In: Proceedings of the EMNLP (2020)
Google Scholar
Östling, R., Tiedemann, J.: Continuous multilinguality with language vectors. In: Proceedings of the EACL (2017)
Google Scholar
Ott, M., et al.: fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of the NAACL (2019)
Google Scholar
Pham, N.Q., Niehues, J., Ha, T.L., Waibel, A.: Improving zero-shot translation with language-independent constraints. In: Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) (2019)
Google Scholar
Post, M.: A call for clarity in reporting BLEU scores. In: Proceedings of the Third Conference on Machine Translation: Research Papers (2018)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the EMNLP (2019)
Google Scholar
Sachan, D., Neubig, G.: Parameter sharing methods for multilingual self-attentional translation models. In: Proceedings of the Third Conference on Machine Translation: Research Papers (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the NeurIPS (2017)
Google Scholar
Wu, L., Cheng, S., Wang, M., Li, L.: Language tags matter for zero-shot neural machine translation. In: Proceedings of the ACL Findings (2021)
Google Scholar
Xu, H., Liu, Q., van Genabith, J., Xiong, D.: Modeling task-aware MIMO cardinality for efficient multilingual neural machine translation. In: Proceedings of the ACL (2021)
Google Scholar
Yin, Y., Li, Y., Meng, F., Zhou, J., Zhang, Y.: Categorizing semantic representations for neural machine translation. In: Proceedings of the COLING (2022)
Google Scholar
Zhang, B., Williams, P., Titov, I., Sennrich, R.: Improving massively multilingual neural machine translation and zero-shot translation. In: Proceedings of the ACL (2020)
Google Scholar

Download references

Acknowledgement

This work was supported by the Key Support Project of NSFC-Liaoning Joint Foundation (No. U1908216), and the Project of Research and Development for Neural Machine Translation Models between Cantonese and Mandarin (No. WT135-76). We thank all anonymous reviewers for their valuable suggestions on this work.

Author information

Authors and Affiliations

Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China
Yafang Zheng, Lei Lin, Yuxuan Yuan & Xiaodong Shi
Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan, Ministry of Culture and Tourism, Xiamen, China
Yafang Zheng, Lei Lin, Yuxuan Yuan & Xiaodong Shi

Authors

Yafang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Lei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yuxuan Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Shi .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Biao Luo
Chinese Academy of Sciences, Beijing, China
Long Cheng
Zhejiang University, Hangzhou, China
Zheng-Guang Wu
Guangdong University of Technology, Guangzhou, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, Y., Lin, L., Yuan, Y., Shi, X. (2024). Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14452. Springer, Singapore. https://doi.org/10.1007/978-981-99-8076-5_16

Download citation

DOI: https://doi.org/10.1007/978-981-99-8076-5_16
Published: 14 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8075-8
Online ISBN: 978-981-99-8076-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Transformer for Multilingual Neural Machine Translation

Noise-Robust Semi-supervised Multi-modal Machine Translation

Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Effective Guidance in Zero-Shot Multilingual Translation via Multiple Language Prototypes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Transformer for Multilingual Neural Machine Translation

Noise-Robust Semi-supervised Multi-modal Machine Translation

Document-Level Neural Machine Translation with Hierarchical Modeling of Global Context

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation