research-article

Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation

Authors:

Yi FanAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 23, Issue 2

Article No.: 31, Pages 1 - 19

https://doi.org/10.1145/3639930

Published: 08 February 2024 Publication History

Abstract

As the rapid development of deep learning methods, neural machine translation (NMT) has attracted more and more attention in recent years. However, lack of bilingual resources decreases the performance of the low-resource NMT model seriously. To overcome this problem, several studies put their efforts on knowledge transfer from high-resource language pairs to low-resource language pairs. However, these methods usually focus on one single granularity of language and the parameter sharing among different granularities in NMT is not well studied. In this article, we propose to improve the parameter sharing in low-resource NMT by introducing multi-granularity knowledge such as word, phrase and sentence. This knowledge can be monolingual and bilingual. We build the knowledge sharing model for low-resource NMT based on a multi-task learning framework, three auxiliary tasks such as syntax parsing, cross-lingual named entity recognition, and natural language generation are selected for the low-resource NMT. Experimental results show that the proposed method consistently outperforms six strong baseline systems on several low-resource language pairs.

References

[1]

Tamer Alkhouli, Gabriel Bretschner, and Hermann Ney. 2018. On the alignment problem in multi-head attention-based neural machine translation. In Proceedings of the 3rd Conference on Machine Translation. Association for Computational Linguistics, 177–185. DOI:

[2]

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, and James Glass. 2020. On the linguistic representational power of neural machine translation models. Comput. Linguist. 46, 1 (032020), 1–52. Retrieved from https://direct.mit.edu/coli/article-pdf/46/1/1/1847791/coli_a_00367.pdf

Digital Library

[3]

Noe Casas, Marta R. Costa-jussà, José A. R. Fonollosa, Juan A. Alonso, and Ramón Fanlo. 2021. Linguistic knowledge-based vocabularies for neural machine translation. Nat. Lang. Eng. 27, 4 (2021), 485–506. DOI:

[4]

Shizhe Chen, Qin Jin, and Jianlong Fu. 2019. From words to sentences: A progressive learning approach for zero-resource machine translation with visual pivots. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). International Joint Conferences on Artificial Intelligence Organization, 4932–4938. DOI:

[5]

Raj Dabre, Chenhui Chu, and Anoop Kunchukuttan. 2020. A survey of multilingual neural machine translation. ACM Comput. Surv. 53, 5, Article 99 (Sept.2020), 38 pages. DOI:

Digital Library

[6]

Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, and Zhaopeng Tu. 2021. Progressive multi-granularity training for non-autoregressive translation. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP’21). Association for Computational Linguistics, 2797–2803. DOI:

[7]

Philipp Dufter, Martin Schmitt, and Hinrich Schütze. 2022. Position information in transformers: An overview. Comput. Linguist. 48, 3 (Sept.2022), 733–763. DOI:

[8]

Barry Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindřich Helcl, and Alexandra Birch. 2022. Survey of low-resource machine translation. Comput. Linguist. (July2022), 1–60. DOI:https://direct.mit.edu/coli/article-pdf/doi/10.1162/coli_a_00446/2034376/coli_a_00446.pdf

[9]

Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, and Zhaopeng Tu. 2019. Multi-granularity self-attention for neural machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 887–897. DOI:

[10]

Shicheng Li, Pengcheng Yang, Fuli Luo, and Jun Xie. 2021. Multi-granularity contrasting for cross-lingual pre-training. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP’21). Association for Computational Linguistics, 1708–1717. DOI:

[11]

Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, and Jingbo Zhu. 2019. Shared-private bilingual word embeddings for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3613–3622. DOI:

[12]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized bert pretraining approach. Retrieved from https://api.semanticscholar.org/CorpusID:198953378

[13]

Adam Lopez. 2008. Statistical machine translation. ACM Comput. Surv. 40, 3, Article 8 (Aug.2008), 49 pages. DOI:

Digital Library

[14]

Mieradilijiang Maimaiti, Yang Liu, Huanbo Luan, and Maosong Sun. 2019. Multi-round transfer learning for low-resource NMT using multiple high-resource languages. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 18, 4, Article 38 (may2019), 26 pages. DOI:

Digital Library

[15]

Franz Josef Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Comput. Linguist. 30, 4 (Dec.2004), 417–449. DOI:

Digital Library

[16]

Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A fast, extensible toolkit for sequence modeling. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 48–53. DOI:

[17]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311–318. DOI:

Digital Library

[18]

Peyman Passban, Chris Hokamp, Andy Way, and Qun Liu. 2016. Improving phrase-based SMT using cross-granularity embedding similarity. In Proceedings of the 19th Annual Conference of the European Association for Machine Translation. 129–140. Retrieved from https://aclanthology.org/W16-3403

[19]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 3982–3992. DOI:

[20]

Hammam Riza, Michael Purwoadi, Gunarso, Teduh Uliniansyah, Aw Ai Ti, Sharifah Mahani Aljunied, Luong Chi Mai, Vu Tat Thang, Nguyen Phuong Thai, Vichet Chea, Rapid Sun, Sethserey Sam, Sopheap Seng, Khin Mar Soe, Khin Thandar Nwet, Masao Utiyama, and Chenchen Ding. 2016. Introduction of the Asian language treebank. In Proceedings of the Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA’16). 1–6. DOI:

[21]

Elizabeth Salesky, Andrew Runge, Alex Coda, Jan Niehues, and Graham Neubig. 2020. Optimizing segmentation granularity for neural machine translation. Mach. Transl. 34, 1 (Apr.2020), 41–59. DOI:

Digital Library

[22]

Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 86–96. DOI:

[23]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates, 6000–6010.

Digital Library

[24]

Rui Wang, Xu Tan, Renqian Luo, Tao Qin, and Tie-Yan Liu. 2021. A survey on low-resource neural machine translation. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI’21), Zhi-Hua Zhou (Ed.). International Joint Conferences on Artificial Intelligence Organization, 4636–4643.

[25]

Yiren Wang, ChengXiang Zhai, and Hany Hassan. 2020. Multi-task learning for multilingual neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, 1022–1034. DOI:

[26]

Yining Wang, Long Zhou, Jiajun Zhang, and Chengqing Zong. 2017. Word, subword, or character? an empirical study of granularity in Chinese-English NMT. In Machine Translation, Derek F. Wong and Deyi Xiong (Eds.). Springer, Singapore, 30–42.

[27]

Yingce Xia, Tianyu He, Xu Tan, Fei Tian, Di He, and Tao Qin. 2019. Tied transformers: Neural machine translation with shared encoder and decoder. In Proceedings of the AAAI Conference on Artificial Intelligence.

Digital Library

[28]

Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, and Jingyi Zhang. 2020. Learning source phrase representations for neural machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 386–396. DOI:

[29]

Hyeongu Yun, Yong keun Hwang, and Kyomin Jung. 2020. Improving context-aware neural machine translation using self-attentive sentence embedding. In Proceedings of the AAAI Conference on Artificial Intelligence.

[30]

Poorya Zaremoodi, Wray Buntine, and Gholamreza Haffari. 2018. Adaptive knowledge sharing in multi-task learning: Improving low-resource neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 656–661. DOI:

[31]

Poorya Zaremoodi and Gholamreza Haffari. 2020. Learning to multi-task learn for better neural machine translation. Retrieved from https://arxiv.org/abs/2001.03294

[32]

Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Ke jun Zhang, and Tie-Yan Liu. 2021. UWSpeech: Speech to speech translation for unwritten languages. In Proceedings of the AAAI Conference on Artificial Intelligence.

[33]

Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. 34, 12 (2021), 5586–5609.

[34]

Joyce Zheng, Mehdi Rezagholizadeh, and Peyman Passban. 2022. Dynamic position encoding for transformers. In Proceedings of the 29th International Conference on Computational Linguistics, Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, and Seung-Hoon Na (Eds.). International Committee on Computational Linguistics, 5076–5084. Retrieved from https://aclanthology.org/2022.coling-1.450

[35]

Zaixiang Zheng, Hao Zhou, Shujian Huang, Lei Li, Xinyu Dai, and Jiajun Chen. 2020. Mirror-generative neural machine translation. In Proceedings of the International Conference on Learning Representations (ICLR’20).

[36]

Shuyan Zhou, Xiangkai Zeng, Yingqi Zhou, Antonios Anastasopoulos, and Graham Neubig. 2019. Improving robustness of neural machine translation with multi-task learning. In Proceedings of the 4th Conference on Machine Translation. Association for Computational Linguistics, 565–571. DOI:

Index Terms

Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation

Recommendations

Improving Low-Resource Neural Machine Translation with Weight Sharing
Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data
Abstract
Neural machine translation (NMT) has achieved great success under a great deal of bilingual corpora in the past few years. However, it is much less effective for low-resource language. In order to alleviate the problem, we present two approaches ...
Neural Machine Translation Enhancements through Lexical Semantic Network
ICCMS '18: Proceedings of the 10th International Conference on Computer Modeling and Simulation

In most languages, many words have multiple senses, thus machine translation systems have to choose between several candidates representing different senses of an input word. Although neural machine translation has recently become a dominant paradigm ...
Using Translation Memory to Improve Neural Machine Translations
ICDLT '22: Proceedings of the 2022 6th International Conference on Deep Learning Technologies

In this paper, we describe a way of using translation memory (TM) to improve the translation quality and stability of neural machine translation (NMT) systems, especially when the sentences to be translated have high similarity with sentences stored in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 2

February 2024

340 pages

EISSN:2375-4702

DOI:10.1145/3613556

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 February 2024

Online AM: 09 January 2024

Accepted: 17 December 2023

Revised: 18 August 2023

Received: 18 September 2022

Published in TALLIP Volume 23, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Construction of the International Communication Competences
Shaanxi Federation of Social Sciences Circles
Scientific Research Program
Shaanxi Provincial Education Department

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
198
Total Downloads

Downloads (Last 12 months)198
Downloads (Last 6 weeks)7

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents