Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation

Published: 08 February 2024 Publication History

Abstract

As the rapid development of deep learning methods, neural machine translation (NMT) has attracted more and more attention in recent years. However, lack of bilingual resources decreases the performance of the low-resource NMT model seriously. To overcome this problem, several studies put their efforts on knowledge transfer from high-resource language pairs to low-resource language pairs. However, these methods usually focus on one single granularity of language and the parameter sharing among different granularities in NMT is not well studied. In this article, we propose to improve the parameter sharing in low-resource NMT by introducing multi-granularity knowledge such as word, phrase and sentence. This knowledge can be monolingual and bilingual. We build the knowledge sharing model for low-resource NMT based on a multi-task learning framework, three auxiliary tasks such as syntax parsing, cross-lingual named entity recognition, and natural language generation are selected for the low-resource NMT. Experimental results show that the proposed method consistently outperforms six strong baseline systems on several low-resource language pairs.

References

[1]
Tamer Alkhouli, Gabriel Bretschner, and Hermann Ney. 2018. On the alignment problem in multi-head attention-based neural machine translation. In Proceedings of the 3rd Conference on Machine Translation. Association for Computational Linguistics, 177–185. DOI:
[2]
Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, and James Glass. 2020. On the linguistic representational power of neural machine translation models. Comput. Linguist. 46, 1 (032020), 1–52. Retrieved from https://direct.mit.edu/coli/article-pdf/46/1/1/1847791/coli_a_00367.pdf
[3]
Noe Casas, Marta R. Costa-jussà, José A. R. Fonollosa, Juan A. Alonso, and Ramón Fanlo. 2021. Linguistic knowledge-based vocabularies for neural machine translation. Nat. Lang. Eng. 27, 4 (2021), 485–506. DOI:
[4]
Shizhe Chen, Qin Jin, and Jianlong Fu. 2019. From words to sentences: A progressive learning approach for zero-resource machine translation with visual pivots. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). International Joint Conferences on Artificial Intelligence Organization, 4932–4938. DOI:
[5]
Raj Dabre, Chenhui Chu, and Anoop Kunchukuttan. 2020. A survey of multilingual neural machine translation. ACM Comput. Surv. 53, 5, Article 99 (Sept.2020), 38 pages. DOI:
[6]
Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, and Zhaopeng Tu. 2021. Progressive multi-granularity training for non-autoregressive translation. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP’21). Association for Computational Linguistics, 2797–2803. DOI:
[7]
Philipp Dufter, Martin Schmitt, and Hinrich Schütze. 2022. Position information in transformers: An overview. Comput. Linguist. 48, 3 (Sept.2022), 733–763. DOI:
[8]
Barry Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindřich Helcl, and Alexandra Birch. 2022. Survey of low-resource machine translation. Comput. Linguist. (July2022), 1–60. DOI:https://direct.mit.edu/coli/article-pdf/doi/10.1162/coli_a_00446/2034376/coli_a_00446.pdf
[9]
Jie Hao, Xing Wang, Shuming Shi, Jinfeng Zhang, and Zhaopeng Tu. 2019. Multi-granularity self-attention for neural machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 887–897. DOI:
[10]
Shicheng Li, Pengcheng Yang, Fuli Luo, and Jun Xie. 2021. Multi-granularity contrasting for cross-lingual pre-training. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP’21). Association for Computational Linguistics, 1708–1717. DOI:
[11]
Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, and Jingbo Zhu. 2019. Shared-private bilingual word embeddings for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3613–3622. DOI:
[12]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized bert pretraining approach. Retrieved from https://api.semanticscholar.org/CorpusID:198953378
[13]
Adam Lopez. 2008. Statistical machine translation. ACM Comput. Surv. 40, 3, Article 8 (Aug.2008), 49 pages. DOI:
[14]
Mieradilijiang Maimaiti, Yang Liu, Huanbo Luan, and Maosong Sun. 2019. Multi-round transfer learning for low-resource NMT using multiple high-resource languages. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 18, 4, Article 38 (may2019), 26 pages. DOI:
[15]
Franz Josef Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Comput. Linguist. 30, 4 (Dec.2004), 417–449. DOI:
[16]
Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A fast, extensible toolkit for sequence modeling. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 48–53. DOI:
[17]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311–318. DOI:
[18]
Peyman Passban, Chris Hokamp, Andy Way, and Qun Liu. 2016. Improving phrase-based SMT using cross-granularity embedding similarity. In Proceedings of the 19th Annual Conference of the European Association for Machine Translation. 129–140. Retrieved from https://aclanthology.org/W16-3403
[19]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 3982–3992. DOI:
[20]
Hammam Riza, Michael Purwoadi, Gunarso, Teduh Uliniansyah, Aw Ai Ti, Sharifah Mahani Aljunied, Luong Chi Mai, Vu Tat Thang, Nguyen Phuong Thai, Vichet Chea, Rapid Sun, Sethserey Sam, Sopheap Seng, Khin Mar Soe, Khin Thandar Nwet, Masao Utiyama, and Chenchen Ding. 2016. Introduction of the Asian language treebank. In Proceedings of the Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA’16). 1–6. DOI:
[21]
Elizabeth Salesky, Andrew Runge, Alex Coda, Jan Niehues, and Graham Neubig. 2020. Optimizing segmentation granularity for neural machine translation. Mach. Transl. 34, 1 (Apr.2020), 41–59. DOI:
[22]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 86–96. DOI:
[23]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates, 6000–6010.
[24]
Rui Wang, Xu Tan, Renqian Luo, Tao Qin, and Tie-Yan Liu. 2021. A survey on low-resource neural machine translation. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI’21), Zhi-Hua Zhou (Ed.). International Joint Conferences on Artificial Intelligence Organization, 4636–4643.
[25]
Yiren Wang, ChengXiang Zhai, and Hany Hassan. 2020. Multi-task learning for multilingual neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, 1022–1034. DOI:
[26]
Yining Wang, Long Zhou, Jiajun Zhang, and Chengqing Zong. 2017. Word, subword, or character? an empirical study of granularity in Chinese-English NMT. In Machine Translation, Derek F. Wong and Deyi Xiong (Eds.). Springer, Singapore, 30–42.
[27]
Yingce Xia, Tianyu He, Xu Tan, Fei Tian, Di He, and Tao Qin. 2019. Tied transformers: Neural machine translation with shared encoder and decoder. In Proceedings of the AAAI Conference on Artificial Intelligence.
[28]
Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, and Jingyi Zhang. 2020. Learning source phrase representations for neural machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 386–396. DOI:
[29]
Hyeongu Yun, Yong keun Hwang, and Kyomin Jung. 2020. Improving context-aware neural machine translation using self-attentive sentence embedding. In Proceedings of the AAAI Conference on Artificial Intelligence.
[30]
Poorya Zaremoodi, Wray Buntine, and Gholamreza Haffari. 2018. Adaptive knowledge sharing in multi-task learning: Improving low-resource neural machine translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 656–661. DOI:
[31]
Poorya Zaremoodi and Gholamreza Haffari. 2020. Learning to multi-task learn for better neural machine translation. Retrieved from https://arxiv.org/abs/2001.03294
[32]
Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Ke jun Zhang, and Tie-Yan Liu. 2021. UWSpeech: Speech to speech translation for unwritten languages. In Proceedings of the AAAI Conference on Artificial Intelligence.
[33]
Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. 34, 12 (2021), 5586–5609.
[34]
Joyce Zheng, Mehdi Rezagholizadeh, and Peyman Passban. 2022. Dynamic position encoding for transformers. In Proceedings of the 29th International Conference on Computational Linguistics, Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, and Seung-Hoon Na (Eds.). International Committee on Computational Linguistics, 5076–5084. Retrieved from https://aclanthology.org/2022.coling-1.450
[35]
Zaixiang Zheng, Hao Zhou, Shujian Huang, Lei Li, Xinyu Dai, and Jiajun Chen. 2020. Mirror-generative neural machine translation. In Proceedings of the International Conference on Learning Representations (ICLR’20).
[36]
Shuyan Zhou, Xiangkai Zeng, Yingqi Zhou, Antonios Anastasopoulos, and Graham Neubig. 2019. Improving robustness of neural machine translation with multi-task learning. In Proceedings of the 4th Conference on Machine Translation. Association for Computational Linguistics, 565–571. DOI:

Index Terms

  1. Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 2
    February 2024
    340 pages
    EISSN:2375-4702
    DOI:10.1145/3613556
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 February 2024
    Online AM: 09 January 2024
    Accepted: 17 December 2023
    Revised: 18 August 2023
    Received: 18 September 2022
    Published in TALLIP Volume 23, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Neural machine translation
    2. multi-granularity knowledge
    3. multi-task learning
    4. parameter sharing

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Construction of the International Communication Competences
    • Shaanxi Federation of Social Sciences Circles
    • Scientific Research Program
    • Shaanxi Provincial Education Department

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 198
      Total Downloads
    • Downloads (Last 12 months)198
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media