Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Neural Machine Translation for Low-Resource Languages from a Chinese-centric Perspective: A Survey

Published: 21 June 2024 Publication History

Abstract

Machine translation–the automatic transformation of one natural language (source language) into another (target language) through computational means–occupies a central role in computational linguistics and stands as a cornerstone of research within the field of Natural Language Processing (NLP). In recent years, the prominence of Neural Machine Translation (NMT) has grown exponentially, offering an advanced framework for machine translation research. It is noted for its superior translation performance, especially when tackling the challenges posed by low-resource language pairs that suffer from a limited corpus of data resources. This article offers an exhaustive exploration of the historical trajectory and advancements in NMT, accompanied by an analysis of the underlying foundational concepts. It subsequently provides a concise demarcation of the unique characteristics associated with low-resource languages and presents a succinct review of pertinent translation models and their applications, specifically within the context of languages with low-resources. Moreover, this article delves deeply into machine translation techniques, highlighting approaches tailored for Chinese-centric low-resource languages. Ultimately, it anticipates upcoming research directions in the realm of low-resource language translation.

References

[1]
Frank Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 6 (1958), 386.
[2]
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning representations by back-propagating errors. Nature 323, 6088 (1986), 533–536.
[3]
Jeffrey L. Elman. 1990. Finding structure in time. Cognitive Science 14, 2 (1990), 179–211.
[4]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[5]
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324. DOI:
[6]
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. J. Mach. Learn. Res. 3, 6 (2003), 1137–1155.
[7]
Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. (2013). http://arxiv.org/abs/1301.3781
[8]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. (2014), 3104–3112. https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html
[9]
Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, 1724–1734. DOI:
[10]
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016). arXiv:1609.08144http://arxiv.org/abs/1609.08144
[11]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
[12]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186. DOI:
[13]
Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA (Proceedings of Machine Learning Research), Vol. 97. PMLR, 5926–5936. http://proceedings.mlr.press/v97/song19d.html
[14]
Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, and Armand Joulin. 2021. Beyond English-centric multilingual machine translation. The Journal of Machine Learning Research 22, 1 (2021), 4839–4886.
[15]
Luo Tianhua and Deng Shuwen. 2022. Morphological meanings and language forms: A review on Jiang (2021). Minority Languages of China No.257, 122-128 (2022).
[16]
Song Jinlan. 2002. Differentiation of morphological variants in Chinese Tibetan language. Minority Languages of China29-33 (2002).
[17]
Mao Lei. 2022. A Morphological Study of Hindi. Master’s Thesis. Information Engineering University.
[18]
Zhuome Renqing. 2018. Research on the structure of Tibetan verb phrases based on language information processing. Plateau Science Research 2, 60-69 (2018). DOI:
[19]
Jinyi Zhang, Ye Tian, Jiannan Mao, Mei Han, and Tadahiro Matsumoto. 2022. WCC-JC: A web-crawled corpus for Japanese-Chinese neural machine translation. Applied Sciences 12, 12 (2022), 6002.
[20]
Jinyi Zhang, Ye Tian, Jiannan Mao, Mei Han, Feng Wen, Cong Guo, Zhonghui Gao, and Tadahiro Matsumoto. 2023. WCC-JC 2.0: A web-crawled and manually aligned parallel corpus for Japanese-Chinese neural machine translation. Electronics 12, 5 (2023), 1140.
[21]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[22]
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 8440–8451. DOI:
[23]
Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. 2018. Unsupervised neural machine translation. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=Sy2ogebAW
[24]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: Long Papers. The Association for Computer Linguistics. DOI:
[25]
Yoon Kim, Yacine Jernite, David A. Sontag, and Alexander M. Rush. 2016. Character-aware neural language models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA, Dale Schuurmans and Michael P. Wellman (Eds.). AAAI Press, 2741–2749. DOI:
[26]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186. DOI:
[27]
Zhebin Zhang, Sai Wu, Dawei Jiang, and Gang Chen. 2021. BERT-JAM: Maximizing the utilization of BERT for neural machine translation. Neurocomputing 460 (2021), 84–94. DOI:
[28]
Haoran Xu, Benjamin Van Durme, and Kenton W. Murray. 2021. BERT, mBERT, or BiBERT? A study on contextualized embeddings for neural machine translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7–11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 6663–6675. DOI:
[29]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). arXiv:1907.11692http://arxiv.org/abs/1907.11692
[30]
Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE 2.0: A continual pre-training framework for language understanding. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020. AAAI Press, 8968–8975. DOI:
[31]
Xueqing Wu, Yingce Xia, Jinhua Zhu, Lijun Wu, Shufang Xie, and Tao Qin. 2022. A study of BERT for context-aware neural machine translation. Mach. Learn. 111, 3 (2022), 917–935. DOI:
[32]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 7871–7880. DOI:
[33]
Moreno La Quatra and Luca Cagliero. 2023. BART-IT: An efficient sequence-to-sequence model for Italian text summarization. Future Internet 15, 1 (2023), 15. DOI:
[34]
László János Laki and Zijian Győző Yang. 2022. Neural machine translation for Hungarian. Acta Linguistica Academica 69, 4 (2022), 501–520. DOI:
[35]
Angel Navarro and Francisco Casacuberta. 2023. Exploring multilingual pretrained machine translation models for interactive translation. In Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track, Masaru Yamada and Felix do Carmo (Eds.). Asia-Pacific Association for Machine Translation, Macau SAR, China, 132–142. https://aclanthology.org/2023.mtsummit-users.12
[36]
En-Shiun Annie Lee, Sarubi Thillainathan, Shravan Nayak, Surangika Ranathunga, David Ifeoluwa Adelani, Ruisi Su, and Arya McCarthy. 2022. Pre-trained multilingual sequence-to-sequence models: A hope for low-resource language translation?. In Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22–27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 58–67. DOI:
[37]
Marta Ruiz Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Alison Youngblood, Bapi Akula, Loïc Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon L. Spruit, C. Tran, Pierre Yves Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk and Jeff Wang. 2022. No language left behind: Scaling human-centered machine translation. CoRR abs/2207.04672 (2022).
[38]
OpenAI. 2023. GPT-4 technical report. CoRR abs/2303.08774 (2023). DOI:arXiv:2303.08774
[39]
Zhang Qi, Gui Tao, Zheng Rui, and Huang Xuanjing. 2023. Large scale language models: From theory to practice. Publishing House of Electronics Industry (2023). Translated from Chinese.
[40]
Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, et al. 2022. BLOOM: A 176B-parameter open-access multilingual language model. CoRR abs/2211.05100 (2022). DOI:arXiv:2211.05100
[41]
Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli, Ruxandra Cojocaru, Mérouane Debbah, Étienne Goffinet, Daniel Hesslow, Julien Launay, Quentin Malartic, Daniele Mazzotta, Badreddine Noune, Baptiste Pannier, and Guilherme Penedo. 2023. The Falcon series of open language models. CoRR abs/2311.16867 (2023). DOI:arXiv:2311.16867
[42]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, and Paul Barham. 2023. PaLM: Scaling language modeling with pathways. J. Mach. Learn. Res. 24 (2023), 240:1–240:113. http://jmlr.org/papers/v24/22-1144.html
[43]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurélien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and efficient foundation language models. CoRR abs/2302.13971 (2023). DOI:arXiv:2302.13971
[44]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Sergey Edunov, and Thomas Scialom. 2023. LLaMA 2: Open foundation and fine-tuned chat models. CoRR abs/2307.09288 (2023). DOI:arXiv:2307.09288
[45]
Wenhao Zhu, Hongyi Liu, Qingxiu Dong, Jingjing Xu, Lingpeng Kong, Jiajun Chen, Lei Li, and Shujian Huang. 2023. Multilingual machine translation with large language models: Empirical results and analysis. CoRR abs/2304.04675 (2023). DOI:arXiv:2304.04675
[46]
Yangjian Wu and Gang Hu. 2023. Exploring prompt engineering with GPT language models for document-level machine translation: Insights and findings. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 166–169. https://aclanthology.org/2023.wmt-1.15
[47]
Biao Zhang, Barry Haddow, and Alexandra Birch. 2023. Prompting large language model for machine translation: A case study. In International Conference on Machine Learning, ICML 2023, 23–29 July 2023, Honolulu, Hawaii, USA (Proceedings of Machine Learning Research), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.), Vol. 202. PMLR, 41092–41110. https://proceedings.mlr.press/v202/zhang23m.html
[48]
Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, and Zhaopeng Tu. 2023. Is ChatGPT A good translator? A preliminary study. CoRR abs/2301.08745 (2023). DOI:arXiv:2301.08745
[49]
Marjan Ghazvininejad, Hila Gonen, and Luke Zettlemoyer. 2023. Dictionary-based phrase-level prompting of large language models for machine translation. CoRR abs/2302.07856 (2023). DOI:arXiv:2302.07856
[50]
Chunyou Li, Mingtong Liu, Hongxiao Zhang, Yufeng Chen, Jinan Xu, and Ming Zhou. 2023. MT2: Towards a multi-task machine translation model with translation-specific in-context learning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6–10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 8616–8627. https://aclanthology.org/2023.emnlp-main.532
[51]
Wen Yang, Chong Li, Jiajun Zhang, and Chengqing Zong. 2023. BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages. (2023). arxiv:cs.CL/2305.18098
[52]
Jiali Zeng, Fandong Meng, Yongjing Yin, and Jie Zhou. 2023. TIM: Teaching large language models to translate with comparison. CoRR abs/2307.04408 (2023). DOI:arXiv:2307.04408
[53]
Shaolei Zhang, Qingkai Fang, Zhuocheng Zhang, Zhengrui Ma, Yan Zhou, Langlin Huang, Mengyu Bu, Shangtong Gui, Yunji Chen, Xilin Chen, and Yang Feng. 2023. BayLing: Bridging cross-lingual alignment and instruction following through interactive translation for large language models. CoRR abs/2306.10968 (2023). DOI:arXiv:2306.10968
[54]
Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, and Lei Li. 2023. Extrapolating large language models to non-English by aligning languages. CoRR abs/2308.04948 (2023). DOI:arXiv:2308.04948
[55]
Haoyu Xu, Xing Wang, Xiaolin Xing, and Yu Hong. 2023. Monolingual denoising with large language models for low-resource machine translation. In Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Foshan, China, October 12–15, 2023, Proceedings, Part I (Lecture Notes in Computer Science), Fei Liu, Nan Duan, Qingting Xu, and Yu Hong (Eds.), Vol. 14302. Springer, 413–425. DOI:
[56]
Haoran Xu, Young Jin Kim, Amr Sharaf, and Hany Hassan Awadalla. 2023. A paradigm shift in machine translation: Boosting translation performance of large language models. CoRR abs/2309.11674 (2023). DOI:arXiv:2309.11674
[57]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28–December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.). http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html
[58]
Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, and Lidong Bing. 2023. Verify-and-edit: A knowledge-enhanced chain-of-thought framework. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9–14, 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 5823–5840. DOI:
[59]
Hongyuan Lu, Haoyang Huang, Dongdong Zhang, Haoran Yang, Wai Lam, and Furu Wei. 2023. Chain-of-dictionary prompting elicits translation in large language models. CoRR abs/2305.06575 (2023). DOI:arXiv:2305.06575
[60]
Chenyang Lyu, Jitao Xu, and Longyue Wang. 2023. New trends in machine translation using large language models: Case examples with ChatGPT. CoRR abs/2305.01181 (2023). DOI:arXiv:2305.01181
[61]
Hui Huang, Shuangzhi Wu, Xinnian Liang, Bing Wang, Yanrui Shi, Peihao Wu, Muyun Yang, and Tiejun Zhao. 2023. Towards making the most of LLM for translation quality estimation. In Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Foshan, China, October 12–15, 2023, Proceedings, Part I (Lecture Notes in Computer Science), Fei Liu, Nan Duan, Qingting Xu, and Yu Hong (Eds.), Vol. 14302. Springer, 375–386. DOI:
[62]
Tingchen Fu, Lemao Liu, Deng Cai, Guoping Huang, Shuming Shi, and Rui Yan. 2024. The reasonableness behind unreasonable translation capability of large language model. In International Conference on Learning Representations (ICLR’24).
[63]
Rachel Bawden and François Yvon. 2023. Investigating the translation performance of a large multilingual language model: The case of BLOOM. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, EAMT 2023, Tampere, Finland, 12–15 June 2023, Mary Nurminen, Judith Brenner, Maarit Koponen, Sirkku Latomaa, Mikhail Mikhailov, Frederike Schierl, Tharindu Ranasinghe, Eva Vanmassenhove, Sergi Alvarez Vidal, Nora Aranberri, Mara Nunziatini, Carla Parra Escartín, Mikel L. Forcada, Maja Popovic, Carolina Scarton, and Helena Moniz (Eds.). European Association for Machine Translation, 157–170. https://aclanthology.org/2023.eamt-1.16
[64]
Amr Hendy, Mohamed Abdelrehim, Amr Sharaf, Vikas Raunak, Mohamed Gabr, Hitokazu Matsushita, Young Jin Kim, Mohamed Afify, and Hany Hassan Awadalla. 2023. How good are GPT models at machine translation? A comprehensive evaluation. CoRR abs/2302.09210 (2023). DOI:arXiv:2302.09210
[65]
Longyue Wang, Chenyang Lyu, Tianbo Ji, Zhirui Zhang, Dian Yu, Shuming Shi, and Zhaopeng Tu. 2023. Document-level machine translation with large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6–10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 16646–16661. https://aclanthology.org/2023.emnlp-main.1036
[66]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6–12, 2002, Philadelphia, PA, USA. ACL, 311–318. DOI:
[67]
Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://aclanthology.org/W04-1013
[68]
Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Wang, and Lei Li. 2023. INSTRUCTSCORE: Towards explainable text generation evaluation with automatic feedback. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 5967–5994. DOI:
[69]
Lianzhang Lou, Xi Yin, Yutao Xie, and Yang Xiang. 2023. CCEval: A representative evaluation benchmark for the Chinese-centric multilingual machine translation. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6–10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 10176–10184. https://aclanthology.org/2023.findings-emnlp.682
[70]
Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, and Dacheng Tao. 2023. Towards making the most of ChatGPT for machine translation. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6–10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 5622–5633. https://aclanthology.org/2023.findings-emnlp.373
[71]
Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, and Graham Neubig. 2023. ChatGPT MT: Competitive for high- (but not low-) resource languages. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 392–418. https://aclanthology.org/2023.wmt-1.40
[72]
Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, Shuming Shi, and Zhaopeng Tu. 2023. Is ChatGPT A Good Translator? Yes with GPT-4 as the Engine. (2023). arxiv:cs.CL/2301.08745
[73]
Lucía Sanz-Valdivieso and Belén López-Arroyo. 2023. Google Translate vs. ChatGPT: Can non-language professionals trust them for specialized translation?. In International Conference Human-informed Translation and Interpreting Technology (HiT-IT’23). 97–107.
[74]
Marzena Karpinska and Mohit Iyyer. 2023. Large language models effectively leverage document-level context for literary translation, but critical errors persist. In Proceedings of the Eighth Conference on Machine Translation, WMT2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 419–451. https://aclanthology.org/2023.wmt-1.41
[75]
Lukas Edman, Antonio Toral, and Gertjan van Noord. 2020. Low-resource unsupervised NMT: Diagnosing the problem and providing a linguistically motivated solution. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. 81–90.
[76]
Alexandra Chronopoulou, Dario Stojanovski, and Alexander M. Fraser. 2020. Reusing a pretrained language model on languages with limited corpora for unsupervised NMT. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16–20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 2703–2711. DOI:
[77]
Yihong Liu, Alexandra Chronopoulou, Hinrich Schütze, and Alexander Fraser. 2023. On the copying problem of unsupervised NMT: A training schedule with a language discriminator loss. In Proceedings of the 20th International Conference on Spoken Language Translation, IWSLT@ACL 2023, Toronto, Canada (in-person and online), 13–14 July, 2023, Elizabeth Salesky, Marcello Federico, and Marine Carpuat (Eds.). Association for Computational Linguistics, 491–502. DOI:
[78]
Salam Michael Singh and Thoudam Doren Singh. 2020. Unsupervised neural machine translation for English and Manipuri. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages. Association for Computational Linguistics, Suzhou, China, 69–78. https://aclanthology.org/2020.loresmt-1.10
[79]
Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, and Tiejun Zhao. 2020. Knowledge distillation for multilingual unsupervised neural machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 3525–3535. DOI:
[80]
Jiawei Wu, Xin Wang, and William Yang Wang. 2019. Extract and edit: An alternative to back-translation for unsupervised neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 1173–1183. DOI:
[81]
Zuchao Li, Hai Zhao, Rui Wang, Masao Utiyama, and Eiichiro Sumita. 2020. Reference language based unsupervised neural machine translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16–20 November 2020 (Findings of ACL), Trevor Cohn, Yulan He, and Yang Liu (Eds.), Vol. EMNLP 2020. Association for Computational Linguistics, 4151–4162. DOI:
[82]
Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, 5039–5049. https://aclanthology.org/D18-1549/
[83]
Guillaume Lample, Alexis Conneau, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2018. Word translation without parallel data. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=H196sainb
[84]
Guillaume Lample, Alexis Conneau, Ludovic Denoyer, and Marc’Aurelio Ranzato. 2018. Unsupervised machine translation using monolingual corpora only. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rkYTTf-AZ
[85]
Zi-Yi Dou, Junjie Hu, Antonios Anastasopoulos, and Graham Neubig. 2019. Unsupervised domain adaptation for neural machine translation with domain-aware feature embeddings. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, 1417–1422. DOI:
[86]
Po-Yao Huang, Junjie Hu, Xiaojun Chang, and Alexander G. Hauptmann. 2020. Unsupervised multimodal neural machine translation with pseudo visual pivoting. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 8226–8237. DOI:
[87]
Chang Xu, Tao Qin, Gang Wang, and Tie-Yan Liu. 2019. Polygon-Net: A general framework for jointly boosting multiple unsupervised neural machine translation models. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’19). 5320–5326.
[88]
Sreyashi Nag, Mihir Kale, Varun Lakshminarasimhan, and Swapnil Singhavi. 2020. Incorporating bilingual dictionaries for low resource semi-supervised neural machine translation. CoRR abs/2004.02071 (2020). arXiv:2004.02071https://arxiv.org/abs/2004.02071
[89]
Yong Cheng. 2019. Semi-supervised Learning for Neural Machine Translation. Springer Singapore, Singapore, 25–40.
[90]
Weijia Xu, Xing Niu, and Marine Carpuat. 2020. Dual reconstruction: A unifying objective for semi-supervised neural machine translation. EMNLP 2020 (2020), 2006–2020. DOI:
[91]
Wenbo Zhang, Xiao Li, Yating Yang, Rui Dong, and Gongxu Luo. 2020. Keeping models consistent between pretraining and translation for low-resource neural machine translation. Future Internet 12, 12 (2020). DOI:
[92]
Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2019. Semi-supervised neural machine translation via marginal distribution estimation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, 10 (2019), 1564–1576. DOI:
[93]
Yunsu Kim, Miguel Graça, and Hermann Ney. 2020. When and why is unsupervised neural machine translation useless?. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, EAMT 2020, Lisboa, Portugal, November 3–5, 2020. European Association for Machine Translation, 35–44. https://aclanthology.org/2020.eamt-1.5/
[94]
Zhuoyuan Mao, Fabien Cromierès, Raj Dabre, Haiyue Song, and Sadao Kurohashi. 2020. JASS: Japanese-specific sequence to sequence pre-training for neural machine translation. (2020), 3683–3691. https://aclanthology.org/2020.lrec-1.454/
[95]
Chen Xu, Bojie Hu, Yufan Jiang, Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, and Jingbo Zhu. 2020. Dynamic curriculum learning for low-resource neural machine translation. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020. International Committee on Computational Linguistics, 3977–3989. DOI:
[96]
Barry Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindrich Helcl, and Alexandra Birch. 2022. Survey of low-resource machine translation. Comput. Linguistics 48, 3 (2022), 673–732. DOI:
[97]
Surangika Ranathunga, En-Shiun Annie Lee, Marjana Prifti Skenduli, Ravi Shekhar, Mehreen Alam, and Rishemjit Kaur. 2021. Neural machine translation for low-resource languages: A survey. CoRR abs/2106.15115 (2021). arXiv:2106.15115https://arxiv.org/abs/2106.15115
[98]
Zi-Yi Dou, Antonios Anastasopoulos, and Graham Neubig. 2020. Dynamic data selection and weighting for iterative back-translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16–20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 5894–5904. DOI:
[99]
Ali Araabi and Christof Monz. 2020. Optimizing transformer for low-resource neural machine translation. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020, Donia Scott, Núria Bel, and Chengqing Zong (Eds.). International Committee on Computational Linguistics, 3429–3435. DOI:
[100]
Thi-Vinh Ngo, Phuong-Thai Nguyen, Thanh-Le Ha, Khac-Quy Dinh, and Le-Minh Nguyen. 2020. Improving multilingual neural machine translation for low-resource languages: French, English-Vietnamese. arXiv preprint arXiv:2012.08743 (2020).
[101]
Mengzhou Xia, Xiang Kong, Antonios Anastasopoulos, and Graham Neubig. 2019. Generalized data augmentation for low-resource translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 5786–5796. DOI:
[102]
Libo Qin, Minheng Ni, Yue Zhang, and Wanxiang Che. 2020. CoSDA-ML: Multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, Christian Bessiere (Ed.). ijcai.org, 3853–3860. DOI:
[103]
Xuan-Phi Nguyen, Shafiq R. Joty, Kui Wu, and Ai Ti Aw. 2020. Data diversification: A simple strategy for neural machine translation. (2020). https://proceedings.neurips.cc/paper/2020/hash/7221e5c8ec6b08ef6d3f9ff3ce6eb1d1-Abstract.html
[104]
Zhenhao Li and Lucia Specia. 2019. Improving neural machine translation robustness via data augmentation: Beyond back-translation. In Proceedings of the 5th Workshop on Noisy User-generated Text, W-NUT@EMNLP 2019, Hong Kong, China, November 4, 2019, Wei Xu, Alan Ritter, Tim Baldwin, and Afshin Rahimi (Eds.). Association for Computational Linguistics, 328–336. DOI:
[105]
Min-Hyung Kang and Kais Kudrolli. 2019. VaLaR NMT: Vastly Lacking Resources Neural Machine Translation. (2019).
[106]
Guanhua Chen, Yun Chen, Yong Wang, and Victor O. K. Li. 2021. Lexical-constraint-aware neural machine translation via data augmentation. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 3587–3593.
[107]
Sufeng Duan, Hai Zhao, and Dongdong Zhang. 2023. Syntax-aware data augmentation for neural machine translation. IEEE ACM Trans. Audio Speech Lang. Process. 31 (2023), 2988–2999. DOI:
[108]
Amittai Axelrod, Xiaodong He, and Jianfeng Gao. 2011. Domain adaptation via pseudo in-domain data selection. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A Meeting of SIGDAT, a Special Interest Group of the ACL. ACL, 355–362. https://aclanthology.org/D11-1033/
[109]
Marlies van der Wees, Arianna Bisazza, and Christof Monz. 2017. Dynamic data selection for neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9–11, 2017. Association for Computational Linguistics, 1400–1410. DOI:
[110]
Pei Zhang, Xueying Xu, and Deyi Xiong. 2018. Active learning for neural machine translation. In 2018 International Conference on Asian Language Processing, IALP 2018, Bandung, Indonesia, November 15–17, 2018, Minghui Dong, Moch Arif Bijaksana, Herry Sujaini, Ade Romadhony, Fariska Z. Ruskanda, Elvira Nurfadhilah, and Lyla Ruslana Aini (Eds.). IEEE, 153–158. DOI:
[111]
Rui Wang, Masao Utiyama, Andrew Finch, Lemao Liu, Kehai Chen, and Eiichiro Sumita. 2018. Sentence selection and weighting for neural machine translation domain adaptation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 10 (2018), 1727–1741.
[112]
Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, and Min Zhang. 2019. Code-switching for enhancing NMT with pre-specified translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 449–459. DOI:
[113]
Yipeng Li. 2015. The annotation technology of Japanese scientific language in the Sino Japanese bilingual parallel corpus. Guide to Business2 (2015), 175–176.
[114]
Yu Li, Xiao Li, Yating Yang, and Rui Dong. 2020. A diverse data augmentation strategy for low-resource neural machine translation. Information 11, 5 (2020). DOI:
[115]
Xinyi Wang, Hieu Pham, Zihang Dai, and Graham Neubig. 2018. SwitchOut: An efficient data augmentation algorithm for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, 856–861. DOI:
[116]
Fei Gao, Jinhua Zhu, Lijun Wu, Yingce Xia, Tao Qin, Xueqi Cheng, Wengang Zhou, and Tie-Yan Liu. 2019. Soft contextual data augmentation for neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 5539–5544. DOI:
[117]
Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, and Hermann Ney. 2019. Generalizing back-translation in neural machine translation. In Proceedings of the Fourth Conference on Machine Translation, WMT 2019, Florence, Italy, August 1–2, 2019 - Volume 1: Research Papers. Association for Computational Linguistics, 45–52. DOI:
[118]
Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, and Satoshi Nakamura. 2018. Multi-source neural machine translation with data augmentation. In Proceedings of the 15th International Conference on Spoken Language Translation, IWSLT 2018, Bruges, Belgium, October 29–30, 2018. International Conference on Spoken Language Translation, 48–53. https://aclanthology.org/2018.iwslt-1.7
[119]
Amane Sugiyama and Naoki Yoshinaga. 2019. Data augmentation using back-translation for context-aware neural machine translation. In Proceedings of the Fourth Workshop on Discourse in Machine Translation, DiscoMT@EMNLP 2019, Hong Kong, China, November 3, 2019, Andrei Popescu-Belis, Sharid Loáiciga, Christian Hardmeier, and Deyi Xiong (Eds.). Association for Computational Linguistics, 35–44. DOI:
[120]
Isaac Caswell, Ciprian Chelba, and David Grangier. 2019. Tagged back-translation. In Proceedings of the Fourth Conference on Machine Translation, WMT 2019, Florence, Italy, August 1–2, 2019 - Volume 1: Research Papers. Association for Computational Linguistics, 53–63. DOI:
[121]
Jyotsana Khatri and Pushpak Bhattacharyya. 2020. Filtering back-translated data in unsupervised neural machine translation. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8–13, 2020, Donia Scott, Núria Bel, and Chengqing Zong (Eds.). International Committee on Computational Linguistics, 4334–4339. DOI:
[122]
Hao-Ran Wei, Zhirui Zhang, Boxing Chen, and Weihua Luo. 2020. Iterative domain-repaired back-translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16–20, 2020. Association for Computational Linguistics, 5884–5893. DOI:
[123]
Idris Abdulmumin, Bashir Shehu Galadanci, and Ismaila Idris Sinan. 2020. Iterative self-learning for enhanced back-translation in low resource neural machine translation. CoRR abs/2011.07403 (2020). arXiv:2011.07403https://arxiv.org/abs/2011.07403
[124]
Hieu Pham, Xinyi Wang, Yiming Yang, and Graham Neubig. 2021. Meta back-translation. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3–7, 2021. OpenReview.net. https://openreview.net/forum?id=3jjmdp7Hha
[125]
Congcong You, Shengxiang Gao, Zhengtao Yu, Cunli Mao, and Runhai Pan. 2021. A Chinese-Vietnamese neural machine translation method based on synonym data augmentation. Computer Engineering & Science 43, 08 (2021), 1497.
[126]
Chengxun Jia, Hua Lai, Zhengtao Yu, Yonghua Wen, and Zhiqiang Yu. 2021. Phrase substitution based pseudo-parallel sentence pair generation between Chinese and Vietnamese. Journal of Chinese Information Processing 35, 8 (2021), 47–55.
[127]
Zhiyun Zhao, Chongde Shi, Yanqing He, Yingfan Gao, and Changqing Yao. 2017. Cooperative research on Chinese-Japanese machine translation for S&T documents. Technology Intelligence Engineering 3, 3 (2017), 4–9.
[128]
Yimeng Zhuang, Yuan Zhang, and Lijie Wang. 2020. LIT team’s system description for Japanese-Chinese machine translation task in IWSLT 2020. In Proceedings of the 17th International Conference on Spoken Language Translation, IWSLT 2020, Online, July 9–10, 2020, Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, and François Yvon (Eds.). Association for Computational Linguistics, 109–113. DOI:
[129]
Masato Hagiwara. 2020. Octanove labs’ Japanese-Chinese open domain translation system. In Proceedings of the 17th International Conference on Spoken Language Translation, IWSLT 2020, Online, July 9–10, 2020, Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, and François Yvon (Eds.). Association for Computational Linguistics, 166–171. DOI:
[130]
Vaibhav Vaibhav, Sumeet Singh, Craig Stewart, and Graham Neubig. 2019. Improving robustness of machine translation with synthetic noise. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 1916–1920. DOI:
[131]
Hany Hassan, Mostafa Elaraby, and Ahmed Y. Tawfik. 2017. Synthetic data for neural machine translation of spoken-dialects. In Proceedings of the 14th International Conference on Spoken Language Translation, IWSLT 2017, Tokyo, Japan, December 14–15, 2017, Sakriani Sakti and Masao Utiyama (Eds.). International Workshop on Spoken Language Translation, 82–89. https://aclanthology.org/2017.iwslt-1.12
[132]
Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. Data augmentation for low-resource neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 2: Short Papers. Association for Computational Linguistics, 567–573. DOI:
[133]
Bram Bulté and Arda Tezcan. 2019. Neural fuzzy repair: Integrating fuzzy matches into neural machine translation. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 1800–1809. DOI:
[134]
Wei Peng, Chongxuan Huang, Tianhao Li, Yun Chen, and Qun Liu. 2020. Dictionary-based data augmentation for cross-domain neural machine translation. CoRR abs/2004.02577 (2020). arXiv:2004.02577https://arxiv.org/abs/2004.02577
[135]
Daniel Li, I Te, Naveen Arivazhagan, Colin Cherry, and Dirk Padfield. 2021. Sentence boundary augmentation for neural machine translation robustness. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’21). IEEE, 7553–7557.
[136]
Mara Chinea-Rios, Álvaro Peris, and Francisco Casacuberta. 2017. Adapting neural machine translation with parallel synthetic data. In Proceedings of the Second Conference on Machine Translation, WMT 2017, Copenhagen, Denmark, September 7–8, 2017, Ondrej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno-Yepes, Philipp Koehn, and Julia Kreutzer (Eds.). Association for Computational Linguistics, 138–147. DOI:
[137]
Yu Li, Xiao Li, Yating Yang, and Rui Dong. 2020. A diverse data augmentation strategy for low-resource neural machine translation. Information 11, 5 (2020), 255.
[138]
Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5 (102017), 339–351. DOI:arXiv:https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00065/1567476/tacl_a_00065.pdf
[139]
Yong Cheng, Qian Yang, Yang Liu, Maosong Sun, and Wei Xu. 2017. Joint training for pivot-based neural machine translation. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 3974–3980. DOI:
[140]
Yichao Lu, Phillip Keung, Faisal Ladhak, Vikas Bhardwaj, Shaonan Zhang, and Jason Sun. 2018. A neural interlingua for multilingual machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers. Association for Computational Linguistics, Brussels, Belgium, 84–92. DOI:
[141]
Baijun Ji, Zhirui Zhang, Xiangyu Duan, Min Zhang, Boxing Chen, and Weihua Luo. 2020. Cross-lingual pre-training based transfer for zero-shot neural machine translation. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020. AAAI Press, 115–122. https://ojs.aaai.org/index.php/AAAI/article/view/5341
[142]
Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, and Xian Li. 2021. Improving zero-shot translation by disentangling positional information. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 1259–1273. DOI:
[143]
Guanhua Chen, Shuming Ma, Yun Chen, Li Dong, Dongdong Zhang, Jia Pan, Wenping Wang, and Furu Wei. 2021. Zero-shot cross-lingual transfer of neural machine translation with multilingual pretrained encoders. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 15–26. DOI:
[144]
Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, and Wolfgang Macherey. 2019. The missing ingredient in zero-shot neural machine translation. CoRR abs/1903.07091 (2019). arXiv:1903.07091http://arxiv.org/abs/1903.07091
[145]
Maruan Al-Shedivat and Ankur Parikh. 2019. Consistency by agreement in zero-shot neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, USA, 1184–1197. DOI:
[146]
Ngoc-Quan Pham, Jan Niehues, Thanh-Le Ha, and Alexander Waibel. 2019. Improving zero-shot translation with language-independent constraints. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers). Association for Computational Linguistics, Florence, Italy, 13–23. DOI:
[147]
Xiao Pan, Mingxuan Wang, Liwei Wu, and Lei Li. 2021. Contrastive learning for many-to-many multilingual neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 244–258. DOI:
[148]
Shuhao Gu and Yang Feng. 2022. Improving zero-shot multilingual translation with universal representations and cross-mapping. In Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 6492–6504. https://aclanthology.org/2022.findings-emnlp.485
[149]
Liwei Wu, Shanbo Cheng, Mingxuan Wang, and Lei Li. 2021. Language tags matter for zero-shot neural machine translation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, Online, 3001–3007. DOI:
[150]
Jiatao Gu, Yong Wang, Kyunghyun Cho, and Victor O.K. Li. 2019. Improved zero-shot neural machine translation via ignoring spurious correlations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 1258–1268. DOI:
[151]
Changfeng Zhu, Heng Yu, Shanbo Cheng, and Weihua Luo. 2020. Language-aware interlingua for multilingual neural machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1650–1655. DOI:
[152]
Biao Zhang, Philip Williams, Ivan Titov, and Rico Sennrich. 2020. Improving massively multilingual neural machine translation and zero-shot translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1628–1639. DOI:
[153]
Weizhi Wang, Zhirui Zhang, Yichao Du, Boxing Chen, Jun Xie, and Weihua Luo. 2021. Rethinking zero-shot neural machine translation: From a perspective of latent variables. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 4321–4327. DOI:
[154]
Mieradilijiang Maimaiti, Yang Liu, Huanbo Luan, and Maosong Sun. 2019. Multi-round transfer learning for low-resource NMT using multiple high-resource languages. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 4 (2019), 1–26.
[155]
Gongxu Luo, Yating Yang, Yang Yuan, Zhanheng Chen, and Aizimaiti Ainiwaer. 2019. Hierarchical transfer learning architecture for low-resource neural machine translation. IEEE Access 7 (2019), 154157–154166.
[156]
Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, and Hermann Ney. 2019. Pivot-based transfer learning for neural machine translation between non-English languages. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, 866–876. DOI:
[157]
Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, and Rico Sennrich. 2020. In neural machine translation, what does transfer learning transfer?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, 7701–7710. DOI:
[158]
Raj Dabre, Atsushi Fujita, and Chenhui Chu. 2019. Exploiting multilingualism through multistage fine-tuning for low-resource neural machine translation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 1410–1416.
[159]
Mieradilijiang Maimaiti, Yang Liu, Huanbo Luan, and Maosong Sun. 2021. Enriching the transfer learning with pre-trained lexicon embedding for low-resource neural machine translation. Tsinghua Science and Technology 27, 1 (2021), 150–163.
[160]
Aizhan Imankulova, Raj Dabre, Atsushi Fujita, and Kenji Imamura. 2019. Exploiting out-of-domain parallel data through multilingual transfer learning for low-resource neural machine translation. In Proceedings of Machine Translation Summit XVII Volume 1: Research Track, MTSummit 2019, Dublin, Ireland, August 19–23, 2019. European Association for Machine Translation, 128–139. https://aclanthology.org/W19-6613/
[161]
Xiaolin Xing, Yu Hong, Minhan Xu, Jianmin Yao, and Guodong Zhou. 2022. Taking actions separately: A bidirectionally-adaptive transfer learning method for low-resource neural machine translation. In Proceedings of the 29th International Conference on Computational Linguistics. 4481–4491.
[162]
Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, and Min Zhang. 2022. ConsistTL: Modeling consistency in transfer learning for low-resource neural machine translation. (2022), 8383–8394. DOI:
[163]
Gong-Xu Luo, Ya-Ting Yang, Rui Dong, Yan-Hong Chen, and Wen-Bo Zhang. 2020. A joint back-translation and transfer learning method for low-resource neural machine translation. Mathematical Problems in Engineering 2020 (2020), 1–11.
[164]
Vikrant Goyal, Sourav Kumar, and Dipti Misra Sharma. 2020. Efficient neural machine translation for low-resource languages via exploiting related languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 162–168.
[165]
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc Viet Le, and Ruslan Salakhutdinov. 2019. Transformer-XL: Attentive language models beyond a fixed-length context. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, 2978–2988. DOI:
[166]
Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, and Xu Sun. 2019. Explicit sparse transformer: Concentrated attention through explicit selection. CoRR abs/1912.11637 (2019). arXiv:1912.11637http://arxiv.org/abs/1912.11637
[167]
Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. CoRR abs/2004.05150 (2020). arXiv:2004.05150https://arxiv.org/abs/2004.05150
[168]
Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontañón, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. 2020. Big bird: Transformers for longer sequences. (2020). https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html
[169]
Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11106–11115.
[170]
Kai Song, Yue Zhang, Min Zhang, and Weihua Luo. 2018. Improved English to Russian translation by neural suffix prediction. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 410–417. DOI:
[171]
Yanling Xiao, Lemao Liu, Guoping Huang, Qu Cui, Shujian Huang, Shuming Shi, and Jiajun Chen. 2022. BiTIIMT: A bilingual text-infilling method for interactive machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1958–1969.
[172]
Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, and Furu Wei. 2022. DeepNet: Scaling transformers to 1,000 layers. arXiv preprint arXiv:2203.00555 (2022).
[173]
Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, and Wen Zhao. 2022. Frequency-aware contrastive learning for neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11712–11720.
[174]
Bin Li, Yixuan Weng, Fei Xia, and Hanjun Deng. 2024. Towards better Chinese-centric neural machine translation for low-resource languages. Comput. Speech Lang. 84 (2024), 101566. DOI:
[175]
Dongqi Wang, Haoran Wei, Zhirui Zhang, Shujian Huang, Jun Xie, and Jiajun Chen. 2022. Non-parametric online learning from human feedback for neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 11431–11439.
[176]
Xiaoya Li, Yuxian Meng, Xiaofei Sun, Qinghong Han, Arianna Yuan, and Jiwei Li. 2019. Is word segmentation necessary for deep learning of Chinese representations?. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, 3242–3252. DOI:
[177]
Colin Cherry, George F. Foster, Ankur Bapna, Orhan Firat, and Wolfgang Macherey. 2018. Revisiting character-based neural machine translation with capacity and compression. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, 4295–4305. https://aclanthology.org/D18-1461/
[178]
Duygu Ataman, Orhan Firat, Mattia Antonino Di Gangi, Marcello Federico, and Alexandra Birch. 2019. On the importance of word boundaries in character-level neural machine translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP 2019, Hong Kong, November 4, 2019, Alexandra Birch, Andrew M. Finch, Hiroaki Hayashi, Ioannis Konstas, Thang Luong, Graham Neubig, Yusuke Oda, and Katsuhito Sudoh (Eds.). Association for Computational Linguistics, 187–193. DOI:
[179]
Huadong Chen, Shujian Huang, David Chiang, Xinyu Dai, and Jiajun Chen. 2018. Combining character and word information in neural machine translation using a multi-level attention. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1284–1293.
[180]
Wei Zhang, Feifei Lin, Xiaodong Wang, Zhenshuang Liang, and Zhen Huang. 2019. Subcharacter Chinese-English neural machine translation with Wubi encoding. arXiv preprint arXiv:1911.02737 (2019).
[181]
Anna Currey and Kenneth Heafield. 2019. Incorporating source syntax into transformer-based neural machine translation. In Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers). 24–33.
[182]
Yingqiang Gao, Nikola I. Nikolov, Yuhuang Hu, and Richard H. R. Hahnloser. 2020. Character-level translation with self-attention. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, 1591–1604. DOI:
[183]
Qingsong Ma, Ondřej Bojar, and Yvette Graham. 2018. Results of the WMT18 metrics shared task: Both characters and embeddings achieve good performance. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers. 671–688.
[184]
Wentao Luo. 2020. Analyzing the problems of vocabulary in Japanese-Chinese neural network machine translation. Computer Science and Application 10, 3 (2020), 387–397.
[185]
Xintong Li, Guanlin Li, Lemao Liu, Max Meng, and Shuming Shi. 2019. On the word alignment from neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1293–1303.
[186]
Xiang Zou, Junguo Zhu, Shengxiang Gao, Zhengtao Yu, and Fuan Yang. 2022. Translation quality estimation of Chinese-Vietnamese neural machine translation incorporating linguistic differentiation features. Journal of Chinese Computer Systems 43, 7 (2022), 1413–1418. https://doi.org/10.20009/j.cnki.21-1106/TP.2020-1084
[187]
Kai Song, Kun Wang, Heng Yu, Yue Zhang, Zhongqiang Huang, Weihua Luo, Xiangyu Duan, and Min Zhang. 2020. Alignment-enhanced transformer for constraining NMT with pre-specified translations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 8886–8893.
[188]
Rico Sennrich and Barry Haddow. 2016. Linguistic input features improve neural machine translation. In Proceedings of the First Conference on Machine Translation, WMT 2016, colocated with ACL 2016, August 11–12, Berlin, Germany. The Association for Computer Linguistics, 83–91. DOI:
[189]
Kehai Chen, Rui Wang, Masao Utiyama, Lemao Liu, Akihiro Tamura, Eiichiro Sumita, and Tiejun Zhao. 2017. Neural machine translation with source dependency representation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2846–2852.
[190]
Akiko Eriguchi, Kazuma Hashimoto, and Yoshimasa Tsuruoka. 2016. Tree-to-sequence attentional neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1: Long Papers. The Association for Computer Linguistics. DOI:
[191]
Huadong Chen, Shujian Huang, David Chiang, and Jiajun Chen. 2017. Improved neural machine translation with a syntax-aware encoder and decoder. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long Papers. Association for Computational Linguistics, 1936–1945. DOI:
[192]
Jetic Gu, Hassan S. Shavarani, and Anoop Sarkar. 2018. Top-down tree structured decoding with syntactic connections for neural machine translation and parsing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, 401–413. DOI:
[193]
Thuong-Hai Pham, Dominik Macháček, and Ondřej Bojar. 2019. Promoting the knowledge of source syntax in transformer NMT is not needed. Computación y Sistemas 23, 3 (2019), 923–934.
[194]
Tyler A. Chang and Anna N. Rafferty. 2020. Encodings of source syntax: Similarities in NMT representations across target languages. In Proceedings of the 5th Workshop on Representation Learning for NLP, RepL4NLP@ACL 2020, Online, July 9, 2020, Spandana Gella, Johannes Welbl, Marek Rei, Fabio Petroni, Patrick S. H. Lewis, Emma Strubell, Min Joon Seo, and Hannaneh Hajishirzi (Eds.). Association for Computational Linguistics, 7–16. DOI:
[195]
Meishan Zhang, Zhenghua Li, Guohong Fu, and Min Zhang. 2019. Syntax-enhanced neural machine translation with syntax-aware word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 1151–1161. DOI:
[196]
Yau-Shian Wang, Hung-yi Lee, and Yun-Nung Chen. 2019. Tree transformer: Integrating tree structures into self-attention. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, 1061–1070. DOI:
[197]
Christos Baziotis, Barry Haddow, and Alexandra Birch. 2020. Language model prior for low-resource neural machine translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16–20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 7622–7634. DOI:
[198]
Diego Moussallem, Mihael Arčan, Axel-Cyrille Ngonga Ngomo, and Paul Buitelaar. 2019. Augmenting neural machine translation with knowledge graphs. arXiv preprint arXiv:1902.08816 (2019).
[199]
Chen Zhang, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, and Yansong Feng. 2023. MC⌃2: A multilingual corpus of minority languages in China. CoRR abs/2311.08348 (2023). DOI:arXiv:2311.08348
[200]
Wuyun He, Zhi Xiu, Jingjing Bao, Meilan Chen, and Siriguleng Wang. 2022. Mongolian-Chinese neural machine translation system based on word segmentation with BERT data enhancement. Journal of Xiamen University (Natural Science) 61, 667-674 (2022).
[201]
Jiajun Zhang and Chengqing Zong. 2016. Exploiting source-side monolingual data in neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1535–1545.
[202]
Tiangang Bai. 2020. Mongolian-Chinese Neural Machine Translation Based on Reinforcement Learning. Master’s Thesis. Inner Mongolia University.
[203]
Mir Adili Jiang Maimaiti. 2021. Research on Neural Machine Translation Methods under Low-Resource Conditions. Ph.D. Dissertation. Tsinghua University.
[204]
Haibo Wang. 2018. Research on multi-granularity Mongolian-Chinese neural network machine translation based on multi-granularity neural network model. Journal of Inner Mongolia University (Natural Science Edition) 49, 5 (2018), 590–597.
[205]
Wuyun He and Siriguleng Wang. 2022. Application of neural network word slicing method in Mongolian-Chinese machine translation. Journal of Minzu University of China (Natural Sciences Edition) 31, 36-46 (2022).
[206]
Yila Su, Fen Gao, Xianghua Niu, and Qingdaoerji Ren. 2021. Pre-training cross Mongolian-Chinese language model based on self-attention mechanism. Computer Applications and Software 38, 165-170 (2021).
[207]
Yufei Wang, Yila Su, Yaping Zhao, Xiaoqian Sun, and Qingdaoerji Ren. 2020. Mongolian-Chinese neural machine translation model based on parameter transfer. Computer Applications and Software 37, 81-87 (2020).
[208]
Pengcong Wang, Hongxu Hou, Shuo Sun, Nier Wu, Weichen Jian, Zongheng Yang, and Yisong Wang. 2022. Hot-start transfer learning combined with approximate distillation for Mongolian-Chinese neural machine translation. In Machine Translation: 18th China Conference, CCMT 2022, Lhasa, China, August 6–10, 2022, Revised Selected Papers. Springer, 12–23.
[209]
Xiu Zhi and Siriguleng Wang. 2021. Research on the application of BERT in Mongolian-Chinese neural machine translation. In 2021 13th International Conference on Machine Learning and Computing (ICMLC’21). Association for Computing Machinery, New York, NY, USA, 404–409. DOI:
[210]
Hongxu Hou, Shuo Sun, and Nier Wu. 2022. Survey of Mongolian-Chinese neural machine translation. Computer Science 49, 31-40 (2022).
[211]
Shuao Guo, Ningyuan Deng, and Yanqing He. 2023. ISTIC’s neural machine translation systems for CCMT’ 2023. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 94–102.
[212]
Anwar Aysa, Mijit Ablimit, Hankiz Yilahun, and Askar Hamdulla. 2022. Chinese-Uyghur bilingual lexicon extraction based on weak supervision. Information 13, 4 (2022), 175.
[213]
Wenbo Zhang, Xiao Li, Yating Yang, and Rui Dong. 2021. Pre-training on mixed data for low-resource neural machine translation. Information 12, 3 (2021), 133.
[214]
Jiatong Liu, Zhonghao Wang, Yanchang Cui, Minghong Fan, and Beizhan Wang. 2020. Optimization of dimension-han machine translation model based on neural network. In Journal of Physics: Conference Series, Vol. 1646. IOP Publishing, 012143.
[215]
Wumaier Hasan, Ruzmamat Sirajahmat, Hairela Xireaili, Wenqi Liu, Yibulayin Tuergen, Liejun Wang, and Abulizi Wayit. 2021. Bi-directional Uyghur-Chinese neural machine translation with marked syllables. Computer Engineering and Applications 57, 161-168 (2021).
[216]
Halike Ayiguli, Abiderexiti Kahaerjiang, Wumaier Aishan, and Yibulayin Tuergen. 2019. Neural machine translation of Uyghur-Chinese quantifier. Computer Engineering and Design 40, 2649-2653 (2019). DOI:
[217]
Gulinigeer Abudouwaili, Kahaerjiang Abiderexiti, Jiamila Wushouer, Yunfei Shen, Turenisha Maimaitimin, and Tuergen Yibulayin. 2021. Morphological analysis corpus construction of Uyghur. In Chinese Computational Linguistics: 20th China National Conference, CCL 2021, Hohhot, China, August 13–15, 2021, Proceedings. Springer, 280–293.
[218]
Shunle Zhu. 2019. Optimized Chinese-Uyghur neural machine translation model based on multi-features. Computer Engineering and Design 40, 1484-1488 (2019). DOI:
[219]
Zhiwang Xu, Huibin Qin, and Yongzhu Hua. 2021. Research on Uyghur-Chinese neural machine translation based on the transformer at multistrategy segmentation granularity. Mobile Information Systems 2021 (2021), 1–7.
[220]
Xuewen Shi, Heyan Huang, Ping Jian, and Yi-Kun Tang. 2021. Improving neural machine translation with sentence alignment learning. Neurocomputing 420 (2021), 15–26.
[221]
Xuewen Shi, Ping Jian, Yikun Tang, and Heyan Huang. 2023. Improving word-level diversity in neural machine translation by controlling the effects of word frequency. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, and Xianpei Han (Eds.). Chinese Information Processing Society of China, Harbin, China, 64–77. https://aclanthology.org/2023.ccl-1.6
[222]
Muyun Yang, Xixin Hu, Hao Xiong, Jiayi Wang, Yiliyaer Jiaermuhamaiti, Zhongjun He, Weihua Luo, and Shujian Huang. 2019. CCMT 2019 machine translation evaluation report. In Machine Translation, Shujian Huang and Kevin Knight (Eds.). Springer Singapore, Singapore, 105–128.
[223]
Shumin Shi, Xing Wu, Rihai Su, and Heyan Huang. 2022. Low-resource neural machine translation: Methods and trends. ACM Trans. Asian Low Resour. Lang. Inf. Process. 21, 5 (2022), 103:1–103:22. DOI:
[224]
Hao Wang, Yongbin Yu, Nyima Tashi, Rinchen Dongrub, Ekong Favour, Mengwei Ai, Kalzang Gyatso, Yong Cuo, and Qun Nuo. 2022. Life is short, train it less: Neural machine Tibetan-Chinese translation based on mRASP and dataset enhancement. In Machine Translation: 18th China Conference, CCMT 2022, Lhasa, China, August 6–10, 2022, Revised Selected Papers. Springer, 54–59.
[225]
Jiacuo Cizhen, Duanzhu Sangjie, Maosong Sun, Maoxian Zhou, and Chajia Se. 2020. Research on Tibetan-Chinese machine translation method with iterative back translation strategy. Journal of Chinese Information Processing 34, 67-73 (2020).
[226]
Dan Yang, Yidong Syn, and Cuo Yong. 2022. Research on Tibetan-Chinese neural machine translation based on data enhancement. Computer & Digital Engineering 50, 2473-2477 (2022).
[227]
Ding Liu, Yachao Li, Dengyun Zhu, Xuan Liu, Ning Ma, and Ao Zhu. 2020. Investigating back-translation in Tibetan-Chinese neural machine translation. In Journal of Physics: Conference Series, Vol. 1651. IOP Publishing, 012122.
[228]
Yidong Sun, Cuo Yong, and Dan Yang. 2022. Tibetan-Chinese bidirectional machine translation based on VOLT. Computer and Modernization No.321, 28-32 (2022).
[229]
Yachao Li, Deyi Xiong, Min Zang, Jing Jiang, Ning Ma, and Jianmin Yin. 2017. Research on Tibetan-Chinese neural machine translation. Journal of Chinese Information Processing 31, 103-109 (2017).
[230]
Marieke Meelen, Élie Roux, and Nathan Hill. 2021. Optimisation of the largest annotated Tibetan corpus combining rule-based, memory-based, and deep-learning methods. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 20, 1 (2021), 1–11.
[231]
Tao Jiang, Hao Sun, Yu Gang Dai, and Ding Liu. 2020. Tibetan-Chinese neural machine translation combining attention mechanism. In Journal of Physics: Conference Series, Vol. 1607. IOP Publishing, 012001.
[232]
Wen Lai, Xiaobing Zhao, and Wei Bao. 2018. Tibetan-Chinese neural machine translation based on syllable segmentation. In Proceedings of the AMTA 2018 Workshop on Technologies for MT of Low Resource Languages (LoResMT’18). 21–29.
[233]
Maoxian Zhou, Jia Secha, and Rangjia Cai. 2021. Research on Tibetan-Chinese neural machine translation integrating syntactic information. In 2021 3rd International Conference on Advanced Information Science and System (AISS’21). 1–4.
[234]
Sangjie Duanzhu, Cizhen Jiacuo, Rou Te, Sanzhi Jia, and Cairang Jia. 2019. An end-to-end method for data filtering on Tibetan-Chinese parallel corpus via negative sampling. In Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, October 18–20, 2019, Proceedings. Springer, 414–423.
[235]
Kalzang Gyatso, Peizhuo Liu, Yi Jing, Yinqiao Li, Nyima Tashi, Tong Xiao, and Jingbo Zhu. 2023. CCMT2023 Tibetan-Chinese machine translation evaluation technical report. In China Conference on Machine Translation. Springer, 28–36.
[236]
Longtu Zhang and Mamoru Komachi. 2021. Using sub-character level information for neural machine translation of logographic languages. ACM Trans. Asian Low Resour. Lang. Inf. Process. 20, 2 (2021), 31:1–31:15. DOI:
[237]
Longtu Zhang and Mamoru Komachi. 2019. Chinese-Japanese unsupervised neural machine translation using sub-character level information. arXiv preprint arXiv:1903.00149 (2019).
[238]
Jinhua Du and Andy Way. 2017. Pinyin as subword unit for Chinese-sourced neural machine translation. In Proceedings of the 25th Irish Conference on Artificial Intelligence and Cognitive Science, Dublin, Ireland, December 7–8, 2017 (CEUR Workshop Proceedings), John McAuley and Susan McKeever (Eds.), Vol. 2086. CEUR-WS.org, 89–101. https://ceur-ws.org/Vol-2086/AICS2017_paper_14.pdf
[239]
Nikola I. Nikolov, Yuhuang Hu, Mi Xue Tan, and Richard H. R. Hahnloser. 2018. Character-level Chinese-English translation through ASCII encoding. In Proceedings of the Third Conference on Machine Translation: Research Papers, 10–16.
[240]
Jinyi Zhang and Tadahiro Matsumoto. 2019. Character decomposition for Japanese-Chinese character-level neural machine translation. In 2019 International Conference on Asian Language Processing (IALP’19). 35–40. DOI:
[241]
Jack Halpern. 2018. Very large-scale lexical resources to enhance Chinese and Japanese machine translation. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC’18).
[242]
Danielle Saunders, Weston Feely, and Bill Byrne. 2020. Inference-only sub-character decomposition improves translation of unseen logographic characters. In Proceedings of the 7th Workshop on Asian Translation, WAT@AACL/IJCNLP 2020, Suzhou, China, December 4, 2020. Association for Computational Linguistics, 170–177. https://aclanthology.org/2020.wat-1.21/
[243]
Jinyi Zhang and Tadahiro Matsumoto. 2017. Japanese-Chinese machine translation for the Japanese case particle “de”. In 2017 International Conference on Asian Language Processing (IALP’17). 330–333. DOI:
[244]
Jinyi Zhang and Tadahiro Matsumoto. 2017. Improving character-level Japanese-Chinese neural machine translation with radicals as an additional input feature. In 2017 International Conference on Asian Language Processing (IALP’17). 172–175. DOI:
[245]
Toshiaki Nakazawa and Kurohashi Yoshio. 2012. Improvement of Japanese Chinese machine translation system. Japio Year Book (2012), 258–261. https://www.japio.or.jp/00yearbook/files/2012book/12_4_05.pdf
[246]
Zhaohui Bu. 2004. A Study on Japanese-Chinese Machine Translation – Centering on the Rules for TORITATE Expression and Negative Expression. Ph.D. Dissertation. Gifu University.
[247]
ZhiYun Zhao, ChongDe Shi, YanQing He, YingFan Gao, and ChangQing Yao. 2017. Cooperative research on Chinese-Japanese machine translation for S&T documents. Technology Intelligence Engineering 3, 4-9 (2017).
[248]
Yipeng Li. 2015. Japanese technological language labeling techniques in the Chinese-Japanese bilingual parallel corpus. Guide to Business No.282, 175-176 (2015). DOI:
[249]
Chenhui Chu, Toshiaki Nakazawa, and Sadao Kurohashi. 2014. Constructing a Chinese–Japanese parallel corpus from Wikipedia. In Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, May 26–31, 2014, Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association (ELRA), 642–647. http://www.lrec-conf.org/proceedings/lrec2014/summaries/21.html
[250]
Jinyi Zhang and Tadahiro Matsumoto. 2019. Corpus augmentation for neural machine translation with Chinese-Japanese parallel corpora. Applied Sciences 9, 10 (2019), 2036.
[251]
Boliang Zhang, Ajay Nagesh, and Kevin Knight. 2020. Parallel Corpus filtering via pre-trained language models. (2020), 8545–8554. DOI:
[252]
Zhibo Man, Cunli Mao, Zhengtao Yu, Xunyu Li, Shengxiang Gao, and Junguo Zhu. 2021. Chinese-English-Burmese neural machine translation based on multilingual joint training. Journal of Tsinghua University (Science and Technology) 61, 9 (2021), 927–935.
[253]
Shaoning Zhang. 2019. Research on the Construction Method of Chinese-Myanmar Parallel Corpus Based on Pivot Language. Master’s Thesis. Kunming University of Science and Technology.
[254]
Xunyu Li, Cunli Mao, Zhengtao Yu, Shengxiang Gao, Zhenhan Wang, and Yafei Zhang. 2021. Chinese-Burmese comparable document acquisition based on topic model and bilingual word embedding. Journal of Chinese Information Processing 35, 88-95 (2021).
[255]
Yue Li. 2020. Research on Chinese-Myanmar Neural Machine Translation Method with Single Language Material. Master’s Thesis. Kunming University of Science and Technology.
[256]
Zhiqiang Yu, Yonghua Wen, Minghu Gao, and Man Yang. 2023. Chinese-Myanmar parallel sentence pairs generation method based on semantic difference. Journal of Yunnan University of Nationalities: Natural Sciences Edition 32, 1 (2023), 118–123.
[257]
Chenchen Ding, Hnin Thu Zar Aye, Win Pa Pa, Khin Thandar Nwet, Khin Mar Soe, Masao Utiyama, and Eiichiro Sumita. 2019. Towards Burmese (Myanmar) morphological analysis: Syllable-based tokenization and part-of-speech tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 1 (2019), 1–34.
[258]
Yue Li, Cunli Mao, Zhengtao Yu, Shengxiang Gao, Zhenhan Wang, and Yafei Zhang. 2021. Method of Chinese Burmese bilingual vocabulary extraction based on subject and context features. Journal of Chinese Computer Systems 42, 1 (2021), 91.
[259]
Cunli Mao, Xia Wu, Junguo Zhu, Zhengtao Yu, Yunlong Li, and Zhenhan Wang. 2020. Chinese-Burmese parallel sentence pair extraction based on CNN-CorrNet. Journal of Chinese Information Processing 34, 60-66 (2020).
[260]
Cunli Mao, Shan Lu, Hongbin Wang, Zhengtao Yu, Xia Wu, and Zhenhan Wang. 2021. Semi-supervised Chinese-Burmese Bilingual dictionary construction. Journal of Chinese Information Processing 35, 7 (2021), 47–53.
[261]
Linqin Wang, Zhengtao Yu, Cunli Mao, Chengxiang Gao, Zhibo Man, and Zhenhan Wang. 2021. Semi-supervised Chinese-Myanmar neural machine translation based model-uncertainty. In Proceedings of the 20th Chinese National Conference on Computational Linguistics. 35–45.
[262]
Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-task learning for multiple language translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26–31, 2015, Beijing, China, Volume 1: Long Papers. The Association for Computer Linguistics, 1723–1732. DOI:
[263]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. (2015). http://arxiv.org/abs/1409.0473
[264]
Hongtao Zhang, Yonghua Wen, and Jian Wang. 2022. Thai-Chinese neural machine translation method based on dependency distance penalty. Communications Technology 55, 990-997 (2022).
[265]
Yihan Zhang. 2021. Research on Thai-Chinese Machine Translation Optimization Method under Low Resource Conditions. Master’s Thesis. Yunnan University.
[266]
Zhijin Li, Hua Lai, Yonghua Wen, and Shengxiang Gao. 2022. Neural machine translation integrating bidirectional-dependency self-attention mechanism. Journal of Computer Applications 42, 12 (2022), 3679.
[267]
Yinhan Feng. 2019. A Study on the Method of Computing Sentence Similarity between Chinese and Thai Languages Based on Word Embedding. Master’s Thesis. Kunming University of Science and Technology.
[268]
Lalita Lowphansirikul, Charin Polpanumas, Attapol T. Rutherford, and Sarana Nutanong. 2022. A large English–Thai parallel corpus from the web and machine-generated text. Language Resources and Evaluation 56, 2 (2022), 477–499.
[269]
Yan Liu and Deyi Xiong. 2022. Construction method of parallel corpus for minority language machine translation. Computer Science 49, 41-46 (2022).
[270]
Zhiqiang Yu, Zhengtao Yu, Yuxin Huang, Junjun Guo, Zhenhan Wang, and Zhibo Man. 2020. Transfer learning for Chinese-Lao neural machine translation with linguistic similarity. In Machine Translation: 16th China Conference, CCMT 2020, Hohhot, China, October 10–12, 2020, Revised Selected Papers 16. Springer, 1–10.
[271]
El Moatez Billah Nagoudi, AbdelRahim Elmadany, and Muhammad Abdul-Mageed. 2022. TURJUMAN: A public toolkit for neural Arabic machine translation. arXiv preprint arXiv:2206.03933 (2022).
[272]
Huan Liu, Junpeng Liu, Kaiyu Huang, and Degen Huang. 2022. Domain adaptation approach for low resource Russian-Chinese machine translation task. Journal of Xiamen University (Natural Science) (2022).
[273]
Can Li, Yating Yang, Yupeng Ma, and Rui Dong. 2021. Neural machine translation corpus expansion method based on language similarity mining. Journal of Computer Applications 41, 11 (2021), 3145.
[274]
Guangfeng Liu, Qinpei Zhu, Xingyu Chen, Renjie Feng, Jianxin Ren, Renshou Wu, Qingliang Miao, Rui Wang, and Kai Yu. 2022. The AISP-SJTU translation system for WMT 2022. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 310–317.
[275]
Jesujoba Alabi, Lydia Nishimwe, Benjamin Muller, Camille Rey, Benoît Sagot, and Rachel Bawden. 2022. Inria-ALMAnaCH at WMT 2022: Does transcription help cross-script machine translation?. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 233–243.
[276]
Artur Nowakowski, Gabriela Palka, Kamil Guttmann, and Mikolaj Pokrywka. 2022. Adam Mickiewicz University at WMT 2022: NER-assisted and quality-aware neural machine translation. In Proceedings of the Seventh Conference on Machine Translation, WMT 2022, Abu Dhabi, United Arab Emirates (Hybrid), December 7–8, 2022. Association for Computational Linguistics, 326–334. https://aclanthology.org/2022.wmt-1.26
[277]
Dimitrios Roussis and Vassilis Papavassiliou. 2022. The ARC-NKUA submission for the English-Ukrainian general machine translation shared task at WMT22. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 358–365.
[278]
Josef Jon, Martin Popel, and Ondřej Bojar. 2022. CUNI-Bergamot submission at WMT22 general translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 280–289.
[279]
Martin Popel, Jindřich Libovický, and Jindřich Helcl. 2022. CUNI systems for the WMT 22 Czech-Ukrainian translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 352–357. https://aclanthology.org/2022.wmt-1.30
[280]
Hao Zong and Chao Bei. 2022. GTCOM neural machine translation systems for WMT22. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 428–431.
[281]
Daimeng Wei, Zhiqiang Rao, Zhanglin Wu, Shaojun Li, Yuanchang Luo, Yuhao Xie, Xiaoyu Chen, Hengchao Shang, Zongyao Li, Zhengzhe Yu, Jinlong Yang, Miaomiao Ma, Lizhi Lei, Hao Yang, and Ying Qin. 2022. HW-TSC’s submissions to the WMT 2022 general machine translation shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 403–410. https://aclanthology.org/2022.wmt-1.36
[282]
Changtong Zan, Keqin Peng, Liang Ding, Baopu Qiu, Boan Liu, Shwai He, Qingyu Lu, Zheng Zhang, Chuang Liu, Weifeng Liu, Yibing Zhan, and Dacheng Tao. 2022. Vega-MT: The JD explore academy machine translation system for WMT22. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 411–422. https://aclanthology.org/2022.wmt-1.37
[283]
Shivam Kalkar, Yoko Matsuzaki, and Ben Li. 2022. KYB general machine translation systems for WMT22. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 290–294.
[284]
Marilena Malli and George Tambouratzis. 2022. Evaluating corpus cleanup methods in the WMT’22 news translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 335–341.
[285]
Bing Han, Yangjian Wu, Gang Hu, and Qiulin Chen. 2022. Lan-Bridge MT’s participation in the WMT 2022 general translation shared task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 268–274. https://aclanthology.org/2022.wmt-1.19
[286]
Hui Zeng. 2022. No domain left behind. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 423–427. https://aclanthology.org/2022.wmt-1.38
[287]
Matīss Rikters, Marili Tomingas, Tuuli Tuisk, Valts Ernštreits, and Mark Fishel. 2022. Machine translation for Livonian: Catering to 20 speakers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 508–514.
[288]
Hiroyuki Deguchi, Kenji Imamura, Masahiro Kaneko, Yuto Nishida, Yusuke Sakai, Justin Vasselli, Huy-Hien Vu, and Taro Watanabe. 2022. NAIST-NICT-TIT WMT22 general MT task submission. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 244–250.
[289]
Makoto Morishita, Keito Kudo, Yui Oka, Katsuki Chousa, Shun Kiyono, Sho Takase, and Jun Suzuki. 2022. NT5 at WMT 2022 general translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 318–325.
[290]
Weiqiao Shan, Zhiquan Cao, Yuchen Han, Siming Wu, Yimin Hu, Jie Wang, Yi Zhang, Baoyu Hou, Hang Cao, Chenghao Gao, Xiaowen Liu, Tong Xiao, Anxiang Ma, and Jingbo Zhu. 2022. The NiuTrans machine translation systems for WMT22. In Proceedings of the Seventh Conference on Machine Translation (WMT’22), 366–374.
[291]
Alexander Molchanov, Vladislav Kovalenko, and Natalia Makhamalkina. 2022. PROMT systems for WMT22 general translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 342–345.
[292]
Adam Dobrowolski, Mateusz Klimaszewski, Adam Myśliwy, Marcin Szymański, Jakub Kowalski, Kornelia Szypuła, Paweł Przewłocki, and Paweł Przybysz. 2022. Samsung R&D Institute Poland participation in WMT 2022. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 251–259.
[293]
Zhiwei He, Xing Wang, Zhaopeng Tu, Shuming Shi, and Rui Wang. 2022. Tencent AI Lab - Shanghai Jiao Tong University low-resource translation system for the WMT22 translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 260–267. https://aclanthology.org/2022.wmt-1.18
[294]
Maali Tars, Taido Purason, and Andre Tättar. 2022. Teaching unseen low-resource languages to large translation models. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 375–380.
[295]
Csaba Oravecz, Katina Bontcheva, David Kolovratník, Bogomil Kovachev, and Christopher Scott. 2022. eTranslation’s submissions to the WMT22 general machine translation task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). 346–351.
[296]
Chang Jin, Tingxun Shi, Zhengshan Xue, and Xiaodong Lin. 2022. Manifold’s English-Chinese system at WMT22 general MT task. In Proceedings of the Seventh Conference on Machine Translation (WMT’22). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 275–279. https://aclanthology.org/2022.wmt-1.20
[297]
Yangjian Wu and Gang Hu. 2023. Exploring prompt engineering with GPT language models for document-level machine translation: Insights and findings. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 166–169. DOI:
[298]
Zhanglin Wu, Daimeng Wei, Zongyao Li, Zhengzhe Yu, Shaojun Li, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Yuhao Xie, Lizhi Lei, Hao Yang, and Yanfei Jiang. 2023. Treating general MT shared task as a multi-domain adaptation problem: HW-TSC’s submission to the WMT23 general MT shared task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 170–174. DOI:
[299]
Wenbo Zhang. 2023. IOL research machine translation systems for WMT23 general machine translation shared task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 187–191. DOI:
[300]
Luo Min, Yixin Tan, and Qiulin Chen. 2023. Yishu: Yishu at WMT2023 translation task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 143–149. DOI:
[301]
Hui Zeng. 2023. Achieving state-of-the-art multilingual translation model with minimal data and parameters. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 181–186. DOI:
[302]
Matiss Rikters and Makoto Miwa. 2023. AIST AIRC submissions to the WMT23 shared task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 155–161. DOI:
[303]
Josef Jon, Martin Popel, and Ondrej Bojar. 2023. CUNI at WMT23 general translation task: MT and a genetic algorithm. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 119–127. DOI:
[304]
Hao Zong. 2023. GTCOM and DLUT’s neural machine translation systems for WMT23. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 192–197. DOI:
[305]
Ben Li, Yoko Matsuzaki, and Shivam Kalkar. 2023. KYB general machine translation systems for WMT23. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 137–142. DOI:
[306]
Pavel Rychlý and Yuliia Teslia. 2023. MUNI-NLP submission for Czech-Ukrainian translation task at WMT23. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 162–165. DOI:
[307]
Hiroyuki Deguchi, Kenji Imamura, Yuto Nishida, Yusuke Sakai, Justin Vasselli, and Taro Watanabe. 2023. NAIST-NICT WMT’23 general MT task submission. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 110–118. DOI:
[308]
Alexander Molchanov and Vladislav Kovalenko. 2023. PROMT systems for WMT23 shared general translation task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 150–154. DOI:
[309]
Jan Christian Blaise Cruz. 2023. Samsung R&D Institute Philippines at WMT 2023. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 103–109. DOI:
[310]
Keito Kudo, Takumi Ito, Makoto Morishita, and Jun Suzuki. 2023. SKIM at WMT 2023 general translation task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 128–136. DOI:
[311]
Di Wu, Shaomu Tan, David Stap, Ali Araabi, and Christof Monz. 2023. UvA-MT’s participation in the WMT 2023 general translation shared task. In Proceedings of the Eighth Conference on Machine Translation, WMT 2023, Singapore, December 6–7, 2023, Philipp Koehn, Barry Haddon, Tom Kocmi, and Christof Monz (Eds.). Association for Computational Linguistics, 175–180. DOI:
[312]
Zeyu Yan, Wenbo Zhang, Qiaobo Deng, Hongbao Mao, Jie Cai, and Zhengyu He. 2023. Transn’s submission for CCMT 2023 quality estimation task. In China Conference on Machine Translation. Springer, 1–12.
[313]
Zhanglin Wu, Zhengzhe Yu, Zongyao Li, Daimeng Wei, Yuhao Xie, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Zhiqiang Rao, Shaojun Li, Song peng, Lizhi Lei, Hao Yang, and Yanfei Jiang. 2023. HW-TSC’s neural machine translation system for CCMT 2023. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 13–27.
[314]
Fan Liu, Yahui Zhao, Guozhe Jin, Xinghua Lu, Zhejun Jin, and Rongyi Cui. 2023. Korean-Chinese machine translation method based on independent language features. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 37–49.
[315]
Zhejian Lai, Xiang Geng, Yu Zhang, Jiajun Chen, and Shujian Huang. 2023. NJUNLP’s submission for CCMT 2023 quality estimation task. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 50–56.
[316]
Na Ye and Jiaxin Li. 2023. A k-nearest neighbor approach for domain-specific translation quality estimation. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 69–80.
[317]
Na Ye and Gen Fu. 2023. WSA: A unified framework for word and sentence autocompletion in interactive machine translation. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 81–93.
[318]
Rui Zhang, Jinghao Yuan, Hui Huang, Muyun Yang, and Tiejun Zhao. 2023. HIT-MI &T Lab’s submission to CCMT 2023 automatic post-editing task. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 57–68.
[319]
Zhiyang Zhang, Yaping Zhang, Lu Xiang, Yang Zhao, Yu Zhou, and Chengqing Zong. 2023. A novel dataset and benchmark analysis on document image translation. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 103–115.
[320]
Bokai Guo, Chong Feng, Fang Liu, Xinyan Li, and Xiaomei Wang. 2023. Joint contrastive learning for factual consistency evaluation of cross-lingual abstract summarization. In Machine Translation, Yang Feng and Chong Feng (Eds.). Springer Nature Singapore, Singapore, 116–127.
[321]
Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, and Rico Sennrich. 2020. In neural machine translation, what does transfer learning transfer?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 7701–7710. DOI:
[322]
Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Jian Su, Kevin Duh, and Xavier Carreras (Eds.). Association for Computational Linguistics, Austin, Texas, USA, 1568–1575. DOI:
[323]
Sergey Edunov, Myle Ott, Michael Auli, and David Grangier. 2018. Understanding back-translation at scale. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 489–500. DOI:
[324]
Maciej Modrzejewski, Miriam Exel, Bianka Buschbeck, Thanh-Le Ha, and Alex Waibel. 2020. Incorporating external annotation to improve named entity translation in NMT. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. 45–51.
[325]
Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, and Andre Martins. 2022. Quality-aware decoding for neural machine translation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, WA, USA, 1396–1412. DOI:
[326]
M. Amin Farajian, Marco Turchi, Matteo Negri, and Marcello Federico. 2017. Multi-domain neural machine translation through unsupervised adaptation. In Proceedings of the Second Conference on Machine Translation (WMT’17). 127–137.
[327]
Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, and Alexandra Birch. 2018. Marian: Fast neural machine translation in C++. In Proceedings of ACL 2018, System Demonstrations. Association for Computational Linguistics, Melbourne, Australia, 116–121. DOI:
[328]
Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, and Jakob Uszkoreit. 2018. Tensor2Tensor for neural machine translation. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track). Association for Machine Translation in the Americas, Boston, MA, USA, 193–199. https://aclanthology.org/W18-1819
[329]
Sha Yuan, Hanyu Zhao, Zhengxiao Du, Ming Ding, Xiao Liu, Yukuo Cen, Xu Zou, Zhilin Yang, and Jie Tang. 2021. WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models. AI Open 2 (2021), 65–68. DOI:
[330]
Qu Cui, Shujian Huang, Jiahuan Li, Xiang Geng, Zaixiang Zheng, Guoping Huang, and Jiajun Chen. 2021. DirectQE: Direct pretraining for machine translation quality estimation. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2–9, 2021. AAAI Press, 12719–12727. DOI:
[331]
Matthew G. Snover, Bonnie J. Dorr, Richard M. Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, AMTA 2006, Cambridge, Massachusetts, USA, August 8–12, 2006. Association for Machine Translation in the Americas, 223–231. https://aclanthology.org/2006.amta-papers.25/
[332]
Yuanhang Zheng, Zhixing Tan, Meng Zhang, Mieradilijiang Maimaiti, Huanbo Luan, Maosong Sun, Qun Liu, and Yang Liu. 2021. Self-supervised quality estimation for machine translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7–11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 3322–3334. DOI:

Index Terms

  1. Neural Machine Translation for Low-Resource Languages from a Chinese-centric Perspective: A Survey

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 6
      June 2024
      378 pages
      EISSN:2375-4702
      DOI:10.1145/3613597
      • Editor:
      • Imed Zitouni
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 June 2024
      Online AM: 16 May 2024
      Accepted: 09 May 2024
      Revised: 09 March 2024
      Received: 08 August 2023
      Published in TALLIP Volume 23, Issue 6

      Check for updates

      Author Tags

      1. Low-resource languages
      2. neural machine translation
      3. unsupervised learning
      4. transfer learning
      5. multilingual translation
      6. large language models
      7. Chinese-centric languages

      Qualifiers

      • Research-article

      Funding Sources

      • General Young Talents Project for Scientific Research grant of the Educational Department of Liaoning Province
      • Research Support Program for Inviting High-Level Talents grant of Shenyang Ligong University
      • hina Scholarship Council

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 349
        Total Downloads
      • Downloads (Last 12 months)349
      • Downloads (Last 6 weeks)87
      Reflects downloads up to 02 Sep 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media