research-article

A Survey on Document-level Neural Machine Translation: Methods and Evaluation

Authors:

Gholamreza HaffariAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 54, Issue 2

Article No.: 45, Pages 1 - 36

https://doi.org/10.1145/3441691

Published: 05 March 2021 Publication History

Abstract

Machine translation (MT) is an important task in natural language processing (NLP), as it automates the translation process and reduces the reliance on human translators. With the resurgence of neural networks, the translation quality surpasses that of the translations obtained using statistical techniques for most language-pairs. Up until a few years ago, almost all of the neural translation models translated sentences independently, without incorporating the wider document-context and inter-dependencies among the sentences. The aim of this survey article is to highlight the major works that have been undertaken in the space of document-level machine translation after the neural revolution, so researchers can recognize the current state and future directions of this field. We provide an organization of the literature based on novelties in modelling and architectures as well as training and decoding strategies. In addition, we cover evaluation strategies that have been introduced to account for the improvements in document MT, including automatic metrics and discourse-targeted test sets. We conclude by presenting possible avenues for future exploration in this research field.

References

[1]

Ruchit Agrawal, Marco Turchi, and Matteo Negri. 2018. Contextual handling in neural machine translation: Look behind, ahead, and on both sides. In Proceedings of the 21st Conference of the European Association for Machine Translation, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Miquel Esplà-Gomis, Maja Popović, Celia Rico, André Martins, Joachim Van den Bogaert, and Mikel L. Forcada (Eds.). EAMT, 11--20.

[2]

Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, and Michael Collins. 2016. Globally normalized transition-based neural networks. In Proceedings of the 54th Meeting of the Assoc. for Computat. Linguistics. The Association for Computational Linguistics, 2442--2452.

[3]

Eleftherios Avramidis, Vivien Macketanz, Ursula Strohriegel, and Hans Uszkoreit. 2019. Linguistic evaluation of German-English machine translation using a test suite. In Proceedings of the 4th Conference on Machine Translation. ACL, 445--454.

[4]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations.

[5]

Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. ACL, 65--72.

[6]

Yehoshua Bar-Hillel. 1960. The present status of automatic translation of languages. Adv. Comput. 1 (1960), 91--163.

[7]

Loïc Barrault, Ondřej Bojar, Marta R. Costa-Jussà, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias Müller, Santanu Pal, Matt Post, and Marcos Zampieri. 2019. Findings of the 2019 Conference on Machine Translation (WMT19). In Proceedings of the 4th Conference on Machine Translation. ACL, 1--61.

[8]

Rachel Bawden, Rico Sennrich, Alexandra Birch, and Barry Haddow. 2018. Evaluating discourse phenomena in neural machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 1304--1313.

[9]

Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. In Proceedings of the 26th International Conference on Machine Learning. ACM, 41--48.

Digital Library

[10]

Julian Besag. 1975. Statistical analysis of non-lattice data. J. Roy. Statist. Soc. Series D (Statist.) 24, 3 (1975), 179--195.

[11]

Antoine Bosselut, Asli Celikyilmaz, Xiaodong He, Jianfeng Gao, Po-Sen Huang, and Yejin Choi. 2018. Discourse-aware neural rewards for coherent text generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 173--184.

[12]

Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Ling. 19, 2 (1993), 263--311.

Digital Library

[13]

Eirini Chatzikoumi. 2020. How to evaluate machine translation: A review of automated and human metrics. Nat. Lang. Eng. 26, 2 (2020), 137--161.

[14]

Junxuan Chen, Xiang Li, Jiarui Zhang, Chulun Zhou, Jianwei Cui, Bin Wang, and Jinsong Su. 2020. Modeling discourse structure for document-level neural machine translation. In Proceedings of the 1st Workshop on Automatic Simultaneous Translation. ACL, 30--36.

[15]

Yun Chen, Victor O. K. Li, Kyunghyun Cho, and Samuel Bowman. 2018. A stable and effective learning strategy for trainable greedy decoding. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 380--390.

[16]

Stephane Clinchant, Kweon Woo Jung, and Vassilina Nikoulina. 2019. On the use of BERT for neural machine translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation. ACL, 108--117.

[17]

Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 7059--7069.

[18]

Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc Le, and Ruslan Salakhutdinov. 2019. Transformer-XL: Attentive language models beyond a fixed-length context. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. ACL, 2978--2988.

[19]

Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, and Łukasz Kaiser. 2019. Universal transformers. In Proceedings of the 6th International Conference on Learning Representations.

[20]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 4171--4186.

[21]

Sergey Edunov, Alexei Baevski, and Michael Auli. 2019. Pre-trained language model representations for language generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 4052--4059.

[22]

Jeffrey L. Elman. 1990. Finding structure in time. Cog. Sci. 14, 2 (1990), 179--211.

[23]

Cristina España-Bonet and Dana Ruiter. 2019. UdS-DFKI participation at WMT 2019: Low-resource (en-gu) and coreference-aware (en-de) systems. In Proceedings of the 4th Conference on Machine Translation. ACL, 183--190.

[24]

Christiane Fellbaum (Ed.). 1998. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA.

[25]

Eva Martínez Garcia, Carles Creus, Cristina España-Bonet, and Lluís Màrquez. 2017. Using word embeddings to enforce document-level lexical consistency in machine translation. Prague Bull. Math. Ling. 108 (2017), 85--96.

[26]

Eva Martínez Garcia, Cristina España-Bonet, and Lluís Màrquez. 2014. Document-level machine translation as a re-translation process. Proces. Leng. Nat. 53 (2014), 103--110.

[27]

Eva Martínez Garcia, Cristina España-Bonet, and Lluís Màrquez. 2015. Document-level machine translation with word vector models. In Proceedings of the 18th Conference of the European Association for Machine Translation. 59--66.

[28]

Jonas Gehring, Michael Auli, David Grangier, and Yann Dauphin. 2017. A convolutional encoder model for neural machine translation. In Proceedings of the 55th Meeting of the Association for Computational Linguistics. ACL, 123--135.

[29]

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning, Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, 1243--1252.

[30]

Zhengxian Gong, Min Zhang, and Guodong Zhou. 2011. Cache-based document-level statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 909--919.

Digital Library

[31]

Zhengxian Gong, Min Zhang, and Guodong Zhou. 2015. Document-level machine translation evaluation with gist consistency and text cohesion. In Proceedings of the 2nd Workshop on Discourse in Machine Translation. ACL, 33--40.

[32]

Yvette Graham, Barry Haddow, and Philipp Koehn. 2019. Translationese in Machine Translation Evaluation. arXiv:1906.09833.

[33]

Alex Graves. 2012. Sequence transduction with recurrent neural networks. In Proceedings of the Representation Learning Workshop at the International Conference on Machine Learning. PMLR.

[34]

Liane Guillou and Christian Hardmeier. 2016. PROTEST: A test suite for evaluating pronouns in machine translation. In Proceedings of the 10th International Conference on Language Resources and Evaluation, Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis (Eds.). ELRA, 636--643.

[35]

Liane Guillou and Christian Hardmeier. 2018. Automatic reference-based evaluation of pronoun translation misses the point. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 4797--4802.

[36]

Francisco Guzmán, Shafiq Joty, Lluís Màrquez, and Preslav Nakov. 2014. Using discourse structure improves machine translation evaluation. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics. ACL, 687--698.

[37]

Najeh Hajlaoui and Andrei Popescu-Belis. 2013. Assessing the accuracy of discourse connective translations: Validation of an automatic metric. In Proceedings of the 14th International Conference on Computational Linguistics and Intelligent Text Processing. Springer Berlin, 236--247.

Digital Library

[38]

Michael Halliday and Ruqaiya Hasan. 1976. Cohesion in English. Longman, London, UK.

[39]

Christian Hardmeier. 2014. Discourse in Statistical Machine Translation. Ph.D. Dissertation. Uppsala University, Sweden.

[40]

Christian Hardmeier and Marcello Federico. 2010. Modelling pronominal anaphora in statistical machine translation. In Proceedings of the 7th International Workshop on Spoken Language Translation, 283--289.

[41]

Christian Hardmeier, Joakim Nivre, and Jörg Tiedemann. 2012. Document-wide decoding for phrase-based statistical machine translation. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. ACL, 1179--1190.

[42]

Hiroaki Hayashi, Yusuke Oda, Alexandra Birch, Ioannis Konstas, Andrew Finch, Minh-Thang Luong, Graham Neubig, and Katsuhito Sudoh. 2019. Findings of the 3rd workshop on neural generation and translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation. ACL, 1--14.

[43]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770--778.

[44]

Sébastien Jean and Kyunghyun Cho. 2019. Context-Aware Learning for Neural Machine Translation. arXiv:1903.04715.

[45]

Sebastien Jean, Stanislas Lauly, Orhan Firat, and Kyunghyun Cho. 2017. Does Neural Machine Translation Benefit from Larger Context? arXiv:1704.05135.

[46]

Yangfeng Ji, Chenhao Tan, Sebastian Martschat, Yejin Choi, and Noah A. Smith. 2017. Dynamic entity representations in neural language models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 1830--1839.

[47]

Shu Jiang, Rui Wang, Zuchao Li, Masao Utiyama, Kehai Chen, Eiichiro Sumita, Hai Zhao, and Bao-liang Lu. 2019. Document-level Neural Machine Translation with Inter-Sentence Attention. arXiv:1910.14528.

[48]

Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Trans. ACL 5 (2017), 339--351.

[49]

Shafiq Joty, Francisco Guzmán, Lluís Màrquez, and Preslav Nakov. 2017. Discourse structure in machine translation evaluation. Comput. Ling. 43, 4 (Dec. 2017), 683--722.

Digital Library

[50]

Marcin Junczys-Dowmunt. 2019. Microsoft translator at WMT 2019: Towards large-scale document-level neural machine translation. In Proceedings of the 4th Conference on Machine Translation. ACL, 225--233.

[51]

Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, and Alexandra Birch. 2018. Marian: Fast neural machine translation in C++. In Proceedings of ACL System Demonstrations. ACL, 116--121.

[52]

Daniel Jurafsky and James H. Martin. 2009. Speech and Language Processing (2nd ed.). Prentice-Hall, Inc., Upper Saddle River, NJ.

Digital Library

[53]

Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, and Preslav Nakov. 2019. Evaluating pronominal anaphora in machine translation: An evaluation measure and a test suite. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. ACL, 2957--2966.

[54]

Yunsu Kim, Duc Thanh Tran, and Hermann Ney. 2019. When and why is document-level context useful in neural machine translation? In Proceedings of the 4th Workshop on Discourse in Machine Translation. ACL, 24--34.

[55]

Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 388--395.

[56]

Shaohui Kuang and Deyi Xiong. 2018. Fusing recency into neural machine translation with an inter-sentence gate model. In Proceedings of the 27th International Conference on Computational Linguistics. ACL, 607--617.

[57]

Shaohui Kuang, Deyi Xiong, Weihua Luo, and Guodong Zhou. 2018. Modeling coherence for neural machine translation with dynamic and topic caches. In Proceedings of the 27th International Conference on Computational Linguistics. ACL, 596--606.

[58]

Samuel Läubli, Rico Sennrich, and Martin Volk. 2018. Has machine translation achieved human parity? A case for document-level evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 4791--4796.

[59]

Alon Lavie and Abhaya Agarwal. 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the 2nd Workshop on Statistical Machine Translation. ACL, 228--231.

[60]

Yann LeCun. 1988. A theoretical framework for back-propagation. In Proceedings of the Connectionist Models Summer School, D. Touretzky, G. Hinton, and T. Sejnowski (Eds.). Morgan Kaufmann, Pittsburg, PA, 21--28.

[61]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Meeting of the Association for Computational Linguistics. ACL, 7871--7880.

[62]

Gideon Lewis-Kraus. 2016. The great AI awakening. The New York Times Magazine (14 12 2016).

[63]

Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, and Changliang Li. 2020. Does multi-encoder help? A case study on context-aware neural machine translation. In Proceedings of the 58th Meeting of the Association for Computational Linguistics. ACL, 3512--3518.

[64]

Liangyou Li, Xin Jiang, and Qun Liu. 2019. Pretrained Language Models for Document-Level Neural Machine Translation. arXiv:1911.03110.

[65]

Pierre Lison and Jörg Tiedemann. 2016. OpenSubtitles2016: Extracting large parallel corpora from Movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation, Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis (Eds.). ELRA, 923--929.

[66]

Siyou Liu and Xiaojun Zhang. 2020. Corpora for document-level neural machine translation. In Proceedings of the 12th Language Resources and Evaluation Conference. ELRA, 3775--3781.

[67]

Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. 2020. Multilingual denoising pre-training for neural machine translation. arXiv:2001.08210.

[68]

Adam Lopez. 2008. Statistical machine translation. ACM Comput. Surv. 40, 3 (Aug. 2008).

Digital Library

[69]

Shuming Ma, Dongdong Zhang, and Ming Zhou. 2020. A simple and effective unified encoder for document-level machine translation. In Proceedings of the 58th Meeting of the Association for Computational Linguistics. ACL, 3505--3511.

[70]

Valentin Macé and Christophe Servan. 2019. Using whole document context in neural machine translation. In Proceedings of the 16th International Workshop on Spoken Language Translation.

[71]

William C. Mann and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text 8, 3 (1988), 243--281.

[72]

Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Meeting of the Association for Computational Linguistics. ACL, 55--60.

[73]

Elman Mansimov, Gábor Melis, and Lei Yu. 2020. Capturing document context inside sentence-level neural machine translation models with self-training. arXiv:2003.05259.

[74]

Eva Martínez Garcia, Carles Creus, and Cristina España-Bonet. 2019. Context-aware neural machine translation decoding. In Proceedings of the 4th Workshop on Discourse in Machine Translation. ACL, 13--23.

[75]

André F. T. Martins and Ramón Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In Proceedings of the 33rd International Conference on Machine Learning, Maria Florina Balcan and Kilian Q. Weinberger (Eds.), Vol. 48. PMLR, New York, NY, 1614--1623.

[76]

Sameen Maruf. 2019. Document-wide Neural Machine Translation. Ph.D. Dissertation. Monash University, Australia.

[77]

Sameen Maruf and Gholamreza Haffari. 2018. Document context neural machine translation with memory networks. In Proceedings of the 56th Meeting of the Association for Computational Linguistics. ACL, 1275--1284.

[78]

Sameen Maruf and Gholamreza Haffari. 2019. Monash University’s submissions to the WNGT 2019 document translation task. In Proceedings of the 3rd Workshop on Neural Generation and Translation. ACL, 256--261.

[79]

Sameen Maruf, André F. T. Martins, and Gholamreza Haffari. 2018. Contextual neural model for translating bilingual multi-speaker conversations. In Proceedings of the 3rd Conference on Machine Translation. ACL, 101--112.

[80]

Sameen Maruf, André F. T. Martins, and Gholamreza Haffari. 2019. Selective attention for context-aware neural machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computatational Linguistics: Human Language Technologies. ACL, 3092--3102.

[81]

Cade Metz. 2016. An infusion of AI makes Google translate more powerful than ever. Wired (27 09 2016).

[82]

Thomas Meyer, Andrei Popescu-Belis, Najeh Hajlaoui, and Andrea Gesmundo. 2012. Machine translation of labeled discourse connectives. In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas.

[83]

Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, and James Henderson. 2018. Document-level neural machine translation with hierarchical attention networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 2947--2954.

[84]

Lesly Miculicich Werlen and Andrei Popescu-Belis. 2017. Validation of an automatic metric for the accuracy of pronoun translation (APT). In Proceedings of the 3rd Workshop on Discourse in Machine Translation. ACL, 17--25.

[85]

Mathias Müller, Annette Rios, Elena Voita, and Rico Sennrich. 2018. A large-scale test set for the evaluation of context-aware pronoun translation in neural machine translation. In Proceedings of the 3rd Conference on Machine Translation. ACL, 61--72.

[86]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting on Association for Computational Linguistics. ACL, 311--318.

[87]

Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 2227--2237.

[88]

Martin Popel. 2018. CUNI transformer neural MT system for WMT18. In Proceedings of the 3rd Conference on Machine Translation. ACL, 482--487.

[89]

Martin Popel, Dominik Macháček, Michal Auersperger, Ondřej Bojar, and Pavel Pecina. 2019. English-Czech systems in WMT19: Document-level transformer. In Proceedings of the 4th Conference on Machine Translation. ACL, 342--348.

[90]

Andrei Popescu-Belis. 2019. Context in Neural Machine Translation: A Review of Models and Evaluations. arXiv:1901.09115.

[91]

Maja Popović. 2019. Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses. In Proceedings of the 4th Conference on Machine Translation. ACL, 464--469.

[92]

Alex Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. Retrieved from: https://www.cs.ubc.ca/amuham01/LING530/papers/radford2018improving.pdf.

[93]

Alessandro Raganato, Yves Scherrer, and Jörg Tiedemann. 2019. The MuCoW test suite at WMT 2019: Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation. In Proceedings of the 4th Conference on Machine Translation. ACL, 470--480.

[94]

C. J. Van Rijsbergen. 1979. Information Retrieval (2nd ed.). Butterworth-Heinemann, Newton, MA.

Digital Library

[95]

Annette Rios, Mathias Müller, and Rico Sennrich. 2018. The word sense disambiguation test suite at WMT18. In Proceedings of the 3rd Conference on Machine Translation. ACL, 588--596.

[96]

Annette Rios Gonzales, Laura Mascarell, and Rico Sennrich. 2017. Improving word sense disambiguation in neural machine translation with sense embeddings. In Proceedings of the 2nd Conference on Machine Translation. ACL, 11--19.

[97]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986. Learning internal representations by error propagation. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foundations. The MIT Press, Cambridge, MA, 318--362.

Digital Library

[98]

Kateřina Rysová, Magdaléna Rysová, Tomáš Musil, Lucie Poláková, and Ondřej Bojar. 2019. A test suite and manual evaluation of document-level NMT at WMT19. In Proceedings of the 4th Conference on Machine Translation. ACL, 455--463.

[99]

Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. 2017. Dynamic routing between capsules. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 3856--3866.

[100]

Fahimeh Saleh, Alexandre Berard, Ioan Calapodescu, and Laurent Besacier. 2019. Naver Labs Europe’s systems for the document-level generation and translation task at WNGT 2019. In Proceedings of the 3rd Workshop on Neural Generation and Translation. ACL, 273--279.

[101]

Danielle Saunders, Felix Stahlberg, and Bill Byrne. 2020. Using context in neural machine translation training objectives. In Proceedings of the 58th Meeting of the Association for Computational Linguistics. ACL, 7764--7770.

[102]

Yves Scherrer, Jörg Tiedemann, and Sharid Loáiciga. 2019. Analysing concatenation approaches to document-level NMT in two different domains. In Proceedings of the 4th Workshop on Discourse in Machine Translation. ACL, 51--61.

[103]

Rico Sennrich. 2017. How grammatical is character-level neural machine translation? Assessing MT quality with contrastive translation pairs. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. ACL, 376--382.

[104]

Rico Sennrich. 2018. Why the time is ripe for discourse in machine translation? In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation.

[105]

Ross D. Shachter. 1998. Bayes-ball: Rational pastime (for determining irrelevance and requisite information in belief networks and influence diagrams). In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, 480--487.

[106]

Matt Shannon. 2017. Optimizing expected word error rate via sampling for speech recognition. arXiv:1706.02776.

[107]

Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-attention with relative position representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 464--468.

[108]

Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 2016. Minimum risk training for neural machine translation. In Proceedings of the 54th Meeting of the Association for Computational Linguistics. ACL, 1683--1692.

[109]

Karin Sim Smith. 2017. On integrating discourse in machine translation. In Proceedings of the 3rd Workshop on Discourse in Machine Translation. ACL, 110--121.

[110]

Karin Sim Smith. 2018. Coherence in Machine Translation. Ph.D. Dissertation. University of Sheffield, UK.

[111]

Karin Sim Smith and Lucia Specia. 2018. Assessing Crosslingual Discourse Relations in Machine Translation. arXiv:1810.03148.

[112]

Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of 7th Conference of the Association for Machine Translation in the Americas. 223--231.

[113]

Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. MASS: Masked sequence to sequence pre-training for language generation. In Proceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, 5926--5936.

[114]

Felix Stahlberg. 2020. Neural machine translation: A review. J. Artif. Intell. Res. 69 (2020), 343--418.

[115]

Felix Stahlberg, Danielle Saunders, Adrià de Gispert, and Bill Byrne. 2019. CUED@WMT19:EWC&LMs. In Proceedings of the 4th Conference on Machine Translation. ACL, 364--373.

[116]

Dario Stojanovski and Alexander Fraser. 2018. Coreference and coherence in neural machine translation: A study using oracle experiments. In Proceedings of the 3rd Conference on Machine Translation. ACL, 49--60.

[117]

Dario Stojanovski and Alexander Fraser. 2019. Combining local and document-level context: The LMU Munich neural machine translation system at WMT19. In Proceedings of the 4th Conference on Machine Translation. ACL, 400--406.

[118]

Dario Stojanovski and Alexander Fraser. 2019. Improving anaphora resolution in neural machine translation using curriculum learning. In Proceedings of Machine Translation Summit XVII. EAMT, 140--150.

[119]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3104--3112.

Digital Library

[120]

Aarne Talman, Umut Sulubacak, Raúl Vázquez, Yves Scherrer, Sami Virpioja, Alessandro Raganato, Arvi Hurskainen, and Jörg Tiedemann. 2019. The University of Helsinki submissions to the WMT19 news translation task. In Proceedings of the 4th Conference on Machine Translation. ACL, 412--423.

[121]

Xin Tan, Longyin Zhang, Deyi Xiong, and Guodong Zhou. 2019. Hierarchical modeling of global context for document-level neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. ACL, 1576--1585.

[122]

Jörg Tiedemann and Yves Scherrer. 2017. Neural machine translation with extended context. In Proceedings of the 3rd Workshop on Discourse in Machine Translation. ACL, 82--92.

[123]

Antonio Toral, Sheila Castilho, Ke Hu, and Andy Way. 2018. Attaining the unattainable? Reassessing claims of human parity in neural machine translation. In Proceedings of the 3rd Conference on Machine Translation. ACL, 113--123.

[124]

Zhaopeng Tu, Yang Liu, Shuming Shi, and Tong Zhang. 2018. Learning to remember translation history with a continuous cache. Trans. ACL 6 (2018), 407--420.

[125]

Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, and Jakob Uszkoreit. 2018. Tensor2Tensor for neural machine translation. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas. AMTA, 193--199.

[126]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008.

Digital Library

[127]

Elena Voita, Rico Sennrich, and Ivan Titov. 2019. Context-aware monolingual repair for neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. ACL, 877--886.

[128]

Elena Voita, Rico Sennrich, and Ivan Titov. 2019. When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In Proceedings of the 57th Meeting of the Association for Computational Linguistics. ACL, 1198--1212.

[129]

Elena Voita, Pavel Serdyukov, Rico Sennrich, and Ivan Titov. 2018. Context-aware neural machine translation learns anaphora resolution. In Proceedings of the 56th Meeting of the Association for Computational Linguistics. ACL, 1264--1274.

[130]

Tereza Vojtěchová, Michal Novák, Miloš Klouček, and Ondřej Bojar. 2019. SAO WMT19 test suite: Machine translation of audit reports. In Proceedings of the 4th Conference on Machine Translation. ACL, 481--493.

[131]

Longyue Wang, Zhaopeng Tu, Andy Way, and Qun Liu. 2017. Exploiting cross-sentence context for neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 2826--2831.

[132]

Xinyi Wang, Jason Weston, Michael Auli, and Yacine Jernite. 2019. Improving Conditioning in Context-Aware Sequence to Sequence Models. arXiv:1911.09728.

[133]

Billy T. M. Wong and Chunyu Kit. 2012. Extending machine translation evaluation metrics with lexical cohesion to document level. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. ACL, 1060--1068.

[134]

KayYen Wong, Sameen Maruf, and Gholamreza Haffari. 2020. Contextual neural machine translation improves translation of cataphoric pronouns. In Proceedings of the 58th Meeting of the Association for Computational Linguistics. ACL, 5971--5978.

[135]

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Åukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144.

[136]

Tong Xiao, Jingbo Zhu, Shujie Yao, and Hao Zhang. 2011. Document-level consistency verification in machine translation. In Proceedings of the 13th Machine Translation Summit. AAMT, 131--138.

[137]

Hao Xiong, Zhongjun He, Hua Wu, and Haifeng Wang. 2019. Modeling coherence for discourse neural machine translation. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI.

Digital Library

[138]

Hongfei Xu, Deyi Xiong, Josef van Genabith, and Qiuhui Liu. 2020. Efficient context-aware neural machine translation with layer-wise weighting and input-aware gating. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, Christian Bessiere (Ed.). IJCAI Organization, 3933--3940.

[139]

Hayahide Yamagishi and Mamoru Komachi. 2019. Improving Context-aware Neural Machine Translation with Target-side Context. arXiv:1909.00531.

[140]

Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 1480--1489.

[141]

Zhengxin Yang, Jinchao Zhang, Fandong Meng, Shuhao Gu, Yang Feng, and Jie Zhou. 2019. Enhancing context modeling with a query-guided capsule network for document-level translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. ACL, 1527--1537.

[142]

Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, and Chris Dyer. 2020. Better document-level machine translation with Bayes’ rule. Trans. ACL 8 (2020), 346--360.

[143]

Hyeongu Yun, Yongkeun Hwang, and Kyomin Jung. 2020. Improving context-aware neural machine translation using self-attentive sentence embedding. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. AAAI, 9498--9506.

[144]

Jiacheng Zhang, Huanbo Luan, Maosong Sun, Feifei Zhai, Jingfang Xu, Min Zhang, and Yang Liu. 2018. Improving the transformer translation model with document-level context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 533--542.

[145]

Pei Zhang, Xu Zhang, Wei Chen, Jian Yu, Yanfeng Wang, and Deyi Xiong. 2020. Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation. arXiv:2003.13205.

[146]

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating text generation with BERT. In Proceedings of the 8th International Conference on Learning Representations.

[147]

Zaixiang Zheng, Xiang Yue, Shujian Huang, Jiajun Chen, and Alexandra Birch. 2020. Towards making the most of context in neural machine translation. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, Christian Bessiere (Ed.). IJCAI Organization, 3983--3989.

Cited By

Trainotti Rabonato RMilios EBerton L(2025)Gender-Neutral English to Portuguese Machine Translator: Promoting Inclusive LanguageIntelligent Systems10.1007/978-3-031-79038-6_13(180-195)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-3-031-79038-6_13
Mondal SWang CChen YCheng YHuang YDai HKabir H(2024)Enhancement of English-Bengali Machine Translation Leveraging Back-TranslationApplied Sciences10.3390/app1415684814:15(6848)Online publication date: 5-Aug-2024
https://doi.org/10.3390/app14156848
Li YLuan ZLiu YLiu HQi JHan D(2024)Automated information extraction model enhancing traditional Chinese medicine RCT evidence extraction (Evi-BERT): algorithm development and validationFrontiers in Artificial Intelligence10.3389/frai.2024.14549457Online publication date: 15-Aug-2024
https://doi.org/10.3389/frai.2024.1454945
Show More Cited By

Index Terms

A Survey on Document-level Neural Machine Translation: Methods and Evaluation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic algorithms

Recommendations

Neural machine translation advised by statistical machine translation
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

Neural Machine Translation (NMT) is a new approach to machine translation that has made great progress in recent years. However, recent studies show that NMT generally produces fluent but inadequate translations (Tu et al. 2016b; 2016a; He et al. 2016; ...
Using Translation Memory to Improve Neural Machine Translations
ICDLT '22: Proceedings of the 2022 6th International Conference on Deep Learning Technologies

In this paper, we describe a way of using translation memory (TM) to improve the translation quality and stability of neural machine translation (NMT) systems, especially when the sentences to be translated have high similarity with sentences stored in ...
N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics

In this paper we compare and contrast two approaches to Machine Translation (MT): the CMU-UKA Syntax Augmented Machine Translation system (SAMT) and UPC-TALP N-gram-based Statistical Machine Translation (SMT). SAMT is a hierarchical syntax-driven ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 54, Issue 2

March 2022

800 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3450359

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021

Accepted: 01 December 2020

Revised: 01 November 2020

Received: 01 December 2019

Published in CSUR Volume 54, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

Context-aware neural machine translation

Qualifiers

Research-article
Research
Refereed

Funding Sources

Google Faculty Research Award and ARC

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
1,563
Total Downloads

Downloads (Last 12 months)305
Downloads (Last 6 weeks)32

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Trainotti Rabonato RMilios EBerton L(2025)Gender-Neutral English to Portuguese Machine Translator: Promoting Inclusive LanguageIntelligent Systems10.1007/978-3-031-79038-6_13(180-195)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-3-031-79038-6_13
Mondal SWang CChen YCheng YHuang YDai HKabir H(2024)Enhancement of English-Bengali Machine Translation Leveraging Back-TranslationApplied Sciences10.3390/app1415684814:15(6848)Online publication date: 5-Aug-2024
https://doi.org/10.3390/app14156848
Li YLuan ZLiu YLiu HQi JHan D(2024)Automated information extraction model enhancing traditional Chinese medicine RCT evidence extraction (Evi-BERT): algorithm development and validationFrontiers in Artificial Intelligence10.3389/frai.2024.14549457Online publication date: 15-Aug-2024
https://doi.org/10.3389/frai.2024.1454945
Cang HFeng D(2024)Construction of English corpus oral instant translation model based on internet of things and deep learning of information securityJournal of Computational Methods in Sciences and Engineering10.3233/JCM-24718324:3(1507-1522)Online publication date: 17-Jun-2024
https://doi.org/10.3233/JCM-247183
Xu YLi YWang JZhang XFilkov VRay BZhou M(2024)Evaluating Terminology Translation in Machine Translation Systems via Metamorphic TestingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695069(758-769)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695069
Duan H(2024)MS-CNN Algorithm in Upgrading Intelligent Translation Virtual Simulation Platform2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON)10.1109/NMITCON62075.2024.10698843(1-5)Online publication date: 9-Aug-2024
https://doi.org/10.1109/NMITCON62075.2024.10698843
Zhong KZhang JGuo W(2024)Document-Level Machine Translation with Effective Batch-Level Context Representation2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651489(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651489
Xia B(2024)Machine Automatic Translation Evaluation Based on Big Data Algorithms2024 Second International Conference on Data Science and Information System (ICDSIS)10.1109/ICDSIS61070.2024.10594232(1-5)Online publication date: 17-May-2024
https://doi.org/10.1109/ICDSIS61070.2024.10594232
Li B(2024)Mobile Assisted Language Translation Intelligent Recommendation System Based on Natural Language Processing2024 International Conference on Distributed Computing and Optimization Techniques (ICDCOT)10.1109/ICDCOT61034.2024.10515759(1-5)Online publication date: 15-Mar-2024
https://doi.org/10.1109/ICDCOT61034.2024.10515759
Li XGao S(2024)Construction of Foreign Language Translation Recognition System Model Based on Artificial Intelligence Algorithms2024 Third International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE)10.1109/ICDCECE60827.2024.10548204(1-6)Online publication date: 26-Apr-2024
https://doi.org/10.1109/ICDCECE60827.2024.10548204
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents