Abstract
Deep neural networks (DNN) have achieved great success in several research areas like information retrieval, image processing, and speech recognition. In the field of machine translation, neural machine translation (NMT) has been able to overcome the statistical machine translation (SMT), which has been the dominant technology for a long-term span of time. The recent machine translation approach, which consists of two sub networks named an encoder and a decoder, has gained state-of-the-art performance on different benchmarks and for several language pairs. The increasing interest of researchers in NMT is due to its simplicity compared to SMT which consists of several components tuned separately. This paper describes the evolution of NMT. The different attention mechanism architectures are discussed and the purpose of each. The paper also presents some toolkits that are developed specifically for research and production of NMT systems. The superiority of NMT over SMT is discussed, as well as the problems facing NMT.



Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bentivogli L, Bisazza A., Cettolo M, Federico M. (2016). Neural versus phrase-based machine translation quality: a case study. arXiv preprint AxXiv:1608.04631
Zhang B, Xiong D, Su J, Duan H (2017) A context-aware recurrent encoder for neural machine translation. IEEE/ACM Transact on Aud, Speech, Lang Process 25(12):2424–2432
Almansor, E. H. (2018). Translating Arabic as low resource language using distribution representation and neural machine translation models (Doctoral dissertation).
Moussallem D, Wauer M, Ngomo ACN (2018) Machine translation using semantic web technologies: A survey. J Web Semant 51:1–19
Williams P, Sennrich R, Post M, Koehn P (2016) Syntax-based statistical machine translation. Synth Lectur on Human Lang Technolog 9(4):1–208
Satpathy S, Mishra, S. P, Nayak, A. K. (2019, May). Analysis of Learning Approaches for Machine Translation Systems. In 2019 International Conference on Applied Machine Learning (ICAML) (pp. 160–164). IEEE
Wang X, Tu Z, Zhang M (2018) Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation. IEEE/ACM Transact on Audio, Speech, Lang Process 26(12):2255–2266
Yang Z, Chen W, Wang F, Xu B (2018) Generative adversarial training for neural machine translation. Neurocomputing 321:146–155
Forcada M L, Ñeco R P. (1997, June). Recursive hetero-associative memories for translation. In International Work-Conference on Artificial Neural Networks. 453–462. Springer, Berlin
Castano, M. A., Casacuberta, F., Vidal, E. (1997). Machine translation using neural networks and finite-state models. Theoretical and Methodological Issues in Machine Translation (TMI), 160–167
Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518
Schwenk H. (2012). Continuous space translation models for phrase-based statistical machine translation. Proceedings of COLING 2012: Posters, 1071–1080.
Kalchbrenner N, Blunsom P. (2013). Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1700–1709
Sutskever I, Vinyals, O, Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104–3112
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259
Bahdanau D, Cho K, Bengio Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Cheng Y. (2019). Agreement-based joint training for bidirectional attention-based neural machine translation. In Joint Training for Neural Machine Translation. 11–23. Springer, Singapore
Choi H, Cho K, Bengio Y (2018) Fine-grained attention mechanism for neural machine translation. Neurocomputing 284:171–176
Hu D (2019) An introductory survey on attention mechanisms in NLP problems. In Proceedings of SAI Intelligent Systems Conference. 432–448. Springer, Cham.
Dhanani F, Rafi, M (2020) Attention Transformer Model for Translation of Similar Languages. In Proceedings of the Fifth Conference on Machine Translation. 387–392
Raffel C, Luong M. T, Liu P. J, Weiss R. J, Eck D (2017) Online and linear-time attention by enforcing monotonic alignments. In International Conference on Machine Learning. 2837–2846. PMLR
Vaswani A., Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A. N, Polosukhin I (2017) Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008
Sachan D. S, Neubig G (2018). Parameter sharing methods for multilingual self-attentional translation models. arXiv preprint arXiv:1809.00252
Zhao G, Lin J, Zhang Z, Ren X, Su Q, Sun X (2019) Explicit sparse transformer: Concentrated attention through explicit selection. arXiv preprint arXiv:1912.11637
Tsunoo E, Kashiwagi Y, Watanabe S (2020) Streaming Transformer ASR with Blockwise Synchronous Beam Search. arXiv preprint arXiv:2006.14941
Gehring J, Auli M, Grangier D, Yarats D, Dauphin Y. N (2017) Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. 1243–1252. JMLR. org
Wu Y. C, Yin F, Zhang X. Y, Liu L, Liu C. L (2018) Scan: Sliding convolutional attention network for scene text recognition. arXiv preprint arXiv:1806.00578
Wang L, Yao J, Tao Y, Zhong L, Liu W, Du Q (2018) A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. arXiv preprint arXiv:1805.03616
Yin W, Schütze H (2018) Attentive convolution: Equipping cnns with rnn-style attention mechanisms. Trans Assoc Computat Ling 6:687–702
http://statmt.org.http://statmt.orgLast access 4 February 2021
http://www.stamt.org/wrnt14/translation-task.html Last access 4 February 2021
http://www.stamt.org/wrnt14/translation-task.htmlLast access 4 February 2021
Zakraoui J, Saleh M, Al-Maadeed S, AlJa’am JM (2020) Evaluation of Arabic to English Machine Translation Systems. In 2020 11th International Conference on Information and Communication Systems (ICICS). 185–190. IEEE
http://github.com/jonsafari/nmt-list Last access 4 February 2021
http://opennmt.netLast access 4 February 2021
Neubig G, Sperber M, Wang X, Felix M, Matthews A, Padmanabhan S, Hewitt J (2018) XNMT: The extensible neural machine translation toolkit. arXiv preprint arXiv:1803.00188
Sennrich R, Firat O, Cho K, Birch A, Haddow B, Hitschler J, Nădejde M (2017) Nematus: a toolkit for neural machine translation. arXiv preprint arXiv:1730.04357
http://github.com/nyu-di/di4mt-tutorial Last access 4 February 2021
Hieber F, Domhan T, Denkowski M, Vilar D, Sokolov A., Clifton A., Post M (2017) Sockeye: A toolkit for neural machine translation. arXiv preprint arXiv:1712.05690
Junczys-Dowmunt M, Grundkiewicz R, Dwojak T, Hoang H, Heafield K, Neckermann T, Birch A (2018) Marian: Fast neural machine translation in C++. arXiv preprint arXiv:1804.00344
Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez A. N, Gouws S, Uszkoreit J (2018) Tensor2tensor for neural machine translation. arXiv preprint arXiv:1803.07416
Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Auli M (2019) fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038
Helcl J, Libovický J (2017) Neural monkey: An open-source tool for sequence learning. The Prague Bullet Mathemat Ling 107(1):5
Koehn P, Knowles R. (2017). Six challenges for neural machine translation. arXiv preprint arXiv:1760.03872
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mohamed, S.A., Elsayed, A.A., Hassan, Y.F. et al. Neural machine translation: past, present, and future. Neural Comput & Applic 33, 15919–15931 (2021). https://doi.org/10.1007/s00521-021-06268-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06268-0