Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Neural machine translation: past, present, and future

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Deep neural networks (DNN) have achieved great success in several research areas like information retrieval, image processing, and speech recognition. In the field of machine translation, neural machine translation (NMT) has been able to overcome the statistical machine translation (SMT), which has been the dominant technology for a long-term span of time. The recent machine translation approach, which consists of two sub networks named an encoder and a decoder, has gained state-of-the-art performance on different benchmarks and for several language pairs. The increasing interest of researchers in NMT is due to its simplicity compared to SMT which consists of several components tuned separately. This paper describes the evolution of NMT. The different attention mechanism architectures are discussed and the purpose of each. The paper also presents some toolkits that are developed specifically for research and production of NMT systems. The superiority of NMT over SMT is discussed, as well as the problems facing NMT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Bentivogli L, Bisazza A., Cettolo M, Federico M. (2016). Neural versus phrase-based machine translation quality: a case study. arXiv preprint AxXiv:1608.04631

  2. Zhang B, Xiong D, Su J, Duan H (2017) A context-aware recurrent encoder for neural machine translation. IEEE/ACM Transact on Aud, Speech, Lang Process 25(12):2424–2432

    Article  Google Scholar 

  3. Almansor, E. H. (2018). Translating Arabic as low resource language using distribution representation and neural machine translation models (Doctoral dissertation).

  4. Moussallem D, Wauer M, Ngomo ACN (2018) Machine translation using semantic web technologies: A survey. J Web Semant 51:1–19

    Article  Google Scholar 

  5. Williams P, Sennrich R, Post M, Koehn P (2016) Syntax-based statistical machine translation. Synth Lectur on Human Lang Technolog 9(4):1–208

    Article  Google Scholar 

  6. Satpathy S, Mishra, S. P, Nayak, A. K. (2019, May). Analysis of Learning Approaches for Machine Translation Systems. In 2019 International Conference on Applied Machine Learning (ICAML) (pp. 160–164). IEEE

  7. Wang X, Tu Z, Zhang M (2018) Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation. IEEE/ACM Transact on Audio, Speech, Lang Process 26(12):2255–2266

    Article  Google Scholar 

  8. Yang Z, Chen W, Wang F, Xu B (2018) Generative adversarial training for neural machine translation. Neurocomputing 321:146–155

    Article  Google Scholar 

  9. Forcada M L, Ñeco R P. (1997, June). Recursive hetero-associative memories for translation. In International Work-Conference on Artificial Neural Networks. 453–462. Springer, Berlin

  10. Castano, M. A., Casacuberta, F., Vidal, E. (1997). Machine translation using neural networks and finite-state models. Theoretical and Methodological Issues in Machine Translation (TMI), 160–167

  11. Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518

    Article  Google Scholar 

  12. Schwenk H. (2012). Continuous space translation models for phrase-based statistical machine translation. Proceedings of COLING 2012: Posters, 1071–1080.

  13. Kalchbrenner N, Blunsom P. (2013). Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1700–1709

  14. Sutskever I, Vinyals, O, Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104–3112

  15. Cho K, Van Merriënboer B, Bahdanau D, Bengio Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259

  16. Bahdanau D, Cho K, Bengio Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  17. Cheng Y. (2019). Agreement-based joint training for bidirectional attention-based neural machine translation. In Joint Training for Neural Machine Translation. 11–23. Springer, Singapore

  18. Choi H, Cho K, Bengio Y (2018) Fine-grained attention mechanism for neural machine translation. Neurocomputing 284:171–176

    Article  Google Scholar 

  19. Hu D (2019) An introductory survey on attention mechanisms in NLP problems. In Proceedings of SAI Intelligent Systems Conference. 432–448. Springer, Cham.

  20. Dhanani F, Rafi, M (2020) Attention Transformer Model for Translation of Similar Languages. In Proceedings of the Fifth Conference on Machine Translation. 387–392

  21. Raffel C, Luong M. T, Liu P. J, Weiss R. J, Eck D (2017) Online and linear-time attention by enforcing monotonic alignments. In International Conference on Machine Learning. 2837–2846. PMLR

  22. Vaswani A., Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A. N, Polosukhin I (2017) Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008

  23. Sachan D. S, Neubig G (2018). Parameter sharing methods for multilingual self-attentional translation models. arXiv preprint arXiv:1809.00252

  24. Zhao G, Lin J, Zhang Z, Ren X, Su Q, Sun X (2019) Explicit sparse transformer: Concentrated attention through explicit selection. arXiv preprint arXiv:1912.11637

  25. Tsunoo E, Kashiwagi Y, Watanabe S (2020) Streaming Transformer ASR with Blockwise Synchronous Beam Search. arXiv preprint arXiv:2006.14941

  26. Gehring J, Auli M, Grangier D, Yarats D, Dauphin Y. N (2017) Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. 1243–1252. JMLR. org

  27. Wu Y. C, Yin F, Zhang X. Y, Liu L, Liu C. L (2018) Scan: Sliding convolutional attention network for scene text recognition. arXiv preprint arXiv:1806.00578

  28. Wang L, Yao J, Tao Y, Zhong L, Liu W, Du Q (2018) A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization. arXiv preprint arXiv:1805.03616

  29. Yin W, Schütze H (2018) Attentive convolution: Equipping cnns with rnn-style attention mechanisms. Trans Assoc Computat Ling 6:687–702

    Google Scholar 

  30. http://statmt.org.http://statmt.orgLast access 4 February 2021

  31. http://www.stamt.org/wrnt14/translation-task.html Last access 4 February 2021

  32. http://www.stamt.org/wrnt14/translation-task.htmlLast access 4 February 2021

  33. Zakraoui J, Saleh M, Al-Maadeed S, AlJa’am JM (2020) Evaluation of Arabic to English Machine Translation Systems. In 2020 11th International Conference on Information and Communication Systems (ICICS). 185–190. IEEE

  34. http://github.com/jonsafari/nmt-list Last access 4 February 2021

  35. http://opennmt.netLast access 4 February 2021

  36. Neubig G, Sperber M, Wang X, Felix M, Matthews A, Padmanabhan S, Hewitt J (2018) XNMT: The extensible neural machine translation toolkit. arXiv preprint arXiv:1803.00188

  37. Sennrich R, Firat O, Cho K, Birch A, Haddow B, Hitschler J, Nădejde M (2017) Nematus: a toolkit for neural machine translation. arXiv preprint arXiv:1730.04357

  38. http://github.com/nyu-di/di4mt-tutorial Last access 4 February 2021

  39. Hieber F, Domhan T, Denkowski M, Vilar D, Sokolov A., Clifton A., Post M (2017) Sockeye: A toolkit for neural machine translation. arXiv preprint arXiv:1712.05690

  40. Junczys-Dowmunt M, Grundkiewicz R, Dwojak T, Hoang H, Heafield K, Neckermann T, Birch A (2018) Marian: Fast neural machine translation in C++. arXiv preprint arXiv:1804.00344

  41. Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez A. N, Gouws S, Uszkoreit J (2018) Tensor2tensor for neural machine translation. arXiv preprint arXiv:1803.07416

  42. Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Auli M (2019) fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038

  43. Helcl J, Libovický J (2017) Neural monkey: An open-source tool for sequence learning. The Prague Bullet Mathemat Ling 107(1):5

    Article  Google Scholar 

  44. Koehn P, Knowles R. (2017). Six challenges for neural machine translation. arXiv preprint arXiv:1760.03872

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shereen A. Mohamed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohamed, S.A., Elsayed, A.A., Hassan, Y.F. et al. Neural machine translation: past, present, and future. Neural Comput & Applic 33, 15919–15931 (2021). https://doi.org/10.1007/s00521-021-06268-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06268-0

Keywords