A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

Läubli, Samuel; Castilho, Sheila; Neubig, Graham; Sennrich, Rico; Shen, Qinlan; Toral, Antonio

doi:10.1613/jair.1.11371

Computer Science > Computation and Language

arXiv:2004.01694 (cs)

[Submitted on 3 Apr 2020]

Title:A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

Authors:Samuel Läubli, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, Antonio Toral

View PDF

Abstract:The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.'s 2018 investigation into Chinese to English news translation, showing that the finding of human-machine parity was owed to weaknesses in the evaluation design - which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human-machine parity in particular, for which we offer a set of recommendations based on our empirical findings.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2004.01694 [cs.CL]
	(or arXiv:2004.01694v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2004.01694
Journal reference:	Journal of Artificial Intelligence Research 67 (2020) 653-672
Related DOI:	https://doi.org/10.1613/jair.1.11371

Submission history

From: Samuel Läubli [view email]
[v1] Fri, 3 Apr 2020 17:49:56 UTC (35 KB)

Computer Science > Computation and Language

Title:A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators