Re-translation versus Streaming for Simultaneous Translation

Arivazhagan, Naveen; Cherry, Colin; Macherey, Wolfgang; Foster, George

Computer Science > Computation and Language

arXiv:2004.03643v1 (cs)

[Submitted on 7 Apr 2020 (this version), latest version 29 Jun 2020 (v3)]

Title:Re-translation versus Streaming for Simultaneous Translation

Authors:Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, George Foster

View PDF

Abstract:There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available. We study a related problem in which revisions to the hypothesis beyond strictly appending words are permitted. This is suitable for applications such as live captioning an audio feed. In this setting, we compare custom streaming approaches to re-translation, a straightforward strategy where each new source token triggers a distinct translation from scratch. We find re-translation to be as good or better than state-of-the-art streaming systems, even when operating under constraints that allow very few revisions. We attribute much of this success to a previously proposed data-augmentation technique that adds prefix-pairs to the training data, which alongside wait-k inference forms a strong baseline for streaming translation. We also highlight re-translation's ability to wrap arbitrarily powerful MT systems with an experiment showing large improvements from an upgrade to its base model.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2004.03643 [cs.CL]
	(or arXiv:2004.03643v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2004.03643

Submission history

From: Colin Cherry [view email]
[v1] Tue, 7 Apr 2020 18:27:32 UTC (206 KB)
[v2] Tue, 14 Apr 2020 17:06:21 UTC (206 KB)
[v3] Mon, 29 Jun 2020 23:36:13 UTC (206 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-04

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Naveen Arivazhagan
Colin Cherry
Wolfgang Macherey
George F. Foster

export BibTeX citation

Computer Science > Computation and Language

Title:Re-translation versus Streaming for Simultaneous Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Re-translation versus Streaming for Simultaneous Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators