Document-Level Language Models for Machine Translation

Petrick, Frithjof; Herold, Christian; Petrushkov, Pavel; Khadivi, Shahram; Ney, Hermann

Computer Science > Computation and Language

arXiv:2310.12303 (cs)

[Submitted on 18 Oct 2023]

Title:Document-Level Language Models for Machine Translation

Authors:Frithjof Petrick, Christian Herold, Pavel Petrushkov, Shahram Khadivi, Hermann Ney

View PDF

Abstract:Despite the known limitations, most machine translation systems today still operate on the sentence-level. One reason for this is, that most parallel training data is only sentence-level aligned, without document-level meta information available. In this work, we set out to build context-aware translation systems utilizing document-level monolingual data instead. This can be achieved by combining any existing sentence-level translation model with a document-level language model. We improve existing approaches by leveraging recent advancements in model combination. Additionally, we propose novel weighting techniques that make the system combination more flexible and significantly reduce computational overhead. In a comprehensive evaluation on four diverse translation tasks, we show that our extensions improve document-targeted scores substantially and are also computationally more efficient. However, we also find that in most scenarios, back-translation gives even better results, at the cost of having to re-train the translation system. Finally, we explore language model fusion in the light of recent advancements in large language models. Our findings suggest that there might be strong potential in utilizing large language models via model combination.

Comments:	accepted at WMT 2023
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2310.12303 [cs.CL]
	(or arXiv:2310.12303v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.12303

Submission history

From: Christian Herold [view email]
[v1] Wed, 18 Oct 2023 20:10:07 UTC (73 KB)

Computer Science > Computation and Language

Title:Document-Level Language Models for Machine Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Document-Level Language Models for Machine Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators