Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

Brian Thompson; Huda Khayrallah; Antonios Anastasopoulos; Arya D. McCarthy; Kevin Duh; Rebecca Marvin; Paul McNamee; Jeremy Gwinnup; Tim Anderson; Philipp Koehn

doi:10.18653/v1/W18-6313

Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn

Abstract

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component’s contribution to, and capacity for, domain adaptation. We find that freezing any single component during continued training has minimal impact on performance, and that performance is surprisingly good when a single component is adapted while holding the rest of the model fixed. We also find that continued training does not move the model very far from the out-of-domain model, compared to a sensitivity analysis metric, suggesting that the out-of-domain model can provide a good generic initialization for the new domain.

Anthology ID:: W18-6313
Volume:: Proceedings of the Third Conference on Machine Translation: Research Papers
Month:: October
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 124–132
Language:
URL:: https://aclanthology.org/W18-6313
DOI:: 10.18653/v1/W18-6313
Bibkey:
Cite (ACL):: Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, and Philipp Koehn. 2018. Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 124–132, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation (Thompson et al., WMT 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-6313.pdf
Code: awslabs/sockeye
Data: OpenSubtitles

PDF Cite Search Code