pdf
bib Proceedings of the 4th Workshop on Asian Translation (WAT2017) Toshiaki Nakazawa
|
Isao Goto
pdf
bib
abs Overview of the 4th Workshop on Asian Translation Toshiaki Nakazawa
|
Shohei Higashiyama
|
Chenchen Ding
|
Hideya Mino
|
Isao Goto
|
Hideto Kazawa
|
Yusuke Oda
|
Graham Neubig
|
Sadao Kurohashi
This paper presents the results of the shared tasks from the 4th workshop on Asian translation (WAT2017) including J↔E, J↔C scientific paper translation subtasks, C↔J, K↔J, E↔J patent translation subtasks, H↔E mixed domain subtasks, J↔E newswire subtasks and J↔E recipe subtasks. For the WAT2017, 12 institutions participated in the shared tasks. About 300 translation results have been submitted to the automatic evaluation server, and selected submissions were manually evaluated.
pdf
bib
abs Controlling Target Features in Neural Machine Translation via Prefix Constraints Shunsuke Takeno
|
Masaaki Nagata
|
Kazuhide Yamamoto
We propose prefix constraints, a novel method to enforce constraints on target sentences in neural machine translation. It places a sequence of special tokens at the beginning of target sentence (target prefix), while side constraints places a special token at the end of source sentence (source suffix). Prefix constraints can be predicted from source sentence jointly with target sentence, while side constraints (Sennrich et al., 2016) must be provided by the user or predicted by some other methods. In both methods, special tokens are designed to encode arbitrary features on target-side or metatextual information. We show that prefix constraints are more flexible than side constraints and can be used to control the behavior of neural machine translation, in terms of output length, bidirectional decoding, domain adaptation, and unaligned target word generation.
pdf
bib
abs Improving Japanese-to-English Neural Machine Translation by Paraphrasing the Target Language Yuuki Sekizawa
|
Tomoyuki Kajiwara
|
Mamoru Komachi
Neural machine translation (NMT) produces sentences that are more fluent than those produced by statistical machine translation (SMT). However, NMT has a very high computational cost because of the high dimensionality of the output layer. Generally, NMT restricts the size of vocabulary, which results in infrequent words being treated as out-of-vocabulary (OOV) and degrades the performance of the translation. In evaluation, we achieved a statistically significant BLEU score improvement of 0.55-0.77 over the baselines including the state-of-the-art method.
pdf
bib
abs Improving Low-Resource Neural Machine Translation with Filtered Pseudo-Parallel Corpus Aizhan Imankulova
|
Takayuki Sato
|
Mamoru Komachi
Large-scale parallel corpora are indispensable to train highly accurate machine translators. However, manually constructed large-scale parallel corpora are not freely available in many language pairs. In previous studies, training data have been expanded using a pseudo-parallel corpus obtained using machine translation of the monolingual corpus in the target language. However, in low-resource language pairs in which only low-accuracy machine translation systems can be used, translation quality is reduces when a pseudo-parallel corpus is used naively. To improve machine translation performance with low-resource language pairs, we propose a method to expand the training data effectively via filtering the pseudo-parallel corpus using a quality estimation based on back-translation. As a result of experiments with three language pairs using small, medium, and large parallel corpora, language pairs with fewer training data filtered out more sentence pairs and improved BLEU scores more significantly.
pdf
bib
abs Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing Atsushi Fujita
|
Eiichiro Sumita
Aiming at facilitating the research on quality estimation (QE) and automatic post-editing (APE) of machine translation (MT) outputs, especially for those among Asian languages, we have created new datasets for Japanese to English, Chinese, and Korean translations. As the source text, actual utterances in Japanese were extracted from the log data of our speech translation service. MT outputs were then given by phrase-based statistical MT systems. Finally, human evaluators were employed to grade the quality of MT outputs and to post-edit them. This paper describes the characteristics of the created datasets and reports on our benchmarking experiments on word-level QE, sentence-level QE, and APE conducted using the created datasets.
pdf
bib
abs NTT Neural Machine Translation Systems at WAT 2017 Makoto Morishita
|
Jun Suzuki
|
Masaaki Nagata
In this year, we participated in four translation subtasks at WAT 2017. Our model structure is quite simple but we used it with well-tuned hyper-parameters, leading to a significant improvement compared to the previous state-of-the-art system. We also tried to make use of the unreliable part of the provided parallel corpus by back-translating and making a synthetic corpus. Our submitted system achieved the new state-of-the-art performance in terms of the BLEU score, as well as human evaluation.
pdf
bib
abs XMU Neural Machine Translation Systems for WAT 2017 Boli Wang
|
Zhixing Tan
|
Jinming Hu
|
Yidong Chen
|
Xiaodong Shi
This paper describes the Neural Machine Translation systems of Xiamen University for the shared translation tasks of WAT 2017. Our systems are based on the Encoder-Decoder framework with attention. We participated in three subtasks. We experimented subword segmentation, synthetic training data and model ensembling. Experiments show that all these methods can give substantial improvements.
pdf
bib
abs A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size Masato Neishi
|
Jin Sakuma
|
Satoshi Tohda
|
Shonosuke Ishiwatari
|
Naoki Yoshinaga
|
Masashi Toyoda
In this paper, we describe the team UT-IIS’s system and results for the WAT 2017 translation tasks. We further investigated several tricks including a novel technique for initializing embedding layers using only the parallel corpus, which increased the BLEU score by 1.28, found a practical large batch size of 256, and gained insights regarding hyperparameter settings. Ultimately, our system obtained a better result than the state-of-the-art system of WAT 2016. Our code is available on https://github.com/nem6ishi/wat17.
pdf
bib
abs Patent NMT integrated with Large Vocabulary Phrase Translation by SMT at WAT 2017 Zi Long
|
Ryuichiro Kimura
|
Takehito Utsuro
|
Tomoharu Mitsuhashi
|
Mikio Yamamoto
Neural machine translation (NMT) cannot handle a larger vocabulary because the training complexity and decoding complexity proportionally increase with the number of target words. This problem becomes even more serious when translating patent documents, which contain many technical terms that are observed infrequently. Long et al.(2017) proposed to select phrases that contain out-of-vocabulary words using the statistical approach of branching entropy. The selected phrases are then replaced with tokens during training and post-translated by the phrase translation table of SMT. In this paper, we apply the method proposed by Long et al. (2017) to the WAT 2017 Japanese-Chinese and Japanese-English patent datasets. Evaluation on Japanese-to-Chinese, Chinese-to-Japanese, Japanese-to-English and English-to-Japanese patent sentence translation proved the effectiveness of phrases selected with branching entropy, where the NMT model of Long et al.(2017) achieves a substantial improvement over a baseline NMT model without the technique proposed by Long et al.(2017).
System architecture, experimental settings and experimental results of the EHR team for the WAT2017 tasks are described. We participate in three tasks: JPCen-ja, JPCzh-ja and JPCko-ja. Although the basic architecture of our system is NMT, reranking technique is conducted using SMT results. One of the major drawback of NMT is under-translation and over-translation. On the other hand, SMT infrequently makes such translations. So, using reranking of n-best NMT outputs by the SMT output, discarding such translations can be expected. We can improve BLEU score from 46.03 to 47.08 by this technique in JPCzh-ja task.
pdf
bib
abs Ensemble and Reranking: Using Multiple Models in the NICT-2 Neural Machine Translation System at WAT2017 Kenji Imamura
|
Eiichiro Sumita
In this paper, we describe the NICT-2 neural machine translation system evaluated at WAT2017. This system uses multiple models as an ensemble and combines models with opposite decoding directions by reranking (called bi-directional reranking). In our experimental results on small data sets, the translation quality improved when the number of models was increased to 32 in total and did not saturate. In the experiments on large data sets, improvements of 1.59-3.32 BLEU points were achieved when six-model ensembles were combined by the bi-directional reranking.
pdf
bib
abs A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task Yusuke Oda
|
Katsuhito Sudoh
|
Satoshi Nakamura
|
Masao Utiyama
|
Eiichiro Sumita
This paper describes the details about the NAIST-NICT machine translation system for WAT2017 English-Japanese Scientific Paper Translation Task. The system consists of a language-independent tokenizer and an attentional encoder-decoder style neural machine translation model. According to the official results, our system achieves higher translation accuracy than any systems submitted previous campaigns despite simple model architecture.
pdf
bib
abs Comparison of SMT and NMT trained with large Patent Corpora: Japio at WAT2017 Satoshi Kinoshita
|
Tadaaki Oshio
|
Tomoharu Mitsuhashi
Japio participates in patent subtasks (JPC-EJ/JE/CJ/KJ) with phrase-based statistical machine translation (SMT) and neural machine translation (NMT) systems which are trained with its own patent corpora in addition to the subtask corpora provided by organizers of WAT2017. In EJ and CJ subtasks, SMT and NMT systems whose sizes of training corpora are about 50 million and 10 million sentence pairs respectively achieved comparable scores for automatic evaluations, but NMT systems were superior to SMT systems for both official and in-house human evaluations.
pdf
bib
abs Kyoto University Participation to WAT 2017 Fabien Cromieres
|
Raj Dabre
|
Toshiaki Nakazawa
|
Sadao Kurohashi
We describe here our approaches and results on the WAT 2017 shared translation tasks. Following our good results with Neural Machine Translation in the previous shared task, we continue this approach this year, with incremental improvements in models and training methods. We focused on the ASPEC dataset and could improve the state-of-the-art results for Chinese-to-Japanese and Japanese-to-Chinese translations.
pdf
bib
abs CUNINMT System for WAT 2017 Translation Tasks Tom Kocmi
|
Dušan Variš
|
Ondřej Bojar
The paper presents this year’s CUNI submissions to the WAT 2017 Translation Task focusing on the Japanese-English translation, namely Scientific papers subtask, Patents subtask and Newswire subtask. We compare two neural network architectures, the standard sequence-to-sequence with attention (Seq2Seq) and an architecture using convolutional sentence encoder (FBConv2Seq), both implemented in the NMT framework Neural Monkey that we currently participate in developing. We also compare various types of preprocessing of the source Japanese sentences and their impact on the overall results. Furthermore, we include the results of our experiments with out-of-domain data obtained by combining the corpora provided for each subtask.
pdf
bib
abs Tokyo Metropolitan University Neural Machine Translation System for WAT 2017 Yukio Matsumura
|
Mamoru Komachi
In this paper, we describe our neural machine translation (NMT) system, which is based on the attention-based NMT and uses long short-term memories (LSTM) as RNN. We implemented beam search and ensemble decoding in the NMT system. The system was tested on the 4th Workshop on Asian Translation (WAT 2017) shared tasks. In our experiments, we participated in the scientific paper subtasks and attempted Japanese-English, English-Japanese, and Japanese-Chinese translation tasks. The experimental results showed that implementation of beam search and ensemble decoding can effectively improve the translation quality.
pdf
bib
abs Comparing Recurrent and Convolutional Architectures for English-Hindi Neural Machine Translation Sandhya Singh
|
Ritesh Panjwani
|
Anoop Kunchukuttan
|
Pushpak Bhattacharyya
In this paper, we empirically compare the two encoder-decoder neural machine translation architectures: convolutional sequence to sequence model (ConvS2S) and recurrent sequence to sequence model (RNNS2S) for English-Hindi language pair as part of IIT Bombay’s submission to WAT2017 shared task. We report the results for both English-Hindi and Hindi-English direction of language pair.