Multilingual Whispers: Generating Paraphrases with Translation

Christian Federmann; Oussama Elachqar; Chris Quirk

doi:10.18653/v1/D19-5503

Multilingual Whispers: Generating Paraphrases with Translation

Christian Federmann, Oussama Elachqar, Chris Quirk

Abstract

Naturally occurring paraphrase data, such as multiple news stories about the same event, is a useful but rare resource. This paper compares translation-based paraphrase gathering using human, automatic, or hybrid techniques to monolingual paraphrasing by experts and non-experts. We gather translations, paraphrases, and empirical human quality assessments of these approaches. Neural machine translation techniques, especially when pivoting through related languages, provide a relatively robust source of paraphrases with diversity comparable to expert human paraphrases. Surprisingly, human translators do not reliably outperform neural systems. The resulting data release will not only be a useful test set, but will also allow additional explorations in translation and paraphrase quality assessments and relationships.

Anthology ID:: D19-5503
Volume:: Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:: WNUT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17–26
Language:
URL:: https://aclanthology.org/D19-5503
DOI:: 10.18653/v1/D19-5503
Bibkey:
Cite (ACL):: Christian Federmann, Oussama Elachqar, and Chris Quirk. 2019. Multilingual Whispers: Generating Paraphrases with Translation. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 17–26, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Multilingual Whispers: Generating Paraphrases with Translation (Federmann et al., WNUT 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-5503.pdf

PDF Cite Search