Abstract
Developing task-oriented bots requires diverse sets of annotated user utterances to learn mappings between natural language utterances and user intents. Automated paraphrase generation offers a cost-effective and scalable approach for generating varied training samples by creating different versions of the same utterance. However, existing sequence-to-sequence models used in automated paraphrasing often suffer from errors, such as repetition and grammar. Identifying these errors, particularly in transformer architectures, has become a challenge. In this paper, we propose a taxonomy of errors encountered in transformer-based paraphrase generation models based on a comprehensive error analysis of transformer-generated paraphrases. Leveraging this taxonomy, we introduced the Transformer-based Paraphrasing Model Errors dataset, consisting of 5880 annotated paraphrases labeled with error types and explanations. Additionally, we developed a novel multilabel paraphrase annotation model by fine-tuning a BERT model for error annotation task. Evaluation against human annotations demonstrates significant agreement, with the model showing robust performance in predicting error labels, even for unseen paraphrases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
The prompt we used can be found in the supplementary material link supplied.
- 5.
- 6.
OR refers to the state of Oregon.
- 7.
Paraphrase generation is a multi-step (word-by-word) prediction task, where a small error at an early time-step may lead to poor predictions for the rest of the sentence, as the error is compounded over the next token predictions [8].
References
Alikaniotis, D., Raheja, V.: The unreasonable effectiveness of transformer language models in grammatical error correction. In: BEA@ACL (2019)
Bannard, C., Callison, C.: Paraphrasing with bilingual parallel corpora. In: ACL’05, pp. 597–604 (2005). https://aclanthology.org/P05-1074
Berro, A., Fard, M.A.Y.Z., et al.: An extensible and reusable pipeline for automated utterance paraphrases. In: PVLDB (2021)
Brown, T.B., et al.: Language models are few-shot learners. In: NeurIPS (2020)
Bui, T.C., Le, V.D., To, H.T., Cha, S.K.: Generative pre-training for paraphrase generation by representing and predicting spans in exemplars. In: 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 83–90. IEEE (2021)
Cegin, J., Simko, J., Brusilovsky, P.: ChatGPT to replace crowdsourcing of paraphrases for intent classification: Higher diversity and comparable model robustness (2023). arXiv preprint arXiv:2305.12947
Celikyilmaz, A., Clark, E., Gao, J.: Evaluation of text generation: A survey (2020)
Chen, D., Dolan, W.B.: Collecting highly parallel data for paraphrase evaluation. ACL-HLT, pp. 190–200 (2011). https://aclanthology.org/P11-1020
Chklovski, T.: Collecting paraphrase corpora from volunteer contributors. In: Proceedings of the 3rd International Conference on Knowledge Capture, pp. 115–120 (2005)
Dopierre, T., Gravier, C., Logerais, W.: ProtAugment: unsupervised diverse short-texts paraphrasing for intent detection meta-learning. In: ACL-IJCNLP (2021). https://aclanthology.org/2021.acl-long.191
Dou, Y., Forbes, M., et al.: Is GPT-3 text indistinguishable from human text? Scarecrow: a framework for scrutinizing machine text. In: ACL, pp. 7250–7274 (2022)
Ethayarajh, K.: How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. EMNLP-IJCNLP (2019)
Freitag, M., Foster, G., et al.: Experts, errors, and context: a large-scale study of human evaluation for machine translation. Trans. Assoc. Comput. Linguist. 9, 1460–1474 (2021). https://aclanthology.org/2021.tacl-1.87
Fujita, A.: Automatic generation of syntactically well-formed and semantically appropriate paraphrases. Ph.D. thesis, Ph. D. thesis, Nara Institute of Science and Technology (2005). https://api.semanticscholar.org/CorpusID:16348044
Fujita, A., Furihata, K., Inui, K., Matsumoto, Y., Takeuchi, K.: Paraphrasing of japanese light-verb constructions based on lexical conceptual structure (2004)
Goyal, T., Durrett, G.: Neural syntactic preordering for controlled paraphrase generation, pp. 238–252 (2020)
Hegde, C., Patil, S.: Unsupervised paraphrase generation using pre-trained language models (2020)
Huang, S., Wu, Y., Wei, F., Luan, Z.: Dictionary-guided editing networks for paraphrase generation 33, 6546–6553 (2019)
Huang, T.H., Chen, Y.N., Bigham, J.P.: Real-time on-demand crowd-powered entity extraction (2017). https://arxiv.org/abs/1704.03627
Iyyer, M., Wieting, J., Gimpel, K., Zettlemoyer, L.: Adversarial example generation with syntactically controlled paraphrase networks, pp. 1875–1885 (2018)
Jiang, Y., Kummerfeld, J.K., Lasecki, W.S.: Understanding task design trade-offs in crowdsourced paraphrase collection. In: ACL 55th Annual Meeting, pp. 103–109. Vancouver, Canada (Jul 2017)
Koponen, M.: Assessing machine translation quality with error analysis (2010)
Larson, S., Cheung, A., Mahendran, A., et al.: Inconsistencies in crowdsourced slot-filling annotations: a typology and identification methods. In: COLING (2020). https://aclanthology.org/2020.coling-main.442
Li, Z., Jiang, X., Shang, L., Li, H.: Paraphrase generation with deep reinforcement learning. EMNLP (2018). https://aclanthology.org/D18-1421
Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: A survey of data-driven methods. CL (2010). https://aclanthology.org/J10-3003
Mallinson, J., Sennrich, R., Lapata, M.: Paraphrasing revisited with neural machine translation. ACL European Chapter (2017). https://aclanthology.org/E17-1083
Metzler, D., Hovy, E., Zhang, C.: An empirical evaluation of data-driven paraphrase generation techniques. In: ACL 49th Annual Meeting, pp. 546–551. Portland, Oregon, USA (2011)
Negri, M., Mehdad, Y., Marchetti, A., Giampiccolo, D., Bentivogli, L.: Chinese whispers: Cooperative paraphrase acquisition. In: LREC’12, pp. 2659–2665. Istanbul, Turkey (2012)
Nilforoshan, H., Wang, J., Wu, E.: PreCog: Improving crowdsourced data quality before acquisition (2017). arXiv preprint arXiv:1704.02384
Popović, M.: On nature and causes of observed MT errors. MTSummitXVIII (2021)
Prakash, A., et al.: Neural paraphrase generation with stacked residual LSTM networks. In: COLING (2016)
Raffel, C., Shazeer, N., Roberts, A., Lee, K., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. In: JMLR (2020)
Ramírez, J., Berro, A., Baez, M., Benatallah, B., Casati, F.: Crowdsourcing diverse paraphrases for training task-oriented bots (2021)
Reimers, N., Gurevych, I.: Sentence-BERT: Sentence embeddings using siamese BERT-networks. EMNLP (2019). https://aclanthology.org/D19-1410
Ribeiro, M.T., Wu, T., Guestrin, C., Singh, S.: Beyond accuracy: behavioral testing of NLP models with checklist. In: ACL, pp. 4902–4912 (2020). https://aclanthology.org/2020.acl-main.442
Su, Y., Awadallah, A.H., Khabsa, M., Pantel, P., Gamon, M., Encarnacion, M.: Building natural language interfaces to web APIs (2017)
Sun, X., Liu, J., Lyu, Y., et al.: Answer-focused and position-aware neural question generation. EMNLP (2018). https://aclanthology.org/D18-1427
Thompson, B., Post, M.: Automatic machine translation evaluation in many languages via zero-shot paraphrasing. EMNLP (2020)
Thomson, C., Reiter, E.: A gold standard methodology for evaluating accuracy in data-to-text systems. In: INLG (2020). https://aclanthology.org/2020.inlg-1.22
Van, E., Clinciu, M., et al.: Underreporting of errors in NLG output, and what to do about it. INLG (2021). https://aclanthology.org/2021.inlg-1.14
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)
Witteveen, S., Andrews, M.: Paraphrasing with large language models (2019)
Yaghoub-Zadeh-Fard, M., Benatallah, B., et al.: Dynamic word recommendation to obtain diverse crowdsourced paraphrases of user utterances. In: IUI (2020)
Yaghoub-Zadeh-Fard, M.A., Benatallah, B., et al.: User utterance acquisition for training task-oriented bots: A review of challenges, techniques and opportunities (2020)
Yaghoubzadeh, M., Benatallah, B., et al.: A study of incorrect paraphrases in crowdsourced user utterances. NAACL’19 (2019). https://aclanthology.org/N19-1026
Yaghoubzadehfard, M.: Scalable and Quality-Aware Training Data Acquisition for Conversational Cognitive Services. Ph.D. thesis, UNSW Sydney (2021)
Zamanirad, S.: Superimposition of natural language conversations over software enabled services. Ph.D. thesis, University of New South Wales, Sydney, Australia (2019)
Zeng, D., Zhang, H., Xiang, L., Wang, J., Ji, G.: User-oriented paraphrase generation with keywords controlled network. IEEE Access 7, 80542–80551 (2019)
Zhou, J., Bhat, S.: Paraphrase generation: a survey of the state of the art. In: EMNLP (2021). https://aclanthology.org/2021.emnlp-main.414
Acknowledgments
We acknowledge the financial support provided by the PICASSO Idex Lyon scholarship, which supported the research conducted by Auday Berro as part of Ph.D. studies.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Berro, A., Benatallah, B., Gaci, Y., Benabdeslem, K. (2024). Error Types in Transformer-Based Paraphrasing Models: A Taxonomy, Paraphrase Annotation Model and Dataset. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14941. Springer, Cham. https://doi.org/10.1007/978-3-031-70341-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-70341-6_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70340-9
Online ISBN: 978-3-031-70341-6
eBook Packages: Computer ScienceComputer Science (R0)