Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Error Types in Transformer-Based Paraphrasing Models: A Taxonomy, Paraphrase Annotation Model and Dataset

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Abstract

Developing task-oriented bots requires diverse sets of annotated user utterances to learn mappings between natural language utterances and user intents. Automated paraphrase generation offers a cost-effective and scalable approach for generating varied training samples by creating different versions of the same utterance. However, existing sequence-to-sequence models used in automated paraphrasing often suffer from errors, such as repetition and grammar. Identifying these errors, particularly in transformer architectures, has become a challenge. In this paper, we propose a taxonomy of errors encountered in transformer-based paraphrase generation models based on a comprehensive error analysis of transformer-generated paraphrases. Leveraging this taxonomy, we introduced the Transformer-based Paraphrasing Model Errors dataset, consisting of 5880 annotated paraphrases labeled with error types and explanations. Additionally, we developed a novel multilabel paraphrase annotation model by fine-tuning a BERT model for error annotation task. Evaluation against human annotations demonstrates significant agreement, with the model showing robust performance in predicting error labels, even for unseen paraphrases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/AudayBerro/TPME/tree/master.

  2. 2.

    https://github.com/sonos/nlu-benchmark.

  3. 3.

    https://github.com/GEM-benchmark/NL-Augmenter.

  4. 4.

    The prompt we used can be found in the supplementary material link supplied.

  5. 5.

    https://upset.app/ and https://asntech.shinyapps.io/intervene/.

  6. 6.

    OR refers to the state of Oregon.

  7. 7.

    Paraphrase generation is a multi-step (word-by-word) prediction task, where a small error at an early time-step may lead to poor predictions for the rest of the sentence, as the error is compounded over the next token predictions [8].

References

  1. Alikaniotis, D., Raheja, V.: The unreasonable effectiveness of transformer language models in grammatical error correction. In: BEA@ACL (2019)

    Google Scholar 

  2. Bannard, C., Callison, C.: Paraphrasing with bilingual parallel corpora. In: ACL’05, pp. 597–604 (2005). https://aclanthology.org/P05-1074

  3. Berro, A., Fard, M.A.Y.Z., et al.: An extensible and reusable pipeline for automated utterance paraphrases. In: PVLDB (2021)

    Google Scholar 

  4. Brown, T.B., et al.: Language models are few-shot learners. In: NeurIPS (2020)

    Google Scholar 

  5. Bui, T.C., Le, V.D., To, H.T., Cha, S.K.: Generative pre-training for paraphrase generation by representing and predicting spans in exemplars. In: 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 83–90. IEEE (2021)

    Google Scholar 

  6. Cegin, J., Simko, J., Brusilovsky, P.: ChatGPT to replace crowdsourcing of paraphrases for intent classification: Higher diversity and comparable model robustness (2023). arXiv preprint arXiv:2305.12947

  7. Celikyilmaz, A., Clark, E., Gao, J.: Evaluation of text generation: A survey (2020)

    Google Scholar 

  8. Chen, D., Dolan, W.B.: Collecting highly parallel data for paraphrase evaluation. ACL-HLT, pp. 190–200 (2011). https://aclanthology.org/P11-1020

  9. Chklovski, T.: Collecting paraphrase corpora from volunteer contributors. In: Proceedings of the 3rd International Conference on Knowledge Capture, pp. 115–120 (2005)

    Google Scholar 

  10. Dopierre, T., Gravier, C., Logerais, W.: ProtAugment: unsupervised diverse short-texts paraphrasing for intent detection meta-learning. In: ACL-IJCNLP (2021). https://aclanthology.org/2021.acl-long.191

  11. Dou, Y., Forbes, M., et al.: Is GPT-3 text indistinguishable from human text? Scarecrow: a framework for scrutinizing machine text. In: ACL, pp. 7250–7274 (2022)

    Google Scholar 

  12. Ethayarajh, K.: How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. EMNLP-IJCNLP (2019)

    Google Scholar 

  13. Freitag, M., Foster, G., et al.: Experts, errors, and context: a large-scale study of human evaluation for machine translation. Trans. Assoc. Comput. Linguist. 9, 1460–1474 (2021). https://aclanthology.org/2021.tacl-1.87

  14. Fujita, A.: Automatic generation of syntactically well-formed and semantically appropriate paraphrases. Ph.D. thesis, Ph. D. thesis, Nara Institute of Science and Technology (2005). https://api.semanticscholar.org/CorpusID:16348044

  15. Fujita, A., Furihata, K., Inui, K., Matsumoto, Y., Takeuchi, K.: Paraphrasing of japanese light-verb constructions based on lexical conceptual structure (2004)

    Google Scholar 

  16. Goyal, T., Durrett, G.: Neural syntactic preordering for controlled paraphrase generation, pp. 238–252 (2020)

    Google Scholar 

  17. Hegde, C., Patil, S.: Unsupervised paraphrase generation using pre-trained language models (2020)

    Google Scholar 

  18. Huang, S., Wu, Y., Wei, F., Luan, Z.: Dictionary-guided editing networks for paraphrase generation 33, 6546–6553 (2019)

    Google Scholar 

  19. Huang, T.H., Chen, Y.N., Bigham, J.P.: Real-time on-demand crowd-powered entity extraction (2017). https://arxiv.org/abs/1704.03627

  20. Iyyer, M., Wieting, J., Gimpel, K., Zettlemoyer, L.: Adversarial example generation with syntactically controlled paraphrase networks, pp. 1875–1885 (2018)

    Google Scholar 

  21. Jiang, Y., Kummerfeld, J.K., Lasecki, W.S.: Understanding task design trade-offs in crowdsourced paraphrase collection. In: ACL 55th Annual Meeting, pp. 103–109. Vancouver, Canada (Jul 2017)

    Google Scholar 

  22. Koponen, M.: Assessing machine translation quality with error analysis (2010)

    Google Scholar 

  23. Larson, S., Cheung, A., Mahendran, A., et al.: Inconsistencies in crowdsourced slot-filling annotations: a typology and identification methods. In: COLING (2020). https://aclanthology.org/2020.coling-main.442

  24. Li, Z., Jiang, X., Shang, L., Li, H.: Paraphrase generation with deep reinforcement learning. EMNLP (2018). https://aclanthology.org/D18-1421

  25. Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: A survey of data-driven methods. CL (2010). https://aclanthology.org/J10-3003

  26. Mallinson, J., Sennrich, R., Lapata, M.: Paraphrasing revisited with neural machine translation. ACL European Chapter (2017). https://aclanthology.org/E17-1083

  27. Metzler, D., Hovy, E., Zhang, C.: An empirical evaluation of data-driven paraphrase generation techniques. In: ACL 49th Annual Meeting, pp. 546–551. Portland, Oregon, USA (2011)

    Google Scholar 

  28. Negri, M., Mehdad, Y., Marchetti, A., Giampiccolo, D., Bentivogli, L.: Chinese whispers: Cooperative paraphrase acquisition. In: LREC’12, pp. 2659–2665. Istanbul, Turkey (2012)

    Google Scholar 

  29. Nilforoshan, H., Wang, J., Wu, E.: PreCog: Improving crowdsourced data quality before acquisition (2017). arXiv preprint arXiv:1704.02384

  30. Popović, M.: On nature and causes of observed MT errors. MTSummitXVIII (2021)

    Google Scholar 

  31. Prakash, A., et al.: Neural paraphrase generation with stacked residual LSTM networks. In: COLING (2016)

    Google Scholar 

  32. Raffel, C., Shazeer, N., Roberts, A., Lee, K., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. In: JMLR (2020)

    Google Scholar 

  33. Ramírez, J., Berro, A., Baez, M., Benatallah, B., Casati, F.: Crowdsourcing diverse paraphrases for training task-oriented bots (2021)

    Google Scholar 

  34. Reimers, N., Gurevych, I.: Sentence-BERT: Sentence embeddings using siamese BERT-networks. EMNLP (2019). https://aclanthology.org/D19-1410

  35. Ribeiro, M.T., Wu, T., Guestrin, C., Singh, S.: Beyond accuracy: behavioral testing of NLP models with checklist. In: ACL, pp. 4902–4912 (2020). https://aclanthology.org/2020.acl-main.442

  36. Su, Y., Awadallah, A.H., Khabsa, M., Pantel, P., Gamon, M., Encarnacion, M.: Building natural language interfaces to web APIs (2017)

    Google Scholar 

  37. Sun, X., Liu, J., Lyu, Y., et al.: Answer-focused and position-aware neural question generation. EMNLP (2018). https://aclanthology.org/D18-1427

  38. Thompson, B., Post, M.: Automatic machine translation evaluation in many languages via zero-shot paraphrasing. EMNLP (2020)

    Google Scholar 

  39. Thomson, C., Reiter, E.: A gold standard methodology for evaluating accuracy in data-to-text systems. In: INLG (2020). https://aclanthology.org/2020.inlg-1.22

  40. Van, E., Clinciu, M., et al.: Underreporting of errors in NLG output, and what to do about it. INLG (2021). https://aclanthology.org/2021.inlg-1.14

  41. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  42. Witteveen, S., Andrews, M.: Paraphrasing with large language models (2019)

    Google Scholar 

  43. Yaghoub-Zadeh-Fard, M., Benatallah, B., et al.: Dynamic word recommendation to obtain diverse crowdsourced paraphrases of user utterances. In: IUI (2020)

    Google Scholar 

  44. Yaghoub-Zadeh-Fard, M.A., Benatallah, B., et al.: User utterance acquisition for training task-oriented bots: A review of challenges, techniques and opportunities (2020)

    Google Scholar 

  45. Yaghoubzadeh, M., Benatallah, B., et al.: A study of incorrect paraphrases in crowdsourced user utterances. NAACL’19 (2019). https://aclanthology.org/N19-1026

  46. Yaghoubzadehfard, M.: Scalable and Quality-Aware Training Data Acquisition for Conversational Cognitive Services. Ph.D. thesis, UNSW Sydney (2021)

    Google Scholar 

  47. Zamanirad, S.: Superimposition of natural language conversations over software enabled services. Ph.D. thesis, University of New South Wales, Sydney, Australia (2019)

    Google Scholar 

  48. Zeng, D., Zhang, H., Xiang, L., Wang, J., Ji, G.: User-oriented paraphrase generation with keywords controlled network. IEEE Access 7, 80542–80551 (2019)

    Article  Google Scholar 

  49. Zhou, J., Bhat, S.: Paraphrase generation: a survey of the state of the art. In: EMNLP (2021). https://aclanthology.org/2021.emnlp-main.414

Download references

Acknowledgments

We acknowledge the financial support provided by the PICASSO Idex Lyon scholarship, which supported the research conducted by Auday Berro as part of Ph.D. studies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Auday Berro .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Berro, A., Benatallah, B., Gaci, Y., Benabdeslem, K. (2024). Error Types in Transformer-Based Paraphrasing Models: A Taxonomy, Paraphrase Annotation Model and Dataset. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14941. Springer, Cham. https://doi.org/10.1007/978-3-031-70341-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70341-6_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70340-9

  • Online ISBN: 978-3-031-70341-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics