Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

COSYWA: Enhancing Semantic Integrity in Watermarking Natural Language Generation

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2023)

Abstract

With the increasing use of natural language generation (NLG) models, there is a growing need to differentiate between machine-generated text and natural language text. One promising approach is watermarking, which can help identify machine-generated text and protect against risks such as spam emails and academic dishonesty. However, existing watermarking methods can significantly affect the semantic meaning of the text, creating a need for more effective techniques that maintain semantic integrity. In this paper, we propose a novel watermarking method called COntextual SYnonym WAtermarking (COSYWA) that embeds watermarks in text using a Masked Language Model (MLM) without significantly impairing its semantics. Specifically, we use post-processing to embed watermarks in the output of an NLG model. We generate a context-based synonym set using an MLM model to embed watermark information and use statistical hypothesis testing to detect the existence of watermarking. Our experimental results show that COSYWA significantly enhances the text’s capacity to maintain its original meaning while effectively embedding a watermark, making it a promising approach for protecting against misinformation in NLG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adi, Y., et al.: Turning your weakness into a strength: watermarking deep neural networks by backdooring. In: USENIX-Security, pp. 1615–1631 (2018)

    Google Scholar 

  2. Baki, S., et al.: Scaling and effectiveness of email masquerade attacks: exploiting natural language generation. In: ACM ASIACCS, pp. 469–482 (2017)

    Google Scholar 

  3. Bender, E., et al.: On the dangers of stochastic parrots: can language models be too big? In: ACM FAccT, pp. 610–623 (2021)

    Google Scholar 

  4. Bojar, O., et al.: Findings of the 2014 workshop on statistical machine translation. In: WMT, pp. 12–58 (2014)

    Google Scholar 

  5. Chang, C., Clark, S.: Practical linguistic steganography using contextual synonym substitution and a novel vertex coding method. Comput. Linguist. 40(2), 403–448 (2014)

    Article  Google Scholar 

  6. Ching-Yun, C., Stephen, C.: Practical linguistic steganography using contextual synonym substitution and vertex colour coding. In: EMNLP, pp. 1194–1203 (2010)

    Google Scholar 

  7. Crothers, E., et al.: Machine generated text: a comprehensive survey of threat models and detection methods. arXiv preprint arXiv:2210.07321 (2022)

  8. Dehouche, N.: Plagiarism in the age of massive generative pre-trained transformers (GPT-3). ESEP 21, 17–23 (2021)

    Article  Google Scholar 

  9. Devlin, J., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)

    Google Scholar 

  10. Giaretta, A., Dragoni, N.: Community targeted phishing: a middle ground between massive and spear phishing through natural language generation. In: Ciancarini, P., Mazzara, M., Messina, A., Sillitti, A., Succi, G. (eds.) SEDA 2018. AISC, vol. 925, pp. 86–93. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-14687-0_8

    Chapter  Google Scholar 

  11. He, X., et al.: Cater: intellectual property protection on text generation APIs via conditional watermarks. In: NIPS (2022)

    Google Scholar 

  12. He, X., et al.: Protecting intellectual property of language generation APIs with lexical watermark. In: AAAI, vol. 36, pp. 10758–10766 (2022)

    Google Scholar 

  13. Kirchenbauer, J., et al.: A watermark for large language models. arXiv preprint arXiv:2301.10226 (2023)

  14. Kurenkov, A.: Lessons from the GPT-4Chan controversy. The Gradient (2022)

    Google Scholar 

  15. Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. ACL (2020)

    Google Scholar 

  16. Li, L., et al.: Bert-attack: adversarial attack against BERT using BERT. In: EMNLP (2020)

    Google Scholar 

  17. Lin, C.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  18. Ng, N., et al.: Facebook FAIR’s WMT19 news translation task submission. In: WMT (2020)

    Google Scholar 

  19. OpenAI: Chatgpt: Optimizing language models for dialogue (2022)

    Google Scholar 

  20. Papineni, K., et al.: Bleu: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)

    Google Scholar 

  21. Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)

    Google Scholar 

  22. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR 21(1), 5485–5551 (2020)

    MathSciNet  Google Scholar 

  23. See, A., et al.: Get to the point: summarization with pointer-generator networks. In: ACL, pp. 1073–1083 (2017)

    Google Scholar 

  24. Shu, K., et al.: Mining disinformation and fake news: concepts, methods, and recent advancements. Disinformation, misinformation, and fake news in social media: emerging research challenges and opportunities, pp. 1–19 (2020)

    Google Scholar 

  25. Stiff, H., Johansson, F.: Detecting computer-generated disinformation. Int. J. Data Sci. Anal. 13(4), 363–383 (2022)

    Article  Google Scholar 

  26. Stribling, J., et al.: SCIgen-an automatic CS paper generator (2005)

    Google Scholar 

  27. Szyller, S., et al.: Dawn: dynamic adversarial watermarking of neural networks. In: ACMM, pp. 4417–4425 (2021)

    Google Scholar 

  28. Topkara, U., et al.: The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In: MM &Sec, pp. 164–174 (2006)

    Google Scholar 

  29. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: EMNLP, pp. 38–45 (2020)

    Google Scholar 

  30. Zhang, T., et al.: Bertscore: evaluating text generation with BERT. In: ICLR (2020)

    Google Scholar 

  31. Zhou, W., et al.: Bert-based lexical substitution. In: ACL, pp. 3368–3373 (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 62006138), the Key Support Project of NSFC-Liaoning Joint Foundation (No. U1908216), and the Major Scientific Research Project of the State Language Commission in the 13th Five-Year Plan (No. WT135-38). We thank all anonymous reviewers for their valuable suggestions on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodong Shi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, J., Tan, Z., Shi, X. (2023). COSYWA: Enhancing Semantic Integrity in Watermarking Natural Language Generation. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14302. Springer, Cham. https://doi.org/10.1007/978-3-031-44693-1_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44693-1_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44692-4

  • Online ISBN: 978-3-031-44693-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics