Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3626772.3657779acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Open access

DAC: Quantized Optimal Transport Reward-based Reinforcement Learning Approach to Detoxify Query Auto-Completion

Published: 11 July 2024 Publication History

Abstract

Modern Query Auto-Completion (QAC) systems utilize natural language generation (NLG) using large language models (LLM) to achieve remarkable performance. However, these systems are prone to generating biased and toxic completions due to inherent learning biases. Existing detoxification approaches exhibit two key limitations: (1) They primarily focus on mitigating toxicity for grammatically well-formed long sentences but struggle to adapt to the QAC task, where queries are short and structurally different (include spelling errors, do not follow grammatical rules and have relatively flexible word order). (2) These approaches often view detoxification through a binary lens where all text labeled as toxic is undesirable, and non-toxic is considered desirable. To address these limitations, we propose DAC, an intuitive and efficient reinforcement learning-based model to detoxify QAC. With DAC, we introduce an additional perspective of considering the third query class of addressable toxicity. These queries can encompass implicit toxicity, subjective toxicity, or non-toxic queries containing toxic words. We incorporate this three-class query behavior perspective into the proposed model through quantized optimal transport to learn distinctions and generate truly non-toxic completions. We evaluate toxicity levels in the generated completions by DAC across two real-world QAC datasets (Bing and AOL) using two classifiers: a publicly available generic classifier (Detoxify) and a search query-specific classifier, which we develop (TClassify). We find that DAC consistently outperforms all existing baselines on the Bing dataset and achieves competitive performance on the AOL dataset for query detoxification. % providing high quality and low toxicity. We make the code publicly available.

References

[1]
Pinkesh Badjatiya, Manish Gupta, and Vasudeva Varma. 2019. Stereotypical bias removal for hate speech detection task using knowledge-based generalizations. In The world wide web conference. 49--59.
[2]
Ziv Bar-Yossef and Naama Kraus. 2011. Context-sensitive query auto-completion. In Proceedings of the 20th international conference on World wide web. 107--116.
[3]
Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning, Vol. 11, 23--581 (2010), 81.
[4]
Fei Cai, Maarten De Rijke, et al. 2016. A survey of query auto completion in information retrieval. Foundations and Trends® in Information Retrieval, Vol. 10, 4 (2016), 273--363.
[5]
Villani Cédric. 2008. Optimal Transport: Old and New (Grundlehren der mathematischen Wissenschaften, 338). English. Springer (2008), 998.
[6]
Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, and Lawrence Carin. 2019. Improving Sequence-to-Sequence Learning via Optimal Transport. ICLR (2019).
[7]
Aleksandr Chuklin and Alisa Lavrentyeva. 2013. Adult query classification for web search and recommendation. Search and Exploration of X-rated Information (WSDM'13 Workshop Proceedings (2013), 15--16.
[8]
Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu. 2020. Plug and Play Language Models: A Simple Approach to Controlled Text Generation. In International Conference on Learning Representations.
[9]
Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media, Vol. 11. 512--515.
[10]
Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A Smith. 2020. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 3356--3369.
[11]
S. Gibbs. 2016. Google alters search autocomplete to remove 'are Jews evil' suggestion. The Guardian, Vol. https://www.theguardian.com/technology/2016/dec/05/google-alters-search-autocompleteremove-are-jews-evil-suggestion (2016).
[12]
Manish Gupta, Meghana Joshi, and Puneet Agrawal. 2023. Deep Learning Methods for Query Auto Completion. In European Conference on Information Retrieval. Springer, 341--348.
[13]
Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 8342--8360.
[14]
Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, and Ece Kamar. 2022. ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, 3309--3326.
[15]
Timothy J. Hazen, Alexandra Olteanu, Gabriella Kazai, Fernando Diaz, and Michael Golebiewski. 2020. On the Social and Technical Challenges of Web Search Autosuggestion Moderation. CoRR, Vol. abs/2007.05039 (2020). showeprint[arXiv]2007.05039 https://arxiv.org/abs/2007.05039
[16]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2333--2338.
[17]
Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, Caiming Xiong, and Richard Socher. 2019. CTRL - A Conditional Transformer Language Model for Controllable Generation. arXiv preprint arXiv:1909.05858 (2019).
[18]
Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq Joty, Richard Socher, and Nazneen Fatema Rajani. 2021. GeDi: Generative Discriminator Guided Sequence Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 4929--4952.
[19]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In International conference on machine learning. PMLR, 957--966.
[20]
Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Scott Sorensen, Jai Gupta, Donald Metzler, and Lucy Vasserman. 2022. A New Generation of Perspective API: Efficient Multilingual Character-level Transformers. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022).
[21]
Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, and Yejin Choi. 2021. DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 6691--6706.
[22]
Varvara Logacheva, Daryna Dementieva, Sergey Ustyantsev, Daniil Moskovskiy, David Dale, Irina Krotova, Nikita Semenov, and Alexander Panchenko. 2022. ParaDetox: Detoxification with Parallel Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6804--6818.
[23]
Ximing Lu, Sean Welleck, Jack Hessel, Liwei Jiang, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, and Yejin Choi. 2022. Quark: Controllable text generation with reinforced unlearning. Advances in neural information processing systems, Vol. 35 (2022), 27591--27609.
[24]
Aishwarya Maheswaran, Kaushal Kumar Maurya, Manish Gupta, and Maunendra Sankar Desarkar. 2024. DQAC: Detoxifying Query Auto-completion with Adapters. In Advances in Knowledge Discovery and Data Mining. Springer Nature Singapore, 108--120.
[25]
Ankita Maity, Anubhav Sharma, Rudra Dhar, Tushar Abhishek, Manish Gupta, and Vasudeva Varma. 2024. Multilingual Bias Detection and Mitigation for Indian Languages. ArXiv, Vol. abs/2312.15181 (2024).
[26]
Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Manish Gupta, and Puneet Agrawal. 2023a. Trie-NLG: Trie context augmentation to improve personalized query auto-completion for short and unseen prefixes. In Data Mining and Knowledge Discovery, Vol. 1573--756X. ECML-PKDD 2023.
[27]
Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Manish Gupta, and Puneet Agrawal. 2023b. trie-nlg: trie context augmentation to improve personalized query auto-completion for short and unseen prefixes. Data Mining and Knowledge Discovery (2023), 1--24.
[28]
Bhaskar Mitra and Nick Craswell. 2015. Query auto-completion for rare prefixes. In Proceedings of the 24th ACM international on conference on information and knowledge management. 1755--1758.
[29]
Agnès Mustar, Sylvain Lamprier, and Benjamin Piwowarski. 2020. Using BERT and BART for query suggestion. In Joint Conference of the Information Retrieval Communities in Europe, Vol. 2621. CEUR-WS. org.
[30]
Greg Pass, Abdur Chowdhury, and Cayley Torgeson. 2006. A picture of search. In Proceedings of the 1st international conference on Scalable information systems. 1--es.
[31]
Gabriel Peyré, Marco Cuturi, et al. 2017. Computational optimal transport. Center for Research in Economics and Statistics Working Papers 2017--86 (2017).
[32]
Luiza Amador Pozzobon, Beyza Ermis, Patrick Lewis, and Sara Hooker. 2023. On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research. In ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models. https://openreview.net/forum?id=bRDHL4J5vy
[33]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog, Vol. 1, 8 (2019), 9.
[34]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, Vol. 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.html
[35]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
[36]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. CoRR, Vol. abs/1707.06347 (2017).
[37]
B. Settles. 2009. Active Learning Literature Survey. Tech. rep (2009), URL: http://digital.library.wisc.edu/1793/.
[38]
Leandro Silva, Mainack Mondal, Denzil Correa, Fabrício Benevenuto, and Ingmar Weber. 2016. Analyzing the targets of hate in online social media. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 10. 687--690.
[39]
Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul F Christiano. 2020. Learning to summarize with human feedback. Advances in Neural Information Processing Systems, Vol. 33 (2020), 3008--3021.
[40]
Saedeh Tahery and Saeed Farzi. 2020. Customized query auto-completion and suggestion-A review. Information Systems, Vol. 87 (2020), 101415.
[41]
Johannes Welbl, Amelia Glaese, Jonathan Uesato, Sumanth Dathathri, John Mellor, Lisa Anne Hendricks, Kirsty Anderson, Pushmeet Kohli, Ben Coppin, and Po-Sen Huang. 2021. Challenges in Detoxifying Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 2447--2469.
[42]
Qiang Wu, Christopher JC Burges, Krysta M Svore, and Jianfeng Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval, Vol. 13 (2010), 254--270.
[43]
Zeqiu Wu, Yushi Hu, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A Smith, Mari Ostendorf, and Hannaneh Hajishirzi. 2024. Fine-grained human feedback gives better rewards for language model training. Advances in Neural Information Processing Systems, Vol. 36 (2024).
[44]
Jingjing Xu, Hao Zhou, Chun Gan, Zaixiang Zheng, and Lei Li. 2021. Vocabulary Learning via Optimal Transport for Neural Machine Translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 7361--7373.
[45]
Nishant Yadav, Rajat Sen, Daniel N Hill, Arya Mazumdar, and Inderjit S Dhillon. 2021. Session-aware query auto-completion using extreme multi-label ranking. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3835--3844.
[46]
Harish Yenala, Manoj Chinnakotla, and Jay Goyal. 2017. Convolutional bi-directional LSTM for detecting inappropriate query suggestions in web search. In Advances in Knowledge Discovery and Data Mining: 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part I 21. Springer, 3--16.
[47]
Di Yin, Jiwei Tan, Zhe Zhang, Hongbo Deng, Shujian Huang, and Jiajun Chen. 2020. Learning to generate personalized query auto-completions via a multi-view multi-task attentive approach. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2998--3007.
[48]
Liu Zhuang, Lin Wayne, Shi Ya, and Zhao Jun. 2021. A Robustly Optimized BERT Pre-training Approach with Post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics. Chinese Information Processing Society of China, Huhhot, China, 1218--1227.
[49]
Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. 2019. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593 (2019).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2024
3164 pages
ISBN:9798400704314
DOI:10.1145/3626772
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2024

Check for updates

Author Tags

  1. auto-completion
  2. large language models
  3. optimal transport
  4. query detoxification
  5. reinforcement learning

Qualifiers

  • Research-article

Funding Sources

Conference

SIGIR 2024
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 185
    Total Downloads
  • Downloads (Last 12 months)185
  • Downloads (Last 6 weeks)77
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media