Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512152acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Knowledge-aware Conversational Preference Elicitation with Bandit Feedback

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Conversational recommender systems (CRSs) have been proposed recently to mitigate the cold-start problem suffered by the traditional recommender systems. By introducing conversational key-terms, existing conversational recommenders can effectively reduce the need for extensive exploration and elicit the user preferences faster and more accurately. However, existing conversational recommenders leveraging key-terms heavily rely on the availability and quality of the key-terms, and their performances might degrade significantly when the key-terms are incomplete or not well labeled, which usually happens when there are new items being consistently incorporated into the systems and involving lots of human efforts to acquire well-labeled key-terms is costly. Besides, existing CRS methods leverage the feedback to different conversational key-terms separately, without considering the underlying relations between the key-terms. In this case, the learning of the conversational recommenders is sample inefficient, especially when there is a large number of candidate conversational key-terms.
    In this paper, we propose a knowledge-aware conversational preference elicitation framework and a bandit-based algorithm GraphConUCB. To achieve efficient preference elicitation given items with incompletely labeled key-terms, our algorithm leverage the underlying relations between the key-terms, guided by the knowledge graph. Being knowledge-aware, our algorithm propagates the user preferences via a pseudo graph feedback module, which also accelerates the exploration in the large action space of key-terms and improves the conversational sample efficiency. To select the most informative conversational key-terms in the graphs to conduct conversations, we further devise a graph-based optimal design module which leverages the graph structure. We provide the theoretical analysis of the regret upper bound for GraphConUCB. With extensive experiments, we show that our algorithm can effectively handle the items with incompletely labeled key-terms, and improves over the state-of-the-art baselines significantly.

    References

    [1]
    Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. 2011. Improved Algorithms for Linear Stochastic Bandits. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain. 2312–2320.
    [2]
    Naoki Abe and Philip M. Long. 1999. Associative Reinforcement Learning using Linear Probabilistic Concepts. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), Bled, Slovenia, June 27 - 30, 1999. Morgan Kaufmann, 3–11.
    [3]
    Charu C Aggarwal 2016. Recommender systems. Vol. 1. Springer.
    [4]
    Noga Alon, Nicolò Cesa-Bianchi, Ofer Dekel, and Tomer Koren. 2015. Online Learning with Feedback Graphs: Beyond Bandits. In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015(JMLR Workshop and Conference Proceedings, Vol. 40). JMLR.org, 23–35.
    [5]
    Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Shie Mannor, Yishay Mansour, and Ohad Shamir. 2017. Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback. SIAM J. Comput. 46, 6 (2017), 1785–1826.
    [6]
    Anthony Atkinson, Alexander Donev, and Randall Tobias. 2007. Optimum experimental designs, with SAS. Vol. 34. Oxford University Press.
    [7]
    Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2-3 (2002), 235–256.
    [8]
    Stephen Boyd, Stephen P Boyd, and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.
    [9]
    Qibin Chen, Junyang Lin, Yichang Zhang, Ming Ding, Yukuo Cen, Hongxia Yang, and Jie Tang. 2019. Towards Knowledge-Based Recommender Dialog System. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, 1803–1813.
    [10]
    Konstantina Christakopoulou, Alex Beutel, Rui Li, Sagar Jain, and Ed H. Chi. 2018. Q&R: A Two-Stage Approach toward Interactive Recommendation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19-23, 2018. ACM, 139–148.
    [11]
    Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards Conversational Recommender Systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. ACM, 815–824.
    [12]
    Varsha Dani, Thomas P. Hayes, and Sham M. Kakade. 2008. Stochastic Linear Optimization under Bandit Feedback. In 21st Annual Conference on Learning Theory - COLT 2008, Helsinki, Finland, July 9-12, 2008. Omnipress, 355–366.
    [13]
    Yang Deng, Yaliang Li, Fei Sun, Bolin Ding, and Wai Lam. 2021. Unified Conversational Recommendation Policy Learning via Graph-based Reinforcement Learning. In SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021. ACM, 1431–1441.
    [14]
    Zuohui Fu, Yikun Xian, Yaxin Zhu, Yongfeng Zhang, and Gerard de Melo. 2020. COOKIE: A Dataset for Conversational Recommendation over Knowledge Graphs in E-commerce. CoRR abs/2008.09237(2020). arXiv:2008.09237
    [15]
    Sudeep Gandhe and David R. Traum. 2008. Evaluation Understudy for Dialogue Coherence Models. In Proceedings of the SIGDIAL 2008 Workshop, The 9th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 19-20 June 2008, Ohio State University, Columbus, Ohio, USA. The Association for Computer Linguistics, 172–181.
    [16]
    Chongming Gao, Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2021. Advances and Challenges in Conversational Recommender Systems: A Survey. CoRR abs/2101.09459(2021). arXiv:2101.09459
    [17]
    Roger A Horn and Charles R Johnson. 2012. Matrix analysis. Cambridge university press.
    [18]
    Dietmar Jannach and Michael Jugovac. 2019. Measuring the Business Value of Recommender Systems. ACM Trans. Manag. Inf. Syst. 10, 4 (2019), 16:1–16:23.
    [19]
    Jack Kiefer and Jacob Wolfowitz. 1960. The equivalence of two extremum problems. Canadian Journal of Mathematics 12 (1960), 363–366.
    [20]
    Tomáš Kocák, Gergely Neu, Michal Valko, and Rémi Munos. 2014. Efficient learning by implicit exploration in bandit problems with side observations. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada. 613–621.
    [21]
    Tze Leung Lai and Herbert Robbins. 1985. Asymptotically efficient adaptive allocation rules. Advances in applied mathematics 6, 1 (1985), 4–22.
    [22]
    Tor Lattimore and Csaba Szepesvári. 2020. Bandit algorithms. Cambridge University Press.
    [23]
    Chung-Wei Lee, Haipeng Luo, and Mengxiao Zhang. 2020. A Closer Look at Small-loss Bounds for Bandits with Graph Feedback. In Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria](Proceedings of Machine Learning Research, Vol. 125). PMLR, 2516–2564.
    [24]
    Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2020. Conversational Recommendation: Formulation, Methods, and Evaluation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 2425–2428.
    [25]
    Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, and Tat-Seng Chua. 2020. Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems. In WSDM ’20: The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, February 3-7, 2020. ACM, 304–312.
    [26]
    Wenqiang Lei, Gangyi Zhang, Xiangnan He, Yisong Miao, Xiang Wang, Liang Chen, and Tat-Seng Chua. 2020. Interactive Path Reasoning on Graph for Conversational Recommendation. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020. ACM, 2073–2083.
    [27]
    Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010. ACM, 661–670.
    [28]
    Raymond Li, Samira Ebrahimi Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, and Chris Pal. 2018. Towards Deep Conversational Recommendations. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada. 9748–9758.
    [29]
    Shuai Li, Wei Chen, Shuai Li, and Kwong-Sak Leung. 2019. Improved Algorithm on Online Clustering of Bandits. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019. ijcai.org, 2923–2929.
    [30]
    Shuai Li, Wei Chen, Zheng Wen, and Kwong-Sak Leung. 2020. Stochastic Online Learning with Probabilistic Graph Feedback. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 4675–4682.
    [31]
    Shuai Li, Tor Lattimore, and Csaba Szepesvári. 2019. Online Learning to Rank with Features. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA(Proceedings of Machine Learning Research, Vol. 97). PMLR, 3856–3865.
    [32]
    Shijun Li, Wenqiang Lei, Qingyun Wu, Xiangnan He, Peng Jiang, and Tat-Seng Chua. 2021. Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-start Users. ACM Trans. Inf. Syst. 39, 4 (2021), 40:1–40:29.
    [33]
    Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, and Ting Liu. 2020. Towards Conversational Recommendation over Multi-Type Dialogs. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 1036–1049.
    [34]
    Kanak Mahadik, Qingyun Wu, Shuai Li, and Amit Sabne. 2020. Fast distributed bandits for online recommendation systems. In ICS ’20: 2020 International Conference on Supercomputing, Barcelona Spain, June, 2020. ACM, 4:1–4:13.
    [35]
    Shie Mannor and Ohad Shamir. 2011. From Bandits to Experts: On the Value of Side-Observations. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain. 684–692.
    [36]
    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. 8024–8035.
    [37]
    Francesco Ricci, Lior Rokach, Bracha Shapira, and B. Paul Kantor. 2015. Recommender Systems Handbook. Recommender Systems Handbook(2015).
    [38]
    Amit Singhal. 2001. Modern Information Retrieval: A Brief Overview. IEEE Data Eng. Bull. 24, 4 (2001), 35–43.
    [39]
    Aleksandrs Slivkins. 2011. Contextual Bandits with Similarity Information. In COLT 2011 - The 24th Annual Conference on Learning Theory, June 9-11, 2011, Budapest, Hungary(JMLR Proceedings, Vol. 19). JMLR.org, 679–702.
    [40]
    Yueming Sun and Yi Zhang. 2018. Conversational Recommender System. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018. ACM, 235–244.
    [41]
    Panagiotis Symeonidis and Eleftherios Tiakas. 2014. Transitive node similarity: predicting and recommending links in signed social networks. World Wide Web 17, 4 (2014), 743–776.
    [42]
    Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, and Minyi Guo. 2019. Knowledge Graph Convolutional Networks for Recommender Systems. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. ACM, 3307–3313.
    [43]
    Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, and Tat-Seng Chua. 2019. KGAT: Knowledge Graph Attention Network for Recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. ACM, 950–958.
    [44]
    Junda Wu, Canzhe Zhao, Tong Yu, Jingyang Li, and Shuai Li. 2021. Clustering of Conversational Bandits for User Preference Learning and Elicitation. In CIKM ’21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1 - 5, 2021. ACM, 2129–2139.
    [45]
    Zhihui Xie, Tong Yu, Canzhe Zhao, and Shuai Li. 2021. Comparison-based Conversational Recommender System with Relative Bandit Feedback. In SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021. ACM, 1400–1409.
    [46]
    Kerui Xu, Jingxuan Yang, Jun Xu, Sheng Gao, Jun Guo, and Ji-Rong Wen. 2021. Adapting User Preference to Online Feedback in Multi-round Conversational Recommendation. In WSDM ’21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021. ACM, 364–372.
    [47]
    Tong Yu, Yilin Shen, and Hongxia Jin. 2019. A Visual Dialog Augmented Interactive Recommender System. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. ACM, 157–165.
    [48]
    Xiaoying Zhang, Hong Xie, Hang Li, and John C. S. Lui. 2020. Conversational Contextual Bandit: Algorithm and Application. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. ACM / IW3C2, 662–672.
    [49]
    Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018. Towards Conversational Search and Recommendation: System Ask, User Respond. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22-26, 2018. ACM, 177–186.
    [50]
    Kun Zhou, Wayne Xin Zhao, Shuqing Bian, Yuanhang Zhou, Ji-Rong Wen, and Jingsong Yu. 2020. Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020. ACM, 1006–1014.
    [51]
    Kun Zhou, Yuanhang Zhou, Wayne Xin Zhao, Xiaoke Wang, and Ji-Rong Wen. 2020. Towards Topic-Guided Conversational Recommender System. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020. International Committee on Computational Linguistics, 4128–4139.
    [52]
    Jie Zou, Yifan Chen, and Evangelos Kanoulas. 2020. Towards Question-based Recommender Systems. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. ACM, 881–890.

    Cited By

    View all
    • (2024)Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational SearchProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657815(796-806)Online publication date: 10-Jul-2024
    • (2024)Toward joint utilization of absolute and relative bandit feedback for conversational recommendationUser Modeling and User-Adapted Interaction10.1007/s11257-023-09388-5Online publication date: 27-Jan-2024
    • (2023)Confident Action Decision via Hierarchical Policy Learning for Conversational RecommendationProceedings of the ACM Web Conference 202310.1145/3543507.3583536(1386-1395)Online publication date: 30-Apr-2023
    • Show More Cited By

    Index Terms

    1. Knowledge-aware Conversational Preference Elicitation with Bandit Feedback
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Information & Contributors

              Information

              Published In

              cover image ACM Conferences
              WWW '22: Proceedings of the ACM Web Conference 2022
              April 2022
              3764 pages
              ISBN:9781450390965
              DOI:10.1145/3485447
              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Sponsors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              Published: 25 April 2022

              Permissions

              Request permissions for this article.

              Check for updates

              Author Tags

              1. conversational recommender
              2. online learning

              Qualifiers

              • Research-article
              • Research
              • Refereed limited

              Conference

              WWW '22
              Sponsor:
              WWW '22: The ACM Web Conference 2022
              April 25 - 29, 2022
              Virtual Event, Lyon, France

              Acceptance Rates

              Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • Downloads (Last 12 months)119
              • Downloads (Last 6 weeks)8
              Reflects downloads up to 27 Jul 2024

              Other Metrics

              Citations

              Cited By

              View all
              • (2024)Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational SearchProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657815(796-806)Online publication date: 10-Jul-2024
              • (2024)Toward joint utilization of absolute and relative bandit feedback for conversational recommendationUser Modeling and User-Adapted Interaction10.1007/s11257-023-09388-5Online publication date: 27-Jan-2024
              • (2023)Confident Action Decision via Hierarchical Policy Learning for Conversational RecommendationProceedings of the ACM Web Conference 202310.1145/3543507.3583536(1386-1395)Online publication date: 30-Apr-2023
              • (2023)Meta Policy Learning for Cold-Start Conversational RecommendationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570443(222-230)Online publication date: 27-Feb-2023
              • (2023)Conversational Contextual Bandits: The Generalized Linear Case2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10150958(1-8)Online publication date: 21-Apr-2023
              • (2023)Clustering of conversational bandits with posterior sampling for user preference learning and elicitationUser Modeling and User-Adapted Interaction10.1007/s11257-023-09358-x33:5(1065-1112)Online publication date: 6-Mar-2023
              • (2022)A Multi-Armed Bandit Recommender Algorithm Based on Conversation and KNNProceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3579654.3579714(1-6)Online publication date: 23-Dec-2022
              • (2022)Bandit Learning in Many-to-One Matching MarketsProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557248(2088-2097)Online publication date: 17-Oct-2022

              View Options

              Get Access

              Login options

              View options

              PDF

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format.

              HTML Format

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media