research-article

Knowledge-aware Conversational Preference Elicitation with Bandit Feedback

Authors:

Canzhe Zhao,

Tong Yu,

Zhihui Xie,

Shuai LiAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 483 - 492

https://doi.org/10.1145/3485447.3512152

Published: 25 April 2022 Publication History

Get Access

Abstract

Conversational recommender systems (CRSs) have been proposed recently to mitigate the cold-start problem suffered by the traditional recommender systems. By introducing conversational key-terms, existing conversational recommenders can effectively reduce the need for extensive exploration and elicit the user preferences faster and more accurately. However, existing conversational recommenders leveraging key-terms heavily rely on the availability and quality of the key-terms, and their performances might degrade significantly when the key-terms are incomplete or not well labeled, which usually happens when there are new items being consistently incorporated into the systems and involving lots of human efforts to acquire well-labeled key-terms is costly. Besides, existing CRS methods leverage the feedback to different conversational key-terms separately, without considering the underlying relations between the key-terms. In this case, the learning of the conversational recommenders is sample inefficient, especially when there is a large number of candidate conversational key-terms.

In this paper, we propose a knowledge-aware conversational preference elicitation framework and a bandit-based algorithm GraphConUCB. To achieve efficient preference elicitation given items with incompletely labeled key-terms, our algorithm leverage the underlying relations between the key-terms, guided by the knowledge graph. Being knowledge-aware, our algorithm propagates the user preferences via a pseudo graph feedback module, which also accelerates the exploration in the large action space of key-terms and improves the conversational sample efficiency. To select the most informative conversational key-terms in the graphs to conduct conversations, we further devise a graph-based optimal design module which leverages the graph structure. We provide the theoretical analysis of the regret upper bound for GraphConUCB. With extensive experiments, we show that our algorithm can effectively handle the items with incompletely labeled key-terms, and improves over the state-of-the-art baselines significantly.

References

[1]

Yasin Abbasi-Yadkori, Dávid Pál, and Csaba Szepesvári. 2011. Improved Algorithms for Linear Stochastic Bandits. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain. 2312–2320.

Abstract

References

Cited By

Index Terms

Recommendations

Comparison-based Conversational Recommender System with Relative Bandit Feedback

Clustering of Conversational Bandits for User Preference Learning and Elicitation

Hierarchical Conversational Preference Elicitation with Bandit Feedback

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations