Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3488560.3498471acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning

Published: 15 February 2022 Publication History

Abstract

Since the inception of Recommender Systems (RS), the accuracy of the recommendations in terms of relevance has been the golden criterion for evaluating the quality of RS algorithms. However, by focusing on item relevance, one pays a significant price in terms of other important metrics: users get stuck in a "filter bubble" and their array of options is significantly reduced, hence degrading the quality of the user experience and leading to churn. Recommendation, and in particular session-based/sequential recommendation, is a complex task with multiple - and often conflicting objectives - that existing state-of-the-art approaches fail to address. In this work, we take on the aforementioned challenge and introduce Scalarized Multi-Objective Reinforcement Learning (SMORL) for the RS setting, a novel Reinforcement Learning (RL) framework that can effectively address multi-objective recommendation tasks. The proposed SMORL agent augments standard recommendation models with additional RL layers that enforce it to simultaneously satisfy three principal objectives: accuracy, diversity, and novelty of recommendations. We integrate this framework with four state-of-the-art session-based recommendation models and compare it with a single-objective RL agent that only focuses on accuracy. Our experimental results on two real-world datasets reveal a substantial increase in aggregate diversity, a moderate increase in accuracy, reduced repetitiveness of recommendations, and demonstrate the importance of reinforcing diversity and novelty as complementary objectives.

Supplementary Material

MP4 File (wsdmfp517-stamenkovic.mp4)
Presentation for 'Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning' by Dusan Stamenkovic

References

[1]
Ashton Anderson, Lucas Maystre, Ian Anderson, Rishabh Mehrotra, and Mounia Lalmas. 2020. Algorithmic effects on the diversity of consumption on spotify. In Proceedings of The Web Conference 2020. 2155--2165.
[2]
Azin Ashkan, Branislav Kveton, Shlomo Berkovsky, and Zheng Wen. 2015. Optimal Greedy Diversity for Recommendation. In IJCAI, Vol. 15. 1742--1748.
[3]
Aishwariya Budhrani, Akashkumar Patel, and Shivam Ribadiya. 2020. Music2Vec: Music Genre Classification and Recommendation System. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) . IEEE, 1406--1411.
[4]
Laming Chen, Guoxin Zhang, and Hanning Zhou. 2017. Fast greedy map inference for determinantal point process to improve recommendation diversity. arXiv preprint arXiv:1709.05135 (2017).
[5]
Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019 a. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining . 456--464.
[6]
Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, and Le Song. 2019 b. Generative adversarial user model for reinforcement learning based recommendation system. In International Conference on Machine Learning . PMLR, 1052--1061.
[7]
Peizhe Cheng, Shuaiqiang Wang, Jun Ma, Jiankai Sun, and Hui Xiong. 2017. Learning to recommend accurate and diverse items. In Proceedings of the 26th international conference on World Wide Web. 183--192.
[8]
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).
[9]
Daniel M Fleder and Kartik Hosanagar. 2007. Recommender systems and their impact on sales diversity. In Proceedings of the 8th ACM conference on Electronic commerce. 192--199.
[10]
Christian Hansen, Rishabh Mehrotra, Casper Hansen, Brian Brost, Lucas Maystre, and Mounia Lalmas. 2021. Shifting Consumption towards Diverse Content on Music Streaming Platforms. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining . 238--246.
[11]
Hado V Hasselt. 2010. Double Q-learning. In Advances in neural information processing systems. 2613--2621.
[12]
Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John T Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS), Vol. 22, 1 (2004), 5--53.
[13]
Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent neural networks with top-k gains for session-based recommendations. In Proceedings of the 27th ACM international conference on information and knowledge management. 843--852.
[14]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).
[15]
Rong Hu and Pearl Pu. 2011. Helping Users Perceive Recommendation Diversity.
[16]
Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, and Yinghui Xu. 2018. Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 368--377.
[17]
Sheena S Iyengar and Mark R Lepper. 1999. Rethinking the value of choice: a cultural perspective on intrinsic motivation. Journal of personality and social psychology, Vol. 76, 3 (1999), 349.
[18]
Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), Vol. 20, 4 (2002), 422--446.
[19]
Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 197--206.
[20]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[21]
Neal Lathia, Stephen Hailes, Licia Capra, and Xavier Amatriain. 2010. Temporal diversity in recommender systems. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval . 210--217.
[22]
Frédéric Lavancier, Jesper Møller, and Ege Rubak. 2015. Determinantal point process models and statistical inference. Journal of the Royal Statistical Society: Series B: Statistical Methodology (2015), 853--877.
[23]
Ye Ma, Lu Zong, Yikang Yang, and Jionglong Su. 2019. News2vec: News network embedding with subnode information. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . 4845--4854.
[24]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[25]
Nikola Milojkovic, Diego Antognini, Giancarlo Bergamin, Boi Faltings, and Claudiu Musat. 2019. Multi-gradient descent for multi-objective recommender systems. arXiv preprint arXiv:2001.00846 (2019).
[26]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
[27]
Hossam Mossalam, Yannis M Assael, Diederik M Roijers, and Shimon Whiteson. 2016. Multi-objective deep reinforcement learning. arXiv preprint arXiv:1610.02707 (2016).
[28]
Eli Pariser. 2011. The filter bubble: What the Internet is hiding from you .Penguin UK.
[29]
Lijing Qin and Xiaoyan Zhu. 2013. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence. Citeseer.
[30]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).
[31]
Marco Tulio Ribeiro, Anisio Lacerda, Adriano Veloso, and Nivio Ziviani. 2012. Pareto-efficient hybridization for multi-objective recommender systems. In Proceedings of the sixth ACM conference on Recommender systems. 19--26.
[32]
Barry Schwartz. 2004. The paradox of choice: Why less is more. New York: Ecco (2004).
[33]
Chaofeng Sha, Xiaowei Wu, and Junyu Niu. 2016. A Framework for Recommending Relevant and Diverse Items. In IJCAI, Vol. 16. 3868--3874.
[34]
Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, and Jieping Ye. 2019. Environment reconstruction with hidden confounders for reinforcement learning based recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 566--576.
[35]
Cass R Sunstein and Edna Ullmann-Margalit. 1999. Second-order decisions. Ethics, Vol. 110, 1 (1999), 5--31.
[36]
Jiaxi Tang and Ke Wang. 2018. Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining . 565--573.
[37]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[38]
Romain Warlop, Jérémie Mary, and Mike Gartrell. 2019. Tensorized determinantal point processes for recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 1605--1615.
[39]
C Ch White, CC III WHITE, and KIM KW. 1980. Solution procedures for vector criterion Markov decision processes. (1980).
[40]
Mark Wilhelm, Ajith Ramanathan, Alexander Bonomo, Sagar Jain, Ed H Chi, and Jennifer Gillenwater. 2018. Practical diversified recommendations on youtube with determinantal point processes. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2165--2173.
[41]
Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3--4 (1992), 229--256.
[42]
Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, and Joemon M Jose. 2020. Self-Supervised Reinforcement Learning forRecommender Systems. arXiv preprint arXiv:2006.05779 (2020).
[43]
Fajie Yuan, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M Jose, and Xiangnan He. 2019. A simple convolutional generative network for next item recommendation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining . 582--590.
[44]
Yuan Cao Zhang, Diarmuid Ó Séaghdha, Daniele Quercia, and Tamas Jambor. 2012. Auralist: introducing serendipity into music recommendation. In Proceedings of the fifth ACM international conference on Web search and data mining . 13--22.
[45]
Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018. Recommendations with negative feedback via pairwise deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1040--1048.
[46]
Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference. 167--176.
[47]
Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, and Dawei Yin. 2019. Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2810--2818.

Cited By

View all
  • (2024)Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671618(5669-5679)Online publication date: 25-Aug-2024
  • (2024)Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action ModelingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657767(375-385)Online publication date: 10-Jul-2024
  • (2024)Treatment Effect Estimation for User Interest Exploration on Recommender SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657736(1861-1871)Online publication date: 10-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
February 2022
1690 pages
ISBN:9781450391320
DOI:10.1145/3488560
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. diversity
  2. multi-objective reinforcement learning
  3. novelty
  4. recommendation
  5. reinforcement learning

Qualifiers

  • Research-article

Funding Sources

  • The National Key R&D Program of China
  • Natural Science Foundation of China

Conference

WSDM '22

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)195
  • Downloads (Last 6 weeks)19
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement LearningProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671618(5669-5679)Online publication date: 25-Aug-2024
  • (2024)Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action ModelingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657767(375-385)Online publication date: 10-Jul-2024
  • (2024)Treatment Effect Estimation for User Interest Exploration on Recommender SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657736(1861-1871)Online publication date: 10-Jul-2024
  • (2024)LabelCraft: Empowering Short Video Recommendations with Automated Label CraftingProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635816(28-37)Online publication date: 4-Mar-2024
  • (2024)Full-stage Diversified Recommendation: Large-scale Online Experiments in Short-video PlatformProceedings of the ACM Web Conference 202410.1145/3589334.3648144(4565-4574)Online publication date: 13-May-2024
  • (2024)Result Diversification in Search and Recommendation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338226236:10(5354-5373)Online publication date: Oct-2024
  • (2024)A reinforcement learning recommender system using bi-clustering and Markov Decision ProcessExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121541237:PBOnline publication date: 1-Feb-2024
  • (2024)Improving recommendation diversity without retraining from scratchInternational Journal of Data Science and Analytics10.1007/s41060-024-00518-9Online publication date: 10-Mar-2024
  • (2024)Explainable recommender system directed by reconstructed explanatory factors and multi‐modal matrix factorizationConcurrency and Computation: Practice and Experience10.1002/cpe.820836:21Online publication date: 19-Jun-2024
  • (2023)Deep Learning Models for Serendipity Recommendations: A Survey and New PerspectivesACM Computing Surveys10.1145/360514556:1(1-26)Online publication date: 26-Aug-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media