Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3488560.3498471acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning

Published: 15 February 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Since the inception of Recommender Systems (RS), the accuracy of the recommendations in terms of relevance has been the golden criterion for evaluating the quality of RS algorithms. However, by focusing on item relevance, one pays a significant price in terms of other important metrics: users get stuck in a "filter bubble" and their array of options is significantly reduced, hence degrading the quality of the user experience and leading to churn. Recommendation, and in particular session-based/sequential recommendation, is a complex task with multiple - and often conflicting objectives - that existing state-of-the-art approaches fail to address. In this work, we take on the aforementioned challenge and introduce Scalarized Multi-Objective Reinforcement Learning (SMORL) for the RS setting, a novel Reinforcement Learning (RL) framework that can effectively address multi-objective recommendation tasks. The proposed SMORL agent augments standard recommendation models with additional RL layers that enforce it to simultaneously satisfy three principal objectives: accuracy, diversity, and novelty of recommendations. We integrate this framework with four state-of-the-art session-based recommendation models and compare it with a single-objective RL agent that only focuses on accuracy. Our experimental results on two real-world datasets reveal a substantial increase in aggregate diversity, a moderate increase in accuracy, reduced repetitiveness of recommendations, and demonstrate the importance of reinforcing diversity and novelty as complementary objectives.

    Supplementary Material

    MP4 File (wsdmfp517-stamenkovic.mp4)
    Presentation for 'Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning' by Dusan Stamenkovic

    References

    [1]
    Ashton Anderson, Lucas Maystre, Ian Anderson, Rishabh Mehrotra, and Mounia Lalmas. 2020. Algorithmic effects on the diversity of consumption on spotify. In Proceedings of The Web Conference 2020. 2155--2165.
    [2]
    Azin Ashkan, Branislav Kveton, Shlomo Berkovsky, and Zheng Wen. 2015. Optimal Greedy Diversity for Recommendation. In IJCAI, Vol. 15. 1742--1748.
    [3]
    Aishwariya Budhrani, Akashkumar Patel, and Shivam Ribadiya. 2020. Music2Vec: Music Genre Classification and Recommendation System. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) . IEEE, 1406--1411.
    [4]
    Laming Chen, Guoxin Zhang, and Hanning Zhou. 2017. Fast greedy map inference for determinantal point process to improve recommendation diversity. arXiv preprint arXiv:1709.05135 (2017).
    [5]
    Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019 a. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining . 456--464.
    [6]
    Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, and Le Song. 2019 b. Generative adversarial user model for reinforcement learning based recommendation system. In International Conference on Machine Learning . PMLR, 1052--1061.
    [7]
    Peizhe Cheng, Shuaiqiang Wang, Jun Ma, Jiankai Sun, and Hui Xiong. 2017. Learning to recommend accurate and diverse items. In Proceedings of the 26th international conference on World Wide Web. 183--192.
    [8]
    Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).
    [9]
    Daniel M Fleder and Kartik Hosanagar. 2007. Recommender systems and their impact on sales diversity. In Proceedings of the 8th ACM conference on Electronic commerce. 192--199.
    [10]
    Christian Hansen, Rishabh Mehrotra, Casper Hansen, Brian Brost, Lucas Maystre, and Mounia Lalmas. 2021. Shifting Consumption towards Diverse Content on Music Streaming Platforms. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining . 238--246.
    [11]
    Hado V Hasselt. 2010. Double Q-learning. In Advances in neural information processing systems. 2613--2621.
    [12]
    Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John T Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS), Vol. 22, 1 (2004), 5--53.
    [13]
    Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent neural networks with top-k gains for session-based recommendations. In Proceedings of the 27th ACM international conference on information and knowledge management. 843--852.
    [14]
    Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).
    [15]
    Rong Hu and Pearl Pu. 2011. Helping Users Perceive Recommendation Diversity.
    [16]
    Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, and Yinghui Xu. 2018. Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 368--377.
    [17]
    Sheena S Iyengar and Mark R Lepper. 1999. Rethinking the value of choice: a cultural perspective on intrinsic motivation. Journal of personality and social psychology, Vol. 76, 3 (1999), 349.
    [18]
    Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), Vol. 20, 4 (2002), 422--446.
    [19]
    Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 197--206.
    [20]
    Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    [21]
    Neal Lathia, Stephen Hailes, Licia Capra, and Xavier Amatriain. 2010. Temporal diversity in recommender systems. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval . 210--217.
    [22]
    Frédéric Lavancier, Jesper Møller, and Ege Rubak. 2015. Determinantal point process models and statistical inference. Journal of the Royal Statistical Society: Series B: Statistical Methodology (2015), 853--877.
    [23]
    Ye Ma, Lu Zong, Yikang Yang, and Jionglong Su. 2019. News2vec: News network embedding with subnode information. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . 4845--4854.
    [24]
    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
    [25]
    Nikola Milojkovic, Diego Antognini, Giancarlo Bergamin, Boi Faltings, and Claudiu Musat. 2019. Multi-gradient descent for multi-objective recommender systems. arXiv preprint arXiv:2001.00846 (2019).
    [26]
    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
    [27]
    Hossam Mossalam, Yannis M Assael, Diederik M Roijers, and Shimon Whiteson. 2016. Multi-objective deep reinforcement learning. arXiv preprint arXiv:1610.02707 (2016).
    [28]
    Eli Pariser. 2011. The filter bubble: What the Internet is hiding from you .Penguin UK.
    [29]
    Lijing Qin and Xiaoyan Zhu. 2013. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence. Citeseer.
    [30]
    Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).
    [31]
    Marco Tulio Ribeiro, Anisio Lacerda, Adriano Veloso, and Nivio Ziviani. 2012. Pareto-efficient hybridization for multi-objective recommender systems. In Proceedings of the sixth ACM conference on Recommender systems. 19--26.
    [32]
    Barry Schwartz. 2004. The paradox of choice: Why less is more. New York: Ecco (2004).
    [33]
    Chaofeng Sha, Xiaowei Wu, and Junyu Niu. 2016. A Framework for Recommending Relevant and Diverse Items. In IJCAI, Vol. 16. 3868--3874.
    [34]
    Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, and Jieping Ye. 2019. Environment reconstruction with hidden confounders for reinforcement learning based recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 566--576.
    [35]
    Cass R Sunstein and Edna Ullmann-Margalit. 1999. Second-order decisions. Ethics, Vol. 110, 1 (1999), 5--31.
    [36]
    Jiaxi Tang and Ke Wang. 2018. Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining . 565--573.
    [37]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
    [38]
    Romain Warlop, Jérémie Mary, and Mike Gartrell. 2019. Tensorized determinantal point processes for recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 1605--1615.
    [39]
    C Ch White, CC III WHITE, and KIM KW. 1980. Solution procedures for vector criterion Markov decision processes. (1980).
    [40]
    Mark Wilhelm, Ajith Ramanathan, Alexander Bonomo, Sagar Jain, Ed H Chi, and Jennifer Gillenwater. 2018. Practical diversified recommendations on youtube with determinantal point processes. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2165--2173.
    [41]
    Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3--4 (1992), 229--256.
    [42]
    Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, and Joemon M Jose. 2020. Self-Supervised Reinforcement Learning forRecommender Systems. arXiv preprint arXiv:2006.05779 (2020).
    [43]
    Fajie Yuan, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M Jose, and Xiangnan He. 2019. A simple convolutional generative network for next item recommendation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining . 582--590.
    [44]
    Yuan Cao Zhang, Diarmuid Ó Séaghdha, Daniele Quercia, and Tamas Jambor. 2012. Auralist: introducing serendipity into music recommendation. In Proceedings of the fifth ACM international conference on Web search and data mining . 13--22.
    [45]
    Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018. Recommendations with negative feedback via pairwise deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1040--1048.
    [46]
    Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference. 167--176.
    [47]
    Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, and Dawei Yin. 2019. Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2810--2818.

    Cited By

    View all
    • (2024)Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action ModelingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657767(375-385)Online publication date: 10-Jul-2024
    • (2024)Treatment Effect Estimation for User Interest Exploration on Recommender SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657736(1861-1871)Online publication date: 10-Jul-2024
    • (2024)LabelCraft: Empowering Short Video Recommendations with Automated Label CraftingProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635816(28-37)Online publication date: 4-Mar-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
    February 2022
    1690 pages
    ISBN:9781450391320
    DOI:10.1145/3488560
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 February 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. diversity
    2. multi-objective reinforcement learning
    3. novelty
    4. recommendation
    5. reinforcement learning

    Qualifiers

    • Research-article

    Funding Sources

    • The National Key R&D Program of China
    • Natural Science Foundation of China

    Conference

    WSDM '22

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)217
    • Downloads (Last 6 weeks)13

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action ModelingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657767(375-385)Online publication date: 10-Jul-2024
    • (2024)Treatment Effect Estimation for User Interest Exploration on Recommender SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657736(1861-1871)Online publication date: 10-Jul-2024
    • (2024)LabelCraft: Empowering Short Video Recommendations with Automated Label CraftingProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635816(28-37)Online publication date: 4-Mar-2024
    • (2024)A reinforcement learning recommender system using bi-clustering and Markov Decision ProcessExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121541237:PBOnline publication date: 1-Feb-2024
    • (2024)Improving recommendation diversity without retraining from scratchInternational Journal of Data Science and Analytics10.1007/s41060-024-00518-9Online publication date: 10-Mar-2024
    • (2024)Explainable recommender system directed by reconstructed explanatory factors and multi‐modal matrix factorizationConcurrency and Computation: Practice and Experience10.1002/cpe.8208Online publication date: 19-Jun-2024
    • (2023)Deep Learning Models for Serendipity Recommendations: A Survey and New PerspectivesACM Computing Surveys10.1145/360514556:1(1-26)Online publication date: 26-Aug-2023
    • (2023)Broadening the Scope: Evaluating the Potential of Recommender Systems beyond prioritizing AccuracyProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3610649(1139-1145)Online publication date: 14-Sep-2023
    • (2023)Reproducibility of Multi-Objective Reinforcement Learning Recommendation: Interplay between Effectiveness and Beyond-Accuracy PerspectivesProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3609493(467-478)Online publication date: 14-Sep-2023
    • (2023)A Systematic Study on Reproducibility of Reinforcement Learning in Recommendation SystemsACM Transactions on Recommender Systems10.1145/35965191:3(1-23)Online publication date: 14-Jul-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media