Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3523227.3551485acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
short-paper
Open access

Multiobjective Evaluation of Reinforcement Learning Based Recommender Systems

Published: 13 September 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Movielens dataset has become a default choice for recommender systems evaluation. In this paper we analyze the best strategies of a Reinforcement Learning agent on Movielens (1M) dataset studying the balance between precision and diversity of recommendations. We found that trivial strategies are able to maximize ranking quality criteria, but useless for users of the recommendation system due to the lack of diversity in final predictions. Our proposed method stimulates the agent to explore the environment using the stochasticity of Ornstein-Uhlenbeck processes. Experiments show that optimization of the Ornstein-Uhlenbeck process drift coefficient improves the diversity of recommendations while maintaining high nDCG and HR criteria. To the best of our knowledge, the analysis of agent strategies in recommendation environments has not been studied excessively in previous works.

    Supplementary Material

    MP4 File (recsys22-160_video.mp4)
    Movielens dataset has become a default choice for recommender systems evaluation. In this paper we analyze the best strategies of a Reinforcement Learning agent on Movielens (1M) dataset studying the balance between precision and diversity of recommendations. We found that trivial strategies are able to maximize ranking quality criteria, but useless for users of the recommendation system due to the lack of diversity in final predictions. Our proposed method stimulates the agent to explore the environment using the stochasticity of Ornstein-Uhlenbeck processes. Experiments show that optimization of the Ornstein-Uhlenbeck process drift coefficient improves the diversity of recommendations while maintaining high nDCG and HR criteria. To the best of our knowledge, the analysis of agent strategies in recommendation environments has not been studied excessively in previous works.

    References

    [1]
    M Mehdi Afsar, Trafford Crump, and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv preprint arXiv:2101.06286(2021).
    [2]
    Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR abs/1607.06450(2016). arXiv:1607.06450http://arxiv.org/abs/1607.06450
    [3]
    Xueying Bai, Jian Guan, and Hongning Wang. 2019. A model-based reinforcement learning with adversarial training for online recommendation. Advances in Neural Information Processing Systems 32 (2019).
    [4]
    Gabriel Barth-Maron, Matthew W Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva Tb, Alistair Muldal, Nicolas Heess, and Timothy Lillicrap. 2018. Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617(2018).
    [5]
    Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 456–464.
    [6]
    Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google News Personalization: Scalable Online Collaborative Filtering. In Proceedings of the 16th International Conference on World Wide Web (Banff, Alberta, Canada) (WWW ’07). Association for Computing Machinery, New York, NY, USA, 271–280. https://doi.org/10.1145/1242572.1242610
    [7]
    Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2015. Deep Reinforcement Learning in Large Discrete Action Spaces. arxiv:1512.07679 [cs.AI]
    [8]
    Scott Fujimoto, Herke Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. In International conference on machine learning. PMLR, 1587–1596.
    [9]
    Scott Fujimoto, Herke van Hoof, and David Meger. 2018. Addressing Function Approximation Error in Actor-Critic Methods. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 1582–1591. http://proceedings.mlr.press/v80/fujimoto18a.html
    [10]
    F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. https://doi.org/10.1145/2827872
    [11]
    Xu He, Bo An, Yanghua Li, Haikai Chen, Rundong Wang, Xinrun Wang, Runsheng Yu, Xin Li, and Zhirong Wang. 2020. Learning to collaborate in multi-module recommendation via multi-agent reinforcement learning without communication. In Fourteenth ACM Conference on Recommender Systems. 210–219.
    [12]
    Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3-7, 2017, Rick Barrett, Rick Cummings, Eugene Agichtein, and Evgeniy Gabrilovich (Eds.). ACM, 173–182. https://doi.org/10.1145/3038912.3052569
    [13]
    Michal Kompan and Mária Bieliková. 2010. Content-Based News Recommendation. In E-Commerce and Web Technologies, Francesco Buccafurri and Giovanni Semeraro (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 61–72.
    [14]
    Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (Aug. 2009), 30–37. https://doi.org/10.1109/MC.2009.263
    [15]
    Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A Contextual-Bandit Approach to Personalized News Article Recommendation. In Proceedings of the 19th International Conference on World Wide Web (Raleigh, North Carolina, USA) (WWW ’10). Association for Computing Machinery, New York, NY, USA, 661–670. https://doi.org/10.1145/1772690.1772758
    [16]
    Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous control with deep reinforcement learning. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1509.02971
    [17]
    Feng Liu, Huifeng Guo, Xutao Li, Ruiming Tang, Yunming Ye, and Xiuqiang He. 2020. End-to-End Deep Reinforcement Learning Based Recommendation with Supervised Embedding. In Proceedings of the 13th International Conference on Web Search and Data Mining (Houston, TX, USA) (WSDM ’20). Association for Computing Machinery, New York, NY, USA, 384–392. https://doi.org/10.1145/3336191.3371858
    [18]
    Feng Liu, Ruiming Tang, Xutao Li, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling. ArXiv abs/1810.12027(2018).
    [19]
    Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027(2018).
    [20]
    Zefang Liu, Shuran Wen, and Yinzhu Quan. 2021. Deep Reinforcement Learning based Group Recommender System. arXiv preprint arXiv:2106.06900(2021).
    [21]
    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602(2013).
    [22]
    Simon Philip, Peter Shola, and Ovye Abari. 2014. Application of Content-Based Approach in Research Paper Recommendation System for a Digital Library. International Journal of Advanced Computer Science and Applications 5 (10 2014). https://doi.org/10.14569/IJACSA.2014.051006
    [23]
    Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–36.
    [24]
    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arxiv:1707.06347 [cs.LG]
    [25]
    G. E. Uhlenbeck and L. S. Ornstein. 1930. On the theory of the Brownian motion. Phys. Rev. 36, 3 (1930), 823–841.
    [26]
    Shoujin Wang, Longbing Cao, Yan Wang, Quan Z Sheng, Mehmet A Orgun, and Defu Lian. 2021. A survey on session-based recommender systems. ACM Computing Surveys (CSUR) 54, 7 (2021), 1–38.
    [27]
    Norbert Wiener. 1976. Collected works with commentaries. Mit Press.
    [28]
    Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, and Leyu Lin. 2021. Hierarchical reinforcement learning for integrated recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4521–4528.
    [29]
    Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI conference on artificial intelligence, Vol. 31.
    [30]
    Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 1–38.
    [31]
    Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Jiliang Tang, and Hui Liu. 2021. Dear: Deep reinforcement learning for online advertising impression in recommender systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 750–758.
    [32]
    Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin. 2019. ” Deep reinforcement learning for search, recommendation, and online advertising: a survey” by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator. ACM sigweb newsletterSpring (2019), 1–15.
    [33]
    Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2018. Deep reinforcement learning for page-wise recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems - RecSys '18. ACM Press. https://doi.org/10.1145/3240323.3240374
    [34]
    Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2018. Deep Reinforcement Learning for List-wise Recommendations. ArXiv abs/1801.00209(2018).
    [35]
    Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A Deep Reinforcement Learning Framework for News Recommendation. Proceedings of the 2018 World Wide Web Conference (2018).
    [36]
    Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 world wide web conference. 167–176.

    Cited By

    View all
    • (2023)Interpreting Decision Process in Offline Reinforcement Learning for Interactive Recommendation SystemsNeural Information Processing10.1007/978-981-99-8138-0_22(270-286)Online publication date: 26-Nov-2023
    • (undefined)UISA: User Information Separating Architecture for Commodity Recommendation Policy with Deep Reinforcement LearningACM Transactions on Recommender Systems10.1145/3654806

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems
    September 2022
    743 pages
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 September 2022

    Check for updates

    Author Tags

    1. Deep Deterministic Policy Gradient (DDPG)
    2. Deep Reinforcement Learning
    3. Ornstein-Uhlenbeck processes
    4. Recommendation Systems
    5. noise injection

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    Acceptance Rates

    Overall Acceptance Rate 254 of 1,295 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)340
    • Downloads (Last 6 weeks)34

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Interpreting Decision Process in Offline Reinforcement Learning for Interactive Recommendation SystemsNeural Information Processing10.1007/978-981-99-8138-0_22(270-286)Online publication date: 26-Nov-2023
    • (undefined)UISA: User Information Separating Architecture for Commodity Recommendation Policy with Deep Reinforcement LearningACM Transactions on Recommender Systems10.1145/3654806

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media