short-paper

Open access

Multiobjective Evaluation of Reinforcement Learning Based Recommender Systems

Authors:

Alexey Grishanov,

Anastasia Ianina, and

Konstantin VorontsovAuthors Info & Claims

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

September 2022

Pages 622 - 627

https://doi.org/10.1145/3523227.3551485

Published: 13 September 2022 Publication History

All formats PDF

Abstract

Movielens dataset has become a default choice for recommender systems evaluation. In this paper we analyze the best strategies of a Reinforcement Learning agent on Movielens (1M) dataset studying the balance between precision and diversity of recommendations. We found that trivial strategies are able to maximize ranking quality criteria, but useless for users of the recommendation system due to the lack of diversity in final predictions. Our proposed method stimulates the agent to explore the environment using the stochasticity of Ornstein-Uhlenbeck processes. Experiments show that optimization of the Ornstein-Uhlenbeck process drift coefficient improves the diversity of recommendations while maintaining high nDCG and HR criteria. To the best of our knowledge, the analysis of agent strategies in recommendation environments has not been studied excessively in previous works.

Supplementary Material

MP4 File (recsys22-160_video.mp4)

Movielens dataset has become a default choice for recommender systems evaluation. In this paper we analyze the best strategies of a Reinforcement Learning agent on Movielens (1M) dataset studying the balance between precision and diversity of recommendations. We found that trivial strategies are able to maximize ranking quality criteria, but useless for users of the recommendation system due to the lack of diversity in final predictions. Our proposed method stimulates the agent to explore the environment using the stochasticity of Ornstein-Uhlenbeck processes. Experiments show that optimization of the Ornstein-Uhlenbeck process drift coefficient improves the diversity of recommendations while maintaining high nDCG and HR criteria. To the best of our knowledge, the analysis of agent strategies in recommendation environments has not been studied excessively in previous works.

Download
5.53 MB

References

[1]

M Mehdi Afsar, Trafford Crump, and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv preprint arXiv:2101.06286(2021).

[2]

Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR abs/1607.06450(2016). arXiv:1607.06450http://arxiv.org/abs/1607.06450

[3]

Xueying Bai, Jian Guan, and Hongning Wang. 2019. A model-based reinforcement learning with adversarial training for online recommendation. Advances in Neural Information Processing Systems 32 (2019).

[4]

Gabriel Barth-Maron, Matthew W Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva Tb, Alistair Muldal, Nicolas Heess, and Timothy Lillicrap. 2018. Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617(2018).

[5]

Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 456–464.

Digital Library

[6]

Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google News Personalization: Scalable Online Collaborative Filtering. In Proceedings of the 16th International Conference on World Wide Web (Banff, Alberta, Canada) (WWW ’07). Association for Computing Machinery, New York, NY, USA, 271–280. https://doi.org/10.1145/1242572.1242610

Digital Library

[7]

Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2015. Deep Reinforcement Learning in Large Discrete Action Spaces. arxiv:1512.07679 [cs.AI]

[8]

Scott Fujimoto, Herke Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. In International conference on machine learning. PMLR, 1587–1596.

[9]

Scott Fujimoto, Herke van Hoof, and David Meger. 2018. Addressing Function Approximation Error in Actor-Critic Methods. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 1582–1591. http://proceedings.mlr.press/v80/fujimoto18a.html

[10]

F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 5, 4, Article 19 (Dec. 2015), 19 pages. https://doi.org/10.1145/2827872

Digital Library

[11]

Xu He, Bo An, Yanghua Li, Haikai Chen, Rundong Wang, Xinrun Wang, Runsheng Yu, Xin Li, and Zhirong Wang. 2020. Learning to collaborate in multi-module recommendation via multi-agent reinforcement learning without communication. In Fourteenth ACM Conference on Recommender Systems. 210–219.

Digital Library

[12]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3-7, 2017, Rick Barrett, Rick Cummings, Eugene Agichtein, and Evgeniy Gabrilovich (Eds.). ACM, 173–182. https://doi.org/10.1145/3038912.3052569

Digital Library

[13]

Michal Kompan and Mária Bieliková. 2010. Content-Based News Recommendation. In E-Commerce and Web Technologies, Francesco Buccafurri and Giovanni Semeraro (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 61–72.

[14]

Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (Aug. 2009), 30–37. https://doi.org/10.1109/MC.2009.263

Digital Library

[15]

Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A Contextual-Bandit Approach to Personalized News Article Recommendation. In Proceedings of the 19th International Conference on World Wide Web (Raleigh, North Carolina, USA) (WWW ’10). Association for Computing Machinery, New York, NY, USA, 661–670. https://doi.org/10.1145/1772690.1772758

Digital Library

[16]

Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous control with deep reinforcement learning. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1509.02971

[17]

Feng Liu, Huifeng Guo, Xutao Li, Ruiming Tang, Yunming Ye, and Xiuqiang He. 2020. End-to-End Deep Reinforcement Learning Based Recommendation with Supervised Embedding. In Proceedings of the 13th International Conference on Web Search and Data Mining (Houston, TX, USA) (WSDM ’20). Association for Computing Machinery, New York, NY, USA, 384–392. https://doi.org/10.1145/3336191.3371858

Digital Library

[18]

Feng Liu, Ruiming Tang, Xutao Li, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling. ArXiv abs/1810.12027(2018).

[19]

Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027(2018).

[20]

Zefang Liu, Shuran Wen, and Yinzhu Quan. 2021. Deep Reinforcement Learning based Group Recommender System. arXiv preprint arXiv:2106.06900(2021).

[21]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602(2013).

[22]

Simon Philip, Peter Shola, and Ovye Abari. 2014. Application of Content-Based Approach in Research Paper Recommendation System for a Digital Library. International Journal of Advanced Computer Science and Applications 5 (10 2014). https://doi.org/10.14569/IJACSA.2014.051006

[23]

Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–36.

Digital Library

[24]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arxiv:1707.06347 [cs.LG]

[25]

G. E. Uhlenbeck and L. S. Ornstein. 1930. On the theory of the Brownian motion. Phys. Rev. 36, 3 (1930), 823–841.

[26]

Shoujin Wang, Longbing Cao, Yan Wang, Quan Z Sheng, Mehmet A Orgun, and Defu Lian. 2021. A survey on session-based recommender systems. ACM Computing Surveys (CSUR) 54, 7 (2021), 1–38.

Digital Library

[27]

Norbert Wiener. 1976. Collected works with commentaries. Mit Press.

[28]

Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, and Leyu Lin. 2021. Hierarchical reinforcement learning for integrated recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4521–4528.

[29]

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI conference on artificial intelligence, Vol. 31.

[30]

Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 1–38.

Digital Library

[31]

Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Jiliang Tang, and Hui Liu. 2021. Dear: Deep reinforcement learning for online advertising impression in recommender systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 750–758.

[32]

Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin. 2019. ” Deep reinforcement learning for search, recommendation, and online advertising: a survey” by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator. ACM sigweb newsletterSpring (2019), 1–15.

[33]

Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2018. Deep reinforcement learning for page-wise recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems - RecSys '18. ACM Press. https://doi.org/10.1145/3240323.3240374

Digital Library

[34]

Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2018. Deep Reinforcement Learning for List-wise Recommendations. ArXiv abs/1801.00209(2018).

[35]

Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A Deep Reinforcement Learning Framework for News Recommendation. Proceedings of the 2018 World Wide Web Conference (2018).

Digital Library

[36]

Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 world wide web conference. 167–176.

Digital Library

Cited By

Volovikova ZKuderov PPanov A(2023)Interpreting Decision Process in Offline Reinforcement Learning for Interactive Recommendation SystemsNeural Information Processing10.1007/978-981-99-8138-0_22(270-286)Online publication date: 26-Nov-2023
https://doi.org/10.1007/978-981-99-8138-0_22
Xu AJian LYin YZhang N(undefined)UISA: User Information Separating Architecture for Commodity Recommendation Policy with Deep Reinforcement LearningACM Transactions on Recommender Systems10.1145/3654806
https://dl.acm.org/doi/10.1145/3654806

Index Terms

Multiobjective Evaluation of Reinforcement Learning Based Recommender Systems

Recommendations

A deep reinforcement learning based long-term recommender system
Abstract
Recommender systems aim to maximize the overall accuracy for long-term recommendations. However, most of the existing recommendation models adopt a static view, and ignore the fact that recommendation is a dynamic sequential decision-...
Highlights
- A novel top-N interactive recommender system based on deep reinforcement learning is proposed.
Read More
Deep reinforcement learning for page-wise recommendations
RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

Recommender systems can mitigate the information overload problem by suggesting users' personalized items. In real-world recommendations such as e-commerce, a typical interaction between the system and its users is - users are recommended a page of ...
Read More
Exploiting Social Tagging in a Web 2.0 Recommender System

Recommender systems help users cope with information overload by using their preferences to recommend items. To date, most recommenders have employed users' ratings, information about the user's profile, or metadata describing the items. To take ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

September 2022

743 pages

ISBN:9781450392785

DOI:10.1145/3523227

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2022

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

RecSys '22

Sponsor:

RecSys '22: Sixteenth ACM Conference on Recommender Systems

September 18 - 23, 2022

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
836
Total Downloads

Downloads (Last 12 months)340
Downloads (Last 6 weeks)34

Other Metrics

View Author Metrics

Citations

Cited By

Volovikova ZKuderov PPanov A(2023)Interpreting Decision Process in Offline Reinforcement Learning for Interactive Recommendation SystemsNeural Information Processing10.1007/978-981-99-8138-0_22(270-286)Online publication date: 26-Nov-2023
https://doi.org/10.1007/978-981-99-8138-0_22
Xu AJian LYin YZhang N(undefined)UISA: User Information Separating Architecture for Commodity Recommendation Policy with Deep Reinforcement LearningACM Transactions on Recommender Systems10.1145/3654806
https://dl.acm.org/doi/10.1145/3654806

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents