Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3523227.3547409acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
introduction
Free access

CONSEQUENCES — Causality, Counterfactuals and Sequential Decision-Making for Recommender Systems

Published: 13 September 2022 Publication History

Abstract

Recommender systems are more and more often modelled as repeated decision making processes – deciding which (ranking of) items to recommend to a given user. Each decision to recommend or rank an item has a significant impact on immediate and future user responses, long-term satisfaction or engagement with the system, and possibly valuable exposure for the item provider. This interactive and interventionist view of the recommender uncovers a plethora of unanswered research questions, as it complicates the typically adopted offline evaluation or learning procedures in the field. We need an understanding of causal inference to reason about (possibly unintended) consequences of the recommender, and a notion of counterfactuals to answer common “what if”-type questions in learning and evaluation. Advances at the intersection of these fields can foster progress in effective, efficient and fair learning and evaluation from logged data. These topics have been emerging in the Recommender Systems community for a while, but we firmly believe in the value of a dedicated forum and place to learn and exchange ideas. We welcome contributions from both academia and industry and bring together a growing community of researchers and practitioners interested in sequential decision making, offline evaluation, batch policy learning, fairness in online platforms, as well as other related tasks, such as A/B testing.

References

[1]
Aman Agarwal, Soumya Basu, Tobias Schnabel, and Thorsten Joachims. 2017. Effective Evaluation using Logged Bandit Feedback from Multiple Loggers. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 687–696.
[2]
James Bennett, Stan Lanning, 2007. The Netflix prize. In Proc. of the KDD cup and workshop, Vol. 2007. 35.
[3]
Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of attention: Amortizing individual fairness in rankings. In The 41st international acm sigir conference on research & development in information retrieval. 405–414.
[4]
Stephen Bonner and Flavian Vasile. 2018. Causal embeddings for recommendation. In Proceedings of the 12th ACM conference on recommender systems. 104–112.
[5]
Minmin Chen, Yuyan Wang, Can Xu, Ya Le, Mohit Sharma, Lee Richardson, Su-Lin Wu, and Ed Chi. 2021. Values of User Exploration in Recommender Systems. In Fifteenth ACM Conference on Recommender Systems. 85–95.
[6]
Fernando Diaz, Bhaskar Mitra, Michael D Ekstrand, Asia J Biega, and Ben Carterette. 2020. Evaluating stochastic rankings with expected exposure. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 275–284.
[7]
Miroslav Dudík, Dumitru Erhan, John Langford, and Lihong Li. 2014. Doubly Robust Policy Evaluation and Optimization. Statist. Sci. 29, 4 (2014), 485–511.
[8]
Mehrdad Farajtabar, Yinlam Chow, and Mohammad Ghavamzadeh. 2018. More Robust Doubly Robust Off-policy Evaluation. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. PMLR, 1447–1456.
[9]
Audrey Huang, Liu Leqi, Zachary Lipton, and Kamyar Azizzadenesheli. 2021. Off-policy risk assessment in contextual bandits. Advances in Neural Information Processing Systems 34 (2021).
[10]
Olivier Jeunen. 2021. Offline Approaches to Recommendation with Online Success. Ph.D. Dissertation. University of Antwerp.
[11]
Olivier Jeunen and Bart Goethals. 2021. Pessimistic reward models for off-policy learning in recommendation. In Fifteenth ACM Conference on Recommender Systems. 63–74.
[12]
Olivier Jeunen and Bart Goethals. 2021. Top-k contextual bandits with equity of exposure. In Fifteenth ACM Conference on Recommender Systems. 310–320.
[13]
Olivier Jeunen, David Rohde, Flavian Vasile, and Martin Bompaire. 2020. Joint Policy-Value Learning for Recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1223–1233.
[14]
Nathan Kallus, Yuta Saito, and Masatoshi Uehara. 2021. Optimal Off-Policy Evaluation from Multiple Logging Policies. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. PMLR, 5247–5256.
[15]
Nathan Kallus and Angela Zhou. 2018. Policy Evaluation and Optimization with Continuous Treatments. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. PMLR, 1243–1251.
[16]
Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto. 2022. Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 487–497.
[17]
Shuai Li, Yasin Abbasi-Yadkori, Branislav Kveton, S Muthukrishnan, Vishwa Vinay, and Zheng Wen. 2018. Offline Evaluation of Ranking Policies with Click Models. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 1685–1694.
[18]
Jiaqi Ma, Zhe Zhao, Xinyang Yi, Ji Yang, Minmin Chen, Jiaxi Tang, Lichan Hong, and Ed H Chi. 2020. Off-policy learning in two-stage recommender systems. In Proceedings of The Web Conference 2020. 463–473.
[19]
James McInerney, Brian Brost, Praveen Chandar, Rishabh Mehrotra, and Benjamin Carterette. 2020. Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 1779–1788.
[20]
Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, and Craig Boutilier. 2020. Optimizing long-term social welfare in recommender systems: A constrained matching approach. In International Conference on Machine Learning. PMLR, 6987–6998.
[21]
Harrie Oosterhuis. 2020. Learning From User Interactions With Rankings: A Unification of the Field. Ph.D. Dissertation. University of Amsterdam.
[22]
Harrie Oosterhuis. 2021. Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1023–1032.
[23]
Harrie Oosterhuis and Maarten de Rijke. 2021. Unifying Online and Counterfactual Learning to Rank. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM‘21). ACM.
[24]
Harrie Oosterhuis and Maarten de Rijke. 2022. Reaching the End of Unbiasedness: Uncovering Implicit Limitations of Click-Based Learning to Rank. In Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. ACM.
[25]
Yuta Saito. 2020. Doubly robust estimator for ranking metrics with post-click conversions. In Fourteenth ACM Conference on Recommender Systems. Association for Computing Machinery, 92–100.
[26]
Yuta Saito, Shunsuke Aihara, Megumi Matsutani, and Yusuke Narita. 2020. Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation. arXiv preprint arXiv:2008.07146(2020).
[27]
Yuta Saito and Thorsten Joachims. 2021. Counterfactual learning and evaluation for recommender systems: Foundations, implementations, and recent advances. In Fifteenth ACM Conference on Recommender Systems. 828–830.
[28]
Yuta Saito and Thorsten Joachims. 2022. Off-Policy Evaluation for Large Action Spaces via Embeddings. In International Conference on Machine Learning. PMLR, 19089–19122.
[29]
Yuta Saito, Takuma Udagawa, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, and Kei Tateno. 2021. Evaluating the Robustness of Off-Policy Evaluation. In Fifteenth ACM Conference on Recommender Systems. 114–123.
[30]
Nian Si, Fan Zhang, Zhengyuan Zhou, and Jose Blanchet. 2020. Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits. In Proceedings of the 38th International Conference on Machine Learning, Vol. 119. PMLR, 8884–8894.
[31]
Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 2219–2228.
[32]
Ashudeep Singh and Thorsten Joachims. 2019. Policy Learning for Fairness in Ranking. In Advances in Neural Information Processing Systems, Vol. 32.
[33]
Yi Su, Magd Bayoumi, and Thorsten Joachims. 2021. Optimizing Rankings for Recommendation in Matching Markets. arXiv preprint arXiv:2106.01941(2021).
[34]
Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, and Miroslav Dudík. 2020. Doubly Robust Off-Policy Evaluation with Shrinkage. In Proceedings of the 37th International Conference on Machine Learning, Vol. 119. PMLR, 9167–9176.
[35]
Yi Su, Lequn Wang, Michele Santacatterina, and Thorsten Joachims. 2019. Cab: Continuous Adaptive Blending for Policy Evaluation and Learning. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. PMLR, 6005–6014.
[36]
Adith Swaminathan and Thorsten Joachims. 2015. Batch Learning from Logged Bandit Feedback through Counterfactual Risk Minimization. The Journal of Machine Learning Research 16, 1 (2015), 1731–1755.
[37]
Adith Swaminathan and Thorsten Joachims. 2015. The Self-Normalized Estimator for Counterfactual Learning. In Advances in Neural Information Processing Systems, Vol. 28. 3231–3239.
[38]
Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miro Dudik, John Langford, Damien Jose, and Imed Zitouni. 2017. Off-policy Evaluation for Slate Recommendation. In Advances in Neural Information Processing Systems, Vol. 30.
[39]
Shengpu Tang and Jenna Wiens. 2021. Model selection for offline reinforcement learning: Practical considerations for healthcare settings. In Machine Learning for Healthcare Conference. PMLR, 2–35.
[40]
Philip Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High-confidence off-policy evaluation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.
[41]
Philip Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High confidence policy improvement. In International Conference on Machine Learning. PMLR, 2380–2388.
[42]
Cameron Voloshin, Hoang M Le, Nan Jiang, and Yisong Yue. 2019. Empirical study of off-policy policy evaluation for reinforcement learning. arXiv preprint arXiv:1911.06854(2019).
[43]
Yu-Xiang Wang, Alekh Agarwal, and Miroslav Dudık. 2017. Optimal and adaptive off-policy evaluation in contextual bandits. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. PMLR, 3589–3597.

Cited By

View all
  • (2024)Ranking the causal impact of recommendations under collider bias in k-spots recommender systemsACM Transactions on Recommender Systems10.1145/36431392:2(1-29)Online publication date: 14-May-2024
  • (2024)Δ-OPE: Off-Policy Estimation with Pairs of PoliciesProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688162(878-883)Online publication date: 8-Oct-2024
  • (2024)Multi-Objective Recommendation via Multivariate Policy LearningProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688132(712-721)Online publication date: 8-Oct-2024
  • Show More Cited By

Index Terms

  1. CONSEQUENCES — Causality, Counterfactuals and Sequential Decision-Making for Recommender Systems

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems
          September 2022
          743 pages
          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 13 September 2022

          Check for updates

          Author Tags

          1. counterfactuals
          2. fairness in rankings
          3. off-policy evaluation and learning
          4. recommender systems

          Qualifiers

          • Introduction
          • Research
          • Refereed limited

          Conference

          Acceptance Rates

          Overall Acceptance Rate 254 of 1,295 submissions, 20%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)116
          • Downloads (Last 6 weeks)18
          Reflects downloads up to 26 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Ranking the causal impact of recommendations under collider bias in k-spots recommender systemsACM Transactions on Recommender Systems10.1145/36431392:2(1-29)Online publication date: 14-May-2024
          • (2024)Δ-OPE: Off-Policy Estimation with Pairs of PoliciesProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688162(878-883)Online publication date: 8-Oct-2024
          • (2024)Multi-Objective Recommendation via Multivariate Policy LearningProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688132(712-721)Online publication date: 8-Oct-2024
          • (2024)Optimal Baseline Corrections for Off-Policy Contextual BanditsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688105(722-732)Online publication date: 8-Oct-2024
          • (2024)CONSEQUENCES --- The 3rd Workshop on Causality, Counterfactuals and Sequential Decision-Making for Recommender SystemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3687095(1206-1209)Online publication date: 8-Oct-2024
          • (2024)Ad-load Balancing via Off-policy Learning in a Content MarketplaceProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635846(586-595)Online publication date: 4-Mar-2024
          • (2023)CONSEQUENCES — The 2nd Workshop on Causality, Counterfactuals and Sequential Decision-Making for Recommender SystemsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608749(1223-1226)Online publication date: 14-Sep-2023
          • (2023)Causal Recommendation: Progresses and Future DirectionsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3594245(3432-3435)Online publication date: 19-Jul-2023

          View Options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Login options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media