introduction

Free access

CONSEQUENCES — Causality, Counterfactuals and Sequential Decision-Making for Recommender Systems

Authors:

Olivier Jeunen,

Thorsten Joachims,

Harrie Oosterhuis,

Flavian VasileAuthors Info & Claims

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Pages 654 - 657

https://doi.org/10.1145/3523227.3547409

Published: 13 September 2022 Publication History

All formats PDF

Abstract

Recommender systems are more and more often modelled as repeated decision making processes – deciding which (ranking of) items to recommend to a given user. Each decision to recommend or rank an item has a significant impact on immediate and future user responses, long-term satisfaction or engagement with the system, and possibly valuable exposure for the item provider. This interactive and interventionist view of the recommender uncovers a plethora of unanswered research questions, as it complicates the typically adopted offline evaluation or learning procedures in the field. We need an understanding of causal inference to reason about (possibly unintended) consequences of the recommender, and a notion of counterfactuals to answer common “what if”-type questions in learning and evaluation. Advances at the intersection of these fields can foster progress in effective, efficient and fair learning and evaluation from logged data. These topics have been emerging in the Recommender Systems community for a while, but we firmly believe in the value of a dedicated forum and place to learn and exchange ideas. We welcome contributions from both academia and industry and bring together a growing community of researchers and practitioners interested in sequential decision making, offline evaluation, batch policy learning, fairness in online platforms, as well as other related tasks, such as A/B testing.

References

[1]

Aman Agarwal, Soumya Basu, Tobias Schnabel, and Thorsten Joachims. 2017. Effective Evaluation using Logged Bandit Feedback from Multiple Loggers. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 687–696.

Digital Library

[2]

James Bennett, Stan Lanning, 2007. The Netflix prize. In Proc. of the KDD cup and workshop, Vol. 2007. 35.

[3]

Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of attention: Amortizing individual fairness in rankings. In The 41st international acm sigir conference on research & development in information retrieval. 405–414.

[4]

Stephen Bonner and Flavian Vasile. 2018. Causal embeddings for recommendation. In Proceedings of the 12th ACM conference on recommender systems. 104–112.

Digital Library

[5]

Minmin Chen, Yuyan Wang, Can Xu, Ya Le, Mohit Sharma, Lee Richardson, Su-Lin Wu, and Ed Chi. 2021. Values of User Exploration in Recommender Systems. In Fifteenth ACM Conference on Recommender Systems. 85–95.

[6]

Fernando Diaz, Bhaskar Mitra, Michael D Ekstrand, Asia J Biega, and Ben Carterette. 2020. Evaluating stochastic rankings with expected exposure. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 275–284.

Digital Library

[7]

Miroslav Dudík, Dumitru Erhan, John Langford, and Lihong Li. 2014. Doubly Robust Policy Evaluation and Optimization. Statist. Sci. 29, 4 (2014), 485–511.

[8]

Mehrdad Farajtabar, Yinlam Chow, and Mohammad Ghavamzadeh. 2018. More Robust Doubly Robust Off-policy Evaluation. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. PMLR, 1447–1456.

[9]

Audrey Huang, Liu Leqi, Zachary Lipton, and Kamyar Azizzadenesheli. 2021. Off-policy risk assessment in contextual bandits. Advances in Neural Information Processing Systems 34 (2021).

[10]

Olivier Jeunen. 2021. Offline Approaches to Recommendation with Online Success. Ph.D. Dissertation. University of Antwerp.

[11]

Olivier Jeunen and Bart Goethals. 2021. Pessimistic reward models for off-policy learning in recommendation. In Fifteenth ACM Conference on Recommender Systems. 63–74.

Digital Library

[12]

Olivier Jeunen and Bart Goethals. 2021. Top-k contextual bandits with equity of exposure. In Fifteenth ACM Conference on Recommender Systems. 310–320.

Digital Library

[13]

Olivier Jeunen, David Rohde, Flavian Vasile, and Martin Bompaire. 2020. Joint Policy-Value Learning for Recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1223–1233.

Digital Library

[14]

Nathan Kallus, Yuta Saito, and Masatoshi Uehara. 2021. Optimal Off-Policy Evaluation from Multiple Logging Policies. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. PMLR, 5247–5256.

[15]

Nathan Kallus and Angela Zhou. 2018. Policy Evaluation and Optimization with Continuous Treatments. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. PMLR, 1243–1251.

[16]

Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto. 2022. Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 487–497.

Digital Library

[17]

Shuai Li, Yasin Abbasi-Yadkori, Branislav Kveton, S Muthukrishnan, Vishwa Vinay, and Zheng Wen. 2018. Offline Evaluation of Ranking Policies with Click Models. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 1685–1694.

Digital Library

[18]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Ji Yang, Minmin Chen, Jiaxi Tang, Lichan Hong, and Ed H Chi. 2020. Off-policy learning in two-stage recommender systems. In Proceedings of The Web Conference 2020. 463–473.

Digital Library

[19]

James McInerney, Brian Brost, Praveen Chandar, Rishabh Mehrotra, and Benjamin Carterette. 2020. Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 1779–1788.

Digital Library

[20]

Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, and Craig Boutilier. 2020. Optimizing long-term social welfare in recommender systems: A constrained matching approach. In International Conference on Machine Learning. PMLR, 6987–6998.

[21]

Harrie Oosterhuis. 2020. Learning From User Interactions With Rankings: A Unification of the Field. Ph.D. Dissertation. University of Amsterdam.

[22]

Harrie Oosterhuis. 2021. Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1023–1032.

Digital Library

[23]

Harrie Oosterhuis and Maarten de Rijke. 2021. Unifying Online and Counterfactual Learning to Rank. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM‘21). ACM.

Digital Library

[24]

Harrie Oosterhuis and Maarten de Rijke. 2022. Reaching the End of Unbiasedness: Uncovering Implicit Limitations of Click-Based Learning to Rank. In Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval. ACM.

Digital Library

[25]

Yuta Saito. 2020. Doubly robust estimator for ranking metrics with post-click conversions. In Fourteenth ACM Conference on Recommender Systems. Association for Computing Machinery, 92–100.

Digital Library

[26]

Yuta Saito, Shunsuke Aihara, Megumi Matsutani, and Yusuke Narita. 2020. Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation. arXiv preprint arXiv:2008.07146(2020).

[27]

Yuta Saito and Thorsten Joachims. 2021. Counterfactual learning and evaluation for recommender systems: Foundations, implementations, and recent advances. In Fifteenth ACM Conference on Recommender Systems. 828–830.

Digital Library

[28]

Yuta Saito and Thorsten Joachims. 2022. Off-Policy Evaluation for Large Action Spaces via Embeddings. In International Conference on Machine Learning. PMLR, 19089–19122.

[29]

Yuta Saito, Takuma Udagawa, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, and Kei Tateno. 2021. Evaluating the Robustness of Off-Policy Evaluation. In Fifteenth ACM Conference on Recommender Systems. 114–123.

[30]

Nian Si, Fan Zhang, Zhengyuan Zhou, and Jose Blanchet. 2020. Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits. In Proceedings of the 38th International Conference on Machine Learning, Vol. 119. PMLR, 8884–8894.

Digital Library

[31]

Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, 2219–2228.

Digital Library

[32]

Ashudeep Singh and Thorsten Joachims. 2019. Policy Learning for Fairness in Ranking. In Advances in Neural Information Processing Systems, Vol. 32.

[33]

Yi Su, Magd Bayoumi, and Thorsten Joachims. 2021. Optimizing Rankings for Recommendation in Matching Markets. arXiv preprint arXiv:2106.01941(2021).

[34]

Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, and Miroslav Dudík. 2020. Doubly Robust Off-Policy Evaluation with Shrinkage. In Proceedings of the 37th International Conference on Machine Learning, Vol. 119. PMLR, 9167–9176.

Digital Library

[35]

Yi Su, Lequn Wang, Michele Santacatterina, and Thorsten Joachims. 2019. Cab: Continuous Adaptive Blending for Policy Evaluation and Learning. In Proceedings of the 36th International Conference on Machine Learning, Vol. 97. PMLR, 6005–6014.

[36]

Adith Swaminathan and Thorsten Joachims. 2015. Batch Learning from Logged Bandit Feedback through Counterfactual Risk Minimization. The Journal of Machine Learning Research 16, 1 (2015), 1731–1755.

Digital Library

[37]

Adith Swaminathan and Thorsten Joachims. 2015. The Self-Normalized Estimator for Counterfactual Learning. In Advances in Neural Information Processing Systems, Vol. 28. 3231–3239.

[38]

Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miro Dudik, John Langford, Damien Jose, and Imed Zitouni. 2017. Off-policy Evaluation for Slate Recommendation. In Advances in Neural Information Processing Systems, Vol. 30.

[39]

Shengpu Tang and Jenna Wiens. 2021. Model selection for offline reinforcement learning: Practical considerations for healthcare settings. In Machine Learning for Healthcare Conference. PMLR, 2–35.

[40]

Philip Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High-confidence off-policy evaluation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.

[41]

Philip Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High confidence policy improvement. In International Conference on Machine Learning. PMLR, 2380–2388.

[42]

Cameron Voloshin, Hoang M Le, Nan Jiang, and Yisong Yue. 2019. Empirical study of off-policy policy evaluation for reinforcement learning. arXiv preprint arXiv:1911.06854(2019).

[43]

Yu-Xiang Wang, Alekh Agarwal, and Miroslav Dudık. 2017. Optimal and adaptive off-policy evaluation in contextual bandits. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. PMLR, 3589–3597.

Digital Library

Cited By

Ruiz De villa ASottocornola GCoba LLucchesi FSkorulski B(2024)Ranking the causal impact of recommendations under collider bias in k-spots recommender systemsACM Transactions on Recommender Systems10.1145/36431392:2(1-29)Online publication date: 14-May-2024
https://dl.acm.org/doi/10.1145/3643139
Jeunen OUstimenko A(2024)Δ-OPE: Off-Policy Estimation with Pairs of PoliciesProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688162(878-883)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688162
Jeunen OMandav JPotapov IAgarwal NVaid SShi WUstimenko A(2024)Multi-Objective Recommendation via Multivariate Policy LearningProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688132(712-721)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688132
Show More Cited By

Index Terms

CONSEQUENCES — Causality, Counterfactuals and Sequential Decision-Making for Recommender Systems
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Learning to rank
        Ranking
    2. Learning settings
      1. Batch learning
      2. Learning from implicit feedback

Recommendations

Acquiring User Information Needs for Recommender Systems
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 03

Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based ...
Human Decision Making and Recommender Systems

Recommender systems have already proved to be valuable for coping with the information overload problem in several application domains. They provide people with suggestions for items which are likely to be of interest for them; hence, a primary function ...
A Scalable, Accurate Hybrid Recommender System
WKDD '10: Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining

Recommender systems apply machine learning techniques for filtering unseen information and can predict whether a user would like a given resource. There are three main types of recommender systems: collaborative filtering, content-based filtering, and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

September 2022

743 pages

ISBN:9781450392785

DOI:10.1145/3523227

Copyright © 2022 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2022

Check for updates

Author Tags

Qualifiers

Introduction
Research
Refereed limited

Conference

RecSys '22

Sponsor:

RecSys '22: Sixteenth ACM Conference on Recommender Systems

September 18 - 23, 2022

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
404
Total Downloads

Downloads (Last 12 months)116
Downloads (Last 6 weeks)18

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ruiz De villa ASottocornola GCoba LLucchesi FSkorulski B(2024)Ranking the causal impact of recommendations under collider bias in k-spots recommender systemsACM Transactions on Recommender Systems10.1145/36431392:2(1-29)Online publication date: 14-May-2024
https://dl.acm.org/doi/10.1145/3643139
Jeunen OUstimenko A(2024)Δ-OPE: Off-Policy Estimation with Pairs of PoliciesProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688162(878-883)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688162
Jeunen OMandav JPotapov IAgarwal NVaid SShi WUstimenko A(2024)Multi-Objective Recommendation via Multivariate Policy LearningProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688132(712-721)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688132
Gupta SJeunen OOosterhuis Hde Rijke M(2024)Optimal Baseline Corrections for Off-Policy Contextual BanditsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688105(722-732)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3688105
Jeunen OOosterhuis HSaito YVasile FWang Y(2024)CONSEQUENCES --- The 3rd Workshop on Causality, Counterfactuals and Sequential Decision-Making for Recommender SystemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3687095(1206-1209)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3640457.3687095
Sagtani HJhawar MMehrotra RJeunen OAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)Ad-load Balancing via Off-policy Learning in a Content MarketplaceProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635846(586-595)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3635846
Jeunen OJoachims TOosterhuis HSaito YVasile FWang Y(2023)CONSEQUENCES — The 2nd Workshop on Causality, Counterfactuals and Sequential Decision-Making for Recommender SystemsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608749(1223-1226)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608749
Wang WZhang YLi HWu PFeng FHe XChen HDuh WHuang HKato MMothe JPoblete B(2023)Causal Recommendation: Progresses and Future DirectionsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3594245(3432-3435)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3594245

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten