research-article

Bounding System-Induced Biases in Recommender Systems with a Randomized Dataset

Authors:

Pengxiang Cheng,

Xiaolian Zhang,

Zhong MingAuthors Info & Claims

ACM Transactions on Information Systems, Volume 41, Issue 4

Article No.: 108, Pages 1 - 26

https://doi.org/10.1145/3582002

Published: 08 April 2023 Publication History

Abstract

Debiased recommendation with a randomized dataset has shown very promising results in mitigating system-induced biases. However, it still lacks more theoretical insights or an ideal optimization objective function compared with the other more well-studied routes without a randomized dataset. To bridge this gap, we study the debiasing problem from a new perspective and propose to directly minimize the upper bound of an ideal objective function, which facilitates a better potential solution to system-induced biases. First, we formulate a new ideal optimization objective function with a randomized dataset. Second, according to the prior constraints that an adopted loss function may satisfy, we derive two different upper bounds of the objective function: a generalization error bound with triangle inequality and a generalization error bound with separability. Third, we show that most existing related methods can be regarded as the insufficient optimization of these two upper bounds. Fourth, we propose a novel method called debiasing approximate upper bound (DUB) with a randomized dataset, which achieves a more sufficient optimization of these upper bounds. Finally, we conduct extensive experiments on a public dataset and a real product dataset to verify the effectiveness of our DUB.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th Symposium on Operating Systems Design and Implementation. USENIX Association, Berkeley, CA, 265–283.

[2]

Himan Abdollahpouri, Robin Burke, and Bamshad Mobasher. 2017. Controlling popularity bias in learning-to-rank recommendation. In Proceedings of the 11th ACM Conference on Recommender Systems. ACM, Como, 42–46.

Digital Library

[3]

Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, and Marc Najork. 2019. Addressing trust bias for unbiased learning-to-rank. In Proceedings of the Web Conference 2019. ACM, San Francisco, CA, 4–14.

Digital Library

[4]

Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019. Estimating position bias without intrusive interventions. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. ACM, Melbourne, 474–482.

Digital Library

[5]

Stephen Bonner and Flavian Vasile. 2018. Causal embeddings for recommendation. In Proceedings of the 12th ACM Conference on Recommender Systems. ACM, Vancouver, BC, 104–112.

Digital Library

[6]

Rocío Cañamares and Pablo Castells. 2018. Should I follow the crowd?: A probabilistic analysis of the effectiveness of popularity in recommender systems. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Ann Arbor, MI, 415–424.

Digital Library

[7]

Jiawei Chen, Hande Dong, Yang Qiu, Xiangnan He, Xin Xin, Liang Chen, Guli Lin, and Keping Yang. 2021. AutoDebias: Learning to debias for recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Montréal, QC, 21–30.

Digital Library

[8]

Nicolas Courty, Rémi Flamary, Amaury Habrard, and Alain Rakotomamonjy. 2017. Joint distribution optimal transportation for domain adaptation. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Long Beach, CA, 3730–3739.

Digital Library

[9]

Prem Gopalan, Jake M. Hofman, and David M. Blei. 2015. Scalable recommendation with hierarchical Poisson factorization. In Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence. AUAI Press, Arlington, VA, 326–335.

[10]

Siyuan Guo, Lixin Zou, Yiding Liu, Wenwen Ye, Suqi Cheng, Shuaiqiang Wang, Hechang Chen, Dawei Yin, and Yi Chang. 2021. Enhanced doubly robust learning for debiasing post-click conversion rate estimation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Montréal, QC, 275–284.

Digital Library

[11]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the Web Conference 2017. ACM, Perth, 173–182.

Digital Library

[12]

Wassily Hoeffding. 1994. Probability inequalities for sums of bounded random variables. In The Collected Works of Wassily Hoeffding. Springer, New York, NY, 409–426.

[13]

Alexandre Kaspar, Tae-Hyun Oh, Liane Makatura, Petr Kellnhofer, and Wojciech Matusik. 2019. Neural inverse knitting: From images to manufacturing instructions. In Proceedings of the 36th International Conference on Machine Learning. PMLR, Long Beach, CA, 3272–3281.

[14]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). https://arxiv.org/abs/1412.6980

[15]

Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.

Digital Library

[16]

Jongyeong Lee, Nontawat Charoenphakdee, Seiichi Kuroki, and Masashi Sugiyama. 2019. Domain discrepancy measure for complex models in unsupervised domain adaptation. arXiv preprint arXiv:1901.10654 (2019). https://arxiv.org/abs/1901.10654

[17]

Jae-woong Lee, Seongmin Park, and Jongwuk Lee. 2021. Dual unbiased recommender learning for implicit feedback. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Montréal, QC, 1647–1651.

Digital Library

[18]

Dawen Liang, Laurent Charlin, James McInerney, and David M. Blei. 2016. Modeling user exposure in recommendation. In Proceedings of the Web Conference 2016. ACM, Montréal, QC, 951–961.

[19]

Chen Lin, Dugang Liu, Hanghang Tong, and Yanghua Xiao. 2022. Spiral of silence and its application in recommender systems. IEEE Transactions on Knowledge and Data Engineering 34, 6 (2022), 2934–2947.

[20]

Dugang Liu, Pengxiang Cheng, Zhenhua Dong, Xiuqiang He, Weike Pan, and Zhong Ming. 2020. A general knowledge distillation framework for counterfactual recommendation via uniform data. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Xi’an, 831–840.

Digital Library

[21]

Dugang Liu, Pengxiang Cheng, Zinan Lin, Jinwei Luo, Zhenhua Dong, Xiuqiang He, Weike Pan, and Zhong Ming. 2022. KDCRec: Knowledge distillation for counterfactual recommendation via uniform data. IEEE Transactions on Knowledge and Data Engineering (2022).

[22]

Dugang Liu, Pengxiang Cheng, Hong Zhu, Zhenhua Dong, Xiuqiang He, Weike Pan, and Zhong Ming. 2021. Mitigating confounding bias in recommendation via information bottleneck. In Proceedings of the 15th ACM Conference on Recommender Systems. ACM, Amsterdam, 351–360.

Digital Library

[23]

Dugang Liu, Pengxiang Cheng, Hong Zhu, Zhenhua Dong, Xiuqiang He, Weike Pan, and Zhong Ming. 2022. Debiased representation learning in recommendation via information bottleneck. ACM Transactions on Recommender Systems 1, 1 (2022), 5.

[24]

Dugang Liu, Chen Lin, Zhilin Zhang, Yanghua Xiao, and Hanghang Tong. 2019. Spiral of silence in recommender systems. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. ACM, Melbourne, 222–230.

Digital Library

[25]

Yiming Liu, Xuezhi Cao, and Yong Yu. 2016. Are you influenced by others when rating? Improve rating prediction by conformity modeling. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, Boston, MA, 269–272.

Digital Library

[26]

Benjamin M. Marlin and Richard S. Zemel. 2009. Collaborative prediction and ranking with non-random missing data. In Proceedings of the 3rd ACM Conference on Recommender Systems. ACM, New York City, NY, 5–12.

Digital Library

[27]

Marco Morik, Ashudeep Singh, Jessica Hong, and Thorsten Joachims. 2020. Controlling fairness and bias in dynamic learning-to-rank. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Xi’an, 429–438.

Digital Library

[28]

Harrie Oosterhuis and Maarten de Rijke. 2021. Unifying online and counterfactual learning to rank: A novel counterfactual estimator that effectively utilizes online interventions. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. ACM, Jerusalem, 463–471.

Digital Library

[29]

Zohreh Ovaisi, Ragib Ahsan, Yifan Zhang, Kathryn Vasilaky, and Elena Zheleva. 2020. Correcting for selection bias in learning-to-rank systems. In Proceedings of the Web Conference 2020. ACM, Taipei, 1863–1873.

Digital Library

[30]

Rong Pan, Yunhong Zhou, Bin Cao, Nathan N. Liu, Rajan Lukose, Martin Scholz, and Qiang Yang. 2008. One-class collaborative filtering. In Proceedings of the 8th IEEE International Conference on Data Mining. IEEE, Pisa, 502–511.

Digital Library

[31]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc., Vancouver, BC, 8026–8037.

[32]

Judea Pearl and Dana Mackenzie. 2018. The Book of Why: The New Science of Cause and Effect. Basic Books, Inc., New York, NY.

Digital Library

[33]

Yuta Saito. 2020. Asymmetric tri-training for debiasing missing-not-at-random explicit feedback. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Xi’an, 309–318.

Digital Library

[34]

Yuta Saito, Suguru Yaginuma, Yuta Nishino, Hayato Sakata, and Kazuhide Nakata. 2020. Unbiased recommender learning from missing-not-at-random implicit feedback. In Proceedings of the 13th International Conference on Web Search and Data Mining. ACM, Houston, TX, 501–509.

Digital Library

[35]

Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as treatments: Debiasing learning and evaluation. In Proceedings of the 33rd International Conference on Machine Learning. PMLR, New York City, NY, 1670–1679.

[36]

Pannaga Shivaswamy and Ashok Chandrashekar. 2021. Bias-variance decomposition for ranking. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. ACM, Jerusalem, 472–480.

Digital Library

[37]

Harald Steck. 2013. Evaluation of recommendations: Rating-prediction and ranking. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, Hong Kong, 213–220.

Digital Library

[38]

Wenjie Wang, Fuli Feng, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua. 2021. Clicks can be cheating: Counterfactual recommendation for mitigating clickbait issue. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Montréal, QC, 1288–1297.

Digital Library

[39]

Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position bias estimation for unbiased learning to rank in personal search. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. ACM, Los Angeles, CA, 610–618.

Digital Library

[40]

Xiaojie Wang, Rui Zhang, Yu Sun, and Jianzhong Qi. 2019. Doubly robust joint learning for recommendation on data missing not at random. In Proceedings of the 36th International Conference on Machine Learning. PMLR, Long Beach, CA, 6638–6647.

[41]

Xiaojie Wang, Rui Zhang, Yu Sun, and Jianzhong Qi. 2021. Combating selection biases in recommender systems with a few unbiased ratings. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. ACM, Jerusalem, 427–435.

Digital Library

[42]

Zifeng Wang, Xi Chen, Rui Wen, Shao-Lun Huang, Ercan E. Kuruoglu, and Yefeng Zheng. 2020. Information theoretic counterfactual learning from missing-not-at-random feedback. In Proceedings of the 34th International Conference on Neural Information Processing Systems. Curran Associates Inc., Online, 1854–1864.

[43]

Tianxin Wei, Fuli Feng, Jiawei Chen, Ziwei Wu, Jinfeng Yi, and Xiangnan He. 2021. Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. ACM, Singapore, 1791–1800.

Digital Library

[44]

Xinwei Wu, Hechang Chen, Jiashu Zhao, Li He, Dawei Yin, and Yi Chang. 2021. Unbiased learning to rank in feeds recommendation. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. ACM, Jerusalem, 490–498.

Digital Library

[45]

Himank Yadav, Zhengxiao Du, and Thorsten Joachims. 2021. Policy-gradient training of fair and unbiased ranking functions. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Montréal, QC, 1044–1053.

Digital Library

[46]

Haiqin Yang, Guang Ling, Yuxin Su, Michael R. Lyu, and Irwin King. 2015. Boosting response aware model-based collaborative filtering. IEEE Transactions on Knowledge and Data Engineering 27, 8 (2015), 2064–2077.

Digital Library

[47]

Jiangxing Yu, Hong Zhu, Chih-Yao Chang, Xinhua Feng, Bowen Yuan, Xiuqiang He, and Zhenhua Dong. 2020. Influence function for unbiased recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Xi’an, 1929–1932.

Digital Library

[48]

Bowen Yuan, Jui-Yang Hsia, Meng-Yuan Yang, Hong Zhu, Chih-Yao Chang, Zhenhua Dong, and Chih-Jen Lin. 2019. Improving ad click prediction by considering non-displayed events. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, Beijing, 329–338.

Digital Library

[49]

Shuxi Zeng, Murat Ali Bayir, Joseph J. Pfeiffer III, Denis Charles, and Emre Kiciman. 2021. Causal transfer random forest: Combining logged data and randomized experiments for robust prediction. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. ACM, Jerusalem, 211–219.

Digital Library

[50]

Wenhao Zhang, Wentian Bao, Xiao-Yang Liu, Keping Yang, Quan Lin, Hong Wen, and Ramin Ramezani. 2020. Large-scale causal approaches to debiasing post-click conversion rate estimation with multi-task learning. In Proceedings of the Web Conference 2020. ACM, Taipei, 2775–2781.

Digital Library

[51]

Xiaoying Zhang, Junzhou Zhao, and John C. S. Lui. 2017. Modeling the assimilation-contrast effects in online product rating systems: Debiasing and recommendations. In Proceedings of the 11th ACM Conference on Recommender Systems. ACM, Como, 98–106.

Digital Library

[52]

Yu Zheng, Chen Gao, Xiang Li, Xiangnan He, Yong Li, and Depeng Jin. 2021. Disentangling user interest and conformity for recommendation with causal embedding. In Proceedings of the Web Conference 2021. ACM, Ljubljana, 2980–2991.

Digital Library

[53]

Ziwei Zhu, Yun He, Xing Zhao, Yin Zhang, Jianling Wang, and James Caverlee. 2021. Popularity-opportunity bias in collaborative filtering. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. ACM, Jerusalem, 85–93.

Digital Library

Cited By

Xu JYu JCai YChua T(2024)Dual Contrastive Learning for Cross-domain Named Entity RecognitionACM Transactions on Information Systems10.1145/3678879Online publication date: 20-Jul-2024
https://doi.org/10.1145/3678879
Meng BQin X(2024)Construction of an Education Innovation Network Management System under Artificial Intelligence TechnologyProceedings of the 2024 International Conference on Computer and Multimedia Technology10.1145/3675249.3675280(168-171)Online publication date: 24-May-2024
https://dl.acm.org/doi/10.1145/3675249.3675280
Ma JBian KXu YZhu L(2024)ANAGL: A Noise-resistant and Anti-sparse Graph Learning for micro-video recommendationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3670407Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3670407
Show More Cited By

Index Terms

Bounding System-Induced Biases in Recommender Systems with a Randomized Dataset
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Bounding Average-Energy Games
Proceedings of the 20th International Conference on Foundations of Software Science and Computation Structures - Volume 10203

We consider average-energy games, where the goal is to minimize the long-run average of the accumulated energy. While several results have been obtained on these games recently, decidability of average-energy games with a lower-bound constraint on the ...
Lower Bounds for Randomized Exclusive Write PRAMs

In this paper we study the question: How useful is randomization in speeding up Exclusive Write PRAM computations? Our results give further evidence that randomization is of limited use in these types of computations. First we examine a compaction ...
Bounding the Power of Preemption in Randomized Scheduling

We study on-line scheduling in overloaded systems. Requests for jobs arrive one by one as time proceeds; the serving agents have limited capacity and not all requests can be served. Still, we want to serve the "best" set of requests according to some ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 41, Issue 4

October 2023

958 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3587261

Editor:
Min Zhang
Tsinghua University, China

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2023

Online AM: 24 January 2023

Accepted: 09 January 2023

Revised: 18 September 2022

Received: 06 February 2022

Published in TOIS Volume 41, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
464
Total Downloads

Downloads (Last 12 months)140
Downloads (Last 6 weeks)11

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xu JYu JCai YChua T(2024)Dual Contrastive Learning for Cross-domain Named Entity RecognitionACM Transactions on Information Systems10.1145/3678879Online publication date: 20-Jul-2024
https://doi.org/10.1145/3678879
Meng BQin X(2024)Construction of an Education Innovation Network Management System under Artificial Intelligence TechnologyProceedings of the 2024 International Conference on Computer and Multimedia Technology10.1145/3675249.3675280(168-171)Online publication date: 24-May-2024
https://dl.acm.org/doi/10.1145/3675249.3675280
Ma JBian KXu YZhu L(2024)ANAGL: A Noise-resistant and Anti-sparse Graph Learning for micro-video recommendationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3670407Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3670407
Cai MHou MChen LWu LBai HLi YWang M(2024)Mitigating Recommendation Biases via Group-Alignment and Global-Uniformity in Representation LearningACM Transactions on Intelligent Systems and Technology10.1145/366493115:5(1-27)Online publication date: 17-Oct-2024
https://doi.org/10.1145/3664931
Kang SKweon WLee DLian JXie XYu H(2024)Unbiased, Effective, and Efficient Distillation from Heterogeneous Models for Recommender SystemsACM Transactions on Recommender Systems10.1145/3649443Online publication date: 23-Feb-2024
https://dl.acm.org/doi/10.1145/3649443
Lalor JAbbasi AOketch KYang YForsgren N(2024)Should Fairness be a Metric or a Model? A Model-based Framework for Assessing Bias in Machine Learning PipelinesACM Transactions on Information Systems10.1145/364127642:4(1-41)Online publication date: 22-Mar-2024
https://dl.acm.org/doi/10.1145/3641276
Yang JDing YWang YRen PChen ZCai FMa JZhang RRen ZXin XAngélica LLattanzi SMuñoz Medina AAkoglu LGionis AVassilvitskii S(2024)Debiasing Sequential Recommenders through Distributionally Robust Optimization over System ExposureProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635848(882-890)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1145/3616855.3635848
Shi XLiu QXie HWu DPeng BShang MLian D(2023)Relieving Popularity Bias in Interactive Recommendation: A Diversity-Novelty-Aware Reinforcement Learning ApproachACM Transactions on Information Systems10.1145/361810742:2(1-30)Online publication date: 8-Nov-2023
https://dl.acm.org/doi/10.1145/3618107
Li YChen HXu SGe YTan JLiu SZhang Y(2023)Fairness in Recommendation: Foundations, Methods, and ApplicationsACM Transactions on Intelligent Systems and Technology10.1145/361030214:5(1-48)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3610302
Zhao JWang WLin XQu LZhang JChua TFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)Popularity-aware Distributionally Robust Optimization for Recommendation SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615492(4967-4973)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615492
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents