Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3511808.3557624acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

KuaiRand: An Unbiased Sequential Recommendation Dataset with Randomly Exposed Videos

Published: 17 October 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Recommender systems deployed in real-world applications can have inherent exposure bias, which leads to the biased logged data plaguing the researchers. A fundamental way to address this thorny problem is to collect users' interactions on randomly expose items, i.e., the missing-at-random data. A few works have asked certain users to rate or select randomly recommended items, e.g., Yahoo!, Coat, and OpenBandit. However, these datasets are either too small in size or lack key information, such as unique user ID or the features of users/items. In this work, we present KuaiRand, an unbiased sequential recommendation dataset containing millions of intervened interactions on randomly exposed videos, collected from the video-sharing mobile App, Kuaishou. Different from existing datasets, KuaiRand records 12 kinds of user feedback signals (e.g., click, like, and view time) on randomly exposed videos inserted in the recommendation feeds in two weeks. To facilitate model learning, we further collect rich features of users and items as well as users' behavior history. By releasing this dataset, we enable the research of advanced debiasing large-scale recommendation scenarios for the first time. Also, with its distinctive features, KuaiRand can support various other research directions such as interactive recommendation, long sequential behavior modeling, and multi-task learning. The dataset is available at https://kuairand.com.

    References

    [1]
    Léon Bottou, Jonas Peters, Joaquin Qui nonero-Candela, Denis X Charles, D Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, and Ed Snelson. 2013. Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising. Journal of Machine Learning Research, Vol. 14, 11 (2013).
    [2]
    Qingpeng Cai, Ruohan Zhan, Chi Zhang, Jie Zheng, Guangwei Ding, Pinghua Gong, Dong Zheng, and Peng Jiang. 2022. Constrained Reinforcement Learning for Short Video Recommendation. arXiv preprint arXiv:2205.13248 (2022).
    [3]
    Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, and Xiangnan He. 2020. Bias and Debias in Recommender System: A Survey and Future Directions. arXiv preprint arXiv:2010.03240 (2020).
    [4]
    Chongming Gao, Wenqiang Lei, Jiawei Chen, Shiqi Wang, Xiangnan He, Shijun Li, Biao Li, Yuan Zhang, and Peng Jiang. 2022a. CIRS: Bursting Filter Bubbles by Counterfactual Interactive Recommender System. arXiv preprint arXiv:2204.01266 (2022).
    [5]
    Chongming Gao, Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2021. Advances and Challenges in Conversational Recommender Systems: A Survey. AI Open, Vol. 2 (2021), 100--126. https://doi.org/10.1016/j.aiopen.2021.06.002
    [6]
    Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. 2022b. KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems. arXiv preprint arXiv:2202.10842 (2022).
    [7]
    Alexandre Gilotte, Clément Calauzènes, Thomas Nedelec, Alexandre Abraham, and Simon Dollé. 2018. Offline A/B Testing for Recommender Systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA). Association for Computing Machinery, New York, NY, USA, 198--206. https://doi.org/10.1145/3159652.3159687
    [8]
    Eugene Ie, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019. Recsim: A Configurable Simulation Platform for Recommender Systems. arXiv preprint arXiv:1909.04847 (2019).
    [9]
    Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated Gain-Based Evaluation of IR Techniques. ACM Transactions on Information Systems (TOIS), Vol. 20, 4 (Oct. 2002), 422--446. https://doi.org/10.1145/582415.582418
    [10]
    Wenqiang Lei, Chongming Gao, and Maarten de Rijke. 2021. RecSys 2021 Tutorial on Conversational Recommendation: Formulation, Methods, and Evaluation. In Fifteenth ACM Conference on Recommender Systems (Amsterdam, Netherlands) (RecSys '21). Association for Computing Machinery, New York, NY, USA, 842--844. https://doi.org/10.1145/3460231.3473325
    [11]
    Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A Contextual-Bandit Approach to Personalized News Article Recommendation. In Proceedings of the 19th International Conference on World Wide Web (Raleigh, North Carolina, USA). Association for Computing Machinery, New York, NY, USA, 661--670. https://doi.org/10.1145/1772690.1772758
    [12]
    Xiao Lin, Hongjie Chen, Changhua Pei, Fei Sun, Xuanji Xiao, Hanxiao Sun, Yongfeng Zhang, Wenwu Ou, and Peng Jiang. 2019. A Pareto-efficient Algorithm for Multiple Objective Optimization in E-commerce Recommendation. In Proceedings of the 13th ACM Conference on recommender systems (RecSys '19). 20--28.
    [13]
    Dugang Liu, Pengxiang Cheng, Zhenhua Dong, Xiuqiang He, Weike Pan, and Zhong Ming. 2020. A General Knowledge Distillation Framework for Counterfactual Recommendation via Uniform Data. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China). Association for Computing Machinery, New York, NY, USA, 831--840. https://doi.org/10.1145/3397271.3401083
    [14]
    David C. Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C. Ma, Zhigang Zhong, Jenny Liu, and Yushi Jing. 2017. Related Pins at Pinterest: The Evolution of a Real-World Recommender System. In Proceedings of the 26th International Conference on World Wide Web Companion (Perth, Australia) (WWW '17 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 583--592. https://doi.org/10.1145/3041021.3054202
    [15]
    Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire Space Multi-task Model: An Effective Approach for Estimating Post-click Conversion Rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '18). 1137--1140.
    [16]
    Benjamin M. Marlin and Richard S. Zemel. 2009. Collaborative Prediction and Ranking with Non-Random Missing Data. In Proceedings of the Third ACM Conference on Recommender Systems (New York, New York, USA). Association for Computing Machinery, New York, NY, USA, 5--12. https://doi.org/10.1145/1639714.1639717
    [17]
    Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD '19). Association for Computing Machinery, New York, NY, USA, 2671--2679. https://doi.org/10.1145/3292500.3330666
    [18]
    Qi Pi, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, Xiaoqiang Zhu, and Kun Gai. 2020. Search-Based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM '20). Association for Computing Machinery, New York, NY, USA, 2685--2692. https://doi.org/10.1145/3340531.3412744
    [19]
    Yuta Saito, Shunsuke Aihara, Megumi Matsutani, and Yusuke Narita. 2021. Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). https://openreview.net/forum?id=tyn3MYS_uDT
    [20]
    Yuta Saito, Suguru Yaginuma, Yuta Nishino, Hayato Sakata, and Kazuhide Nakata. 2020. Unbiased Recommender Learning from Missing-Not-At-Random Implicit Feedback. In Proceedings of the 13th International Conference on Web Search and Data Mining (Houston, TX, USA). Association for Computing Machinery, New York, NY, USA, 501--509. https://doi.org/10.1145/3336191.3371783
    [21]
    Tobias Schnabel, Paul N. Bennett, Susan T. Dumais, and Thorsten Joachims. 2018. Short-Term Satisfaction and Long-Term Coverage: Understanding How Users Tolerate Algorithmic Exploration. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA). Association for Computing Machinery, New York, NY, USA, 513--521. https://doi.org/10.1145/3159652.3159700
    [22]
    Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as Treatments: Debiasing Learning and Evaluation. In Proceedings of the 33rd International Conference on International Conference on Machine Learning (New York, NY, USA). JMLR.org, 1670--1679.
    [23]
    Adith Swaminathan and Thorsten Joachims. 2015. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback. In International Conference on Machine Learning. PMLR, 814--823.
    [24]
    CIW TEAM. 2022. Short video app Kuaishou e-commerce GMV up 48% in Q1 2022. https://www.chinainternetwatch.com/31784/kuaishou-quarterly/. Accessed June 1, 2022.
    [25]
    Hongwei Wang, Fuzheng Zhang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2019. Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation. In The World Wide Web Conference (San Francisco, CA, USA) (WWW '19). Association for Computing Machinery, New York, NY, USA, 2000--2010. https://doi.org/10.1145/3308558.3313411
    [26]
    Shiqi Wang, Chongming Gao, Min Gao, Junliang Yu, Zongwei Wang, and Hongzhi Yin. 2022. Who Are the Best Adopters? User Selection Model for Free Trial Item Promotion. arXiv preprint arXiv:2202.09508 (2022).
    [27]
    Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, and Joemon M. Jose. 2020. Self-Supervised Reinforcement Learning for Recommender Systems. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, 931--940. https://doi.org/10.1145/3397271.3401147
    [28]
    Chenxiao Yang, Junwei Pan, Xiaofeng Gao, Tingyu Jiang, Dapeng Liu, and Guihai Chen. 2022. Cross-Task Knowledge Distillation in Multi-Task Recommendation. AAAI '22 (2022).
    [29]
    Longqi Yang, Yin Cui, Yuan Xuan, Chenyang Wang, Serge Belongie, and Deborah Estrin. 2018. Unbiased Offline Recommender Evaluation for Missing-Not-at-Random Implicit Feedback. In Proceedings of the 12th ACM Conference on Recommender Systems (Vancouver, British Columbia, Canada). Association for Computing Machinery, New York, NY, USA, 279--287. https://doi.org/10.1145/3240323.3240355
    [30]
    Shuo Zhang and Krisztian Balog. 2020. Evaluating Conversational Recommender Systems via User Simulation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA, 1512--1520. https://doi.org/10.1145/3394486.3403202
    [31]
    Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep Interest Evolution Network for Click-through Rate Prediction. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (Honolulu, Hawaii, USA) (AAAI '19). AAAI Press, Article 729, 8 pages. https://doi.org/10.1609/aaai.v33i01.33015941
    [32]
    Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom). Association for Computing Machinery, New York, NY, USA, 1059--1068. https://doi.org/10.1145/3219819.3219823

    Cited By

    View all
    • (2024)Deep Causal Reasoning for RecommendationsACM Transactions on Intelligent Systems and Technology10.1145/365398515:4(1-25)Online publication date: 18-Jun-2024
    • (2024)Mitigating Exposure Bias in Recommender Systems – A Comparative Analysis of Discrete Choice ModelsACM Transactions on Recommender Systems10.1145/3641291Online publication date: 27-Jan-2024
    • (2024)SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological SignalsProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638343(417-421)Online publication date: 10-Mar-2024
    • Show More Cited By

    Index Terms

    1. KuaiRand: An Unbiased Sequential Recommendation Dataset with Randomly Exposed Videos

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
      October 2022
      5274 pages
      ISBN:9781450392365
      DOI:10.1145/3511808
      • General Chairs:
      • Mohammad Al Hasan,
      • Li Xiong
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 October 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. datasets
      2. long sequence
      3. random exposure
      4. recommendation

      Qualifiers

      • Short-paper

      Funding Sources

      • The National Natural Science Foundation of China

      Conference

      CIKM '22
      Sponsor:

      Acceptance Rates

      CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)140
      • Downloads (Last 6 weeks)12

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Deep Causal Reasoning for RecommendationsACM Transactions on Intelligent Systems and Technology10.1145/365398515:4(1-25)Online publication date: 18-Jun-2024
      • (2024)Mitigating Exposure Bias in Recommender Systems – A Comparative Analysis of Discrete Choice ModelsACM Transactions on Recommender Systems10.1145/3641291Online publication date: 27-Jan-2024
      • (2024)SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological SignalsProceedings of the 2024 Conference on Human Information Interaction and Retrieval10.1145/3627508.3638343(417-421)Online publication date: 10-Mar-2024
      • (2024)EEG-SVRec: An EEG Dataset with User Multidimensional Affective Engagement Labels in Short Video RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657890(698-708)Online publication date: 10-Jul-2024
      • (2024)EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657868(977-987)Online publication date: 10-Jul-2024
      • (2024)SIGformer: Sign-aware Graph Transformer for RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657747(1274-1284)Online publication date: 10-Jul-2024
      • (2024)IncMSR: An Incremental Learning Approach for Multi-Scenario RecommendationProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635828(939-948)Online publication date: 4-Mar-2024
      • (2024)Collaboration and Transition: Distilling Item Transitions into Multi-Query Self-Attention for Sequential RecommendationProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635787(1003-1011)Online publication date: 4-Mar-2024
      • (2024)CDCM: ChatGPT-Aided Diversity-Aware Causal Model for Interactive RecommendationIEEE Transactions on Multimedia10.1109/TMM.2024.335239726(6488-6500)Online publication date: 2024
      • (2024)Modeling multi-behavior sequence via HyperGRU contrastive network for micro-video recommendationKnowledge-Based Systems10.1016/j.knosys.2024.111841295(111841)Online publication date: Jul-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media