research-article

The 2021 RecSys Challenge Dataset: Fairness is not optional

Authors:

Luca Belli,

Alykhan Tejani*,

Frank Portman*,

Alexandre Lung-Yut-Fong*,

Wenzhe ShiAuthors Info & Claims

RecSysChallenge '21: Proceedings of the Recommender Systems Challenge 2021

Pages 1 - 6

https://doi.org/10.1145/3487572.3487573

Published: 22 November 2021 Publication History

Get Access

Abstract

After the success the RecSys 2020 Challenge, we are describing a novel and bigger dataset that was released in conjunction with the ACM RecSys Challenge 2021. This year’s dataset is not only bigger (~1B data points, a 5 fold increase), but for the first time it take into consideration fairness aspects of the challenge. Unlike many static datsets, a lot of effort went into making sure that the dataset was synced with the Twitter platform: if a user deleted their content, the same content would be promptly removed from the dataset too. In this paper, we introduce the dataset and challenge, highlighting some of the issues that arise when creating recommender systems at Twitter scale.

References

[1]

Abolfazl Asudeh, HV Jagadish, Julia Stoyanovich, and Gautam Das. 2019. Designing fair ranking schemes. In Proceedings of the 2019 International Conference on Management of Data. 1259–1276.

Digital Library

Google Scholar

[2]

Luca Belli, Sofia Ira Ktena, Alykhan Tejani, Alexandre Lung-Yut-Fon, Frank Portman, Xiao Zhu, Yuanpu Xie, Akshay Gupta, Michael Bronstein, Amra Delić, Gabriele Sottocornola, Walter Anelli, Nazareno Andrade, Jessie Smith, and Wenzhe Shi. 2020. Privacy-Aware Recommender Systems Challenge on Twitter’s Home Timeline. arxiv:2004.13715 [cs.SI]

Google Scholar

[3]

Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of attention: Amortizing individual fairness in rankings. In The 41st international acm sigir conference on research & development in information retrieval. 405–414.

Google Scholar

[4]

S. Corbett-Davies and Sharad Goel. 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. ArXiv abs/1808.00023(2018).

Google Scholar

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423

Crossref

Google Scholar

[6]

Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. http://arxiv.org/abs/1412.6980

Google Scholar

[7]

Arvind Narayanan and Vitaly Shmatikov. 2006. How To Break Anonymity of the Netflix Prize Dataset. arxiv:cs/0610105 [cs.CR]

Google Scholar

[8]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.

Digital Library

Google Scholar

[9]

Piotr Sapiezynski, Wesley Zeng, Ronald E Robertson, Alan Mislove, and Christo Wilson. 2019. Quantifying the Impact of User Attentionon Fair Group Representation in Ranked Lists. In Companion Proceedings of The 2019 World Wide Web Conference. 553–562.

Google Scholar

[10]

Andrew D. Selbst, Danah Boyd, Sorelle A. Friedler, Suresh Venkatasubramanian, and Janet Vertesi. 2019. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* ’19). Association for Computing Machinery, New York, NY, USA, 59–68. https://doi.org/10.1145/3287560.3287598

Digital Library

Google Scholar

[11]

Ashudeep Singh and Thorsten Joachims. 2018. Fairness of exposure in rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2219–2228.

Digital Library

Google Scholar

[12]

Latanya Sweeney. 1997. Guaranteeing anonymity when sharing medical data, the Datafly System. In Proceedings: a conference of the American Medical Informatics Association. AMIA Fall Symposium. Hanley & Belfus, Inc., Nashville, TN, USA, 51—55. https://europepmc.org/articles/PMC2233452

Google Scholar

[13]

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods 17(2020), 261–272. https://doi.org/10.1038/s41592-019-0686-2

Crossref

Google Scholar

Cited By

View all

Bouchaud P(2024)Skewed perspectives: examining the influence of engagement maximization on content diversity in social media feedsJournal of Computational Social Science10.1007/s42001-024-00255-w7:1(721-739)Online publication date: 20-Mar-2024
https://doi.org/10.1007/s42001-024-00255-w
Bouchaud P(2024)Algorithmic Amplification of Politics and Engagement Maximization on Social MediaComplex Networks & Their Applications XII10.1007/978-3-031-53503-1_11(131-142)Online publication date: 29-Feb-2024
https://doi.org/10.1007/978-3-031-53503-1_11
Agrawal RBrahme SMaitra SSrivastava AIrissappane ALiu YKalloori S(2023)RecSys Challenge 2023 Dataset: Ads Recommendations in Online AdvertisingProceedings of the Recommender Systems Challenge 202310.1145/3626221.3627283(1-3)Online publication date: 19-Sep-2023
https://dl.acm.org/doi/10.1145/3626221.3627283
Show More Cited By

Recommendations

RecSys 2021 Challenge Workshop: Fairness-aware engagement prediction at scale on Twitter’s Home Timeline
RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems

The workshop features presentations of accepted contributions to the RecSys Challenge 2021, organized by Politecnico di Bari, ETH Zürich, Jönköping University, and the data set is provided by Twitter. The challenge focuses on a real-world task of tweet ...
An analysis of the 2014 RecSys Challenge
RecSysChallenge '14: Proceedings of the 2014 Recommender Systems Challenge

The RecSys challenge 2014 focuses on the engagement generated by the tweets posted by the users of the IMDb application for smartphones. Such engagement depends on attributes concerning: the user who posts the message (e.g., his role in the social ...
Ranking approach to RecSys Challenge
RecSysChallenge '14: Proceedings of the 2014 Recommender Systems Challenge

In this paper we describe our approach to solve RecSys Challenge 2014. The challenge is to rank user's auto generated IMDB rating tweets by their favorited and shared count. Our approach is to formulate this as a ranking problem. We treat a single user ...

Comments

Information & Contributors

Information

Published In

RecSysChallenge '21: Proceedings of the Recommender Systems Challenge 2021

October 2021

43 pages

ISBN:9781450386937

DOI:10.1145/3487572

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

RecSysChallenge 2021

RecSysChallenge 2021: Proceedings of the Recommender Systems Challenge 2021

October 1, 2021

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 11 of 15 submissions, 73%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
186
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)3

Reflects downloads up to 12 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Bouchaud P(2024)Skewed perspectives: examining the influence of engagement maximization on content diversity in social media feedsJournal of Computational Social Science10.1007/s42001-024-00255-w7:1(721-739)Online publication date: 20-Mar-2024
https://doi.org/10.1007/s42001-024-00255-w
Bouchaud P(2024)Algorithmic Amplification of Politics and Engagement Maximization on Social MediaComplex Networks & Their Applications XII10.1007/978-3-031-53503-1_11(131-142)Online publication date: 29-Feb-2024
https://doi.org/10.1007/978-3-031-53503-1_11
Agrawal RBrahme SMaitra SSrivastava AIrissappane ALiu YKalloori S(2023)RecSys Challenge 2023 Dataset: Ads Recommendations in Online AdvertisingProceedings of the Recommender Systems Challenge 202310.1145/3626221.3627283(1-3)Online publication date: 19-Sep-2023
https://dl.acm.org/doi/10.1145/3626221.3627283
Agrawal RBrahme SMaitra SKalloori SSrivastava ALiu YIrissappane A(2023)RecSys Challenge 2023: Deep Funnel Optimization with a Focus on User PrivacyProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3610508(1217-1220)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3610508
Volkovs MPerez FCheng ZSun JNorouzi SWong AJankiewicz PRho B(2021)User Engagement Modeling with Deep Learning and Language ModelsProceedings of the Recommender Systems Challenge 202110.1145/3487572.3487604(22-27)Online publication date: 1-Oct-2021
https://dl.acm.org/doi/10.1145/3487572.3487604

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Cited By

Recommendations

RecSys 2021 Challenge Workshop: Fairness-aware engagement prediction at scale on Twitter’s Home Timeline

An analysis of the 2014 RecSys Challenge

Ranking approach to RecSys Challenge

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations