research-article

The MovieLens Datasets: History and Context

Authors:

F. Maxwell Harper and

Joseph A. KonstanAuthors Info & Claims

ACM Transactions on Interactive Intelligent Systems (TiiS), Volume 5, Issue 4

Article No.: 19, Pages 1 - 19

https://doi.org/10.1145/2827872

Published: 22 December 2015 Publication History

Abstract

The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.

References

[1]

Shuo Chang, F. Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’’15). ACM, New York, NY, 1258--1269.

Digital Library

[2]

Dan Cosley, Dan Frankowski, Sara Kiesler, Loren Terveen, and John Riedl. 2005. How oversight improves member-maintained communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’05). ACM, New York, NY, 11--20.

Digital Library

[3]

Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing?: How recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’03). ACM, New York, NY, 585--592.

Digital Library

[4]

Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 271--280.

Digital Library

[5]

Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Transactions on Information Systems 22, 1, 143--177.

Digital Library

[6]

Sara Drenner, Max Harper, Dan Frankowski, John Riedl, and Loren Terveen. 2006. Insert movie reference here: A system to bridge conversation and item-oriented web sites. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 951--954.

Digital Library

[7]

Gideon Dror, Yahoo Labs, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2012. The Yahoo&excl; music dataset and KDDCup11. In Journal of Machine Learning Research Workshop and Conference Proceedings: Proceedings of KDD Cup 2011. 3--18.

[8]

Michael D. Ekstrand, Daniel Kluver, F. Maxwell Harper, and Joseph A. Konstan. 2015. Letting users choose recommender algorithms: An experimental study. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 11--18.

Digital Library

[9]

Michael D. Ekstrand, Michael Ludwig, Joseph A. Konstan, and John T. Riedl. 2011. Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, NY, 133--140.

Digital Library

[10]

Malcolm Gladwell. 1999. The science of the sleeper. The New Yorker. Retrieved November 13, 2015 from http://gladwell.com/the-science-of-the-sleeper/.

[11]

Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. 2001. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4, 2, 133--151.

Digital Library

[12]

F. Maxwell Harper, Dan Frankowski, Sara Drenner, Yuqing Ren, Sara Kiesler, Loren Terveen, Robert Kraut, and John Riedl. 2007a. Talk amongst yourselves: Inviting users to participate in online conversations. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07). ACM, New York, NY, 62--71.

Digital Library

[13]

F. Maxwell Harper, Shilad Sen, and Dan Frankowski. 2007b. Supporting social recommendations with activity-balanced clustering. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 165--168.

Digital Library

[14]

F. Maxwell Harper, Funing Xu, Harmanpreet Kaur, Kyle Condiff, Shuo Chang, and Loren Terveen. 2015. Putting users in control of their recommendations. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 3--10.

Digital Library

[15]

George Karypis. 2001. Evaluation of item-based top-N recommendation algorithms. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM’01). ACM, New York, NY, 247--254.

Digital Library

[16]

Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl. 1997. GroupLens: Applying collaborative filtering to Usenet news. Communications of the ACM 40, 3, 77--87.

Digital Library

[17]

Joseph A. Konstan, J. D. Walker, D. Christopher Brooks, Keith Brown, and Michael D. Ekstrand. 2014. Teaching recommender systems at large scale: Evaluation and lessons learned from a hybrid MOOC. In Proceedings of the 1st ACM Conference on Learning @ Scale Conference (L@S’14). ACM, New York, NY, 61--70.

Digital Library

[18]

John G. Lynch, Jr., Dipankar Chakravarti, and Anusree Mitra. 1991. Contrast effects in consumer judgments: Changes in mental representations or in the anchoring of rating scales? Journal of Consumer Research 18, 3, 284--297.

[19]

Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 17--24.

Digital Library

[20]

Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, New York, NY, 785--794.

Digital Library

[21]

Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015b. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 43--52.

Digital Library

[22]

Bradley Norman Miller. 2003. Toward a Personal Recommender System. Ph.D. dissertation. University of Minnesota, Minneapolis, MN. Retrieved from http://search.proquest.com/dissertations/docview/305324342/abstract/A46BCC87FC4D4DD4PQ/1?accountid=14586.

[23]

Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of users. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW’01). Kluwer Academic Publishers, Norwell, MA, 199--218.

Digital Library

[24]

John O’Donovan and Barry Smyth. 2005. Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces (IUI’05). ACM, New York, NY, 167--174.

Digital Library

[25]

Nick Pentreath. 2015. Machine Learning with Spark. Packt Publishing Ltd, Birmingham, UK.

[26]

Reid Priedhorsky, Mikhil Masli, and Loren Terveen. 2010. Eliciting and focusing geographic volunteer work. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’10). ACM, New York, NY, 61--70.

Digital Library

[27]

Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces (IUI’02). ACM, New York, NY, 127--134.

Digital Library

[28]

Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: An information theoretic approach. ACM SIGKDD Explorations Newsletter 10, 2, 90--100.

Digital Library

[29]

Yuqing Ren, F. Harper, Sara Drenner, Loren Terveen, Sara Kiesler, John Riedl, and Robert Kraut. 2012. Building member attachment in online communities: Applying theories of group identity and interpersonal bonds. Management Information Systems Quarterly 36, 3 (Sept. 2012), 841--864.

Digital Library

[30]

Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of Netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186.

Digital Library

[31]

Eric Ries. 2011. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business, New York, NY.

[32]

Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of Dimensionality Reduction in Recommender System—A Case Study. Technical Report. DTIC Document. Retrieved from http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix==html&identifier==ADA439541.

[33]

Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). ACM, New York, NY, 285--295.

Digital Library

[34]

Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 253--260.

Digital Library

[35]

Toby Segaran. 2007. Programming Collective Intelligence: Building Smart Web 2.0 Applications. O’Reilly Media, Inc., Sebastopol, CA.

Digital Library

[36]

Shilad Sen, F. Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 International ACM Conference on Supporting Group Work (GROUP’07). ACM, New York, NY, 361--370.

Digital Library

[37]

Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work (CSCW’06). ACM, New York, NY, 181--190.

Digital Library

[38]

Shilad Sen, Jesse Vig, and John Riedl. 2009. Learning to recognize valuable tags. In Proceedings of the 14th International Conference on Intelligent User Interfaces (IUI’09). ACM, New York, NY, 87--96.

Digital Library

[39]

Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, New York, NY, 257--297. http://link.springer.com/chapter/10.1007/978-0-387-85820-3_8

[40]

Jesse Vig, Shilad Sen, and John Riedl. 2012. The tag genome: Encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems 2, 3, 13:1--13:44.

Digital Library

[41]

Jesse Vig, Matthew Soukup, Shilad Sen, and John Riedl. 2010. Tag expression: Tagging with feeling. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 323--332.

Digital Library

[42]

Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 22--32.

Digital Library

Cited By

Amanatidis GFilos-Ratsikas ALazos PMarkakis EPapasotiropoulos GDastani MSichman JAlechina NDignum V(2024)On the Potential and Limitations of Proxy Voting: Delegation with Incomplete VotesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662851(49-57)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662851
Hwang IKim HKim YLee Y(2024)Generalized neural collaborative filteringKorean Journal of Applied Statistics10.5351/KJAS.2024.37.3.31137:3(311-322)Online publication date: 30-Jun-2024
https://doi.org/10.5351/KJAS.2024.37.3.311
Niu YLiu KLu FZhang J(2024)A Snapshot Survey of Data Acquisition Forms in Multi-Attribute Decision-Making StudiesBig Data Quantification for Complex Decision-Making10.4018/979-8-3693-1582-8.ch009(219-246)Online publication date: 31-May-2024
https://doi.org/10.4018/979-8-3693-1582-8.ch009
Show More Cited By

Index Terms

The MovieLens Datasets: History and Context

Recommendations

Our Model Achieves Excellent Performance on MovieLens: What Does It Mean?
A typical benchmark dataset for recommender system (RecSys) evaluation consists of user-item interactions generated on a platform within a time period. The interaction generation mechanism partially explains why a user interacts with (e.g., like, purchase,...
Read More
iSynchronizer: A Tool for Extracting, Integration and Analysis of MovieLens and IMDb Datasets
UMAP '18: Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization

The growing popularity of e-commerce has ignited the interest of the research community in e-commerce application research and development. For this purpose, variety of applications and resources such as MovieLens and IMDb datasets have been utilized, ...
Read More
Putting Users in Control of their Recommendations
RecSys '15: Proceedings of the 9th ACM Conference on Recommender Systems

The essence of a recommender system is that it can recommend items personalized to the preferences of an individual user. But typically users are given no explicit control over this personalization, and are instead left guessing about how their actions ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Interactive Intelligent Systems

ACM Transactions on Interactive Intelligent Systems Volume 5, Issue 4

Regular Articles and Special issue on New Directions in Eye Gaze for Interactive Intelligent Systems (Part 1 of 2)

January 2016

118 pages

ISSN:2160-6455

EISSN:2160-6463

DOI:10.1145/2866565

Editors:
Anthony Jameson
German Research Center for Artifi cial Intelligence (DFKI), Germany
,
Krzysztof Gajos
Harvard University, U.S.A

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2015

Accepted: 01 October 2015

Revised: 01 October 2015

Received: 01 July 2015

Published in TIIS Volume 5, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Science Foundation
Google
CFK Productions
Net Perceptions
University of Minnesota's Undergraduate Research Opportunities Program

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2,160
Total Citations
View Citations
8,379
Total Downloads

Downloads (Last 12 months)1,128
Downloads (Last 6 weeks)119

Other Metrics

View Author Metrics

Citations

Cited By

Amanatidis GFilos-Ratsikas ALazos PMarkakis EPapasotiropoulos GDastani MSichman JAlechina NDignum V(2024)On the Potential and Limitations of Proxy Voting: Delegation with Incomplete VotesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662851(49-57)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662851
Hwang IKim HKim YLee Y(2024)Generalized neural collaborative filteringKorean Journal of Applied Statistics10.5351/KJAS.2024.37.3.31137:3(311-322)Online publication date: 30-Jun-2024
https://doi.org/10.5351/KJAS.2024.37.3.311
Niu YLiu KLu FZhang J(2024)A Snapshot Survey of Data Acquisition Forms in Multi-Attribute Decision-Making StudiesBig Data Quantification for Complex Decision-Making10.4018/979-8-3693-1582-8.ch009(219-246)Online publication date: 31-May-2024
https://doi.org/10.4018/979-8-3693-1582-8.ch009
Lee KKeikhosrokiani PWong JAsl M(2024)Narrative Threads and Cinematic Connections Using Intelligent Systems to Enhance Movie Recommendations with Market Basket Analysis and Advanced AlgorithmsData-Driven Business Intelligence Systems for Socio-Technical Organizations10.4018/979-8-3693-1210-0.ch013(319-364)Online publication date: 23-Feb-2024
https://doi.org/10.4018/979-8-3693-1210-0.ch013
Wang QJin EZhang HChen YYue YDorado DHu ZXu M(2024)Enhancing Personalized Recommendations: A Study on the Efficacy of Multi-Task Learning and Feature IntegrationInformation10.3390/info1506031215:6(312)Online publication date: 27-May-2024
https://doi.org/10.3390/info15060312
Azri AHaddi AAllali H(2024)IUAutoTimeSVD++: A Hybrid Temporal Recommender System Integrating Item and User Features Using a Contractive AutoencoderInformation10.3390/info1504020415:4(204)Online publication date: 5-Apr-2024
https://doi.org/10.3390/info15040204
Liu KWu JSun QYang HWan R(2024)Harnessing Test-Oriented Knowledge Graphs for Enhanced Test Function RecommendationElectronics10.3390/electronics1308154713:8(1547)Online publication date: 18-Apr-2024
https://doi.org/10.3390/electronics13081547
Jiang YGao YSun YWang SYan C(2024)Multi-Channel Hypergraph Collaborative Filtering with Attribute InferenceElectronics10.3390/electronics1305090313:5(903)Online publication date: 27-Feb-2024
https://doi.org/10.3390/electronics13050903
Sachpenderis NKoloniari G(2024)Outlier Detection and Prediction in Evolving CommunitiesApplied Sciences10.3390/app1406235614:6(2356)Online publication date: 11-Mar-2024
https://doi.org/10.3390/app14062356
Peng SSiet SIlkhomjon SKim DPark D(2024)Integration of Deep Reinforcement Learning with Collaborative Filtering for Movie Recommendation SystemsApplied Sciences10.3390/app1403115514:3(1155)Online publication date: 30-Jan-2024
https://doi.org/10.3390/app14031155
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents