Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content

The MovieLens Datasets: History and Context

Published: 22 December 2015 Publication History
  • Get Citation Alerts
  • Abstract

    The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.


    Shuo Chang, F. Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW’’15). ACM, New York, NY, 1258--1269.
    Dan Cosley, Dan Frankowski, Sara Kiesler, Loren Terveen, and John Riedl. 2005. How oversight improves member-maintained communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’05). ACM, New York, NY, 11--20.
    Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing?: How recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’03). ACM, New York, NY, 585--592.
    Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 271--280.
    Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Transactions on Information Systems 22, 1, 143--177.
    Sara Drenner, Max Harper, Dan Frankowski, John Riedl, and Loren Terveen. 2006. Insert movie reference here: A system to bridge conversation and item-oriented web sites. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 951--954.
    Gideon Dror, Yahoo Labs, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2012. The Yahoo! music dataset and KDDCup11. In Journal of Machine Learning Research Workshop and Conference Proceedings: Proceedings of KDD Cup 2011. 3--18.
    Michael D. Ekstrand, Daniel Kluver, F. Maxwell Harper, and Joseph A. Konstan. 2015. Letting users choose recommender algorithms: An experimental study. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 11--18.
    Michael D. Ekstrand, Michael Ludwig, Joseph A. Konstan, and John T. Riedl. 2011. Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, NY, 133--140.
    Malcolm Gladwell. 1999. The science of the sleeper. The New Yorker. Retrieved November 13, 2015 from http://gladwell.com/the-science-of-the-sleeper/.
    Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. 2001. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4, 2, 133--151.
    F. Maxwell Harper, Dan Frankowski, Sara Drenner, Yuqing Ren, Sara Kiesler, Loren Terveen, Robert Kraut, and John Riedl. 2007a. Talk amongst yourselves: Inviting users to participate in online conversations. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07). ACM, New York, NY, 62--71.
    F. Maxwell Harper, Shilad Sen, and Dan Frankowski. 2007b. Supporting social recommendations with activity-balanced clustering. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 165--168.
    F. Maxwell Harper, Funing Xu, Harmanpreet Kaur, Kyle Condiff, Shuo Chang, and Loren Terveen. 2015. Putting users in control of their recommendations. In Proceedings of the 9th ACM Conference on Recommender Systems (RecSys’15). ACM, New York, NY, 3--10.
    George Karypis. 2001. Evaluation of item-based top-N recommendation algorithms. In Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM’01). ACM, New York, NY, 247--254.
    Joseph A. Konstan, Bradley N. Miller, David Maltz, Jonathan L. Herlocker, Lee R. Gordon, and John Riedl. 1997. GroupLens: Applying collaborative filtering to Usenet news. Communications of the ACM 40, 3, 77--87.
    Joseph A. Konstan, J. D. Walker, D. Christopher Brooks, Keith Brown, and Michael D. Ekstrand. 2014. Teaching recommender systems at large scale: Evaluation and lessons learned from a hybrid MOOC. In Proceedings of the 1st ACM Conference on Learning @ Scale Conference (L@S’14). ACM, New York, NY, 61--70.
    John G. Lynch, Jr., Dipankar Chakravarti, and Anusree Mitra. 1991. Contrast effects in consumer judgments: Changes in mental representations or in the anchoring of rating scales? Journal of Consumer Research 18, 3, 284--297.
    Paolo Massa and Paolo Avesani. 2007. Trust-aware recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems (RecSys’07). ACM, New York, NY, 17--24.
    Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015a. Inferring networks of substitutable and complementary products. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’15). ACM, New York, NY, 785--794.
    Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015b. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 43--52.
    Bradley Norman Miller. 2003. Toward a Personal Recommender System. Ph.D. dissertation. University of Minnesota, Minneapolis, MN. Retrieved from http://search.proquest.com/dissertations/docview/305324342/abstract/A46BCC87FC4D4DD4PQ/1?accountid=14586.
    Mark O’Connor, Dan Cosley, Joseph A. Konstan, and John Riedl. 2001. PolyLens: A recommender system for groups of users. In Proceedings of the 7th Conference on European Conference on Computer Supported Cooperative Work (ECSCW’01). Kluwer Academic Publishers, Norwell, MA, 199--218.
    John O’Donovan and Barry Smyth. 2005. Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces (IUI’05). ACM, New York, NY, 167--174.
    Nick Pentreath. 2015. Machine Learning with Spark. Packt Publishing Ltd, Birmingham, UK.
    Reid Priedhorsky, Mikhil Masli, and Loren Terveen. 2010. Eliciting and focusing geographic volunteer work. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’10). ACM, New York, NY, 61--70.
    Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces (IUI’02). ACM, New York, NY, 127--134.
    Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: An information theoretic approach. ACM SIGKDD Explorations Newsletter 10, 2, 90--100.
    Yuqing Ren, F. Harper, Sara Drenner, Loren Terveen, Sara Kiesler, John Riedl, and Robert Kraut. 2012. Building member attachment in online communities: Applying theories of group identity and interpersonal bonds. Management Information Systems Quarterly 36, 3 (Sept. 2012), 841--864.
    Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl. 1994. GroupLens: An open architecture for collaborative filtering of Netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work (CSCW’94). ACM, New York, NY, 175--186.
    Eric Ries. 2011. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business, New York, NY.
    Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Application of Dimensionality Reduction in Recommender System—A Case Study. Technical Report. DTIC Document. Retrieved from http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix==html&identifier==ADA439541.
    Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW’01). ACM, New York, NY, 285--295.
    Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 253--260.
    Toby Segaran. 2007. Programming Collective Intelligence: Building Smart Web 2.0 Applications. O’Reilly Media, Inc., Sebastopol, CA.
    Shilad Sen, F. Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 International ACM Conference on Supporting Group Work (GROUP’07). ACM, New York, NY, 361--370.
    Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work (CSCW’06). ACM, New York, NY, 181--190.
    Shilad Sen, Jesse Vig, and John Riedl. 2009. Learning to recognize valuable tags. In Proceedings of the 14th International Conference on Intelligent User Interfaces (IUI’09). ACM, New York, NY, 87--96.
    Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, New York, NY, 257--297. http://link.springer.com/chapter/10.1007/978-0-387-85820-3_8
    Jesse Vig, Shilad Sen, and John Riedl. 2012. The tag genome: Encoding community knowledge to support novel interaction. ACM Transactions on Interactive Intelligent Systems 2, 3, 13:1--13:44.
    Jesse Vig, Matthew Soukup, Shilad Sen, and John Riedl. 2010. Tag expression: Tagging with feeling. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 323--332.
    Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 22--32.

    Cited By

    View all
    • (2024)On the Potential and Limitations of Proxy Voting: Delegation with Incomplete VotesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662851(49-57)Online publication date: 6-May-2024
    • (2024)Generalized neural collaborative filteringKorean Journal of Applied Statistics10.5351/KJAS.2024.37.3.31137:3(311-322)Online publication date: 30-Jun-2024
    • (2024)A Snapshot Survey of Data Acquisition Forms in Multi-Attribute Decision-Making StudiesBig Data Quantification for Complex Decision-Making10.4018/979-8-3693-1582-8.ch009(219-246)Online publication date: 31-May-2024
    • Show More Cited By



    Information & Contributors


    Published In

    cover image ACM Transactions on Interactive Intelligent Systems
    ACM Transactions on Interactive Intelligent Systems  Volume 5, Issue 4
    Regular Articles and Special issue on New Directions in Eye Gaze for Interactive Intelligent Systems (Part 1 of 2)
    January 2016
    118 pages
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]


    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 December 2015
    Accepted: 01 October 2015
    Revised: 01 October 2015
    Received: 01 July 2015
    Published in TIIS Volume 5, Issue 4


    Request permissions for this article.

    Check for updates

    Author Tags

    1. Datasets
    2. MovieLens
    3. ratings
    4. recommendations


    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Science Foundation
    • Google
    • CFK Productions
    • Net Perceptions
    • University of Minnesota's Undergraduate Research Opportunities Program


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)1,128
    • Downloads (Last 6 weeks)119

    Other Metrics


    Cited By

    View all
    • (2024)On the Potential and Limitations of Proxy Voting: Delegation with Incomplete VotesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662851(49-57)Online publication date: 6-May-2024
    • (2024)Generalized neural collaborative filteringKorean Journal of Applied Statistics10.5351/KJAS.2024.37.3.31137:3(311-322)Online publication date: 30-Jun-2024
    • (2024)A Snapshot Survey of Data Acquisition Forms in Multi-Attribute Decision-Making StudiesBig Data Quantification for Complex Decision-Making10.4018/979-8-3693-1582-8.ch009(219-246)Online publication date: 31-May-2024
    • (2024)Narrative Threads and Cinematic Connections Using Intelligent Systems to Enhance Movie Recommendations with Market Basket Analysis and Advanced AlgorithmsData-Driven Business Intelligence Systems for Socio-Technical Organizations10.4018/979-8-3693-1210-0.ch013(319-364)Online publication date: 23-Feb-2024
    • (2024)Enhancing Personalized Recommendations: A Study on the Efficacy of Multi-Task Learning and Feature IntegrationInformation10.3390/info1506031215:6(312)Online publication date: 27-May-2024
    • (2024)IUAutoTimeSVD++: A Hybrid Temporal Recommender System Integrating Item and User Features Using a Contractive AutoencoderInformation10.3390/info1504020415:4(204)Online publication date: 5-Apr-2024
    • (2024)Harnessing Test-Oriented Knowledge Graphs for Enhanced Test Function RecommendationElectronics10.3390/electronics1308154713:8(1547)Online publication date: 18-Apr-2024
    • (2024)Multi-Channel Hypergraph Collaborative Filtering with Attribute InferenceElectronics10.3390/electronics1305090313:5(903)Online publication date: 27-Feb-2024
    • (2024)Outlier Detection and Prediction in Evolving CommunitiesApplied Sciences10.3390/app1406235614:6(2356)Online publication date: 11-Mar-2024
    • (2024)Integration of Deep Reinforcement Learning with Collaborative Filtering for Movie Recommendation SystemsApplied Sciences10.3390/app1403115514:3(1155)Online publication date: 30-Jan-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options


    View or Download as a PDF file.



    View online with eReader.








    Share this Publication link

    Share on social media