MobileRec: A Large Scale Dataset for Mobile Apps Recommendation

Published: 18 July 2023


Recommender systems have become ubiquitous in our digital lives, from recommending products on e-commerce websites to suggesting movies and music on streaming platforms. Existing recommendation datasets, such as Amazon Product Reviews and MovieLens, greatly facilitated the research and development of recommender systems in their respective domains. While the number of mobile users and applications (aka apps) has increased exponentially over the past decade, research in mobile app recommender systems has been significantly constrained, primarily due to the lack of high-quality benchmark datasets, as opposed to recommendations for products, movies, and news. To facilitate research for app recommendation systems, we introduce a large-scale dataset, called MobileRec. We constructed MobileRec from users' activity on the Google play store. MobileRec contains 19.3 million user interactions (i.e., user reviews on apps) with over 10K unique apps across 48 categories. MobileRec records the sequential activity of a total of 0.7 million distinct users. Each of these users has interacted with no fewer than five distinct apps, which stands in contrast to previous datasets on mobile apps that recorded only a single interaction per user. Furthermore, MobileRec presents users' ratings as well as sentiments on installed apps, and each app contains rich metadata such as app name, category, description, and overall rating, among others. We demonstrate that MobileRec can serve as an excellent testbed for app recommendation through a comparative study of several state-of-the-art recommendation approaches. The MobileRec dataset is available at https://huggingface.co/datasets/recmeapp/mobilerec.

This is a large-scale mobile app recommendation dataset, known as MobileRec. MobileRec has been collected from user reviews on Google Play Store. There are 19.3 million user reviews comprising user-item interactions involving 0.7 million unique users and more than 10k items (mobile apps). Each user has a minimum of 5 interactions. The MobileRec dataset has been collected in three steps which are package name (unique identifier collection), metadata collection, and downloading the user reviews in an automated fashion involving dynamic page scrolling, and performing the click events. The MobileRec dataset anonymizes the user information with a 16-character alphanumeric UID. The MobileRec dataset also provides the user review's timestamp in various formats along with a rich set of insightful features e.g. review text, app metadata, app's user rating, and helpfulness votes. Overall, the MobileRec dataset provides new avenues for research in mobile app recommender systems.


