Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Differentiating Regularization Weights -- A Simple Mechanism to Alleviate Cold Start in Recommender Systems

Published: 09 January 2019 Publication History

Abstract

Matrix factorization (MF) and its extended methodologies have been studied extensively in the community of recommender systems in the last decade. Essentially, MF attempts to search for low-ranked matrices that can (1) best approximate the known rating scores, and (2) maintain low Frobenius norm for the low-ranked matrices to prevent overfitting. Since the two objectives conflict with each other, the common practice is to assign the relative importance weights as the hyper-parameters to these objectives. The two low-ranked matrices returned by MF are often interpreted as the latent factors of a user and the latent factors of an item that would affect the rating of the user on the item. As a result, it is typical that, in the loss function, we assign a regularization weight λp on the norms of the latent factors for all users, and another regularization weight λq on the norms of the latent factors for all the items. We argue that such a methodology probably over-simplifies the scenario. Alternatively, we probably should assign lower constraints to the latent factors associated with the items or users that reveal more information, and set higher constraints to the others. In this article, we systematically study this topic. We found that such a simple technique can improve the prediction results of the MF-based approaches based on several public datasets. Specifically, we applied the proposed methodology on three baseline models -- SVD, SVD++, and the NMF models. We found that this technique improves the prediction accuracy for all these baseline models. Perhaps more importantly, this technique better predicts the ratings on the long-tail items, i.e., the items that were rated/viewed/purchased by few users. This suggests that this approach may partially remedy the cold-start issue. The proposed method is very general and can be easily applied on various recommendation models, such as Factorization Machines, Field-aware Factorization Machines, Factorizing Personalized Markov Chains, Prod2Vec, Behavior2Vec, and so on. We release the code for reproducibility. We implemented a Python package that integrates the proposed regularization technique with the SVD, SVD++, and the NMF model. The package can be accessed at https://github.com/ncu-dart/rdf.

References

[1]
Fabian Abel, Eelco Herder, Geert-Jan Houben, Nicola Henze, and Daniel Krause. 2013. Cross-system user modeling and personalization on the social web. User Modeling and User-Adapted Interaction 23, 2--3 (2013), 169--209.
[2]
Lukas Brozovsky and Vaclav Petricek. 2007. Recommender system for online dating service. arXiv: cs/0703042.
[3]
Erik Brynjolfsson, Yu Hu, and Michael D. Smith. 2003. Consumer surplus in the digital economy: Estimating the value of increased product variety at online booksellers. Management Science 49, 11 (2003), 1580--1596.
[4]
Bin Cao, Nathan Nan Liu, and Qiang Yang. 2010. Transfer learning for collective link prediction in multiple heterogenous domains. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 159--166.
[5]
Rich Caruana, Steve Lawrence, and C. Lee Giles. 2001. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in Neural Information Processing Systems. 402--408.
[6]
Hung-Hsuan Chen. 2017. Weighted-SVD: Matrix factorization with weights on the latent factors. arXiv:1710.00482.
[7]
Hung-Hsuan Chen. 2018. Behavior2Vec: Generating distributed representations of users behaviors on products for recommender systems. ACM Transactions on Knowledge Discovery from Data 12, 4 (2018), 43.
[8]
Hung-Hsuan Chen, Chu-An Chung, Hsin-Chien Huang, and Wen Tsui. 2017. Common pitfalls in training and evaluating recommender systems. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 37--45.
[9]
Hung-Hsuan Chen and C. Lee Giles. 2015. ASCOS++: An asymmetric similarity measure for weighted networks to address the problem of simrank. ACM Transactions on Knowledge Discovery from Data 10, 2 (2015), 15.
[10]
Hung-Hsuan Chen, Liang Gou, Xiaolong Zhang, and Clyde Lee Giles. 2011. Collabseer: A search engine for collaboration discovery. In Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries. ACM, 231--240.
[11]
Hung-Hsuan Chen, II Ororbia, G. Alexander, and C. Lee Giles. 2015. ExpertSeer: A keyphrase based expert recommender for digital libraries. arXiv:1511.02058.
[12]
Vladimir Cherkassky, Xuhui Shao, Filip M. Mulier, and Vladimir N. Vapnik. 1999. Model complexity control for regression using VC generalization bounds. IEEE Transactions on Neural Networks 10, 5 (1999), 1075--1089.
[13]
Bradley Efron and Robert Tibshirani. 1986. Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science 1, 1 (1986), 54--75.
[14]
Martin J. Eppler and Jeanne Mengis. 2004. The concept of information overload: A review of literature from organization science, accounting, marketing, MIS, and related disciplines. Information Society 20, 5 (2004), 325--344.
[15]
Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2001. The elements of Statistical Learning. Vol. 1. Springer series in statistics New York.
[16]
Rainer Gemulla, Erik Nijkamp, Peter J. Haas, and Yannis Sismanis. 2011. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 69--77.
[17]
Liang Gou, Jung-Hyun Kim, Hung-Hsuan Chen, Jason Collins, Marc Goodman, Xiaolong Luke Zhang, and C. Lee Giles. 2009. MobiSNA: A mobile video social network application. In Proceedings of the 8th ACM International Workshop on Data Engineering for Wireless and Mobile Access. ACM, 53--56.
[18]
Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in your inbox: Product recommendations at scale. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1809--1818.
[19]
G. Guo, J. Zhang, and N. Yorke-Smith. 2013. A novel Bayesian similarity measure for recommender systems. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI’13). 2619--2625.
[20]
Greg Hamerly and Charles Elkan. 2004. Learning the k in k-means. In Advances in Neural Information Processing Systems. 281--288.
[21]
F. Maxwell Harper and Joseph A. Konstan. 2016. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems 5, 4 (2016), 19.
[22]
Ruining He and Julian McAuley. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 507--517.
[23]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 173--182.
[24]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv:1511.06939.
[25]
Arthur E. Hoerl and Robert W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1 (1970), 55--67.
[26]
Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. Field-aware factorization machines for CTR prediction. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 43--50.
[27]
Ron Kohavi. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Vol. 2. Montrea, 1137--1143.
[28]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. IEEE Computer 42, 8 (2009), 30--37.
[29]
Nick Landia. 2017. Building recommender systems for fashion: Industry talk abstract. In Proceedings of the 11th ACM Conference on Recommender Systems. ACM, 343--343.
[30]
Asher Levi, Osnat Mokryn, Christophe Diot, and Nina Taft. 2012. Finding a needle in a haystack of reviews: Cold start context-based hotel recommender system. In Proceedings of the 6th ACM Conference on Recommender Systems. ACM, 115--122.
[31]
Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems. 2177--2185.
[32]
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web. ACM, 661--670.
[33]
Cheng-You Lien, Guo-Jhen Bai, Ting-Rui Chen, and Hung-Hsuan Chen. 2017. Predicting user’s online shopping tendency during shopping holidays. (2017).
[34]
Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. 2014. Facing the cold start problem in recommender systems. Expert Systems with Applications 41, 4 (2014), 2065--2073.
[35]
Jakub Macina, Ivan Srba, Joseph Jay Williams, and Maria Bielikova. 2017. Educational question routing in online student communities. In Proceedings of the 11th ACM Conference on Recommender Systems. ACM, 47--55.
[36]
Benjamin M. Marlin and Richard S. Zemel. 2009. Collaborative prediction and ranking with non-random missing data. In Proceedings of the Third ACM Conference on Recommender Systems. ACM, 5--12.
[37]
Benjamin M. Marlin, Richard S. Zemel, Sam Roweis, and Malcolm Slaney. 2007. Collaborative filtering and the missing at random assumption. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI’07). AUAI Press, Arlington, Virginia, 267--275.
[38]
Paolo Massa, Kasper Souren, Martino Salvetti, and Danilo Tomasoni. 2001. Trustlet, open research on trust metrics. Scalable Computing: Practice and Experience 9, 4 (2001), 31--43.
[39]
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52.
[40]
Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 437--452.
[41]
Samaneh Moghaddam and Martin Ester. 2013. The FLDA model for aspect-based opinion mining: Addressing the cold start problem. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 909--918.
[42]
Andrew Y. Ng. 1998. On feature selection: Learning with exponentially many irrelevant features as training examples. In Proceedings of the 15th International Conference on Machine Learning (ICML’98). Morgan Kaufmann Publishers Inc., San Francisco, CA, 404--412.
[43]
Charles A. O’reilly. 1980. Individuals and information overload in organizations: Is more necessarily better?Academy of Management Journal 23, 4 (1980), 684--696.
[44]
Seung-Taek Park and Wei Chu. 2009. Pairwise preference regression for cold-start recommendation. In Proceedings of the 3rd ACM Conference on Recommender Systems. ACM, 21--28.
[45]
Michael P. Perrone and Leon N. Cooper. 1995. When networks disagree: Ensemble methods for hybrid neural networks. In How We Learn; How We Remember: Toward An Understanding Of Brain And Neural Systems: Selected Papers of Leon N Cooper. World Scientific, 342--358.
[46]
Ioannis Psorakis, Stephen Roberts, Mark Ebden, and Ben Sheldon. 2011. Overlapping community detection using bayesian non-negative matrix factorization. Physical Review E 83, 6 (2011), 066114.
[47]
Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 7th International Conference on Intelligent User Interfaces. ACM, 127--134.
[48]
Al Mamunur Rashid, George Karypis, and John Riedl. 2008. Learning preferences of new users in recommender systems: An information theoretic approach. Acm Sigkdd Explorations Newsletter 10, 2 (2008), 90--100.
[49]
Steffen Rendle. 2010. Factorization machines. In Proceedings of the 10th International Conference on Data Mining (ICDM ’10). IEEE, 995--1000.
[50]
Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web. ACM, 811--820.
[51]
Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 253--260.
[52]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
[53]
Mingxuan Sun, Fuxin Li, Joonseok Lee, Ke Zhou, Guy Lebanon, and Hongyuan Zha. 2013. Learning multiple-question decision trees for cold-start recommendation. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining. ACM, 445--454.
[54]
Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) 58, 1 (1996), 267--288.
[55]
Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J. Smola, and How Jing. 2017. Recurrent recommender networks. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining. ACM, 495--503.
[56]
Hong-Jian Xue, Xin-Yu Dai, Jianbing Zhang, Shujian Huang, and Jiajun Chen. 2017. Deep matrix factorization models for recommender systems. In Proceedings of the 26th International Joint Conference on Artificial Intelligence.
[57]
Ke Zhou, Shuang-Hong Yang, and Hongyuan Zha. 2011. Functional matrix factorizations for cold-start recommendation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 315--324.
[58]
Hui Zou and Trevor Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67, 2 (2005), 301--320.

Cited By

View all
  • (2024)Clustering-Based Frequent Pattern Mining Framework for Solving Cold-Start Problem in Recommender SystemsIEEE Access10.1109/ACCESS.2024.335505712(13678-13698)Online publication date: 2024
  • (2024)Twittener: Improving News Experience with Sentiment Analysis and Trend RecommendationSocial Computing and Social Media10.1007/978-3-031-61281-7_30(417-433)Online publication date: 1-Jun-2024
  • (2023)Towards addressing item cold-start problem in collaborative filtering by embedding agglomerative clustering and FP-growth into the recommendation systemComputer Science and Information Systems10.2298/CSIS221116052K20:4(1343-1366)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 1
February 2019
340 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3301280
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2019
Accepted: 01 October 2018
Revised: 01 August 2018
Received: 01 June 2018
Published in TKDD Volume 13, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Recommender systems
  2. SVD
  3. SVD++
  4. cold start
  5. collaborative filtering
  6. long tail
  7. matrix factorization

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Industrial Technology Research Institute
  • Ministry of Science and Technology
  • CHANGING.AI

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Clustering-Based Frequent Pattern Mining Framework for Solving Cold-Start Problem in Recommender SystemsIEEE Access10.1109/ACCESS.2024.335505712(13678-13698)Online publication date: 2024
  • (2024)Twittener: Improving News Experience with Sentiment Analysis and Trend RecommendationSocial Computing and Social Media10.1007/978-3-031-61281-7_30(417-433)Online publication date: 1-Jun-2024
  • (2023)Towards addressing item cold-start problem in collaborative filtering by embedding agglomerative clustering and FP-growth into the recommendation systemComputer Science and Information Systems10.2298/CSIS221116052K20:4(1343-1366)Online publication date: 2023
  • (2023)Bootstrapped Personalized Popularity for Cold Start Recommender SystemsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608820(715-722)Online publication date: 14-Sep-2023
  • (2023)AutoOpt: Automatic Hyperparameter Scheduling and Optimization for Deep Click-through Rate PredictionProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608800(183-194)Online publication date: 14-Sep-2023
  • (2023) Efficient Retrieval of the Top- k Most Relevant Event-Partner Pairs IEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.311855235:3(2529-2543)Online publication date: 1-Mar-2023
  • (2023)Detecting Inaccurate Sensors on a Large-Scale Sensor Network Using Centralized and Localized Graph Neural NetworksIEEE Sensors Journal10.1109/JSEN.2023.328727023:15(16446-16455)Online publication date: 1-Aug-2023
  • (2023)User Cold Start Problem in Recommendation Systems: A Systematic ReviewIEEE Access10.1109/ACCESS.2023.333870511(136958-136977)Online publication date: 2023
  • (2023)Collaborative filtering with sequential implicit feedback via learning users’ preferences over item-setsInformation Sciences10.1016/j.ins.2022.11.064621(136-155)Online publication date: Apr-2023
  • (2023)Dynamic global feature extraction and importance‐correlation selection for the prediction of concentrate copper grade and recovery rateThe Canadian Journal of Chemical Engineering10.1002/cjce.24759101:5(2598-2610)Online publication date: 9-Jan-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media