research-article

A Fast Parallel Stochastic Gradient Method for Matrix Factorization in Shared Memory Systems

Authors:

Wei-Sheng Chin,

Chih-Jen LinAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 6, Issue 1

Article No.: 2, Pages 1 - 24

https://doi.org/10.1145/2668133

Published: 11 March 2015 Publication History

Abstract

Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient (SG) method is one of the most popular algorithms for matrix factorization. However, as a sequential approach, SG is difficult to be parallelized for handling web-scale problems. In this article, we develop a fast parallel SG method, FPSG, for shared memory systems. By dramatically reducing the cache-miss rate and carefully addressing the load balance of threads, FPSG is more efficient than state-of-the-art parallel algorithms for matrix factorization.

References

[1]

Robert M. Bell and Yehuda Koren. 2007. Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter 9, 2 (2007), 75--79.

Digital Library

[2]

Kai-Wei Chang, Cho-Jui Hsieh, and Chih-Jen Lin. 2008. Coordinate descent method for large-scale L2-loss linear SVM. Journal of Machine Learning Research 9 (2008), 1369--1398. Retrieved from http://www.csie.ntu.edu.tw/&sim;cjlin/papers/cdl2.pdf.

Digital Library

[3]

Gideon Dror, Noam Koenigstein, Yehuda Koren, and Markus Weimer. 2012. The Yahoo&excl; Music dataset and KDD-Cup 11. In Proceedings of the KDD Cup JMLR Workshop and Conference, Vol. 18. 3--18.

[4]

Rainer Gemulla, Erik Nijkamp, Peter J. Haas, and Yannis Sismanis. 2011. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 69--77.

Digital Library

[5]

Keith B. Hall, Scott Gilpin, and Gideon Mann. 2010. MapReduce/Bigtable for distributed optimization. In Neural Information Processing Systems Workshop on Leaning on Cores, Clusters, and Clouds.

[6]

Cho-Jui Hsieh and Inderjit S. Dhillon. 2011. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Digital Library

[7]

Jack Kiefer and Jacob Wolfowitz. 1952. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics 23, 3 (1952), 462--466.

[8]

Yehuda Koren, Robert M. Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30--37.

Digital Library

[9]

Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12).

Digital Library

[10]

Gideon Mann, Ryan McDonald, Mehryar Mohri, Nathan Silberman, and Dan Walker. 2009. Efficient large-scale distributed training of conditional maximum entropy models. In Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta (Eds.). 1231--1239.

[11]

Ryan McDonald, Keith Hall, and Gideon Mann. 2010. Distributed training strategies for the structured perceptron. In Proceedings of the 48th Annual Meeting of the Association of Computational Linguistics (ACL’10). 456--464.

Digital Library

[12]

Feng Niu, Benjamin Recht, Christopher Ré, and Stephen J. Wright. 2011. HOGWILD&excl;: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. Bartlett, F. C. N. Pereira, and K. Q. Weinberger (Eds.). 693--701.

[13]

István Pilászy, Dávid Zibriczky, and Domonkos Tikk. 2010. Fast ALS-based matrix factorization for explicit and implicit feedback datasets. In Proceedings of the 4th ACM Conference on Recommender Systems. 71--78.

Digital Library

[14]

Herbert Robbins and Sutton Monro. 1951. A stochastic approximation method. The Annals of Mathematical Statistics 22, 3 (1951), 400--407.

[15]

Hsiang-Fu Yu, Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon. 2012. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In Proceedings of the IEEE International Conference on Data Mining. 765--774.

Digital Library

[16]

Hyokun Yun, Hsiang-Fu Yu, Cho-Jui Hsieh, S. V. N. Vishwanathan, and Inderjit S. Dhillon. 2014. NOMAD: Non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. In Proceedings of the International Conference on Very Large Data Bases (VLDB’14).

[17]

Yunhong Zhou, Dennis Wilkinson, Robert Schreiber, and Rong Pan. 2008. Large-scale parallel collaborative filtering for the netflix prize. In Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management. 337--348.

Digital Library

[18]

Yong Zhuang, Wei-Sheng Chin, Yu-Chin Juan, and Chih-Jen Lin. 2013. A fast parallel SGD for matrix factorization in shared memory systems. In Proceedings of the ACM Recommender Systems. Retrieved from http://www.csie.ntu.edu.tw/&sim;cjlin/papers/libmf.pdf.

Digital Library

[19]

Martin Zinkevich, Markus Weimer, Alex Smola, and Lihong Li. 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems 23, J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta (Eds.). 2595--2603.

Cited By

Luo XChen JYuan YWang Z(2024)Pseudo Gradient-Adjusted Particle Swarm Optimization for Accurate Adaptive Latent Factor AnalysisIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.334091954:4(2213-2226)Online publication date: Apr-2024
https://doi.org/10.1109/TSMC.2023.3340919
Elahi FFazlali MMalazi HElahi M(2024)Parallel Fractional Stochastic Gradient Descent With Adaptive Learning for Recommender SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.318521235:3(470-483)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TPDS.2022.3185212
Qin WLuo X(2024)Asynchronous Parallel Fuzzy Stochastic Gradient Descent for High-Dimensional Incomplete Data RepresentationIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2023.330037032:2(445-459)Online publication date: Feb-2024
https://doi.org/10.1109/TFUZZ.2023.3300370
Show More Cited By

Index Terms

A Fast Parallel Stochastic Gradient Method for Matrix Factorization in Shared Memory Systems
1. Mathematics of computing
  1. Mathematical software

Recommendations

A fast parallel SGD for matrix factorization in shared memory systems
RecSys '13: Proceedings of the 7th ACM conference on Recommender systems

Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient descent (SGD) is one of the most popular algorithms for matrix factorization. However, as ...
A parallel matrix factorization based recommender by alternating stochastic gradient decent

Collaborative Filtering (CF) can be achieved by Matrix Factorization (MF) with high prediction accuracy and scalability. Most of the current MF based recommenders, however, are serial, which prevent them sharing the efficiency brought by the rapid ...
Stochastic Gradient Descent for matrix completion: Hybrid parallelization on shared- and distributed-memory systems
Abstract
The purpose of this study is to investigate the hybrid parallelization of the Stochastic Gradient Descent (SGD) algorithm for solving the matrix completion problem on a high-performance computing platform. We propose a hybrid parallel ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 6, Issue 1

April 2015

255 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/2745393

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 March 2015

Accepted: 01 September 2014

Revised: 01 July 2014

Received: 01 April 2014

Published in TIST Volume 6, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Taiwan University
National Science Council of Taiwan
MediaTek Fellowship

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

68
Total Citations
View Citations
824
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Luo XChen JYuan YWang Z(2024)Pseudo Gradient-Adjusted Particle Swarm Optimization for Accurate Adaptive Latent Factor AnalysisIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.334091954:4(2213-2226)Online publication date: Apr-2024
https://doi.org/10.1109/TSMC.2023.3340919
Elahi FFazlali MMalazi HElahi M(2024)Parallel Fractional Stochastic Gradient Descent With Adaptive Learning for Recommender SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.318521235:3(470-483)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TPDS.2022.3185212
Qin WLuo X(2024)Asynchronous Parallel Fuzzy Stochastic Gradient Descent for High-Dimensional Incomplete Data RepresentationIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2023.330037032:2(445-459)Online publication date: Feb-2024
https://doi.org/10.1109/TFUZZ.2023.3300370
Qin WLuo XZhou M(2024)Adaptively-Accelerated Parallel Stochastic Gradient Descent for High-Dimensional and Incomplete Data Representation LearningIEEE Transactions on Big Data10.1109/TBDATA.2023.332630410:1(92-107)Online publication date: Feb-2024
https://doi.org/10.1109/TBDATA.2023.3326304
Qin WLuo XLi SZhou M(2024)Parallel Adaptive Stochastic Gradient Descent Algorithms for Latent Factor Analysis of High-Dimensional and Incomplete Industrial DataIEEE Transactions on Automation Science and Engineering10.1109/TASE.2023.326760921:3(2716-2729)Online publication date: Jul-2024
https://doi.org/10.1109/TASE.2023.3267609
Büyükkaya KKarsavuran MAykanat C(2024)Stochastic Gradient Descent for matrix completionKnowledge-Based Systems10.1016/j.knosys.2023.111176283:COnline publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1016/j.knosys.2023.111176
Suo JHan DZhao H(2023)A multiple head selection joint entity-relation extraction modelJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23176645:4(5647-5657)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.3233/JIFS-231766
Abubaker NKarsavuran MAykanat C(2023)Scaling Stratified Stochastic Gradient Descent for Distributed Matrix CompletionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.325379135:10(10603-10615)Online publication date: 1-Oct-2023
https://doi.org/10.1109/TKDE.2023.3253791
Yang YChui T(2023)Profiling and Pairing Catchments and Hydrological Models With Latent Factor ModelWater Resources Research10.1029/2022WR03368459:6Online publication date: 26-May-2023
https://doi.org/10.1029/2022WR033684
Khan ZChaudhary NKhan TFarooq UPinto CRaja M(2023)Enhanced fractional prediction scheme for effective matrix factorization in chaotic feedback recommender systemsChaos, Solitons & Fractals10.1016/j.chaos.2023.114109176(114109)Online publication date: Nov-2023
https://doi.org/10.1016/j.chaos.2023.114109
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents