article

Free access

Spectral Regularization Algorithms for Learning Large Incomplete Matrices

Authors:

Rahul Mazumder,

Robert TibshiraniAuthors Info & Claims

The Journal of Machine Learning Research, Volume 11

Pages 2287 - 2322

Published: 01 August 2010 Publication History

Abstract

We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm SOFT-IMPUTE iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity of order linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices; for example SOFT-IMPUTE takes a few hours to compute low-rank approximations of a 10⁶ X 10⁶ incomplete matrix with 10⁷ observed entries, and fits a rank-95 approximation to the full Netflix training set in 3.3 hours. Our methods achieve good training and test errors and exhibit superior timings when compared to other competitive state-of-the-art techniques.

References

[1]

J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert. A new approach to collaborative filtering: operator estimation with spectral regularization. Journal of Machine Learning Research, 10:803-826, 2009.

Digital Library

[2]

A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Advances in Neural Information Processing Systems 19. MIT Press, 2007.

Digital Library

[3]

A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243-272, 2008.

Digital Library

[4]

F. Bach. Consistency of trace norm minimization. Journal of Machine Learning Research, 9: 1019-1048, 2008.

Digital Library

[5]

R. M. Bell and Y. Koren. Lessons from the Netflix prize challenge. Technical report, AT&T Bell Laboratories, 2007.

[6]

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

Digital Library

[7]

S. Burer and R. D.C. Monteiro. Local minima and convergence in low-rank semidefinite programming. Mathematical Programming, 103(3):427-631, 2005.

Digital Library

[8]

J. Cai, E. J. Candes, and Z. Shen. A singular value thresholding algorithm for matrix completion, 2008. Available at http://www.citebase.org/abstract?id=oai:arXiv.org:0810.3286.

[9]

E. Candès and B. Recht. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9:717-772, 2008.

Digital Library

[10]

E. J. Candès and T. Tao. The power of convex relaxation: near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053-2080, 2009.

Digital Library

[11]

D. DeCoste. Collaborative prediction using ensembles of maximum margin matrix factorizations. In Proceedings of the 23rd International Conference on Machine Learning, pages 249-256. ACM, 2006.

Digital Library

[12]

A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society Series B, 39:1-38, 1977.

[13]

D. Donoho, I. Johnstone, G. Kerkyachairan, and D. Picard. Wavelet shrinkage; asymptopia? (with discussion). Journal of the Royal Statistical Society: Series B, 57:201-337, 1995.

[14]

M. Fazel. Matrix Rank Minimization with Applications. PhD thesis, Stanford University, 2002.

[15]

J. Friedman. Fast sparse regression and classification. Technical report, Department of Statistics, Stanford University, 2008.

[16]

J. Friedman, T. Hastie, H. Hoefling, and R. Tibshirani. Pathwise coordinate optimization. Annals of Applied Statistics, 2(1):302-332, 2007.

[17]

M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, 2009. Web page and software available at http://stanford.edu/~boyd/cvx.

[18]

T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Prediction, Inference and Data Mining (Second Edition). Springer Verlag, New York, 2009.

[19]

S. Ji and J. Ye. An accelerated gradient method for trace norm minimization. In Proceedings of the 26th International Conference on Machine Learning, pages 457-464, 2009.

Digital Library

[20]

R. H. Keshavan, S. Oh, and A. Montanari. Matrix completion from a few entries. IEEE Transactions on Information Theory, 56(6):2980-2998, 2009.

Digital Library

[21]

R.M. Larsen. Lanczos bidiagonalization with partial reorthogonalization. Technical Report DAIMI PB-357, Department of Computer Science, Aarhus University, 1998.

[22]

R.M. Larsen. Propack-software for large and sparse svd calculations, 2004. Available at http://sun.stanford.edu/~rmunk/PROPACK.

[23]

J. Liu, S. Ji, and J. Ye. SLEP: Sparse Learning with Efficient Projections. Arizona State University, 2009. Available at http://www.public.asu.edu/~jye02/Software/SLEP.

[24]

Z. Liu and L. Vandenberghe. Interior-point method for nuclear norm approximation with application to system identfication. SIAM Journal on Matrix Analysis and Applications, 31(3):1235-1256, 2009.

Digital Library

[25]

S. Ma, D. Goldfarb, and L. Chen. Fixed point and Bregman iterative methods for matrix rank minimization. Mathematical Programming Series A, forthcoming.

Digital Library

[26]

R. Mazumder, J. Friedman, and T. Hastie. Sparsenet: coordinate descent with non-convex penalties. Technical report, Stanford University, 2009.

[27]

Y. Nesterov. Introductory Lectures on Convex Optimization: Basic course. Kluwer, Boston, 2003.

[28]

Y. Nesterov. Gradient methods for minimizing composite objective function. Technical Report 76, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, 2007.

[29]

B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, 2007. Available at http://www.citebase.org/abstract?id=oai:arXiv.org:0706.4138.

Digital Library

[30]

J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the 22nd International Conference on Machine Learning, pages 713-719. ACM, 2005.

Digital Library

[31]

R. Salakhutdinov, A. Mnih, and G. E. Hinton. Restricted Boltzmann machines for collaborative filtering. In Proceedings of the 24th International Conference on Machine Learning, pages 791- 798. AAAI Press, 2007.

Digital Library

[32]

ACM SIGKDD and Netflix. Soft modelling by latent variables: the nonlinear iterative partial least squares (NIPALS) approach. In Proceedings of KDD Cup and Workshop, 2007. Available at http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html.

[33]

N. Srebro and T. Jaakkola. Weighted low-rank approximations. In Proceedings of the 20th International Conference on Machine Learning, pages 720-727. AAAI Press, 2003.

[34]

N. Srebro, N. Alon, and T. Jaakkola. Generalization error bounds for collaborative prediction with low-rank matrices. In Advances in Neural Information Processing Systems 17, pages 5-27. MIT Press, 2005a.

[35]

N. Srebro, J. Rennie, and T. Jaakkola. Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems 17, pages 1329-1336. MIT Press, 2005b.

Digital Library

[36]

G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. Scalable collaborative filtering approaches for large recommender systems. Journal of Machine Learning Research, 10:623-656, 2009.

Digital Library

[37]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267-288, 1996.

[38]

O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and R. B. Altman. Missing value estimation methods for DNA microarrays. Bioinformatics, 17(6):520- 525, 2001.

[39]

C. H. Zhang. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 38(2):894-942, 2010.

Cited By

Sharma NSen RBasu SShanmugam KShakkottai S(2024)Bandits with Stochastic Experts: Constant Regret, Empirical Experts and EpisodesACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/36802799:3(1-33)Online publication date: 25-Jul-2024
https://dl.acm.org/doi/10.1145/3680279
Alves R(2024)Regionalization-Based Collaborative Filtering: Harnessing Geographical Information in RecommendersACM Transactions on Spatial Algorithms and Systems10.1145/365664110:2(1-23)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3656641
Wang JZhang YWang KLin XZhang W(2024)Missing Data Imputation with Uncertainty-Driven NetworkProceedings of the ACM on Management of Data10.1145/36549202:3(1-25)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654920
Show More Cited By

Index Terms

Spectral Regularization Algorithms for Learning Large Incomplete Matrices
1. Computing methodologies
  1. Machine learning
2. Mathematics of computing
  1. Mathematical analysis
    1. Differential equations
      1. Partial differential equations
    2. Numerical analysis
      1. Computations on matrices

Recommendations

Spectral k-support norm regularization
NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2

The k-support norm has successfully been applied to sparse vector prediction problems. We observe that it belongs to a wider class of norms, which we call the box-norms. Within this framework we derive an efficient algorithm to compute the proximity ...
Generalized LASSO with under-determined regularization matrices

This paper studies the intrinsic connection between a generalized LASSO and a basic LASSO formulation. The former is the extended version of the latter by introducing a regularization matrix to the coefficients. We show that when the regularization ...
Reconstruction of Structurally-Incomplete Matrices With Reweighted Low-Rank and Sparsity Priors

Most matrix reconstruction methods assume that missing entries randomly distribute in the incomplete matrix, and the low-rank prior or its variants are used to well pose the problem. However, in practical applications, missing entries are structurally ...

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 11, Issue

3/1/2010

3637 pages

ISSN:1532-4435

EISSN:1533-7928

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 August 2010

Published in JMLR Volume 11

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

197
Total Citations
View Citations
1,079
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)18

Reflects downloads up to 11 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sharma NSen RBasu SShanmugam KShakkottai S(2024)Bandits with Stochastic Experts: Constant Regret, Empirical Experts and EpisodesACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/36802799:3(1-33)Online publication date: 25-Jul-2024
https://dl.acm.org/doi/10.1145/3680279
Alves R(2024)Regionalization-Based Collaborative Filtering: Harnessing Geographical Information in RecommendersACM Transactions on Spatial Algorithms and Systems10.1145/365664110:2(1-23)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3656641
Wang JZhang YWang KLin XZhang W(2024)Missing Data Imputation with Uncertainty-Driven NetworkProceedings of the ACM on Management of Data10.1145/36549202:3(1-25)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654920
Paterakis GFafalios SCharonyktakis PChristophides VTsamardinos I(2024)Do We Really Need Imputation in AutoML Predictive Modeling?ACM Transactions on Knowledge Discovery from Data10.1145/364364318:6(1-64)Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1145/3643643
Perini MNikolic M(2024)In-Database Data ImputationProceedings of the ACM on Management of Data10.1145/36393262:1(1-27)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639326
Obata KKawabata KMatsubara YSakurai YBaeza-Yates RBonchi F(2024)Mining of Switching Sparse Networks for Missing Value Imputation in Multivariate Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671760(2296-2306)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671760
Javaheri AAmini AMarvasti FPalomar D(2024)Learning Spatiotemporal Graphical Models From Incomplete ObservationsIEEE Transactions on Signal Processing10.1109/TSP.2024.335457272(1361-1374)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TSP.2024.3354572
Cao YWu J(2024)Random Subspace Sampling for Classification with Missing DataJournal of Computer Science and Technology10.1007/s11390-023-1611-939:2(472-486)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s11390-023-1611-9
Park YKim JZhu D(2024)Discordance minimization-based imputation algorithms for missing values in rating dataMachine Language10.1007/s10994-023-06452-4113:1(241-279)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s10994-023-06452-4
Pham NVo KVu MNguyen TRiegler MHalvorsen PNguyen B(2024)Correlation Visualization Under Missing Values: A Comparison Between Imputation and Direct Parameter Estimation MethodsMultiMedia Modeling10.1007/978-3-031-53302-0_8(103-116)Online publication date: 29-Jan-2024
https://dl.acm.org/doi/10.1007/978-3-031-53302-0_8
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents