Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3121525.3121550guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A unified view of matrix factorization models

Published: 15 September 2008 Publication History

Abstract

We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, E-PCA, MMMF, pLSI, pLSI-pHITS, Bregman co-clustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as minimizing a generalized Bregman divergence, and we show that (i) a straightforward alternating projection algorithm can be applied to almost any model in our unified view; (ii) the Hessian for each projection has special structure that makes a Newton projection feasible, even when there are equality constraints on the factors, which allows for matrix co-clustering; and (iii) alternating projections can be generalized to simultaneously factor a set of matrices that share dimensions. These observations immediately yield new optimization algorithms for the above factorization methods, and suggest novel generalizations of these methods such as incorporating row and column biases, and adding or relaxing clustering constraints.

References

[1]
Golub, G.H., Loan, C.F.V.: Matrix Computions, 3rd edn. John Hopkins University Press (1996)
[2]
Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50-57 (1999)
[3]
Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: KDD (2008)
[4]
Rish, I., Grabarnik, G., Cecchi, G., Pereira, F., Gordon, G.: Closed-form supervised dimensionality reduction with generalized linear models. In: ICML (2008)
[5]
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS (2001)
[6]
Collins, M., Dasgupta, S., Schapire, R.E.: A generalization of principal component analysis to the exponential family. In: NIPS (2001)
[7]
Gordon, G.J.: Approximate Solutions to Markov Decision Processes. PhD thesis. Carnegie Mellon University (1999)
[8]
Gordon, G.J.: Generalized2 linear2 models. In: NIPS (2002)
[9]
Bregman, L.: The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comp. Math and Math. Phys. 7, 200-217 (1967)
[10]
Censor, Y., Zenios, S.A.: Parallel Optimization: Theory, Algorithms, and Applications. Oxford University Press, Oxford (1997)
[11]
Azoury, K.S., Warmuth, M.K.: Relative loss bounds for on-line density estimation with the exponential family of distributions. Mach. Learn. 43, 211-246 (2001)
[12]
Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705-1749 (2005)
[13]
Forster, J., Warmuth, M.K.: Relative expected instantaneous loss bounds. In: COLT, pp. 90-99 (2000)
[14]
Aldous, D.J.: Representations for partially exchangeable arrays of random variables. J. Multivariate Analysis 11(4), 581-598 (1981)
[15]
Aldous, D.J.: 1. In: Exchangeability and related topics, pp. 1-198. Springer, Heidelberg (1985)
[16]
Welling, M., Rosen-Zvi, M., Hinton, G.: Exponential family harmoniums with an application to information retrieval. In: NIPS (2005)
[17]
Welling, M., Chemudugunta, C., Sutter, N.: Deterministic latent variable models and their pitfalls. In: SDM (2008)
[18]
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993-1022 (2003)
[19]
Koenker, R., Bassett, G.J.: Regression quantiles. Econometrica 46(1), 33-50 (1978)
[20]
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc. B. 58(1), 267-288 (1996)
[21]
Ding, C.H.Q., Li, T., Peng, W.: Nonnegative matrix factorization and probabilistic latent semantic indexing: Equivalence chi-square statistic, and a hybrid method. In: AAAI (2006)
[22]
Ding, C.H.Q., He, X., Simon, H.D.: Nonnegative Lagrangian relaxation of -means and spectral clustering. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 530-538. Springer, Heidelberg (2005)
[23]
Buntine, W.L., Jakulin, A.: Discrete component analysis. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds.) SLSFS 2005. LNCS, vol. 3940, pp. 1-33. Springer, Heidelberg (2006)
[24]
Gabriel, K.R., Zamir, S.: Lower rank approximation of matrices by least squares with any choice of weights. Technometrics 21(4), 489-498 (1979)
[25]
Srebro, N., Jaakola, T.: Weighted low-rank approximations. In: ICML (2003)
[26]
Hartigan, J.: Clustering Algorithms. Wiley, Chichester (1975)
[27]
Ke, Q., Kanade, T.: Robust l1
[28]
norm factorization in the presence of outliers and missing data by alternative convex programming. In: CVPR, pp. 739-746 (2005)
[29]
Paatero, P., Tapper, U.: Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111-126 (1994)
[30]
Schein, A.I., Saul, L.K., Ungar, L.H.: A generalized linear model for principal component analysis of binary data. In: AISTATS (2003)
[31]
Srebro, N., Rennie, J.D.M., Jaakkola, T.S.: Maximum-margin matrix factorization. In: NIPS (2004)
[32]
Rennie, J.D.M., Srebro, N.: Fast maximum margin matrix factorization for collaborative prediction. In: ICML, pp. 713-719. ACM Press, New York (2005)
[33]
Nocedal, J., Wright, S.J.: Numerical Optimization. Series in Operations Research. Springer, Heidelberg (1999)
[34]
Schmidt, M., Fung, G., Rosales, R.: Fast optimization methods for L1 regularization: A comparative study and two new approaches. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 286-297. Springer, Heidelberg (2007)
[35]
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
[36]
Pereira, F., Gordon, G.: The support vector decomposition machine. In: ICML, pp. 689-696. ACM Press, New York (2006)
[37]
Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: SIGIR, pp. 487-494. ACM Press, New York (2007)
[38]
Yu, K., Yu, S., Tresp, V.: Multi-label informed latent semantic indexing. In: SIGIR, pp. 258-265. ACM Press, New York (2005)
[39]
Yu, S., Yu, K., Tresp, V., Kriegel, H.P., Wu, M.: Supervised probabilistic principal component analysis. In: KDD, pp. 464-473 (2006)
[40]
Cohn, D., Hofmann, T.: The missing link-a probabilistic model of document content and hypertext connectivity. In: NIPS (2000)
[41]
Long, B., Wu, X., Zhang, Z.M., Yu, P.S.: Unsupervised learning on k-partite graphs. In: KDD, pp. 317-326. ACM Press, New York (2006)
[42]
Long, B., Zhang, Z.M., Wú, X., Yu, P.S.: Spectral clustering for multi-type relational data. In: ICML, pp. 585-592. ACM Press, New York (2006)
[43]
Long, B., Zhang, Z.M., Wu, X., Yu, P.S.: Relational clustering by symmetric convex coding. In: ICML, pp. 569-576. ACM Press, New York (2007)
[44]
Long, B., Zhang, Z.M., Yu, P.S.: A probabilistic framework for relational clustering. In: KDD, pp. 470-479. ACM Press, New York (2007)
[45]
Banerjee, A., Basu, S., Merugu, S.: Multi-way clustering on relation graphs. In: SDM (2007)
[46]
Netflix: Netflix prize dataset (January 2007), http://www.netflixprize.com
[47]
Internet Movie Database Inc.: IMDB alternate interfaces (January 2007), http://www.imdb.com/interfaces
[48]
Rennie, J.D.: Extracting Information from Informal Communication. PhD thesis, Massachusetts Institute of Technology (2007)

Cited By

View all
  • (2017)Expectile matrix factorization for skewed data analysisProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298239.3298278(259-265)Online publication date: 4-Feb-2017
  • (2017)Learning sparse representations in reinforcement learning with sparse codingProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172175(2067-2073)Online publication date: 19-Aug-2017
  • (2014)Low-Rank Modeling and Its Applications in Image AnalysisACM Computing Surveys10.1145/267455947:2(1-33)Online publication date: 19-Dec-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECMLPKDD'08: Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
September 2008
694 pages
ISBN:3540874801
  • Editors:
  • Walter Daelemans,
  • Bart Goethals,
  • Katharina Morik

Sponsors

  • Google Inc.
  • IBMR: IBM Research
  • SPSS: SPSS, Inc.
  • Office of Naval Research Global Science & Technology
  • HP: HP

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 15 September 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Expectile matrix factorization for skewed data analysisProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298239.3298278(259-265)Online publication date: 4-Feb-2017
  • (2017)Learning sparse representations in reinforcement learning with sparse codingProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172175(2067-2073)Online publication date: 19-Aug-2017
  • (2014)Low-Rank Modeling and Its Applications in Image AnalysisACM Computing Surveys10.1145/267455947:2(1-33)Online publication date: 19-Dec-2014
  • (2014)A framework for matrix factorization based on general distributionsProceedings of the 8th ACM Conference on Recommender systems10.1145/2645710.2645735(249-256)Online publication date: 6-Oct-2014
  • (2014)CoBaFiProceedings of the 23rd international conference on World wide web10.1145/2566486.2568040(97-108)Online publication date: 7-Apr-2014
  • (2013)Retargeted matrix factorization for collaborative filteringProceedings of the 7th ACM conference on Recommender systems10.1145/2507157.2507185(49-56)Online publication date: 12-Oct-2013
  • (2013)Latent outlier detection and the low precision problemProceedings of the ACM SIGKDD Workshop on Outlier Detection and Description10.1145/2500853.2500862(46-52)Online publication date: 11-Aug-2013
  • (2013)On the equivalence of PLSI and projected clusteringACM SIGMOD Record10.1145/2430456.243046941:4(45-50)Online publication date: 17-Jan-2013
  • (2013)Regularized Latent Semantic IndexingACM Transactions on Information Systems10.1145/2414782.241478731:1(1-44)Online publication date: 1-Jan-2013
  • (2013)Supervised Dimensionality Reduction via Nonlinear Target EstimationProceedings of the 15th International Conference on Data Warehousing and Knowledge Discovery - Volume 805710.1007/978-3-642-40131-2_15(172-183)Online publication date: 26-Aug-2013
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media