research-article

Fast bregman divergence NMF using taylor expansion and coordinate descent

Authors:

Haesun ParkAuthors Info & Claims

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 307 - 315

https://doi.org/10.1145/2339530.2339582

Published: 12 August 2012 Publication History

Abstract

Non-negative matrix factorization (NMF) provides a lower rank approximation of a matrix. Due to nonnegativity imposed on the factors, it gives a latent structure that is often more physically meaningful than other lower rank approximations such as singular value decomposition (SVD). Most of the algorithms proposed in literature for NMF have been based on minimizing the Frobenius norm. This is partly due to the fact that the minimization problem based on the Frobenius norm provides much more flexibility in algebraic manipulation than other divergences. In this paper we propose a fast NMF algorithm that is applicable to general Bregman divergences. Through Taylor series expansion of the Bregman divergences, we reveal a relationship between Bregman divergences and Euclidean distance. This key relationship provides a new direction for NMF algorithms with general Bregman divergences when combined with the scalar block coordinate descent method. The proposed algorithm generalizes several recently proposed methods for computation of NMF with Bregman divergences and is computationally faster than existing alternatives. We demonstrate the effectiveness of our approach with experiments conducted on artificial as well as real world data.

Supplementary Material

JPG File (311a_m_talk_7.jpg)

Download
14.95 KB

MP4 File (311a_m_talk_7.mp4)

Download
119.79 MB

References

[1]

http://www.cl.cam.ac.uk/research/dtg/attarchive /facedatabase.html.

[2]

A. Banerjee. Optimal bregman prediction and jensen's equality. In In Proc. International Symposium on Information Theory (ISIT), page 2004, 2004.

[3]

A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with bregman divergences. J. Mach. Learn. Res., 6:1705--1749, December 2005.

[4]

P. Breheny and J. Huang. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics, 5(1):232--253, 2011.

[5]

A. Cichocki and A.-H. Phan. Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Transactions on Fundamentals of Electronics, 92:708--721, 2009.

[6]

A. Cichocki and R. Zdunek. Nmflab for signal and image processing. In tech. rep, Laboratory for Advanced Brain Signal Processing, Saitama, Japan, 2006. BSI, RIKEN.

[7]

A. Cichocki, R. Zdunek, and S. A. A.-H. Phan. Nonnegative matrix and tensor factorizations: Applications to exploratory multi-way data analysis and blind source separation. New York, USA, 2009. Wiley.

Digital Library

[8]

I. S. Dhillon and S. Sra. Generalized nonnegative matrix approximations with bregman divergences. In Neural Information Proc. Systems, pages 283--290, 2005.

[9]

C. Ding, T. Li, and W. Peng. On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Stat. Data Anal., 52:3913--3927, April 2008.

Digital Library

[10]

J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456).

[11]

C. Fevotte, N. Bertin, and J.-L. Durrieu. Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis. Neural Comput., 21:793--830, March 2009.

Digital Library

[12]

M. Figueiredo, R. Nowak, and S. J. Wright. Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE J. of Selected Topics in Signal Proc, 1:586--598, 2007.

[13]

J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1 2010.

[14]

N. Gillis and F. Glineur. Accelerated multiplicative updates and hierarchical als algorithms for nonnegative matrix factorization. Neural Comput., 24(4):1085--1105, 4 2012.

Digital Library

[15]

M. R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards, 49(6), 1952.

[16]

T. Hofmann. Probabilistic latent semantic indexing. In SIGIR '99, pages 50--57, New York, NY, USA, 1999. ACM.

Digital Library

[17]

C.-J. Hsieh and I. S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pages 1064--1072, New York, NY, USA, 2011. ACM.

Digital Library

[18]

C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, and P. Ravikumar. Sparse inverse covariance matrix estimation using quadratic approximation. In Advances in Neural Information Processing Systems 24, pages 2330--2338, 2011.

[19]

H. Kim and H. Park. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics, 23:1495--1502, June 2007.

Digital Library

[20]

H. Kim and H. Park. Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl., 30:713--730, July 2008.

Digital Library

[21]

J. Kim, Y. He, and H. Park. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. Under review.

[22]

J. Kim and H. Park. Toward faster nonnegative matrix factorization: A new algorithm and comparisons. IEEE International Conference on Data Mining, 0:353--362, 2008.

Digital Library

[23]

J. Kim and H. Park. Fast nonnegative matrix factorization: An active-set-like method and comparisons. In SIAM Journal on Scientific Computing, 2011.

Digital Library

[24]

G. Lebanon. Axiomatic geometry of conditional models. Information Theory, IEEE Transactions, 51:1283--1294, April 2005.

Digital Library

[25]

D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In NIPS, pages 556--562. MIT Press, 2000.

Digital Library

[26]

Y. Li and S. Osher. Coordinate descent optimization for l1 minimization with application to compressed sensing; a greedy algorithm. Inverse Probl. Imaging, 3(3).

[27]

C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19:2756--2779, October 2007.

Digital Library

[28]

C. Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In NAACL, pages 71--78, Morristown, NJ, USA, 2003. Association for Computational Linguistics.

Digital Library

[29]

R. Mazumder, J. Friedman, and T. Hastie. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 106(495).

[30]

S. D. Pietra, V. D. Pietra, and J. Lafferty. Duality and auxiliary functions for bregman distances. Technical report, School of Computer Science, Carnegie Mellon University, 2002.

[31]

A. P. Singh and G. J. Gordon. A unified view of matrix factorization models. In Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II, ECML PKDD '08, pages 358--373, Berlin, Heidelberg, 2008. Springer-Verlag.

Digital Library

[32]

S. Wang and D. Schuurmans. Learning continuous latent variable models with bregman divergences. In In Proc. IEEE International Conference on Algorithmic Learning Theory, page 2004, 2003.

[33]

T. Wu and K. Lange. Coordinate descent algorithms for lasso penalized regression. The Annals of Applied Statistics, 2(1):224--244, 2008.

[34]

S. Yun and K.-C. Toh. A coordinate gradient descent method for l1-regularized convex minimization. Computational Optimization and Applications, 48(2).

Digital Library

Cited By

Du KSwamy MWang ZMow W(2023)Matrix Factorization Techniques in Machine Learning, Signal Processing, and StatisticsMathematics10.3390/math1112267411:12(2674)Online publication date: 12-Jun-2023
https://doi.org/10.3390/math11122674
Pelizzola MLaursen RHobolth A(2023)Model selection and robust inference of mutational signatures using Negative Binomial non-negative matrix factorizationBMC Bioinformatics10.1186/s12859-023-05304-124:1Online publication date: 8-May-2023
https://doi.org/10.1186/s12859-023-05304-1
Dalhoumi OBouguila NAmayri MFan W(2023)Bayesian Matrix Factorization for Semibounded DataIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.311182434:6(3111-3123)Online publication date: Jun-2023
https://doi.org/10.1109/TNNLS.2021.3111824
Show More Cited By

Index Terms

Fast bregman divergence NMF using taylor expansion and coordinate descent

Recommendations

Fast coordinate descent methods with variable selection for non-negative matrix factorization
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Nonnegative Matrix Factorization (NMF) is an effective dimension reduction method for non-negative dyadic data, and has proven to be useful in many areas, such as text mining, bioinformatics and image processing. NMF is usually formulated as a ...
Generalized Fisher Kernel with Bregman Divergence
Hybrid Artificial Intelligent Systems
Abstract
The Fisher kernel has good statistical properties. However, from a practical point of view, the necessary distributional assumptions complicate the applicability. We approach the solution to this problem with the NMF (Non-negative Matrix ...
Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence
Highlights
- A new non-negative matrix factorization decomposition model is proposed.
- A new ...
Abstract
This paper introduces a Feature Nonlinear Transformation Non-Negative Matrix Factorization with Kullback-Leibler Divergence (FNTNMF-KLD) for extracting the nonlinear features of a matrix in standard NMF. This method uses a nonlinear ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2012

1616 pages

ISBN:9781450314626

DOI:10.1145/2339530

General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '12

Sponsor:

KDD '12: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 12 - 16, 2012

Beijing, China

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '24

Sponsor:
sigkdd
sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
481
Total Downloads

Downloads (Last 12 months)31
Downloads (Last 6 weeks)2

Reflects downloads up to

Other Metrics

View Author Metrics

Citations

Cited By

Du KSwamy MWang ZMow W(2023)Matrix Factorization Techniques in Machine Learning, Signal Processing, and StatisticsMathematics10.3390/math1112267411:12(2674)Online publication date: 12-Jun-2023
https://doi.org/10.3390/math11122674
Pelizzola MLaursen RHobolth A(2023)Model selection and robust inference of mutational signatures using Negative Binomial non-negative matrix factorizationBMC Bioinformatics10.1186/s12859-023-05304-124:1Online publication date: 8-May-2023
https://doi.org/10.1186/s12859-023-05304-1
Dalhoumi OBouguila NAmayri MFan W(2023)Bayesian Matrix Factorization for Semibounded DataIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.311182434:6(3111-3123)Online publication date: Jun-2023
https://doi.org/10.1109/TNNLS.2021.3111824
Dytso AFauB MPoor H(2022)Bayesian Risk With Bregman Loss: A Cramér–Rao Type Bound and Linear EstimationIEEE Transactions on Information Theory10.1109/TIT.2021.313038168:3(1985-2000)Online publication date: Mar-2022
https://doi.org/10.1109/TIT.2021.3130381
Haddock JKassab LLi SKryshchenko AGrotheer RSizikova EWang CMerkh TMadushani RAhn MNeedell DLeonard K(2021)Semi-supervised Nonnegative Matrix Factorization for Document Classification2021 55th Asilomar Conference on Signals, Systems, and Computers10.1109/IEEECONF53345.2021.9723109(1355-1360)Online publication date: 31-Oct-2021
https://doi.org/10.1109/IEEECONF53345.2021.9723109
Dantas CSoubies EFevotte C(2021)Safe Screening for Sparse Regression with the Kullback-Leibler DivergenceICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP39728.2021.9414183(5544-5548)Online publication date: 6-Jun-2021
https://doi.org/10.1109/ICASSP39728.2021.9414183
Truica CApostol E(2021)TLATR: Automatic Topic Labeling Using Automatic (Domain-Specific) Term RecognitionIEEE Access10.1109/ACCESS.2021.30830009(76624-76641)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3083000
Hien LGillis N(2021)Algorithms for Nonnegative Matrix Factorization with the Kullback–Leibler DivergenceJournal of Scientific Computing10.1007/s10915-021-01504-087:3Online publication date: 8-May-2021
https://doi.org/10.1007/s10915-021-01504-0
Gilad GSason ISharan R(2020)An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learningMachine Learning: Science and Technology10.1088/2632-2153/abc60a2:1(015013)Online publication date: 24-Dec-2020
https://doi.org/10.1088/2632-2153/abc60a
Du KSwamy MDu KSwamy M(2019)Nonnegative Matrix FactorizationNeural Networks and Statistical Learning10.1007/978-1-4471-7452-3_14(427-445)Online publication date: 13-Sep-2019
https://doi.org/10.1007/978-1-4471-7452-3_14
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents