research-article

Smoothed analysis of tensor decompositions

Authors:

Aditya Bhaskara,

Moses Charikar,

Aravindan VijayaraghavanAuthors Info & Claims

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

Pages 594 - 603

https://doi.org/10.1145/2591796.2591881

Published: 31 May 2014 Publication History

Abstract

Low rank decomposition of tensors is a powerful tool for learning generative models. The uniqueness results that hold for tensors give them a significant advantage over matrices. However, tensors pose serious algorithmic challenges; in particular, much of the matrix algebra toolkit fails to generalize to tensors. Efficient decomposition in the overcomplete case (where rank exceeds dimension) is particularly challenging. We introduce a smoothed analysis model for studying these questions and develop an efficient algorithm for tensor decomposition in the highly overcomplete case (rank polynomial in the dimension). In this setting, we show that our algorithm is robust to inverse polynomial error -- a crucial property for applications in learning since we are only allowed a polynomial number of samples. While algorithms are known for exact tensor decomposition in some overcomplete settings, our main contribution is in analyzing their stability in the framework of smoothed analysis.

Our main technical contribution is to show that tensor products of perturbed vectors are linearly independent in a robust sense (i.e. the associated matrix has singular values that are at least an inverse polynomial). This key result paves the way for applying tensor methods to learning problems in the smoothed setting. In particular, we use it to obtain results for learning multi-view models and mixtures of axis-aligned Gaussians where there are many more "components" than dimensions. The assumption here is that the model is not adversarially chosen, which we formalize by thinking of the model parameters as being perturbed. We believe this an appealing way to analyze realistic instances of learning problems, since this framework allows us to overcome many of the usual limitations of using tensor methods.

Supplementary Material

MP4 File (p594-sidebyside.mp4)

Download
257.25 MB

References

[1]

E. Allman, C. Matias and J. Rhodes. Identifiability of Parameters in Latent Structure Models with many Observed Variables. Annals of Statistics, pages 3099--3132, 2009.

[2]

A. Anandkumar, D. Hsu and S. Kakade. A method of moments for mixture models and hidden Markov models. In COLT 2012.

[3]

A. Anandkumar, R. Ge, D. Hsu and S. Kakade. A Tensor Spectral Approach to Learning Mixed Membership Community Models. In COLT 2013.

[4]

A. Anandkumar, R. Ge, D. Hsu, S. Kakade and M. Telgarsky. Tensor Decompositions for Learning Latent Variable Models. arxiv:1210.7559, 2012.

[5]

A. Anandkumar, D. Foster, D. Hsu, S. Kakade, Y. Liu. A Spectral Algorithm for Latent Dirichlet Allocation. In NIPS, pages 926--934, 2012.

Digital Library

[6]

S. Arora, R. Ge, A. Moitra and S. Sachdeva. Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders. In NIPS, pages 2384--2392, 2012.

[7]

M. Belkin, L. Rademacher and J. Voss. Bling Signal Separation in the Presence of Gaussian Noise. In COLT 2013.

[8]

M. Belkin and K. Sinha. Polynomial Learning of Distribution Families. In FOCS, pages 103--112, 2010.

Digital Library

[9]

A. Bhaskara, M. Charikar and A. Vijayaraghavan. Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability. arxiv:1304.8087, 2013.

[10]

A. Bhaskara, M. Charikar, A. Moitra and A. Vijayaraghavan. Smoothed Analysis of Tensor Decompositions. arxiv:1311.3651, 2013.

[11]

J. Chang. Full Reconstruction of Markov Models on Evolutionary Trees: Identifiability and Consistency. Mathematical Biosciences, pages 51--73, 1996.

[12]

P. Comon. Independent Component Analysis: A New Concept? Signal Processing, pages 287--314, 1994.

Digital Library

[13]

S. Dasgupta. Learning Mixtures of Gaussians. In FOCS, pages 634--644, 1999.

Digital Library

[14]

L. De Lathauwer, J Castaing and J. Cardoso. Fourth-order Cumulant-based Blind Identification of Underdetermined Mixtures. IEEE Trans. on Signal Processing, 55(6):2965--2973, 2007.

Digital Library

[15]

J. Feldman, R. A. Servedio, and R. O'Donnell. PAC Learning Axis-aligned Mixtures of Gaussians with No Separation Assumption. In COLT, pages 20--34, 2006.

Digital Library

[16]

A. Frieze, M. Jerrum, R. Kannan. Learning Linear Transformations. In FOCS, pages 359--368, 1996.

Digital Library

[17]

N. Goyal, S. Vempala and Y. Xiao. Fourier PCA. In STOC, 2014 (this proceedings).

[18]

R. Harshman. Foundations of the PARFAC procedure: model and conditions for an 'explanatory' multi-mode factor analysis. UCLA Working Papers in Phonetics, pages 1--84, 1970.

[19]

J. Håstad. Tensor Rank is NP-Complete. Journal of Algorithms, pages 644--654, 1990.

Digital Library

[20]

C. Hillar and L-H. Lim. Most Tensor Problems are NP-Hard. arxiv:0911.1393v4, 2013.

Digital Library

[21]

R. Horn and C. Johnson. Matrix Analysis. Cambridge University Press, 1990.

Digital Library

[22]

D. Hsu and S. Kakade. Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions. In ITCS, pages 11--20, 2013.

Digital Library

[23]

A. Hyvärinen, J. Karhunen and E. Oja. Independent Component Analysis. Wiley Interscience, 2001.

[24]

A. T. Kalai, A. Moitra, and G. Valiant. Efficiently Learning Mixtures of Two Gaussians. In STOC, pages 553--562, 2010.

Digital Library

[25]

A. T. Kalai, A. Samorodnitsky and S-H Teng. Learning and Smoothed Analysis. In FOCS, pages 395--404, 2009.

Digital Library

[26]

J. Kruskal. Three-way Arrays: Rank and Uniqueness of Trilinear Decompositions. Linear Algebra and Applications, 18:95--138, 1977.

[27]

S. Leurgans, R. Ross and R. Abel. A Decomposition for Three-way Arrays. SIAM Journal on Matrix Analysis and Applications, 14(4):1064--1083, 1993.

Digital Library

[28]

B. Lindsay. Mixture Models: Theory, Geometry and Applications. Institute for Mathematical Statistics, 1995.

[29]

P. McCullagh. Tensor Methods in Statistics. Chapman and Hall/CRC, 1987.

[30]

A. Moitra and G. Valiant. Setting the Polynomial Learnability of Mixtures of Gaussians. In FOCS, pages 93--102, 2010.

Digital Library

[31]

E. Mossel and S. Roch. Learning Nonsingular Phylogenies and Hidden Markov Models. In STOC, pages 366--375, 2005.

Digital Library

[32]

Y. Rabani, L. Schulman and C. Swamy. Learning mixtures of arbitrary distributions over large discrete domains. In ITCS 2014.

Digital Library

[33]

C. Spearman. General Intelligence. American Journal of Psychology, pages 201--293, 1904.

[34]

D. A. Spielman, S. H. Teng. Smoothed Analysis of Algorithms: Why the Simplex Algorithm usually takes Polynomial Time. Journal of the ACM, pages 385--463, 2004.

Digital Library

[35]

D. A. Spielman, S. H. Teng. Smoothed Analysis: An Attempt to Explain the Behavior of Algorithms in Practice. Communications of the ACM, pages 76--84, 2009.

Digital Library

[36]

A. Stegeman and P. Comon. Subtracting a Best Rank-1 Approximation may Increase Tensor Rank. Linear Algebra and Its Applications, pages 1276--1300, 2010.

[37]

H. Teicher. Identifiability of Mixtures. Annals of Mathematical Statistics, pages 244--248, 1961.

[38]

S. Vempala, Y. Xiao. Structure from Local Optima: Learning Subspace Juntas via Higher Order PCA. Arxiv:abs/1108.3329, 2011.

[39]

P. Wedin. Perturbation Bounds in Connection with Singular Value Decompositions. BIT, 12:99--111, 1972.

Cited By

Díaz AMorris PPerarnau GSerra O(2024)Speeding up random walk mixing by starting from a uniform vertexElectronic Journal of Probability10.1214/24-EJP109129:noneOnline publication date: 1-Jan-2024
https://doi.org/10.1214/24-EJP1091
Daniely ASrebro NVardi GOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Computational complexity of learning neural networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669456(76272-76297)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669456
Teng S(2023)“Intelligent Heuristics Are the Future of Computing”ACM Transactions on Intelligent Systems and Technology10.1145/362770814:6(1-39)Online publication date: 14-Nov-2023
https://dl.acm.org/doi/10.1145/3627708
Show More Cited By

Index Terms

Smoothed analysis of tensor decompositions
1. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Numerical differentiation
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Hermitian Tensor Decompositions

Hermitian tensors are generalizations of Hermitian matrices, but they have very different properties. Every complex Hermitian tensor is a sum of complex Hermitian rank-1 tensors. However, this is not true for the real case. We study basic properties for ...
Tensor Decompositions and Applications

This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or $N$-way array. Decompositions of higher-order tensors (i.e., $N$-way arrays with $N \geq 3$) have ...
Higher-order kronecker products and tensor decompositions

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

May 2014

984 pages

ISBN:9781450327107

DOI:10.1145/2591796

Program Chair:
David Shmoys
Cornell University

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Conference

STOC '14

Sponsor:

SIGACT

STOC '14: Symposium on Theory of Computing

May 31 - June 3, 2014

New York, New York

Acceptance Rates

STOC '14 Paper Acceptance Rate 91 of 319 submissions, 29%;

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

66
Total Citations
View Citations
462
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)2

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Díaz AMorris PPerarnau GSerra O(2024)Speeding up random walk mixing by starting from a uniform vertexElectronic Journal of Probability10.1214/24-EJP109129:noneOnline publication date: 1-Jan-2024
https://doi.org/10.1214/24-EJP1091
Daniely ASrebro NVardi GOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Computational complexity of learning neural networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669456(76272-76297)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669456
Teng S(2023)“Intelligent Heuristics Are the Future of Computing”ACM Transactions on Intelligent Systems and Technology10.1145/362770814:6(1-39)Online publication date: 14-Nov-2023
https://dl.acm.org/doi/10.1145/3627708
Liu AMoitra A(2023)Robustly Learning General Mixtures of GaussiansJournal of the ACM10.1145/358368070:3(1-53)Online publication date: 23-May-2023
https://dl.acm.org/doi/10.1145/3583680
Wein ASaha BServedio R(2023)Average-Case Complexity of Tensor Decomposition for Low-Degree PolynomialsProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585232(1685-1698)Online publication date: 2-Jun-2023
https://dl.acm.org/doi/10.1145/3564246.3585232
Chen SLi JLi YZhang ASaha BServedio R(2023)Learning Polynomial Transformations via Generalized Tensor DecompositionsProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585209(1671-1684)Online publication date: 2-Jun-2023
https://dl.acm.org/doi/10.1145/3564246.3585209
Johnston NLovitz BVijayaraghavan A(2023)Computing linear sections of varieties: quantum entanglement, tensor decompositions and beyond2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS57990.2023.00079(1316-1336)Online publication date: 6-Nov-2023
https://doi.org/10.1109/FOCS57990.2023.00079
Koiran PSaha S(2023)Absolute reconstruction for sums of powers of linear forms: degree 3 and beyondComputational Complexity10.1007/s00037-023-00239-832:2Online publication date: 3-Aug-2023
https://dl.acm.org/doi/10.1007/s00037-023-00239-8
Koiran PSaha S(2023)Complete Decomposition of Symmetric Tensors in Linear Time and Polylogarithmic PrecisionAlgorithms and Complexity10.1007/978-3-031-30448-4_22(308-322)Online publication date: 25-Apr-2023
https://doi.org/10.1007/978-3-031-30448-4_22
Kwon JEfroni YCaramanis CMannor SKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Tractable optimality in episodic latent MABsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601987(23634-23645)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601987
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents