Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2591796.2591881acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article

Smoothed analysis of tensor decompositions

Published: 31 May 2014 Publication History

Abstract

Low rank decomposition of tensors is a powerful tool for learning generative models. The uniqueness results that hold for tensors give them a significant advantage over matrices. However, tensors pose serious algorithmic challenges; in particular, much of the matrix algebra toolkit fails to generalize to tensors. Efficient decomposition in the overcomplete case (where rank exceeds dimension) is particularly challenging. We introduce a smoothed analysis model for studying these questions and develop an efficient algorithm for tensor decomposition in the highly overcomplete case (rank polynomial in the dimension). In this setting, we show that our algorithm is robust to inverse polynomial error -- a crucial property for applications in learning since we are only allowed a polynomial number of samples. While algorithms are known for exact tensor decomposition in some overcomplete settings, our main contribution is in analyzing their stability in the framework of smoothed analysis.
Our main technical contribution is to show that tensor products of perturbed vectors are linearly independent in a robust sense (i.e. the associated matrix has singular values that are at least an inverse polynomial). This key result paves the way for applying tensor methods to learning problems in the smoothed setting. In particular, we use it to obtain results for learning multi-view models and mixtures of axis-aligned Gaussians where there are many more "components" than dimensions. The assumption here is that the model is not adversarially chosen, which we formalize by thinking of the model parameters as being perturbed. We believe this an appealing way to analyze realistic instances of learning problems, since this framework allows us to overcome many of the usual limitations of using tensor methods.

Supplementary Material

MP4 File (p594-sidebyside.mp4)

References

[1]
E. Allman, C. Matias and J. Rhodes. Identifiability of Parameters in Latent Structure Models with many Observed Variables. Annals of Statistics, pages 3099--3132, 2009.
[2]
A. Anandkumar, D. Hsu and S. Kakade. A method of moments for mixture models and hidden Markov models. In COLT 2012.
[3]
A. Anandkumar, R. Ge, D. Hsu and S. Kakade. A Tensor Spectral Approach to Learning Mixed Membership Community Models. In COLT 2013.
[4]
A. Anandkumar, R. Ge, D. Hsu, S. Kakade and M. Telgarsky. Tensor Decompositions for Learning Latent Variable Models. arxiv:1210.7559, 2012.
[5]
A. Anandkumar, D. Foster, D. Hsu, S. Kakade, Y. Liu. A Spectral Algorithm for Latent Dirichlet Allocation. In NIPS, pages 926--934, 2012.
[6]
S. Arora, R. Ge, A. Moitra and S. Sachdeva. Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders. In NIPS, pages 2384--2392, 2012.
[7]
M. Belkin, L. Rademacher and J. Voss. Bling Signal Separation in the Presence of Gaussian Noise. In COLT 2013.
[8]
M. Belkin and K. Sinha. Polynomial Learning of Distribution Families. In FOCS, pages 103--112, 2010.
[9]
A. Bhaskara, M. Charikar and A. Vijayaraghavan. Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability. arxiv:1304.8087, 2013.
[10]
A. Bhaskara, M. Charikar, A. Moitra and A. Vijayaraghavan. Smoothed Analysis of Tensor Decompositions. arxiv:1311.3651, 2013.
[11]
J. Chang. Full Reconstruction of Markov Models on Evolutionary Trees: Identifiability and Consistency. Mathematical Biosciences, pages 51--73, 1996.
[12]
P. Comon. Independent Component Analysis: A New Concept? Signal Processing, pages 287--314, 1994.
[13]
S. Dasgupta. Learning Mixtures of Gaussians. In FOCS, pages 634--644, 1999.
[14]
L. De Lathauwer, J Castaing and J. Cardoso. Fourth-order Cumulant-based Blind Identification of Underdetermined Mixtures. IEEE Trans. on Signal Processing, 55(6):2965--2973, 2007.
[15]
J. Feldman, R. A. Servedio, and R. O'Donnell. PAC Learning Axis-aligned Mixtures of Gaussians with No Separation Assumption. In COLT, pages 20--34, 2006.
[16]
A. Frieze, M. Jerrum, R. Kannan. Learning Linear Transformations. In FOCS, pages 359--368, 1996.
[17]
N. Goyal, S. Vempala and Y. Xiao. Fourier PCA. In STOC, 2014 (this proceedings).
[18]
R. Harshman. Foundations of the PARFAC procedure: model and conditions for an 'explanatory' multi-mode factor analysis. UCLA Working Papers in Phonetics, pages 1--84, 1970.
[19]
J. Håstad. Tensor Rank is NP-Complete. Journal of Algorithms, pages 644--654, 1990.
[20]
C. Hillar and L-H. Lim. Most Tensor Problems are NP-Hard. arxiv:0911.1393v4, 2013.
[21]
R. Horn and C. Johnson. Matrix Analysis. Cambridge University Press, 1990.
[22]
D. Hsu and S. Kakade. Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions. In ITCS, pages 11--20, 2013.
[23]
A. Hyvärinen, J. Karhunen and E. Oja. Independent Component Analysis. Wiley Interscience, 2001.
[24]
A. T. Kalai, A. Moitra, and G. Valiant. Efficiently Learning Mixtures of Two Gaussians. In STOC, pages 553--562, 2010.
[25]
A. T. Kalai, A. Samorodnitsky and S-H Teng. Learning and Smoothed Analysis. In FOCS, pages 395--404, 2009.
[26]
J. Kruskal. Three-way Arrays: Rank and Uniqueness of Trilinear Decompositions. Linear Algebra and Applications, 18:95--138, 1977.
[27]
S. Leurgans, R. Ross and R. Abel. A Decomposition for Three-way Arrays. SIAM Journal on Matrix Analysis and Applications, 14(4):1064--1083, 1993.
[28]
B. Lindsay. Mixture Models: Theory, Geometry and Applications. Institute for Mathematical Statistics, 1995.
[29]
P. McCullagh. Tensor Methods in Statistics. Chapman and Hall/CRC, 1987.
[30]
A. Moitra and G. Valiant. Setting the Polynomial Learnability of Mixtures of Gaussians. In FOCS, pages 93--102, 2010.
[31]
E. Mossel and S. Roch. Learning Nonsingular Phylogenies and Hidden Markov Models. In STOC, pages 366--375, 2005.
[32]
Y. Rabani, L. Schulman and C. Swamy. Learning mixtures of arbitrary distributions over large discrete domains. In ITCS 2014.
[33]
C. Spearman. General Intelligence. American Journal of Psychology, pages 201--293, 1904.
[34]
D. A. Spielman, S. H. Teng. Smoothed Analysis of Algorithms: Why the Simplex Algorithm usually takes Polynomial Time. Journal of the ACM, pages 385--463, 2004.
[35]
D. A. Spielman, S. H. Teng. Smoothed Analysis: An Attempt to Explain the Behavior of Algorithms in Practice. Communications of the ACM, pages 76--84, 2009.
[36]
A. Stegeman and P. Comon. Subtracting a Best Rank-1 Approximation may Increase Tensor Rank. Linear Algebra and Its Applications, pages 1276--1300, 2010.
[37]
H. Teicher. Identifiability of Mixtures. Annals of Mathematical Statistics, pages 244--248, 1961.
[38]
S. Vempala, Y. Xiao. Structure from Local Optima: Learning Subspace Juntas via Higher Order PCA. Arxiv:abs/1108.3329, 2011.
[39]
P. Wedin. Perturbation Bounds in Connection with Singular Value Decompositions. BIT, 12:99--111, 1972.

Cited By

View all
  • (2024)Speeding up random walk mixing by starting from a uniform vertexElectronic Journal of Probability10.1214/24-EJP109129:noneOnline publication date: 1-Jan-2024
  • (2023)Computational complexity of learning neural networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669456(76272-76297)Online publication date: 10-Dec-2023
  • (2023)“Intelligent Heuristics Are the Future of Computing”ACM Transactions on Intelligent Systems and Technology10.1145/362770814:6(1-39)Online publication date: 14-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing
May 2014
984 pages
ISBN:9781450327107
DOI:10.1145/2591796
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

STOC '14
Sponsor:
STOC '14: Symposium on Theory of Computing
May 31 - June 3, 2014
New York, New York

Acceptance Rates

STOC '14 Paper Acceptance Rate 91 of 319 submissions, 29%;
Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Speeding up random walk mixing by starting from a uniform vertexElectronic Journal of Probability10.1214/24-EJP109129:noneOnline publication date: 1-Jan-2024
  • (2023)Computational complexity of learning neural networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669456(76272-76297)Online publication date: 10-Dec-2023
  • (2023)“Intelligent Heuristics Are the Future of Computing”ACM Transactions on Intelligent Systems and Technology10.1145/362770814:6(1-39)Online publication date: 14-Nov-2023
  • (2023)Robustly Learning General Mixtures of GaussiansJournal of the ACM10.1145/358368070:3(1-53)Online publication date: 23-May-2023
  • (2023)Average-Case Complexity of Tensor Decomposition for Low-Degree PolynomialsProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585232(1685-1698)Online publication date: 2-Jun-2023
  • (2023)Learning Polynomial Transformations via Generalized Tensor DecompositionsProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585209(1671-1684)Online publication date: 2-Jun-2023
  • (2023)Computing linear sections of varieties: quantum entanglement, tensor decompositions and beyond2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS57990.2023.00079(1316-1336)Online publication date: 6-Nov-2023
  • (2023)Absolute reconstruction for sums of powers of linear forms: degree 3 and beyondComputational Complexity10.1007/s00037-023-00239-832:2Online publication date: 3-Aug-2023
  • (2023)Complete Decomposition of Symmetric Tensors in Linear Time and Polylogarithmic PrecisionAlgorithms and Complexity10.1007/978-3-031-30448-4_22(308-322)Online publication date: 25-Apr-2023
  • (2022)Tractable optimality in episodic latent MABsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601987(23634-23645)Online publication date: 28-Nov-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media