Abstract
The article begins with a review of the main approaches for interpretation the results from principal component analysis (PCA) during the last 50–60 years. The simple structure approach is compared to the modern approach of sparse PCA where interpretable solutions are directly obtained. It is shown that their goals are identical but they differ by the way they are realized. Next, the most popular and influential methods for sparse PCA are briefly reviewed. In the remaining part of the paper, a new approach to define sparse PCA is introduced. Several alternative definitions are considered and illustrated on a well-known data set. Finally, it is demonstrated, how one of these possible versions of sparse PCA can be used as a sparse alternative to the classical rotation methods.
Similar content being viewed by others
References
Bach F (2008) BOLASSO: model consistent LASSO estimation through the bootstrap. In: ICML ’08 proceedings of the 25th international conference on machine learning. ACM Press, New York, pp 33–40
Browne MW (2001) An overview of analytic rotation in exploratory factor analysis. Multivar Behav Res 36:111–150
Cadima J, Jolliffe IT (1995) Loadings and correlations in the interpretations of principal components. J Appl Stat 22:203–214
Cadima J, Jolliffe IT (2001) Variable selection and the interpretation of principal subspaces. J Agric Biol Environ Stat 6:62–79
Cai T, Ma Z, Wu Y (2012) Sparse PCA: optimal rates and adaptive estimation. http://arxiv.org/abs/1211.1309
Candès EJ, Tao T (2007) The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann Stat 35:2313–2351
Candès EJ, Wakin M, Boyd SP (2008) Enhancing sparsity by reweighted \(\ell _1\) minimization. J Fourier Anal Appl 14:877–905
Chipman HA, Gu H (2005) Interpretable dimension reduction. J Appl Stat 32:969–987
Chu MT, Trendafilov NT (1998) ORTHOMAX rotation problem. A differential equation approach. Behaviormetrika 25:13–23
d’Aspremont A, Ghaoui L, Jordan M, Lanckriet G (2007) A direct formulation for sparse PCA using semidefinite programming. SIAM Rev 49:434–448
d’Aspremont A, Bach F, Ghaoui L (2008) Optimal solutions for sparse principal component analysis. J Mach Learn Res 9:1269–1294
Diele F, Lopez L, Peluso R (1998) The Cayley transform in the numerical solution of unitary differential systems. Adv Comput Math 8:317–334
Ding X, He L, Carin L (2011) Bayesian robust principal component analysis. IEEE Trans Image Process 20:3419–3430
Donoho DL, Johnstone IM (1994) Ideal spatial adaptation via wavelet shrinkage. Biometrika 81:425–455
Eckart C, Young G (1936) The approximation of one matrix by another of lower rank. Psychometrika 1:211–218
Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl 20:303–353
Enki D, Trendafilov NT (2012) Sparse principal components by semi-partition clustering. Comput Stat 4:605–626
Enki D, Trendafilov NT, Jolliffe IT (2013) A clustering approach to interpretable principal components. J Appl Stat 3:583–599
Friedlander M, Tseng P (2007) Exact regularization of convex programs. SIAM J Optim 4:1326–1350
Guan Y, Dy J (2009) Sparse probabilistic principal component analysis. Proc Twelfth Int Conf Artif Intell Stat 5:185–192
Guo F, Gareth J, Levina E, Michailidis G, Zhu J (2010) Principal component analysis with sparse fused loadings. J Comput Graph Stat 19:947–962
Hannachi A, Jolliffe IT, Stephenson DB, Trendafilov NT (2006) In search of simple structures in climate: simplifying EOFs. Int J Climatol 26:7–28
Harman HH (1976) Modern factor analysis, 3rd edn. University of Chicago Press, Chicago
Hausman RE (1982) Constrained multivariate analysis. In: Zanakis SH, Rustagi JS (eds) Optimization in statistics. North-Holland, Amsterdam, pp 137–151
Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(417–441):498–520
Jeffers JNR (1967) Two case studies in the application of principal component analysis. Appl Stat 16:225–236
Jennrich RI (2007) Rotation methods, algorithms, and standard errors. In: Cudeck R, MacCallum RC (eds) Factor analysis at 100. Lawrens Erlbaum Associates, Mahwah, NJ, pp 315–335
Johnstone IM, Lu AY (2009) On consistency and sparsity for principal components analysis in high dimensions. J Am Stat Assoc 104:682–693
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York
Jolliffe IT, Uddin M (2000) The simplified component technique: An alternative to rotated principal components. J Comput Graph Stat 9:689–710
Jolliffe IT, Trendafilov NT, Uddin M (2003) A modified principal component technique based on the LASSO. J Comput Graph Stat 12:531–547
Journée M, Nesterov Y, Richtárik P, Sepulchre R (2010) Generalized power method for sparse principal component analysis. J Mach Learn Res 11:517–553
Lu Z, Zhang Y (2012) An augmented Lagrangian approach for sparse principal component analysis. Math Program Ser A 135:149–193
Marshall A, Olkin I (1979) Inequalities: theory of majorization and its applications. Academic Press, London
MATLAB (2011) MATLAB R2011a. The MathWorks, Inc., New York
Moghaddam B, Weiss Y, Avidan S (2006) Spectral bounds for sparse PCA: exact and greedy algorithms. Adv Neural Inf Process Syst 18:915–922
Mulaik SA (2010) The foundations of factor analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton, FL
Paul D, Johnstone IM (2007) Augmented sparse principal component analysis for high dimensional data. http://arxiv.org/abs/1202.1242
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2:559–572
Qi X, Luo R, Zhao H (2013) Sparse principal component analysis by choice of norm. J Multivar Anal 114:127–160
Richtárik P, Takáč M, Ahipaşaoğlu SD (2012) Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes. http://www.maths.ed.ac.uk/~richtarik/24AM.pdf
Rousson V, Gasser T (2004) Simple component analysis. Appl Stat 53:539–555
Shen H, Huang JZ (2008) Sparse principal component analysis via regularized low-rank matrix approximation. J Multivar Anal 99:1015–1034
Sriperumbudur BK, Torres DA, Lanckriet GRG (2011) A majorization-minimization approach to the sparse generalized eigenvalue problem. Mach Learn 85:3–39
Thurstone LL (1935) The vectors of mind. University of Chicago Press, Chicago, IL
Thurstone LL (1947) Multiple factor analysis. University of Chicago Press, Chicago, IL
Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc 58:267–288
Trendafilov NT (1999) A continuous-time approach to the oblique Procrustes problem. Behaviormetrika 26:167–181
Trendafilov NT, Jolliffe IT (2006) Projected gradient approach to the numerical solution of the SCoTLASS. Comput Stat Data Anal 50:242–253
Trendafilov NT, Lippert RA (2002) The multimode Procrustes problem. Linear Algebra Appl 349(1–3):245–264
Vichi M, Saporta G (2009) Clustering and disjoint principal component analysis. Comput Stat Data Anal 53:3194–3208
Vines SK (2000) Simple principal components. Appl Stat 49:441–451
Witten DM, Tibshirani R, Hastie T (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation. Biostatistics 10:515–534
Wright S (2011) Gradient algorithms for regularized optimization. SPARS11, Edinburgh, Scotland. http://pages.cs.wisc.edu/~swright
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15:265–286
Zou H, Hastie T, Tibshirani R (2007) On the ”degrees of freedom” of the LASSO. Ann Stat 35:2173–2192
Acknowledgments
I thank the Editor, the Associate Editor, and the anonymous reviewers for their careful work and for the many helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Trendafilov, N.T. From simple structure to sparse components: a review. Comput Stat 29, 431–454 (2014). https://doi.org/10.1007/s00180-013-0434-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-013-0434-5