Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

Sparse principal component analysis addresses the problem of finding a linear combination of the variables in a given dataset with a sparse coefficients vector that maximizes the variability of the data. This model enhances the ability to interpret the principal components and is applicable in a wide variety of fields including genetics and finance, just to name a few. We suggest a necessary coordinate-wise-based optimality condition and show its superiority over the stationarity-based condition that is commonly used in the literature, which is the basis for many of the algorithms designed to solve the problem. We devise algorithms that are based on the new optimality condition and provide numerical experiments that support our assertion that algorithms, which are guaranteed to converge to stronger optimality conditions, perform better than algorithms that converge to points satisfying weaker optimality conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. In [6] the authors suggested a sufficient optimality condition.

  2. Note that the \(l_0\)-norm is not actually a norm since it does not satisfy the absolute homogeneity property.

References

  1. Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, New York (2002)

    MATH  Google Scholar 

  2. Misra, J., Schmitt, W., Hwang, D., Hsiao, L.L., Gullans, S., Stephanopoulos, G., Stephanopoulos, G.: Interactive exploration of microarray gene expression patterns in a reduced dimensional space. Genome Res. 12(7), 1112–1120 (2002)

    Article  Google Scholar 

  3. d’Aspremont, A.: Identifying small mean-reverting portfolios. Quant. Finance 11(3), 351–364 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Moghaddam, B., Weiss, Y., Avidan, S.: Spectral bounds for sparse pca: exact and greedy algorithms. In: Y. Weiss, B. Schölkopf, J. Platt (Eds.) Adv. Neural. Inf. Process. Syst. 18, pp. 915–922. MIT Press, Cambridge, MA (2006)

  5. Cadima, J., Jolliffe, I.T.: Loading and correlations in the interpretation of principle compenents. J. Appl. Stat. 22(2), 203–214 (1995)

    Article  MathSciNet  Google Scholar 

  6. d’Aspremont, A., Bach, F., Ghaoui, L.E.: Optimal solutions for sparse principal component analysis. J. Mach. Learn. Res. 9, 1269–1294 (2008)

    MathSciNet  MATH  Google Scholar 

  7. d’Aspremont, A., El Ghaoui, L., Jordan, M., Lanckriet, G.: A direct formulation of sparse PCA using semidefinite programming. SIAM Rev. 49(3), 434–448 (2007)

  8. Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the LASSO. J. Comput. Graph. Stat. 12(3), 531–547 (2003)

    Article  MathSciNet  Google Scholar 

  9. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  10. Witten, D.M., Hastie, T., Tibshirani, R.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10, 515–534 (2009)

    Article  Google Scholar 

  11. Sigg, C.D., Buhmann, J.M.: Expectation-maximization for sparse and non-negative pca. In: Proceedings of the 25th international conference on machine learning, ICML ’08, pp. 960–967. ACM, NewYork, NY, USA (2008)

  12. Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15, 2006 (2004)

    MathSciNet  Google Scholar 

  13. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Sta. Soc. Ser. B 67, 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  14. Shen, H., Huang, J.Z.: Sparse principal component analysis via regularized low rank matrix approximation. J. Multivar. Anal. 99(6), 1015–1034 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  15. Journée, M., Nesterov, Y., Richtárik, P., Sepulchre, R.: Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11, 517–553 (2010)

    MathSciNet  MATH  Google Scholar 

  16. Luss, R., Teboulle, M.: Conditional gradient algorithms for rank-one matrix approximations with a sparsity constraint. SIAM Rev. 55(1), 65–98 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  17. Sriperumbudur, B.K., Torres, D.A., Lanckriet, G.R.: A majorization–minimization approach to the sparse generalized eigenvalue problem. Mach. Learn. 85(1), 3–39 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  18. Beck, A., Eldar, Y.C.: Sparsity constrained nonlinear optimization: optimality conditions and algorithms. SIAM J. Opt. 23(3), 1480–1509 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  19. Beck, A., Hallak, N.: On the minimization over sparse symmetric sets: projections, optimality conditions, and algorithms. Math. Oper. Res. 41(1), 196–223 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  20. Rockafellar, R.: Convex Analysis. Princeton Mathematical Series, Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  21. Jeffers, J.N.R.: Two case studies in the application of principal component analysis. J. R. Stat. Soc. Ser. C. Appl. Stat. 16(3), 225–236 (1967)

    Google Scholar 

  22. Armstrong, S.A., Staunton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., Sallan, S.E., Lander, E.S., Golub, T.R., Korsmeyer, S.J.: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat. Genet. 30(1), 41–47 (2002)

    Article  Google Scholar 

  23. Liu, F., White, J., Antonescu, C., Gusenleitner, D., Quackenbush, J.: Gcod—genechip oncology database. BMC Bioinform. 12(1), 46 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank two anonymous referees for their helpful remarks that helped to improve the presentation of the paper. The research of Amir Beck was partially supported by the Israel Science Foundation Grant 253/12.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Beck.

Additional information

Communicated by Jérôme Bolte.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Beck, A., Vaisbourd, Y. The Sparse Principal Component Analysis Problem: Optimality Conditions and Algorithms. J Optim Theory Appl 170, 119–143 (2016). https://doi.org/10.1007/s10957-016-0934-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-016-0934-x

Keywords