Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Structural properties of affine sparsity constraints

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

We introduce a new constraint system for sparse variable selection in statistical learning. Such a system arises when there are logical conditions on the sparsity of certain unknown model parameters that need to be incorporated into their selection process. Formally, extending a cardinality constraint, an affine sparsity constraint (ASC) is defined by a linear inequality with two sets of variables: one set of continuous variables and the other set represented by their nonzero patterns. This paper aims to study an ASC system consisting of finitely many affine sparsity constraints. We investigate a number of fundamental structural properties of the solution set of such a non-standard system of inequalities, including its closedness and the description of its closure, continuous approximations and their set convergence, and characterizations of its tangent cones for use in optimization. Based on the obtained structural properties of an ASC system, we investigate the convergence of B(ouligand) stationary solutions when the ASC is approximated by surrogates of the step \(\ell _0\)-function commonly employed in sparsity representation. Our study lays a solid mathematical foundation for solving optimization problems involving these affine sparsity constraints through their continuous approximations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Ahn, M., Pang, J.S., Xin, J.: Difference-of-convex learning: directional stationarity, optimality, and sparsity. SIAM J. Optim. Revision under review (as of February 2017)

  2. Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Structured sparsity through convex optimization. Stat. Sci. 27(4), 450–468 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., Mahajan, A.: Mixed-integer nonlinear optimization. Acta Numer. 22, 1–131 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44(2), 813–852 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bertsimas, D., Shioda, R.: Algorithm for cardinality-constrained quadratic optimization. Comput. Optim. Appl. 43(1), 1–22 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical interactions. Ann. Stat. 43(3), 1111–1141 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bienstock, D.: Computational study of a family of mixed-integer quadratic programming problems. Math. Program. Ser. A 74(2), 121–140 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  8. Brodie, J., Daubechies, I., De Mol, C., Giannone, D., Loris, I.: Sparse and stable Markowitz portfolios. Proc. Natl. Acad. Sci. 106(30), 12267–12272 (2009)

    Article  MATH  Google Scholar 

  9. Burdakov, O.P., Kanzow, C., Schwartz, A.: Mathematical programs with cardinality constraints: reformulation by complementarity-type conditions and a regularization method. SIAM J. Optim. 26(1), 397–425 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chen, C., Li, X., Tolman, C., Wang, S., Ye, Y.: Sparse portfolio selection via quasi-norm regularization. arXiv:1312.6350v1 (2013)

  11. Conforti, M., Cornuejols, G.: A class of logic problems solvable by linear programming. J. ACM 42(5), 1107–1112 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  12. Conforti, M., Cornuejols, G.: Balanced matrices. In: Aardal, K., Nemhauser, G.L., Weismantel, R. (eds.) Discrete Optimization. Handbooks in Operations Research and Management Science, vol. 12, pp. 277–320. Elsevier, Amsterdam (2005)

    Google Scholar 

  13. d’Aspremont, A., Banerjee, O., El Ghaoui, L.: First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30(1), 55–66 (2008)

    MathSciNet  MATH  Google Scholar 

  14. de Miguel, A.-V., Friedlander, M., Nogales, F.J., Scholtes, S.: A two-sided relaxation scheme for mathematical programs with equilibrium constraints. SIAM J. Optim. 16(2), 587–609 (2006)

    MathSciNet  MATH  Google Scholar 

  15. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  16. Feng, M., Mitchell, J.E., Pang, J.S., Waechter, A., Shen, X.: Complementarity formulations of \(\ell _0\)-norm optimization problems. Pac. J. Optim. Accepted Aug 2016

  17. Friedman, J.H., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9(3), 432–441 (2013)

    Article  MATH  Google Scholar 

  18. Hamada, M., Wu, C.F.J.: Analysis of designed experiments with complex aliasing. J. Qual. Technol. 24, 130–137 (1992)

    Article  Google Scholar 

  19. Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press Taylor & Francis Group, Boca Raton (2015)

    Book  MATH  Google Scholar 

  20. Huang, J., Breheny, P., Ma, S.: A selective review of group selection in high-dimensional models. Stat. Sci. 27(4), 481–499 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  21. Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceeding of the 26th Annual International Conference on Machine Learning, Montreal, Canada (ICML ’09, ACM New York) pp. 433–440 (2009)

  22. Kanzow, C., Schwartz, A.: A new regularization method for mathematical programs with complementarity constraints with strong convergence properties. SIAM J. Optim. 23(2), 770–798 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  23. Le Thi, H.A., Pham, D.T., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244, 26–46 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  24. McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman & Hall, London (1983)

    Book  MATH  Google Scholar 

  25. Nemhauser, G., Wolsey, L.: Integer and Combinatorial Optimization. Wiley, New York (1999)

    MATH  Google Scholar 

  26. Pang, J.S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. https://doi.org/10.1287/moor.2016.0795

  27. Park, H., Niida, A., Miyano, S., Imoto, S.: Sparse overlapping group Lasso for integrative multi-Omics analysis. J. Comput. Biol. 22(2), 73–84 (2015)

    Article  MathSciNet  Google Scholar 

  28. Ralph, D., Wright, S.J.: Some properties of regularization and penalization schemes for MPECs. Optim. Methods Softw. 19(5), 527–556 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  29. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

  30. Scholtes, S.: Convergence properties of a regularisation scheme for mathematical programs with complementarity constraints. SIAM J. Optim. 11(4), 918–936 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  31. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  32. Tibshirani, R., Saunders, M.A., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(1), 91–108 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  33. Wang, J., Ye, J.: Multi-layer feature reduction for tree structured group lasso via hierarchical projection. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada pp. 1279–1287 (2015)

  34. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methods 68(1), 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  35. Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  36. Zhang, T.: Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11, 1081–1107 (2010)

    MathSciNet  MATH  Google Scholar 

  37. Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat. 37(6A), 3468–3497 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  38. Zheng, X., Sun, X., Li, D., Sun, J.: Successive convex approximations to cardinality-constrained convex programs: a piecewise-linear DC approach. Comput. Optim. Appl. 59(1–2), 379–397 (2014)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongbo Dong.

Additional information

The research of the second and the third authors were partially supported by the U.S. National Science Foundation Grant IIS-1632971.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, H., Ahn, M. & Pang, JS. Structural properties of affine sparsity constraints. Math. Program. 176, 95–135 (2019). https://doi.org/10.1007/s10107-018-1283-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-018-1283-3

Keywords

Mathematics Subject Classification