Abstract
We introduce a new constraint system for sparse variable selection in statistical learning. Such a system arises when there are logical conditions on the sparsity of certain unknown model parameters that need to be incorporated into their selection process. Formally, extending a cardinality constraint, an affine sparsity constraint (ASC) is defined by a linear inequality with two sets of variables: one set of continuous variables and the other set represented by their nonzero patterns. This paper aims to study an ASC system consisting of finitely many affine sparsity constraints. We investigate a number of fundamental structural properties of the solution set of such a non-standard system of inequalities, including its closedness and the description of its closure, continuous approximations and their set convergence, and characterizations of its tangent cones for use in optimization. Based on the obtained structural properties of an ASC system, we investigate the convergence of B(ouligand) stationary solutions when the ASC is approximated by surrogates of the step \(\ell _0\)-function commonly employed in sparsity representation. Our study lays a solid mathematical foundation for solving optimization problems involving these affine sparsity constraints through their continuous approximations.
Similar content being viewed by others
References
Ahn, M., Pang, J.S., Xin, J.: Difference-of-convex learning: directional stationarity, optimality, and sparsity. SIAM J. Optim. Revision under review (as of February 2017)
Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Structured sparsity through convex optimization. Stat. Sci. 27(4), 450–468 (2012)
Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., Mahajan, A.: Mixed-integer nonlinear optimization. Acta Numer. 22, 1–131 (2013)
Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44(2), 813–852 (2016)
Bertsimas, D., Shioda, R.: Algorithm for cardinality-constrained quadratic optimization. Comput. Optim. Appl. 43(1), 1–22 (2009)
Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical interactions. Ann. Stat. 43(3), 1111–1141 (2013)
Bienstock, D.: Computational study of a family of mixed-integer quadratic programming problems. Math. Program. Ser. A 74(2), 121–140 (1996)
Brodie, J., Daubechies, I., De Mol, C., Giannone, D., Loris, I.: Sparse and stable Markowitz portfolios. Proc. Natl. Acad. Sci. 106(30), 12267–12272 (2009)
Burdakov, O.P., Kanzow, C., Schwartz, A.: Mathematical programs with cardinality constraints: reformulation by complementarity-type conditions and a regularization method. SIAM J. Optim. 26(1), 397–425 (2016)
Chen, C., Li, X., Tolman, C., Wang, S., Ye, Y.: Sparse portfolio selection via quasi-norm regularization. arXiv:1312.6350v1 (2013)
Conforti, M., Cornuejols, G.: A class of logic problems solvable by linear programming. J. ACM 42(5), 1107–1112 (1995)
Conforti, M., Cornuejols, G.: Balanced matrices. In: Aardal, K., Nemhauser, G.L., Weismantel, R. (eds.) Discrete Optimization. Handbooks in Operations Research and Management Science, vol. 12, pp. 277–320. Elsevier, Amsterdam (2005)
d’Aspremont, A., Banerjee, O., El Ghaoui, L.: First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30(1), 55–66 (2008)
de Miguel, A.-V., Friedlander, M., Nogales, F.J., Scholtes, S.: A two-sided relaxation scheme for mathematical programs with equilibrium constraints. SIAM J. Optim. 16(2), 587–609 (2006)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Feng, M., Mitchell, J.E., Pang, J.S., Waechter, A., Shen, X.: Complementarity formulations of \(\ell _0\)-norm optimization problems. Pac. J. Optim. Accepted Aug 2016
Friedman, J.H., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9(3), 432–441 (2013)
Hamada, M., Wu, C.F.J.: Analysis of designed experiments with complex aliasing. J. Qual. Technol. 24, 130–137 (1992)
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press Taylor & Francis Group, Boca Raton (2015)
Huang, J., Breheny, P., Ma, S.: A selective review of group selection in high-dimensional models. Stat. Sci. 27(4), 481–499 (2012)
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceeding of the 26th Annual International Conference on Machine Learning, Montreal, Canada (ICML ’09, ACM New York) pp. 433–440 (2009)
Kanzow, C., Schwartz, A.: A new regularization method for mathematical programs with complementarity constraints with strong convergence properties. SIAM J. Optim. 23(2), 770–798 (2013)
Le Thi, H.A., Pham, D.T., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244, 26–46 (2015)
McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman & Hall, London (1983)
Nemhauser, G., Wolsey, L.: Integer and Combinatorial Optimization. Wiley, New York (1999)
Pang, J.S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. https://doi.org/10.1287/moor.2016.0795
Park, H., Niida, A., Miyano, S., Imoto, S.: Sparse overlapping group Lasso for integrative multi-Omics analysis. J. Comput. Biol. 22(2), 73–84 (2015)
Ralph, D., Wright, S.J.: Some properties of regularization and penalization schemes for MPECs. Optim. Methods Softw. 19(5), 527–556 (2004)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Scholtes, S.: Convergence properties of a regularisation scheme for mathematical programs with complementarity constraints. SIAM J. Optim. 11(4), 918–936 (2001)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)
Tibshirani, R., Saunders, M.A., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(1), 91–108 (2005)
Wang, J., Ye, J.: Multi-layer feature reduction for tree structured group lasso via hierarchical projection. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada pp. 1279–1287 (2015)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methods 68(1), 49–67 (2006)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)
Zhang, T.: Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11, 1081–1107 (2010)
Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat. 37(6A), 3468–3497 (2009)
Zheng, X., Sun, X., Li, D., Sun, J.: Successive convex approximations to cardinality-constrained convex programs: a piecewise-linear DC approach. Comput. Optim. Appl. 59(1–2), 379–397 (2014)
Author information
Authors and Affiliations
Corresponding author
Additional information
The research of the second and the third authors were partially supported by the U.S. National Science Foundation Grant IIS-1632971.
Rights and permissions
About this article
Cite this article
Dong, H., Ahn, M. & Pang, JS. Structural properties of affine sparsity constraints. Math. Program. 176, 95–135 (2019). https://doi.org/10.1007/s10107-018-1283-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-1283-3