Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Nonconvex and nonsmooth optimization problems are frequently encountered in much of statistics, business, science and engineering, but they are not yet widely recognized as a technology in the sense of scalability. A reason for this relatively low degree of popularity is the lack of a well developed system of theory and algorithms to support the applications, as is the case for its convex counterpart. This paper aims to take one step in the direction of disciplined nonconvex and nonsmooth optimization. In particular, we consider in this paper some constrained nonconvex optimization models in block decision variables, with or without coupled affine constraints. In the absence of coupled constraints, we show a sublinear rate of convergence to an \(\epsilon \)-stationary solution in the form of variational inequality for a generalized conditional gradient method, where the convergence rate is dependent on the Hölderian continuity of the gradient of the smooth part of the objective. For the model with coupled affine constraints, we introduce corresponding \(\epsilon \)-stationarity conditions, and apply two proximal-type variants of the ADMM to solve such a model, assuming the proximal ADMM updates can be implemented for all the block variables except for the last block, for which either a gradient step or a majorization–minimization step is implemented. We show an iteration complexity bound of \(O(1/\epsilon ^2)\) to reach an \(\epsilon \)-stationary solution for both algorithms. Moreover, we show that the same iteration complexity of a proximal BCD method follows immediately. Numerical results are provided to illustrate the efficacy of the proposed algorithms for tensor robust PCA and tensor sparse PCA problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. Allen, G.: Sparse higher-order principal components analysis. In: The 15th International Conference on Artificial Intelligence and Statistics (2012)

  2. Ames, B., Hong, M.: Alternating direction method of multipliers for penalized zero-variance discriminant analysis. Comput. Optim. Appl. 64(3), 725–754 (2016). https://doi.org/10.1007/s10589-016-9828-y

    Article  MathSciNet  MATH  Google Scholar 

  3. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bach, F.: Duality between subgradient and conditional gradient methods. SIAM J. Optim. 25(1), 115–129 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  5. Beck, A., Shtern, S.: Linearly convergent away-step conditional gradient for nonstrongly convex functions. Math. Program. 164(1–2), 1–27 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bian, W., Chen, X.: Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian optimization. SIAM J. Optim. 23, 1718–1741 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149, 301–327 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18, 556–572 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  10. Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)

    Article  MATH  Google Scholar 

  11. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146, 459–494 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  12. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    Article  MATH  Google Scholar 

  13. Bredies, K.: A forward-backward splitting algorithm for the minimization of non-smooth convex functionals in Banach space. Inverse Probl. 25(1), 711–723 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  14. Bredies, K., Lorenz, D.A., Maass, P.: A generalized conditional gradient method and its connection to an iterative shrinkage method. Comput. Optim. Appl. 42(2), 173–193 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  15. Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  16. Cartis, C., Gould, N.I.M., Toint, PhL: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization. SIAM J. Optim. 20(6), 2833–2852 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  17. Cartis, C., Gould, N.I.M., Toint, P.L.: Adaptive cubic overestimation methods for unconstrained optimization. Part II: worst-case function-evaluation complexity. Math. Program. Ser. A 130(2), 295–319 (2011)

    Article  MATH  Google Scholar 

  18. Cartis, C., Gould, N.I.M., Toint, P.L.: An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity. IMA J. Numer. Anal. 32, 1662–1695 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  19. Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained \(l_2\)-\(l_p\) minimization. Math. Program. 143, 371–383 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  20. Curtis, F., Robinson, D.P., Samadi, M.: A trust region algorithm with a worst-case iteration complexity of \({\cal{O}} (\epsilon ^{-3/2})\) for nonconvex optimization. Math. Program. 162, 1–32 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  21. Devolder, O., François, G., Nesterov, Yu.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. Ser. A 146, 37–75 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  22. Dutta, J., Deb, K., Tulshyan, R., Arora, R.: Approximate KKT points and a proximity measure for termination. J. Glob. Optim. 56, 1463–1499 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  23. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  24. Frank, M., Wolfe, P.: An algorithm for quadratic programming. Nav. Res. Logist. Q. 3, 95–110 (1956)

    Article  MathSciNet  Google Scholar 

  25. Freund, R.M., Grigas, P.: New analysis and results for the Frank–Wolfe method. Math. Program. 155, 199–230 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  26. Gao, X., Jiang, B., Zhang, S.: On the information-adaptive variants of the ADMM: an iteration complexity perspective. J. Sci. Comput. 76, 327–363 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  27. Ge, D., He, R., He, S.: A three criteria algorithm for \(l_2-l_p\) minimization problem with linear constraints. Math. Program. 166(1), 131–158 (2017)

    Article  MathSciNet  Google Scholar 

  28. Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1), 1–39 (2016)

    MathSciNet  MATH  Google Scholar 

  29. Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: ICML, pp. 37–45 (2013)

  30. Harchaoui, Z., Juditsky, A., Nemirovski, A.: Conditional gradient algorithms for norm-regularized smooth convex optimization. Math. Program. 152, 75–112 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  31. Hong, M.: A distributed, asynchronous and incremental algorithm for nonconvex optimization: an ADMM based approach. IEEE Trans. Control Netw. Syst. 5(3), 935–945 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  32. Hong, M.: Decomposing linearly constrained nonconvex problems by a proximal primal dual approach: algorithms, convergence, and applications. arXiv:1604.00543 (2016)

  33. Hong, M., Luo, Z.-Q., Razaviyayn, M.M.: Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1), 337–364 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  34. Jaggi, M.: Revisiting Frank–Wolfe: projection-free sparse convex optimization. In: ICML (2013)

  35. Jiang, B., Yang, F., Zhang, S.: Tensor and its Tucker core: the invariance relationships. Numer. Linear Algebra Appl. 24(3), e2086 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  36. Kurdyka, K.: On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier 146, 769–783 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  37. Lan, G., Zhou, Y.: Conditional gradient sliding for convex optimization. SIAM J. Optim. 26(2), 1379–1409 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  38. Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  39. Lin, T., Ma, S., Zhang, S.: Global convergence of unmodified 3-block ADMM for a class of convex minimization problems. J. Sci. Comput 76, 69–88 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  40. Lin, T., Ma, S., Zhang, S.: Iteration complexity analysis of multi-block ADMM for a family of convex minimization without strong convexity. J. Sci. Comput. 69(1), 52–81 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  41. Liu, Y., Ma, S., Dai, Y., Zhang, S.: A smoothing SQP framework for a class of composite \(\ell _q\) minimization over polyhedron. Math. Program. Ser. A 158(1), 467–500 (2016)

    Article  MATH  Google Scholar 

  42. Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels, Les Équations aux Dérivées Partielles. Éditions du centre National de la Recherche Scientifique, Paris (1963)

  43. Lacoste-Julien, S.: Convergence rate of Frank–Wolfe for non-convex objectives. Preprint arXiv:1607.00345 (2016)

  44. Lafond, J., Wai, H.-T., Moulines, E.: On the Online Frank–Wolfe algorithms for convex and non-convex optimizations. Preprint arXiv:1510.01171

  45. Luss, R., Teboulle, M.: Conditional gradient algorithms for rank one matrix approximations with a sparsity constraint. SIAM Rev. 55, 65–98 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  46. Martınez, J.M., Raydan, M.: Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization. J. Glob. Optim. 68, 367–385 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  47. Mu, C., Zhang, Y., Wright, J., Goldfarb, D.: Scalable robust matrix recovery: Frank–Wolfe meets proximal methods. SIAM J. Sci. Comput. 38(5), 3291–3317 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  48. Nesterov, Y.: Introductory Lectures on Convex Optimization. Applied Optimization. Kluwer Academic Publishers, Boston, MA (2004)

    Book  MATH  Google Scholar 

  49. Ngai, H.V., Luc, D.T., Théra, M.: Extensions of Fréchet \(\epsilon \)-subdifferential calculus and applications. J. Math. Anal. Appl. 268, 266–290 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  50. Rockafellar, R.T., Wets, R.: Variational Analysis. Volume 317 of Grundlehren der Mathematischen Wissenschafte. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

  51. Shen, Y., Wen, Z., Zhang, Y.: Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization. Optim. Methods Softw. 29(2), 239–263 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  52. Wang, F., Cao, W., Xu, Z.: Convergence of multiblock Bregman ADMM for nonconvex composite problems. Preprint arXiv:1505.03063 (2015)

  53. Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 1–35 (2018)

  54. Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput. 4(4), 333–361 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  55. Xu, Y.: Alternating proximal gradient method for sparse nonnegative Tucker decomposition. Math. Program. Comput. 7(1), 39–70 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  56. Yang, L., Pong, T.K., Chen, X.: Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction. SIAM J. Imaging Sci. 10, 74–110 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  57. Yu, Y., Zhang, X., Schuurmans, D.: Generalized conditional gradient for sparse estimation. Preprint arXiv:1410.4828v1 (2014)

  58. Zhang, C.-H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  59. Zhang, T.: Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11, 1081–1107 (2010)

    MathSciNet  MATH  Google Scholar 

  60. Zhang, T.: Multi-stage convex relaxation for feature selection. Bernoulli 19(5B), 2277–2293 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank Professor Renato D. C. Monteiro and two anonymous referees for their insightful comments, which helped improve this paper significantly.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Jiang.

Additional information

Bo Jiang: Research of this author was supported in part by NSFC Grants 11771269 and 11831002, and Program for Innovative Research Team of Shanghai University of Finance and Economics. Shiqian Ma: Research of this author was supported in part by a startup package in Department of Mathematics at UC Davis. Shuzhong Zhang: Research of this author was supported in part by the National Science Foundation (Grant CMMI-1462408), and in part by Shenzhen Fundamental Research Fund under Grant No. KQTD2015033114415450.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, B., Lin, T., Ma, S. et al. Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis. Comput Optim Appl 72, 115–157 (2019). https://doi.org/10.1007/s10589-018-0034-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-018-0034-y

Keywords

Mathematics Subject Classification