Abstract
This paper aims at investigating saddle point conditions for augmented Lagrangian functions for semidefinite optimization problems. By means of the image space analysis, the existence of a saddle point is shown to be equivalent to a regular weak nonlinear separation of two suitable subsets in the image space (IS) associated with the given problem. Especially, three classes of augmented Lagrangians based on smooth spectral penalty functions can be derived, as particular cases, from a nonlinear separation scheme in the IS. Without requiring the strict complementarity, it is proved that, under strong second-order sufficiency conditions, all these augmented Lagrangian functions admit a local saddle point, and their Hessians become positive definite in a neighborhood of a local optimal point of the original problem. The existence of global saddle points is then obtained under additional assumptions that do not require the compactness of the feasible set.
Similar content being viewed by others
References
Giannessi, F.: Constrained Optimization and Image Space Analysis. Springer, Berlin (2005)
Giannessi, F.: Theorems of the alternative and optimality conditions. J. Optim. Theory Appl. 42, 331–365 (1984)
Dien, P.H., Mastroeni, G., Pappalardo, M., Quang, P.H.: Regularity conditions for constrained extremum problems via image space. J. Optim. Theory Appl. 80, 19–37 (1994)
Pappalardo, M.: Image space approach to penalty methods. J. Optim. Theory Appl. 64, 141–152 (1990)
Rubinov, A.M., Uderzo, A.: On global optimality conditions via separation functions. J. Optim. Theory Appl. 109, 345–370 (2001)
Luo, H.Z., Mastroeni, G., Wu, H.X.: Separation approach for augmented Lagrangians in constrained nonconvex optimization. J. Optim. Theory Appl. 144, 275–290 (2010)
Luo, H.Z., Wu, H.X., Liu, J.Z.: Some results on augmented Lagrangians in constrained global optimization via image space analysis. J. Optim. Theory Appl. 159, 360–385 (2013)
Li, S.J., Xu, Y.D., Zhu, S.K.: Nonlinear separation approach to constrained extremum problems. J. Optim. Theory Appl. 154, 842–856 (2012)
Zhu, S.K., Li, S.J.: Unified duality theory for constrained extremum problems. Part I: Image space analysis. J. Optim. Theory Appl. 161(3), 738–762 (2014)
Chinaie, M., Zafarani, J.: Image space analysis and scalarization of multivalued optimization. J. Optim. Theory Appl. 142, 451–467 (2009)
Giannessi, F., Mastroeni, G., Yao, J.C.: On maximum and variational principles via image space analysis. Positivity 16, 405–427 (2012)
Mastroeni, G.: Nonlinear separation in the image space with applications to penalty methods. Appl. Anal. 91, 1901–1914 (2012)
Shapiro, A., Sun, J.: Some properties of the augmented Lagrangian in cone constrained optimization. Math. Oper. Res. 29, 479–491 (2004)
Todd, M.J.: Semidefinite optimization. Acta Numer. 10, 515–560 (2001)
Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)
Ye, Y.: Interior Point Algorithms: Theory and Analysis. Wiley, New York (1997)
Ben-Tal, A., Jarre, F., Kocvara, M., Nemirovski, A., Zowe, J.: Optimal design of trusses under a nonconvex global buckling constraints. Optim. Eng. 1, 189–213 (2000)
Fares, B., Apkarian, P., Noll, D.: An augmented Lagrangian method for a class of LMI-constrained problems in robust control theory. Int. J. Control 74, 348–360 (2001)
Fares, B., Noll, D., Apkarian, P.: Robust control via sequential semidefinite programming. SIAM J. Control Optim. 40, 1791–1820 (2002)
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
Chan, Z.X., Sun, D.: Constraint nondegeneracy, strong regularity and nonsingularity in semidefinite programming. SIAM J. Optim. 19, 370–396 (2008)
Forsgren, A.: Optimality conditions for nonconvex semidefinite programming. Math. Program. 88, 105–128 (2000)
Qi, H.D.: Local duality of nonlinear semidefinite programming. Math. Oper. Res. 34, 124–141 (2009)
Shapiro, A.: First and second order analysis of nonlinear semidefinite programs. Math. Program. Ser. B 77, 301–320 (1997)
Sun, D.: The strong second-order sufficient condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications. Math. Oper. Res. 31, 761–776 (2006)
Correa, R., Hector Ramirez, C.: A global algorithm for solving nonlinear semidefinite programming. SIAM J. Optim. 15, 303–318 (2004)
Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 135, 89–121 (2012)
Sun, J., Zhang, L.W., Wu, Y.: Properties of the augmented Lagrangian in nonlinear semidefinite optimization. J. Optim. Theory Appl. 129, 437–456 (2006)
Sun, D., Sun, J., Zhang, L.W.: The rate of convergence of the augmented Lagrangian method for nonlinear semidefinite programming. Math. Program. 114, 349–391 (2008)
Luo, H.Z., Wu, H.X., Chen, G.T.: On the convergence of augmented Lagrangian methods for nonlinear semidefinite programming. J. Glob. Optim. 54, 599–618 (2012)
Wu, H.X., Luo, H.Z., Ding, X.D., Chen, G.T.: Global convergence of modified augmented Lagrangian methods for nonlinear semidefinite programming. Comput. Optim. Appl. 56, 531–558 (2013)
Wu, H.X., Luo, H.Z., Yang, J.F.: Nonlinear separation approach for the augmented Lagrangian in nonlinear semidefinite programming. J. Glob. Optim. 59, 695–727 (2014)
Noll, D.: Local convergence of an augmented Lagrangian method for matrix inequality constrained programming. Optim. Methods Softw. 22, 777–802 (2007)
Stingl, M.: On the solution of nonlinear semidefinite programs by augmented Lagrangian methods. PhD thesis, Institute of Applied Mathematics, Universitytat Erlangen-Nurnberg (2005)
Li, D., Sun, X.L.: Existence of a saddle point in nonconvex constrained optimization. J. Glob. Optim. 21, 39–50 (2001)
Sun, X.L., Li, D., McKinnon, K.I.M.: On saddle points of augmented Lagrangians for constrained nonconvex optimization. SIAM J. Optim. 15, 1128–1146 (2005)
Wu, H.X., Luo, H.Z.: A note on the existence of saddle points of \(p\)-th power Lagrangian for constrained nonconvex optimization. Optimization 61, 1331–1345 (2012)
Wu, H.X., Luo, H.Z.: Saddle points of general augmented Lagrangians for constrained nonconvex optimization. J. Glob. Optim. 53, 683–697 (2012)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Birgin, E.G., Castillo, R.A., Martínez, J.M.: Numerical comparison of augmented Lagrangian algorithms for nonconvex problems. Comput. Optim. Appl. 31, 31–55 (2005)
Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991)
Zarantonello, E.H.: Projections on convex sets in Hilbert space and spectral theory I and II. In: Zarantonello, E.H. (ed.) Contributions to Nonlinear Functional Analysis, pp. 237–424. Academic, New York (1971)
Theobald, C.M.: An inequality for the trace of the product of two symmetric matrices. Math. Proc. Camb. Philos. Soc. 77, 77–265 (1975)
Acknowledgments
The authors would like to thank the two anonymous referees for the detailed comments and valuable suggestions, which have improved the final presentation of the paper. This work was supported by the National Natural Science Foundation of China under Grants 11371324 and 11071219, and the Zhejiang Provincial Natural Science Foundation of China under Grants LY13A010012 and LY13A010017.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of Proposition 4.2
Appendix: Proof of Proposition 4.2
We first recall a well-known result regarding the differentiability properties of the composite matrix function. We use the following definition from [41].
Definition 6.1
Let \(G: {{\mathrm{I}\!\mathrm{R}}}^n\rightarrow {{\mathrm{I}\!\mathrm{R}}}^{m\times m}\). Let \(\lambda _1(x),\ldots ,\lambda _{\mu (x)}(x)\) be the \(\mu (x)\) increasingly ordered eigenvalues of \(G(x)\). Let \(G(x)=U(x)\varLambda (x)U(x)^T\) be the spectral decomposition of \(G(x)\), with \(\varLambda (x)=\mathrm{diag}\left( \lambda _1(x), \ldots , \lambda _1(x),\right. \) \(\left. \ldots ,\lambda _{\mu (x)}(x),\ldots ,\right. \) \(\left. \lambda _{\mu (x)}(x)\right) \). Then, the Frobenius covariance matrices is defined by
where the non-zeros in the diagonal matrix occur exactly in the positions of \(\lambda _i(x)\) in \(\varLambda (x)\).
Note that \(P_i(x)P_j(x)=0\) if \(i\ne j\), \(P_i(x)^2=P_i(x) \), \(i,j=1,\ldots ,\mu (x)\), and \(\sum _{i=1}^{\mu (x)}P_i(x)=I_m\) (see [41], p. 403).
Definition 6.2
Let \(t_1,\ldots ,t_m\) be real values and let \(f_0 : {{\mathrm{I}\!\mathrm{R}}}\rightarrow {{\mathrm{I}\!\mathrm{R}}}\) be twice continuously differentiable. Then, we define
The following result is from [41] (Theorem 6.6.30), which will be used to prove Proposition 4.2.
Lemma 6.1
Let \(f_0 : {{\mathrm{I}\!\mathrm{R}}}\rightarrow {{\mathrm{I}\!\mathrm{R}}}\) and \(G: {{\mathrm{I}\!\mathrm{R}}}^n\rightarrow {\mathcal {S}}^m\) be twice continuously differentiable. Define \(F : {\mathcal {S}}^m\rightarrow {\mathcal {S}}^m\) by \(F(Z):= U f_1(D)U^T\), where \(Z= UD U^T\) is the spectral decomposition of \(Z\), \(D=\mathrm{diag}\left( \lambda _1, \ldots , \lambda _m\right) \) and \(f_1(D) = \mathrm{diag}\left( f_0(\lambda _1),\ldots , f_0(\lambda _m)\right) \), where \(\lambda _i\), \(i=1,\ldots ,m\) are eigenvalues of \(Z\) listed in the decreasing order. Let \(\lambda _1(x),\ldots ,\lambda _{\mu (x)}(x)\) be the \(\mu (x)\) increasingly ordered eigenvalues of the matrix \(G(x)\). Then, \(F(G(x))\) is twice continuously differentiable with
where \(P_k(x)\), \(k=1,\ldots ,\mu (x)\) are Frobenius covariance matrices of \(G(x)\), and the matrix \(M_{klq}\) is given by \(M_{klq}:=P_k(x) G_i^{\prime }(x) P_l(x) G_j^{\prime }(x)P_q(x)\).
Proof of Proposition 4.2. We first show conclusion (i). Let \(\lambda _i^*\ge 0\), \(i=1,\ldots ,m\) be eigenvalues of \(G(x^*)\) of rank \(r\) listed in the decreasing order. Since \(\varLambda ^*\bullet G(x^*)=0\) is equivalent to \( \lambda _i(\varLambda ^*) \lambda _i^*=0\) for \(i=1,\ldots ,m\), we obtain \(\lambda _i(\varLambda ^*)=0\) for \(i=1,\ldots ,r\). Let us define \(\psi _c(t):=c^{-1}\psi (ct)\) for any \(t\in {{\mathrm{I}\!\mathrm{R}}}\). By the property \(\psi (0) = 0\), we have \( \psi _c(\lambda _i^*)=0\) for \(i=r+1,\ldots ,m\). Thus, \(\lambda _i(\varLambda ^*) \psi _c(\lambda _i^*)=0\) for \(i=1,\ldots ,m\), which, by the definition of \(\varPsi _c\), means that \(\varLambda ^*\bullet \varPsi _c(G(x^*))=0\). Similarly, \(\varLambda ^*\bullet \varPhi _c(G(x^*))=0\). Observe from condition (D2) that \(G(x^*)\succeq 0\) implies \(\mathrm{Tr}\left( \varXi _c(G(x^*))\right) =0\). Hence, \(L_i(x^*,\varLambda ^*,\mu ^*,c)= f(x^*)\) \((i=1,2,3)\) for any \(c > 0\).
Next, we prove conclusion (ii) for \(L_1\). The proof for \(L_2\) can be constructed by using similar arguments and the properties (C1)–(C3).
Let \(\lambda _1^*,\ldots ,\lambda ^*_{\mu (x^*)} \) be the \(\mu (x^*)\) distinct eigenvalues of \(G(x^*)\) and let \(P_i(x^*)\) be defined by (90) for \(i=1,\ldots ,\mu (x^*)\). Note that \(P_i(x^*)P_j(x^*)=0\) for all \(i\ne j\) and \(P_i(x^*)^2=P_i(x^*) \) for all \(i,j=1,\ldots ,\mu (x^*)\). Let \(P_0:=EE^T\) with \(E=[e_{r+1},\ldots ,e_m]\). We see that \(P_0=P_{\mu (x^*)}(x^*)\). Since \(\varLambda ^*\bullet G(x^*)=0\) and \(\varLambda ^*,G(x^*) \in {\mathcal {S}}^m_+ \) imply \(\varLambda ^* G(x^*)=G(x^*)\varLambda ^*\), by Theorem 1.3.12 of [41], \(\varLambda ^*\) and \(G(x^*)\) are simultaneously diagonalizable. So, \(\varLambda ^*={\bar{ E}}D_{\varLambda ^*}{\bar{E}}^T\), where \(D_{\varLambda ^*}:=\mathrm{diag}\left( 0,\ldots ,0,\lambda _{r+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \), and \({\bar{E}}:=[e_1,\ldots ,e_m]\) with \({\bar{E}}^T\bar{E}=I_m\). Let \(D_0:=\mathrm{diag}{({\underbrace{0,\ldots ,0}_{r}},\underbrace{1\ldots ,1}_{m-r})}\). Then, \(P_0={\bar{E}}D_0\bar{E}^T\) and
By Lemma 6.1 and (91), we can obtain
from which and by \(\psi ^\prime (0)=1\) and \(h(x^*)=0\), we draw immediately that
Moreover, we have
where
By Definition 6.2, via a straightforward computation, we obtain
Note that \(\psi ^\prime (0)=1\). It then follows from (92) and the definition of \(L_1\) that
which is just (38), where
By condition (B1), \(\psi (0)=0\) and \(\psi ^\prime (0)=1\), we know that \(\psi (t)< t\) for any \(t\ne 0\), and hence \(\frac{c^{-1}\psi (c\lambda _k^* )}{(\lambda ^*_k)^2}-\frac{1}{\lambda _k^* }<0\) for \(k=1,\ldots ,\mu (x^*)-1\). Together with \(P_k(x^*)\succeq 0\) for all \(k\) and \(\varLambda ^*\succeq 0\), we then infer from (94) that \(H_c(x^*,\varLambda ^*)\) is positive semidefinite. By the condition (B3) for \(\psi \), we see that \(\lim _{c\rightarrow \infty }\frac{\psi (c\tau )}{c}=0\) for any \(\tau >0\). Note also that \(\lambda _k^*>0\) for \(k=1,\ldots ,r\) and that, by the definition of \(P_k(x)\),
We then deduce from (94) that
proving (39). Note that \(s=m-\mathrm{rank}(\varLambda ^*)\ge r\). Then, \(\lambda _i(\varLambda ^*)=0\) for \(i=1,\ldots ,s\) and
Note that \(\varLambda ^*={\bar{E}}D_{\varLambda ^*}{\bar{E}}^T\) with \({\bar{E}}=[e_1,\ldots ,e_m]\) and \(P_{\mu (x^*)}=P_0=EE^T\) with \(E=[e_{r+1},\ldots ,e_m]\). It then follows from (95) that
where \(Q^*_s: =\mathrm{diag}\left( \lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*),\ldots , \lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \in {{\mathrm{I}\!\mathrm{R}}}^{{\bar{m}}\times {\bar{m}}}\) with \({\bar{m}}:=(m-s)(m-r)\), and the \(i\)-th row of matrix \(B\in {{\mathrm{I}\!\mathrm{R}}}^{n\times {\bar{m}}}\) is defined by
Note that, since \(\mathrm{rank}(\varLambda ^*)=m-s\) implies \(\lambda _i(\varLambda ^*)>0\) for \(i=s+1,\ldots ,m\), \(Q^*_s\) is positive definite and hence, by \(\psi ^{\prime \prime }(0)<0\), we see from (97) that \(M(x^*,\varLambda ^*)\) is positive semidefinite. Now assume that for some \({\bar{d}}\ne 0\in {{\mathrm{I}\!\mathrm{R}}}^n\), it holds
Since \(Q^*_s\) is positive definite, it then follows from (97) that \(B^T{\bar{d}}=\sum _{i=1}^n{\bar{d}}_ib_i=0\). So,
This proves that (40) holds.
Finally, we prove conclusion (ii) for \(L_3\). Note that \(\mathrm{Tr}\left( \varXi _c(G(x))\right) =I_m\bullet \varXi _c(G(x))\). By Lemma 6.1, similar to the computations of (92) and (92) above, we have
Moreover, we have
where
We note from conditions (D1) and (D2) that \(\xi (t) = 0\) and \(\xi ^\prime (t) = 0\) for \(t \ge 0\) and \(\xi ^{\prime \prime }(0)=0\). Note also that \(\lambda ^*_k>0\) for all \(k=1,\ldots ,r\). Therefore, from (98) and (99), we can obtain
By the definition of \(L_3\) and \(\psi ^\prime (0)=1\), it follows from (92), (92), and (100) that
where \( H_c(x^*,\varLambda ^*)\) and \(M(x^*,\varLambda ^*)\) are given in (94) and (95), and satisfy (39) and (40). The proof of the proposition is completed. \(\square \)
Rights and permissions
About this article
Cite this article
Luo, H., Wu, H. & Liu, J. On Saddle Points in Semidefinite Optimization via Separation Scheme. J Optim Theory Appl 165, 113–150 (2015). https://doi.org/10.1007/s10957-014-0634-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-014-0634-3
Keywords
- Semidefinite optimization
- Regular weak nonlinear separation
- Augmented Lagrangian function
- Saddle point
- Strong second-order sufficiency condition