On Saddle Points in Semidefinite Optimization via Separation Scheme

Luo, Hezhi; Wu, Huixian; Liu, Jianzhen

doi:10.1007/s10957-014-0634-3

On Saddle Points in Semidefinite Optimization via Separation Scheme

Published: 12 August 2014

Volume 165, pages 113–150, (2015)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Hezhi Luo¹,
Huixian Wu² &
Jianzhen Liu²

344 Accesses
6 Citations
Explore all metrics

Abstract

This paper aims at investigating saddle point conditions for augmented Lagrangian functions for semidefinite optimization problems. By means of the image space analysis, the existence of a saddle point is shown to be equivalent to a regular weak nonlinear separation of two suitable subsets in the image space (IS) associated with the given problem. Especially, three classes of augmented Lagrangians based on smooth spectral penalty functions can be derived, as particular cases, from a nonlinear separation scheme in the IS. Without requiring the strict complementarity, it is proved that, under strong second-order sufficiency conditions, all these augmented Lagrangian functions admit a local saddle point, and their Hessians become positive definite in a neighborhood of a local optimal point of the original problem. The existence of global saddle points is then obtained under additional assumptions that do not require the compactness of the feasible set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithms for Solving Variational Inequalities and Saddle Point Problems with Some Generalizations of Lipschitz Property for Operators

On the existence of saddle points for nonlinear second-order cone programming problems

Article 06 November 2014

Lagrangian duality and saddle points for sparse linear programming

Article 06 September 2019

References

Giannessi, F.: Constrained Optimization and Image Space Analysis. Springer, Berlin (2005)
MATH Google Scholar
Giannessi, F.: Theorems of the alternative and optimality conditions. J. Optim. Theory Appl. 42, 331–365 (1984)
Article MATH MathSciNet Google Scholar
Dien, P.H., Mastroeni, G., Pappalardo, M., Quang, P.H.: Regularity conditions for constrained extremum problems via image space. J. Optim. Theory Appl. 80, 19–37 (1994)
Article MATH MathSciNet Google Scholar
Pappalardo, M.: Image space approach to penalty methods. J. Optim. Theory Appl. 64, 141–152 (1990)
Article MATH MathSciNet Google Scholar
Rubinov, A.M., Uderzo, A.: On global optimality conditions via separation functions. J. Optim. Theory Appl. 109, 345–370 (2001)
Article MATH MathSciNet Google Scholar
Luo, H.Z., Mastroeni, G., Wu, H.X.: Separation approach for augmented Lagrangians in constrained nonconvex optimization. J. Optim. Theory Appl. 144, 275–290 (2010)
Article MATH MathSciNet Google Scholar
Luo, H.Z., Wu, H.X., Liu, J.Z.: Some results on augmented Lagrangians in constrained global optimization via image space analysis. J. Optim. Theory Appl. 159, 360–385 (2013)
Article MATH MathSciNet Google Scholar
Li, S.J., Xu, Y.D., Zhu, S.K.: Nonlinear separation approach to constrained extremum problems. J. Optim. Theory Appl. 154, 842–856 (2012)
Article MATH MathSciNet Google Scholar
Zhu, S.K., Li, S.J.: Unified duality theory for constrained extremum problems. Part I: Image space analysis. J. Optim. Theory Appl. 161(3), 738–762 (2014)
Chinaie, M., Zafarani, J.: Image space analysis and scalarization of multivalued optimization. J. Optim. Theory Appl. 142, 451–467 (2009)
Article MATH MathSciNet Google Scholar
Giannessi, F., Mastroeni, G., Yao, J.C.: On maximum and variational principles via image space analysis. Positivity 16, 405–427 (2012)
Article MathSciNet Google Scholar
Mastroeni, G.: Nonlinear separation in the image space with applications to penalty methods. Appl. Anal. 91, 1901–1914 (2012)
Article MATH MathSciNet Google Scholar
Shapiro, A., Sun, J.: Some properties of the augmented Lagrangian in cone constrained optimization. Math. Oper. Res. 29, 479–491 (2004)
Todd, M.J.: Semidefinite optimization. Acta Numer. 10, 515–560 (2001)
Article MATH MathSciNet Google Scholar
Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)
Article MATH MathSciNet Google Scholar
Ye, Y.: Interior Point Algorithms: Theory and Analysis. Wiley, New York (1997)
Book MATH Google Scholar
Ben-Tal, A., Jarre, F., Kocvara, M., Nemirovski, A., Zowe, J.: Optimal design of trusses under a nonconvex global buckling constraints. Optim. Eng. 1, 189–213 (2000)
Article MATH MathSciNet Google Scholar
Fares, B., Apkarian, P., Noll, D.: An augmented Lagrangian method for a class of LMI-constrained problems in robust control theory. Int. J. Control 74, 348–360 (2001)
Article MATH MathSciNet Google Scholar
Fares, B., Noll, D., Apkarian, P.: Robust control via sequential semidefinite programming. SIAM J. Control Optim. 40, 1791–1820 (2002)
Article MATH MathSciNet Google Scholar
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
Book MATH Google Scholar
Chan, Z.X., Sun, D.: Constraint nondegeneracy, strong regularity and nonsingularity in semidefinite programming. SIAM J. Optim. 19, 370–396 (2008)
Article MATH MathSciNet Google Scholar
Forsgren, A.: Optimality conditions for nonconvex semidefinite programming. Math. Program. 88, 105–128 (2000)
Article MATH MathSciNet Google Scholar
Qi, H.D.: Local duality of nonlinear semidefinite programming. Math. Oper. Res. 34, 124–141 (2009)
Article MATH MathSciNet Google Scholar
Shapiro, A.: First and second order analysis of nonlinear semidefinite programs. Math. Program. Ser. B 77, 301–320 (1997)
MATH Google Scholar
Sun, D.: The strong second-order sufficient condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications. Math. Oper. Res. 31, 761–776 (2006)
Article MATH MathSciNet Google Scholar
Correa, R., Hector Ramirez, C.: A global algorithm for solving nonlinear semidefinite programming. SIAM J. Optim. 15, 303–318 (2004)
Article MATH MathSciNet Google Scholar
Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 135, 89–121 (2012)
Article MATH MathSciNet Google Scholar
Sun, J., Zhang, L.W., Wu, Y.: Properties of the augmented Lagrangian in nonlinear semidefinite optimization. J. Optim. Theory Appl. 129, 437–456 (2006)
Article MATH MathSciNet Google Scholar
Sun, D., Sun, J., Zhang, L.W.: The rate of convergence of the augmented Lagrangian method for nonlinear semidefinite programming. Math. Program. 114, 349–391 (2008)
Article MATH MathSciNet Google Scholar
Luo, H.Z., Wu, H.X., Chen, G.T.: On the convergence of augmented Lagrangian methods for nonlinear semidefinite programming. J. Glob. Optim. 54, 599–618 (2012)
Article MATH MathSciNet Google Scholar
Wu, H.X., Luo, H.Z., Ding, X.D., Chen, G.T.: Global convergence of modified augmented Lagrangian methods for nonlinear semidefinite programming. Comput. Optim. Appl. 56, 531–558 (2013)
Article MATH MathSciNet Google Scholar
Wu, H.X., Luo, H.Z., Yang, J.F.: Nonlinear separation approach for the augmented Lagrangian in nonlinear semidefinite programming. J. Glob. Optim. 59, 695–727 (2014)
Article MATH MathSciNet Google Scholar
Noll, D.: Local convergence of an augmented Lagrangian method for matrix inequality constrained programming. Optim. Methods Softw. 22, 777–802 (2007)
Article MATH MathSciNet Google Scholar
Stingl, M.: On the solution of nonlinear semidefinite programs by augmented Lagrangian methods. PhD thesis, Institute of Applied Mathematics, Universitytat Erlangen-Nurnberg (2005)
Li, D., Sun, X.L.: Existence of a saddle point in nonconvex constrained optimization. J. Glob. Optim. 21, 39–50 (2001)
Article MATH Google Scholar
Sun, X.L., Li, D., McKinnon, K.I.M.: On saddle points of augmented Lagrangians for constrained nonconvex optimization. SIAM J. Optim. 15, 1128–1146 (2005)
Article MATH MathSciNet Google Scholar
Wu, H.X., Luo, H.Z.: A note on the existence of saddle points of $p$-th power Lagrangian for constrained nonconvex optimization. Optimization 61, 1331–1345 (2012)
Article MATH MathSciNet Google Scholar
Wu, H.X., Luo, H.Z.: Saddle points of general augmented Lagrangians for constrained nonconvex optimization. J. Glob. Optim. 53, 683–697 (2012)
Article MATH MathSciNet Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
MATH Google Scholar
Birgin, E.G., Castillo, R.A., Martínez, J.M.: Numerical comparison of augmented Lagrangian algorithms for nonconvex problems. Comput. Optim. Appl. 31, 31–55 (2005)
Article MATH MathSciNet Google Scholar
Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991)
Book MATH Google Scholar
Zarantonello, E.H.: Projections on convex sets in Hilbert space and spectral theory I and II. In: Zarantonello, E.H. (ed.) Contributions to Nonlinear Functional Analysis, pp. 237–424. Academic, New York (1971)
Chapter Google Scholar
Theobald, C.M.: An inequality for the trace of the product of two symmetric matrices. Math. Proc. Camb. Philos. Soc. 77, 77–265 (1975)
Article MathSciNet Google Scholar

Download references

Acknowledgments

The authors would like to thank the two anonymous referees for the detailed comments and valuable suggestions, which have improved the final presentation of the paper. This work was supported by the National Natural Science Foundation of China under Grants 11371324 and 11071219, and the Zhejiang Provincial Natural Science Foundation of China under Grants LY13A010012 and LY13A010017.

Author information

Authors and Affiliations

Department of Applied Mathematics, College of Science, Zhejiang University of Technology, Hangzhou, 310032, Zhejiang, People’s Republic of China
Hezhi Luo
Department of Mathematics, College of Science, Hangzhou Dianzi University, Hangzhou, 310018, Zhejiang, People’s Republic of China
Huixian Wu & Jianzhen Liu

Authors

Hezhi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Huixian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhen Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hezhi Luo.

Appendix: Proof of Proposition 4.2

We first recall a well-known result regarding the differentiability properties of the composite matrix function. We use the following definition from [41].

Definition 6.1

Let $G: {{\mathrm{I}\!\mathrm{R}}}^n\rightarrow {{\mathrm{I}\!\mathrm{R}}}^{m\times m}$. Let $\lambda _1(x),\ldots ,\lambda _{\mu (x)}(x)$ be the $\mu (x)$ increasingly ordered eigenvalues of $G(x)$. Let $G(x)=U(x)\varLambda (x)U(x)^T$ be the spectral decomposition of $G(x)$, with $\varLambda (x)=\mathrm{diag}\left( \lambda _1(x), \ldots , \lambda _1(x),\right. $ $\left. \ldots ,\lambda _{\mu (x)}(x),\ldots ,\right. $ $\left. \lambda _{\mu (x)}(x)\right) $. Then, the Frobenius covariance matrices is defined by

$$\begin{aligned} P_i(x):= U(x)\mathrm{diag} (0,\ldots ,0,1,\ldots ,1,0,\ldots ,0)U(x)^T,~i=1,\ldots ,\mu (x), \end{aligned}$$

(90)

where the non-zeros in the diagonal matrix occur exactly in the positions of $\lambda _i(x)$ in $\varLambda (x)$.

Note that $P_i(x)P_j(x)=0$ if $i\ne j$, $P_i(x)^2=P_i(x) $, $i,j=1,\ldots ,\mu (x)$, and $\sum _{i=1}^{\mu (x)}P_i(x)=I_m$ (see [41], p. 403).

Definition 6.2

Let $t_1,\ldots ,t_m$ be real values and let $f_0 : {{\mathrm{I}\!\mathrm{R}}}\rightarrow {{\mathrm{I}\!\mathrm{R}}}$ be twice continuously differentiable. Then, we define

$$\begin{aligned}&\Delta f_0(t_k,t_l):=\left\{ \begin{array}{ll} \frac{f_0(t_k)-f_0(t_l)}{t_k-t_l},&{}k\ne l,\\ f_0^\prime (t_k),&{}k=l, \end{array}\right. \\&\Delta ^2 f_0(t_k,t_l,t_q):=\left\{ \begin{array}{ll} \frac{\Delta f_0(t_k,t_q)-\Delta f_0(t_l,t_q)}{t_k-t_l},&{}k\ne l,\\ \frac{\Delta f_0(t_k,t_l)-\Delta f_0(t_q,t_l)}{t_k-t_q},&{}k=l\ne q,\\ f_0^{\prime \prime }(t_k),&{}k=l=q. \end{array}\right. \end{aligned}$$

The following result is from [41] (Theorem 6.6.30), which will be used to prove Proposition 4.2.

Lemma 6.1

Let $f_0 : {{\mathrm{I}\!\mathrm{R}}}\rightarrow {{\mathrm{I}\!\mathrm{R}}}$ and $G: {{\mathrm{I}\!\mathrm{R}}}^n\rightarrow {\mathcal {S}}^m$ be twice continuously differentiable. Define $F : {\mathcal {S}}^m\rightarrow {\mathcal {S}}^m$ by $F(Z):= U f_1(D)U^T$, where $Z= UD U^T$ is the spectral decomposition of $Z$, $D=\mathrm{diag}\left( \lambda _1, \ldots , \lambda _m\right) $ and $f_1(D) = \mathrm{diag}\left( f_0(\lambda _1),\ldots , f_0(\lambda _m)\right) $, where $\lambda _i$, $i=1,\ldots ,m$ are eigenvalues of $Z$ listed in the decreasing order. Let $\lambda _1(x),\ldots ,\lambda _{\mu (x)}(x)$ be the $\mu (x)$ increasingly ordered eigenvalues of the matrix $G(x)$. Then, $F(G(x))$ is twice continuously differentiable with

$$\begin{aligned} \frac{\partial F(G(x))}{\partial x_i}&= \sum _{k,l=1}^{\mu (x)}\Delta f_0(\lambda _k(x),\lambda _l(x)) P_k(x) G_i^\prime (x) P_l(x),\\ \frac{\partial ^2 F(G(x))}{\partial x_i\partial x_j}&= \sum _{k,l=1}^{\mu (x)}\Delta f_0(\lambda _k(x),\lambda _l(x)) P_k(x) G_{ij}^{\prime \prime }(x) P_l(x) \\&\quad +\sum _{k,l,q=1}^{\mu (x)}\Delta ^2 f_0(\lambda _k(x),\lambda _l(x), \lambda _q(x))(M_{klq} +M^T_{klq} ), \end{aligned}$$

where $P_k(x)$, $k=1,\ldots ,\mu (x)$ are Frobenius covariance matrices of $G(x)$, and the matrix $M_{klq}$ is given by $M_{klq}:=P_k(x) G_i^{\prime }(x) P_l(x) G_j^{\prime }(x)P_q(x)$.

Proof of Proposition 4.2. We first show conclusion (i). Let $\lambda _i^*\ge 0$, $i=1,\ldots ,m$ be eigenvalues of $G(x^*)$ of rank $r$ listed in the decreasing order. Since $\varLambda ^*\bullet G(x^*)=0$ is equivalent to $ \lambda _i(\varLambda ^*) \lambda _i^*=0$ for $i=1,\ldots ,m$, we obtain $\lambda _i(\varLambda ^*)=0$ for $i=1,\ldots ,r$. Let us define $\psi _c(t):=c^{-1}\psi (ct)$ for any $t\in {{\mathrm{I}\!\mathrm{R}}}$. By the property $\psi (0) = 0$, we have $ \psi _c(\lambda _i^*)=0$ for $i=r+1,\ldots ,m$. Thus, $\lambda _i(\varLambda ^*) \psi _c(\lambda _i^*)=0$ for $i=1,\ldots ,m$, which, by the definition of $\varPsi _c$, means that $\varLambda ^*\bullet \varPsi _c(G(x^*))=0$. Similarly, $\varLambda ^*\bullet \varPhi _c(G(x^*))=0$. Observe from condition (D2) that $G(x^*)\succeq 0$ implies $\mathrm{Tr}\left( \varXi _c(G(x^*))\right) =0$. Hence, $L_i(x^*,\varLambda ^*,\mu ^*,c)= f(x^*)$ $(i=1,2,3)$ for any $c > 0$.

Next, we prove conclusion (ii) for $L_1$. The proof for $L_2$ can be constructed by using similar arguments and the properties (C1)–(C3).

Let $\lambda _1^*,\ldots ,\lambda ^*_{\mu (x^*)} $ be the $\mu (x^*)$ distinct eigenvalues of $G(x^*)$ and let $P_i(x^*)$ be defined by (90) for $i=1,\ldots ,\mu (x^*)$. Note that $P_i(x^*)P_j(x^*)=0$ for all $i\ne j$ and $P_i(x^*)^2=P_i(x^*) $ for all $i,j=1,\ldots ,\mu (x^*)$. Let $P_0:=EE^T$ with $E=[e_{r+1},\ldots ,e_m]$. We see that $P_0=P_{\mu (x^*)}(x^*)$. Since $\varLambda ^*\bullet G(x^*)=0$ and $\varLambda ^*,G(x^*) \in {\mathcal {S}}^m_+ $ imply $\varLambda ^* G(x^*)=G(x^*)\varLambda ^*$, by Theorem 1.3.12 of [41], $\varLambda ^*$ and $G(x^*)$ are simultaneously diagonalizable. So, $\varLambda ^*={\bar{ E}}D_{\varLambda ^*}{\bar{E}}^T$, where $D_{\varLambda ^*}:=\mathrm{diag}\left( 0,\ldots ,0,\lambda _{r+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) $, and ${\bar{E}}:=[e_1,\ldots ,e_m]$ with ${\bar{E}}^T\bar{E}=I_m$. Let $D_0:=\mathrm{diag}{({\underbrace{0,\ldots ,0}_{r}},\underbrace{1\ldots ,1}_{m-r})}$. Then, $P_0={\bar{E}}D_0\bar{E}^T$ and

$$\begin{aligned} P_0\varLambda ^*P_0={\bar{E}} D_0{\bar{E}}^T\bar{E}D_{\varLambda ^*}{\bar{E}}^T\bar{E}D_0{\bar{E}}^T=\bar{E}D_0 D_{\varLambda ^*} D_0{\bar{E}}^T ={\bar{E}} D_{\varLambda ^*} {\bar{E}}^T=\varLambda ^*.\nonumber \\ \end{aligned}$$

(91)

By Lemma 6.1 and (91), we can obtain

$$\begin{aligned}&\left[ \varLambda ^*\bullet \frac{\partial }{\partial x_i}\varPsi _c(G(x^*))\right] _{i =1}^n = \left[ (P_0\varLambda ^*P_0)\bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_k(x^*)G_i^\prime (x^*)P_l(x^*)\right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_0P_k(x^*)G_i^\prime (x^*)P_l(x^*)P_0\right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_{\mu (x^*)}(x^*)P_k(x^*)G_i^\prime (x^*)P_l(x^*)P_{\mu (x^*)}(x^*)\right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \left( \Delta \psi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})P_{\mu (x^*)}(x^*) G_i^\prime (x^*) P_{\mu (x^*)}(x^*)\right) \right] _{i =1}^n\\&= \left[ \varLambda ^* \bullet \left( \psi _c^\prime (0)P_0 G_i^\prime (x^*) P_0\right) \right] _{i =1}^n = \psi ^\prime (0)\left[ \varLambda ^*\bullet G_i^\prime (x^*)\right] _{i =1}^n\!, \end{aligned}$$

from which and by $\psi ^\prime (0)=1$ and $h(x^*)=0$, we draw immediately that

$$\begin{aligned} \nabla _x L_1(x^*,\varLambda ^*,\mu ^*,c)&= \nabla f(x^*) -\psi ^\prime (0)\left[ \varLambda ^*\bullet G_i^\prime (x^*)\right] _{i =1}^n+\nabla h(x^*)(\mu ^*+ch(x^*))\\&= \nabla _x L_0(x^*,\varLambda ^*,\mu ^*)=0. \end{aligned}$$

Moreover, we have

$$\begin{aligned}&\left[ \varLambda ^*\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varPsi _c(G(x^*))\right] _{i,j =1}^n=\left[ (P_0\varLambda ^*P_0)\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varPsi _c(G(x^*))\right] _{i,j =1}^n\nonumber \\&= \left[ \varLambda ^*\bullet \left( \sum _{k,l=1}^{\mu (x^*)} \Delta \psi _c(\lambda ^*_k,\lambda ^*_l)P_0P_k(x^*)G_{ij}^{\prime \prime }(x^*)P_l(x^*)P_0\right) \right] _{i,j =1}^n\nonumber \\&+\left[ \varLambda ^*\bullet \left( \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \psi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)P_0P_k(x^*)G_i^{\prime }(x^*)P_l(x^*)G_j^{\prime }(x^*)P_q(x^*)P_0\right) \right] _{i,j =1}^n\nonumber \\&+\left[ \varLambda ^*\bullet \left( \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \psi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)P_0P_q(x^*)G_j^{\prime }(x^*)P_l(x^*)G_i^{\prime }(x^*)P_k(x^*)P_0\right) \right] _{i,j =1}^n\nonumber \\&= \left[ \varLambda ^* \bullet \left( \Delta \psi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})P_{\mu (x^*)}(x^*) G_{ij}^{\prime \prime }(x^*) P_{\mu (x^*)}(x^*)\right) \right] _{i,j =1}^n\nonumber \\&+ \left[ \varLambda ^* \bullet \left( P_{\mu (x^*)}(x^*) N_{ij}(x^*) P_{\mu (x^*)}(x^*) \right) \right] _{i,j =1}^n\nonumber \\&+\left[ \varLambda ^* \bullet \left( P_{\mu (x^*)}(x^*) N_{ji}(x^*) P_{\mu (x^*)}(x^*)\right) \right] _{i,j =1}^n\nonumber \\&= \left[ \varLambda ^* \bullet \left( \psi _c^\prime (0)P_0 G_{ij}^{\prime \prime }(x^*) P_0\right) \right] _{i,j =1}^n+2\left[ \varLambda ^* \bullet (P_0 N_{ij}(x^*)P_0)\right] _{i,j =1}^n\nonumber \\&= \psi ^\prime (0)\left[ \varLambda ^*\bullet G_{ij}^{\prime \prime }(x^*)\right] _{i,j =1}^n+2\left[ \varLambda ^* \bullet N_{ij}(x^*) \right] _{i,j =1}^n\!, \end{aligned}$$

(92)

where

$$\begin{aligned} N_{ij}(x^*) =G_i^\prime (x^*)\left( \sum _{k=1}^{\mu (x^*)} \Delta ^2 \psi _c(\lambda ^*_k,0,0)P_k(x^*)\right) G_j^{\prime }(x^*),\quad i,j=1,\ldots ,n. \end{aligned}$$

By Definition 6.2, via a straightforward computation, we obtain

$$\begin{aligned} \Delta ^2 \psi _c(\lambda ^*_k,0,0)=\left\{ \begin{array}{ll} c \psi ^{\prime \prime }(0), &{}\quad k=\mu (x^*)\\ \frac{c^{-1}\psi (c\lambda ^*_k)}{(\lambda ^*_k)^2}-\frac{\psi ^\prime (0)}{\lambda ^*_k},&{}\quad k<\mu (x^*) \end{array}\right. . \end{aligned}$$

Note that $\psi ^\prime (0)=1$. It then follows from (92) and the definition of $L_1$ that

$$\begin{aligned} \nabla _{xx}^2 L_1(x^*,\varLambda ^*,\mu ^*,c)&= \underbrace{{\nabla ^2f(x^*) -\psi ^{\prime }(0)\left[ \varLambda ^*\bullet G_{ij}^{\prime \prime }(x^*)\right] _{i,j=1}^n+\sum _{i=1}^p\mu ^*_i\nabla ^2 h_i(x^*)}}_{{\nabla _{xx}^2 L_0(x^*,\varLambda ^*,\mu ^*)}}\nonumber \\&+\,H_c(x^*,\varLambda ^*)+c M(x^*,\varLambda ^*)+c\nabla h(x^*)\nabla h(x^*)^T , \end{aligned}$$

(93)

which is just (38), where

$$\begin{aligned}&H_c(x^*,\varLambda ^*)=-2\nonumber \\&\quad \times \left[ \varLambda ^*\bullet \left( G_i^\prime (x^*) \sum _{k=1}^{\mu (x^*)-1}\left( \frac{c^{-1}\psi (c\lambda _k^* )}{(\lambda ^*_k)^2}-\frac{1}{\lambda _k^* }\right) P_k(x^*) G_j^\prime (x^*)\right) \right] _{i,j=1}^n \end{aligned}$$

(94)

$$\begin{aligned}&M(x^*,\varLambda ^*)=-2\psi ^{\prime \prime }(0)\left[ \varLambda ^*\bullet \left( G_i^\prime (x^*)P_{\mu (x^*)}G_j^\prime (x^*)\right) \right] _{i,j=1}^n. \end{aligned}$$

(95)

By condition (B1), $\psi (0)=0$ and $\psi ^\prime (0)=1$, we know that $\psi (t)< t$ for any $t\ne 0$, and hence $\frac{c^{-1}\psi (c\lambda _k^* )}{(\lambda ^*_k)^2}-\frac{1}{\lambda _k^* }<0$ for $k=1,\ldots ,\mu (x^*)-1$. Together with $P_k(x^*)\succeq 0$ for all $k$ and $\varLambda ^*\succeq 0$, we then infer from (94) that $H_c(x^*,\varLambda ^*)$ is positive semidefinite. By the condition (B3) for $\psi $, we see that $\lim _{c\rightarrow \infty }\frac{\psi (c\tau )}{c}=0$ for any $\tau >0$. Note also that $\lambda _k^*>0$ for $k=1,\ldots ,r$ and that, by the definition of $P_k(x)$,

$$\begin{aligned} \sum _{k=1}^{\mu (x^*)-1}\frac{1}{\lambda _k^* } P_k(x^*)=\sum _{k=1}^r \frac{1}{\lambda _k^* } e_ke_k^T. \end{aligned}$$

We then deduce from (94) that

$$\begin{aligned} \lim _{c\rightarrow \infty }H_{c}(x^{*},\varLambda ^*)&= 2\left[ \varLambda ^{*}\bullet \left( G_i^{\prime }(x^{*})\left( \sum _{k=1}^{r} \frac{1}{\lambda _{k}^{*}} e_ke_k^T\right) G_j^\prime (x^*)\right) \right] _{i,j=1}^n\nonumber \\&= 2\left[ \varLambda ^{*}\bullet \left( G_{i}^{\prime }(x^{*})[G(x^{*})]^{\dag } G_{j}^{\prime }(x^{*})\right) \right] _{i,j=1}^{n}\nonumber \\&= H(x^{*},\varLambda ^{*}), \end{aligned}$$

(96)

proving (39). Note that $s=m-\mathrm{rank}(\varLambda ^*)\ge r$. Then, $\lambda _i(\varLambda ^*)=0$ for $i=1,\ldots ,s$ and

$$\begin{aligned} D_{\varLambda ^*}=\mathrm{diag}\left( 0,\ldots ,0,\lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \!. \end{aligned}$$

Note that $\varLambda ^*={\bar{E}}D_{\varLambda ^*}{\bar{E}}^T$ with ${\bar{E}}=[e_1,\ldots ,e_m]$ and $P_{\mu (x^*)}=P_0=EE^T$ with $E=[e_{r+1},\ldots ,e_m]$. It then follows from (95) that

$$\begin{aligned} M(x^*,\varLambda ^*)&= -2\psi ^{\prime \prime }(0)\left[ ({\bar{E}}D_{\varLambda ^*}{\bar{E}}^T)\bullet \left( G_i^\prime (x^*)EE^TG_j^\prime (x^*)\right) \right] _{i,j=1}^n\nonumber \\&= -2\psi ^{\prime \prime }(0)\left[ \sum _{k=s+1}^m \lambda _k(\varLambda ^*)e_k^T G_i^\prime (x^*)\left( \sum _{l=r+1}^me_le^T_l\right) G_j^\prime (x^*) e_k\right] _{i,j=1}^n\nonumber \\&= -2\psi ^{\prime \prime }(0)\left[ \sum _{k=s+1}^m \sum _{l=r+1}^m\lambda _k(\varLambda ^*)\left( e_k^T G_i^\prime (x^*)e_le^T_lG_j^\prime (x^*) e_k\right) \right] _{i,j=1}^n\nonumber \\&= -2\psi ^{\prime \prime }(0)BQ^*_sB^T, \end{aligned}$$

(97)

where $Q^*_s: =\mathrm{diag}\left( \lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*),\ldots , \lambda _{s+1}(\varLambda ^*),\ldots ,\lambda _m(\varLambda ^*)\right) \in {{\mathrm{I}\!\mathrm{R}}}^{{\bar{m}}\times {\bar{m}}}$ with ${\bar{m}}:=(m-s)(m-r)$, and the $i$-th row of matrix $B\in {{\mathrm{I}\!\mathrm{R}}}^{n\times {\bar{m}}}$ is defined by

$$\begin{aligned} b_i\!:=\!\left( e_{s+1}^TG_i^\prime (x^*)e_{r+1},\ldots ,e_{m}^TG_i^\prime (x^*)e_{r+1},\ldots ,e_{s+1}^TG_i^\prime (x^*)e_m,\ldots ,e_{m}^TG_i^\prime (x^*)e_m\right) ^T\!\!\!. \end{aligned}$$

Note that, since $\mathrm{rank}(\varLambda ^*)=m-s$ implies $\lambda _i(\varLambda ^*)>0$ for $i=s+1,\ldots ,m$, $Q^*_s$ is positive definite and hence, by $\psi ^{\prime \prime }(0)<0$, we see from (97) that $M(x^*,\varLambda ^*)$ is positive semidefinite. Now assume that for some ${\bar{d}}\ne 0\in {{\mathrm{I}\!\mathrm{R}}}^n$, it holds

$$\begin{aligned} {\bar{d}}^T M(x^*,\varLambda ^*){\bar{d}} =0. \end{aligned}$$

Since $Q^*_s$ is positive definite, it then follows from (97) that $B^T{\bar{d}}=\sum _{i=1}^n{\bar{d}}_ib_i=0$. So,

$$\begin{aligned} e_{k}^T\left( \sum _{i=1}^n{\bar{d}}_iG_i^\prime (x^*)\right) e_j=0, ~k=s+1,\ldots ,m,~j=r+1,\ldots ,m. \end{aligned}$$

This proves that (40) holds.

Finally, we prove conclusion (ii) for $L_3$. Note that $\mathrm{Tr}\left( \varXi _c(G(x))\right) =I_m\bullet \varXi _c(G(x))$. By Lemma 6.1, similar to the computations of (92) and (92) above, we have

$$\begin{aligned} \left[ I_m\bullet \frac{\partial }{\partial x_i}\varXi _c(G(x^*))\right] _{i =1}^n&= \left[ I_m\bullet \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)P_k(x^*)G_i^\prime (x^*)P_l(x^*)\right] _{i =1}^n \nonumber \\&= \left[ \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)\mathrm{Tr}\left( P_k(x^*)G_i^\prime (x^*)P_l(x^*)\right) \right] _{i =1}^n\nonumber \\&= \left[ \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)\mathrm{Tr}\left( P_k(x^*)P_l(x^*)G_i^\prime (x^*)\right) \right] _{i =1}^n\nonumber \\&= \left[ \Delta \xi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})\mathrm{Tr}\left( P_{\mu (x^*)}(x^*)G_i^\prime (x^*)\right) \right] _{i =1}^n\nonumber \\&= \xi ^\prime (0)\left[ P_{\mu (x^*)}(x^*)\bullet G_i^\prime (x^*)\right] _{i =1}^n\!. \end{aligned}$$

(98)

Moreover, we have

$$\begin{aligned}&\left[ I_m\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varXi _c(G(x^*))\right] _{i,j =1}^n \nonumber \\&= \left[ \sum _{k,l=1}^{\mu (x^*)} \Delta \xi _c(\lambda ^*_k,\lambda ^*_l)\mathrm{Tr}\left( P_k(x^*)G_{ij}^{\prime \prime }(x^*)P_l(x^*) \right) \right] _{i,j =1}^n\nonumber \\&+\left[ \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \xi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)\mathrm{Tr}\left( P_k(x^*)G_i^{\prime }(x^*)P_l(x^*)G_j^{\prime }(x^*)P_q(x^*) \right) \right] _{i,j =1}^n \nonumber \\&+\left[ \sum _{k,l,q=1}^{\mu (x^*)} \Delta ^2 \xi _c(\lambda ^*_k,\lambda ^*_l,\lambda ^*_q)\mathrm{Tr}\left( P_q(x^*)G_j^{\prime }(x^*)P_l(x^*)G_i^{\prime }(x^*)P_k(x^*) \right) \right] _{i,j =1}^n\nonumber \\&= \left[ \Delta \xi _c(\lambda ^*_{\mu (x^*)},\lambda ^*_{\mu (x^*)})\mathrm{Tr}\left( P_{\mu (x^*)}(x^*) G_{ij}^{\prime \prime }(x^*) \right) \right] _{i,j =1}^n\nonumber \\&+ \left[ \mathrm{Tr}\left( P_{\mu (x^*)}(x^*) N_{ij}(x^*) \right) \right] _{i,j =1}^n+\left[ \mathrm{Tr}\left( P_{\mu (x^*)}(x^*) N_{ji}(x^*) \right) \right] _{i,j =1}^n\nonumber \\&= \xi ^\prime (0)\left[ P_{\mu (x^*)}(x^*)\bullet G_{ij}^{\prime \prime }(x^*)\right] _{i,j =1}^n+2\left[ P_{\mu (x^*)}(x^*) \bullet N_{ij}(x^*) \right] _{i,j =1}^n\!, \end{aligned}$$

(99)

where

$$\begin{aligned}&N_{ij}(x^*) =G_i^\prime (x^*)\left( \sum _{k=1}^{\mu (x^*)} \Delta ^2 \xi _c(\lambda ^*_k,0,0)P_k(x^*)\right) G_j^{\prime }(x^*),\quad i,j=1,\ldots ,n.\\&\Delta ^2 \xi _c(\lambda ^*_k,0,0)=\left\{ \begin{array}{ll} c \xi ^{\prime \prime }(0), &{}\quad k=\mu (x^*)\\ \frac{c^{-1}\xi (c\lambda ^*_k)}{(\lambda ^*_k)^2}-\frac{\xi ^\prime (0)}{\lambda ^*_k},&{}\quad k<\mu (x^*) \end{array}\right. . \end{aligned}$$

We note from conditions (D1) and (D2) that $\xi (t) = 0$ and $\xi ^\prime (t) = 0$ for $t \ge 0$ and $\xi ^{\prime \prime }(0)=0$. Note also that $\lambda ^*_k>0$ for all $k=1,\ldots ,r$. Therefore, from (98) and (99), we can obtain

$$\begin{aligned} \left[ I_m\bullet \frac{\partial }{\partial x_i}\varXi _c(G(x^*))\right] _{i =1}^n=0,\quad \left[ I_m\bullet \frac{\partial ^2}{\partial x_i\partial x_j}\varXi _c(G(x^*))\right] _{i,j =1}^n=0. \end{aligned}$$

(100)

By the definition of $L_3$ and $\psi ^\prime (0)=1$, it follows from (92), (92), and (100) that

$$\begin{aligned} \nabla _{x} L_{3}(x^{*},\varLambda ^{*},\mu ^*,c)&= \nabla f(x^*) - \psi ^\prime (0) [\varLambda ^*\bullet G_i^\prime (x^*)]_{i =1}^n+ \nabla h (x^*)\mu ^* \\&= \nabla _x L_0(x^*,\varLambda ^*,\mu ^*),\\ \nabla _{xx}^2 L_3(x^*,\varLambda ^*,\mu ^*,c)&= {\underbrace{\nabla ^2f(x^*) -\psi ^\prime (0) [\varLambda ^*\bullet G_{ij}^{\prime \prime }(x^*)]_{i,j=1}^n+\sum _{i=1}^p\mu ^*_i\nabla ^2 h_i(x^*)}_{\nabla _{xx}^2 L_0(x^*,\varLambda ^*,\mu ^*)}}\\&+\,H_c(x^*,\varLambda ^*)+cM(x^*,\varLambda ^*)+c\nabla h(x^*)\nabla h(x^*)^T , \end{aligned}$$

where $ H_c(x^*,\varLambda ^*)$ and $M(x^*,\varLambda ^*)$ are given in (94) and (95), and satisfy (39) and (40). The proof of the proposition is completed. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, H., Wu, H. & Liu, J. On Saddle Points in Semidefinite Optimization via Separation Scheme. J Optim Theory Appl 165, 113–150 (2015). https://doi.org/10.1007/s10957-014-0634-3

Download citation

Received: 17 November 2013
Accepted: 23 July 2014
Published: 12 August 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10957-014-0634-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Saddle Points in Semidefinite Optimization via Separation Scheme

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Algorithms for Solving Variational Inequalities and Saddle Point Problems with Some Generalizations of Lipschitz Property for Operators

On the existence of saddle points for nonlinear second-order cone programming problems

Lagrangian duality and saddle points for sparse linear programming

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Proposition 4.2

Definition 6.1

Definition 6.2

Lemma 6.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

On Saddle Points in Semidefinite Optimization via Separation Scheme

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Algorithms for Solving Variational Inequalities and Saddle Point Problems with Some Generalizations of Lipschitz Property for Operators

On the existence of saddle points for nonlinear second-order cone programming problems

Lagrangian duality and saddle points for sparse linear programming

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Proposition 4.2

Appendix: Proof of Proposition 4.2

Definition 6.1

Definition 6.2

Lemma 6.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation