Accelerated inexact composite gradient methods for nonconvex spectral optimization problems

Kong, Weiwei; Monteiro, Renato D. C.

doi:10.1007/s10589-022-00377-9

Accelerated inexact composite gradient methods for nonconvex spectral optimization problems

Published: 28 May 2022

Volume 82, pages 673–715, (2022)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

425 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

This paper presents two inexact composite gradient methods, one inner accelerated and another doubly accelerated, for solving a class of nonconvex spectral composite optimization problems. More specifically, the objective function for these problems is of the form $f_{1}+f_{2}+h$, where $f_{1}$ and $f_{2}$ are differentiable nonconvex matrix functions with Lipschitz continuous gradients, $h$ is a proper closed convex matrix function, and both $f_{2}$ and $h$ can be expressed as functions that operate on the singular values of their inputs. The methods essentially use an accelerated composite gradient method to solve a sequence of proximal subproblems involving the linear approximation of $f_{1}$ and the singular value functions underlying $f_{2}$ and $h$. Unlike other composite gradient-based methods, the proposed methods take advantage of both the composite and spectral structure underlying the objective function in order to efficiently generate their solutions. Numerical experiments are presented to demonstrate the practicality of these methods on a set of real-world and randomly generated spectral optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A FISTA-type accelerated gradient algorithm for solving smooth nonconvex composite optimization problems

Article 13 May 2021

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Article 21 February 2015

Accelerated Primal-Dual Gradient Descent with Linesearch for Convex, Nonconvex, and Nonsmooth Optimization Problems

Article 01 March 2019

Notes

See https://github.com/wwkong/nc_opt/tree/master/tests/papers/icg.

References

Ahn, M., Pang, J.-S., Xin, J.: Difference-of-convex learning: directional stationarity, optimality, and sparsity. SIAM J. Optim. 27(3), 1637–1665 (2017)
Article MathSciNet Google Scholar
Beck, A.: First-Order Methods in Optimization, vol. 25. SIAM, New Delhi (2017)
Book Google Scholar
Candes, Emmanuel J., Eldar, Yonina C., Strohmer, Thomas, Voroninski, Vladislav: Phase retrieval via matrix completion. SIAM Rev. 57(2), 225–251 (2015)
Article MathSciNet Google Scholar
Carmon, Yair, Duchi, John C., Hinder, Oliver, Sidford, Aaron: Accelerated methods for nonconvex optimization. SIAM J. Optim. 28(2), 1751–1772 (2018)
Article MathSciNet Google Scholar
Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps. Math. Program. 178(1–2), 503–558 (2019)
Article MathSciNet Google Scholar
Ghadimi, Saeed, Lan, Guanghui: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156, 59–99 (2016)
Article MathSciNet Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Generalized Uniformly Optimal Methods for Nonlinear Programming. arXiv e-prints, page arXiv:1508.07384, August 2015
He, Y., Monteiro, R.D.C.: An accelerated HPE-type algorithm for a class of composite convex-concave saddle-point problems. SIAM J. Optim. 26(1), 29–56 (2016)
Article MathSciNet Google Scholar
Kong, W., Melo, J.G., Monteiro, R.D.C.: Complexity of a quadratic penalty accelerated inexact proximal point method for solving linearly constrained nonconvex composite programs. SIAM J. Optim. 29(4), 2566–2593 (2019)
Article MathSciNet Google Scholar
Kong, W., Melo, J.G., Monteiro, R.D.C.: An efficient adaptive accelerated inexact proximal point method for solving linearly constrained nonconvex composite problems. Comput. Opt. Appl. 76(2), 305–346 (2020)
Article MathSciNet Google Scholar
Lewis, A.S.: The convex analysis of unitarily invariant matrix functions. J. Convex Anal. 2(1), 173–183 (1995)
MathSciNet MATH Google Scholar
Liang, J., Monteiro, R.D.C.: A doubly accelerated inexact proximal point method for nonconvex composite optimization problems. arXiv e-prints, page arXiv:1811.11378, November 2018
Liang, J., Monteiro, R.D.C., Sim, C.-K.: A FISTA-type accelerated gradient algorithm for solving smooth nonconvex composite optimization problems. arXiv e-prints, arXiv:1905.07010, May 2019
Monteiro, R.D.C., Ortiz, C., Svaiter, B.F.: An adaptive accelerated first-order method for convex optimization. Comput. Optim. Appl. 64, 31–73 (2016)
Article MathSciNet Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013).
Article MathSciNet Google Scholar
Paquette, C., Lin, H., Drusvyatskiy, D., Mairal, J., Harchaoui, Z.: Catalyst acceleration for gradient-based non-convex optimization. arXiv e-prints, page arXiv:1703.10993, March 2017
Sun, T., Zhang, C.-H.: Calibrated elastic regularization in matrix completion. In: Advances in Neural Information Processing Systems, pp. 863–871 (2012)
Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)
Article MathSciNet Google Scholar
Wen, F., Ying, R., Liu, P., Qiu, R.C.: Robust PCA using generalized nonconvex regularization. IEEE Trans. Circuits Syst. Video Technol. 30, 1497–1510 (2019)
Article Google Scholar
Yao, Q., Kwok, J.T.: Efficient learning with a family of nonconvex regularizers by redistributing nonconvexity. J. Mach. Learn. Res. 18(1), 6574–6625 (2017)
MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank the two anonymous referees and the associate editor for their insightful comments on earlier drafts of this paper.

Author information

Authors and Affiliations

Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37830, USA
Weiwei Kong
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0205, USA
Renato D. C. Monteiro

Authors

Weiwei Kong
View author publications
You can also search for this author in PubMed Google Scholar
Renato D. C. Monteiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weiwei Kong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The works of these authors were partially supported by ONR Grant N00014-18-1-2077, AFOSR Grant FA9550-22-1-0088, NSERC Grant PGSD3-516700-2018, and the IDEaS-TRIAD Fellowship (NSF Grant CCF-1740776). The first author has been supported by the US Department of Energy (DOE) and UT-Battelle, LLC, under contract DE-AC05-00OR22725 and also supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.

Appendices

A Technical bounds

The result below presents a basic property of the composite gradient step.

Proposition 19

Let $h\in \overline{\mathrm{Conv}}\ (\mathcal{Z})$, $z\in \mathrm{dom}\,h$, and g be a differentiable function on $\mathrm{dom}\,h$ which satisfies $g(u)-\ell _{g}(u;z)\le L\Vert u-z\Vert ^{2}/2$ for some $L\ge 0$ and every $u\in \mathrm{dom}\,g$. Moreover, define

$$\begin{aligned} {\hat{z}}:=\mathop {\mathrm{argmin}}\limits _{u}\left\{ \ell _{g}(u;z)+h(u)+\frac{L}{2}\Vert u-z\Vert ^{2}\right\} . \end{aligned}$$

Then, it holds that

$$\begin{aligned} \frac{L}{2}\Vert z-{{\hat{z}}}\Vert ^{2}\le (g+h)(z)-(g+h)({\hat{z}}). \end{aligned}$$

Proof

Using the definition of ${\hat{z}}$, the fact that $\ell _{g}(\cdot ;z)+h(\cdot )+L\Vert \cdot -z\Vert ^{2}/2$ is L-strongly convex, and the assumed bound $g(u)-\ell _{g}(u;z)\le L\Vert u-z\Vert ^{2}/2$ at $u={\hat{z}}$, we have

$$\begin{aligned} (g+h)(z)&=\ell _{g}(z;z)+h(z)\ge \ell _{g}({\hat{z}};z)+h({\hat{z}})+ L\Vert {\hat{z}}-z\Vert ^{2} \ge (g+h)({\hat{z}}) + \frac{L}{2}\Vert {\hat{z}}-z\Vert ^{2}. \end{aligned}$$

$\square$

B R-ACG algorithm

This section presents technical results related to the R-ACG algorithm.

The first set of results describes some basic properties of the generated iterates.

Proposition 20

If $\psi _{s}$ is $\mu$–strongly convex, then the following statements hold:

(a)
$z_{j}^{c}=\mathrm{argmin}_{u\in \mathcal{Z}}\left\{ B_{j}\Gamma _{j}(u)+\Vert u-z_{0}^{c}\Vert ^{2}/2\right\}$;
(b)
$\Gamma _{j}\le \psi$ and $B_{j}\psi (z_{j})\le \inf _{u\in \mathcal{Z}}\left\{ B_{j}\Gamma _{j}(u)+\Vert u-z_{0}^{c}\Vert ^{2}/2\right\}$;
(c)
$\eta _{j}\ge 0$ and $r_{j}\in {\partial }_{\eta _{j}}\left( \psi -\mu \Vert \cdot -z_{j}\Vert ^{2}/2\right) (z_{j})$;
(d)
it holds that
$$\begin{aligned} \left( \frac{1}{1+\mu B_{j}}\right) \Vert B_{j}r_{j}+z_{j}-z_{0}\Vert ^{2}+2B_{j}\eta _{j}\le \Vert z_{j}-z_{0}\Vert ^{2} \end{aligned}$$

Proof

(a) See [14, Proposition 1].

(b) See [14, Proposition 1(b)].

(c) The optimality of $z_{j}^{c}$ in part (a), the $\mu$-strong convexity of $\Gamma _{j}$, and the definition of $r_{j}$ imply that

$$\begin{aligned} r_{j}&=\frac{z_{0}^{c}-z_{j}^{c}}{B_{j}}+\mu (z_{j}-z_{j}^{c})\in {\partial }\left( \Gamma _{j}-\frac{\mu }{2}\Vert \cdot -z_{j}^{c}\Vert ^{2}+\mu \left\langle \cdot ,z_{j}^{c}-z_{j}\right\rangle \right) (z_{j}^{c})\\&={\partial }\left( \Gamma _{j}-\frac{\mu }{2}\Vert \cdot -z_{j}\Vert ^{2}\right) (z_{j}^{c}). \end{aligned}$$

Using the above inclusion, the definition of $\eta _{j}$, the fact that $\Gamma _{j}-\mu \Vert \cdot \Vert ^{2}/2$ is affine, and part (b), we now conclude that

$$\begin{aligned} \psi (z)-\frac{\mu }{2}\Vert z-z_{j}\Vert ^{2}&\ge \Gamma _{j}(z)-\frac{\mu }{2}\Vert z-z_{j}\Vert ^{2}=\Gamma _{j}(z_{j}^{c})-\frac{\mu }{2}\Vert z_{j}^{c}-z_{j}\Vert ^{2}+\left\langle r_{j},z-z_{j}^{c}\right\rangle \\&=\psi (z_{j})+\left\langle r_{j},z-z_{j}\right\rangle -\eta _{j}, \end{aligned}$$

for every $z\in \mathrm{dom}\,\psi _{n}$, which is exactly the desired inclusion. The fact that $\eta _{j}\ge 0$ follows from the above inequality with $z=z_{j}$.

(d) It follows from parts (a)–(b) and the definition of $\eta _{j}$ that

$$\begin{aligned} \eta _{j}&\le \Gamma _{j}(u)+\frac{1}{2B_{j}}\Vert u-z_{0}\Vert ^{2}-\psi (z_{j})\\&=\frac{\mu }{2}\Vert z_{j}-z_{j}^{c}\Vert ^{2}-\frac{1}{B_{j}}\left\langle z_{0}-z_{j}^{c},z_{j}-z_{j}^{c}\right\rangle +\frac{1}{2B_{j}}\Vert z_{j}^{c}-z_{0}\Vert ^{2}\\&=\frac{1}{2B_{j}}\Vert z_{j}-z_{0}\Vert ^{2}-\frac{1}{2B_{j}}(1+\mu B_{j})\Vert z_{j}-z_{j}^{c}\Vert ^{2}\\&=\frac{1}{2B_{j}}\Vert z_{j}-z_{0}\Vert ^{2}-\frac{1}{2B_{j}(1+\mu B_{j})}\Vert B_{j}r_{j}+z_{j}-z_{0}\Vert ^{2}. \end{aligned}$$

Multiplying both sides of the above inequality by $2B_{j}$ yields the desired conclusion. $\square$

The next result presents the general iteration complexity of the algorithm, i.e. Proposition 2(a).

Proof of Proposition 2(a)

Let $\ell$ be the first iteration where

$$\begin{aligned} \min \left\{ \frac{B_{\ell }^{2}}{4(1+\mu B_{\ell })},\frac{B_{\ell }}{2}\right\} \ge K_{\theta }^{2} \end{aligned}$$

(64)

and suppose that the R-ACG has not stopped with failure before iteration $\ell$. We show that it must stop with success at the end of the $\ell ^{\mathrm{th}}$ iteration. Combining the triangle inequality, the successful check in step 3 of the method, (64), and the relation $(a+b)^{2}\le 2a^{2}+2b^{2}$ for all $a,b\in {\mathbb {R}},$ we first have that

$$\begin{aligned}&\Vert r_{\ell }\Vert ^{2}+2\eta _{\ell }\\&\quad \le \max \left\{ \frac{1+\mu B_{\ell }}{A_{\ell }^{2}},\frac{1}{2B_{\ell }}\right\} \left( \frac{1}{1+\mu B_{\ell }}\Vert B_{\ell }r_{\ell }\Vert ^{2}+4B_{\ell }\eta _{\ell }\right) \\&\quad \le \max \left\{ \frac{1+\mu B_{\ell }}{B_{\ell }^{2}},\frac{1}{2B_{\ell }}\right\} \left( \frac{2}{1+\mu B_{\ell }}\Vert B_{\ell }r_{\ell }+z_{\ell }-z_{0}\Vert ^{2}+2\Vert z_{\ell }-z_{0}\Vert ^{2}+4B_{\ell }\eta _{\ell }\right) \\&\quad \le \max \left\{ \frac{4(1+\mu B_{\ell })}{B_{\ell }^{2}},\frac{2}{B_{\ell }}\right\} \Vert z_{\ell }-z_{0}\Vert ^{2}\le \frac{1}{K_{\theta }^{2}}\Vert z_{\ell }-z_{0}\Vert ^{2}\le \theta ^{2}\Vert z_{\ell }-z_{0}\Vert ^{2}, \end{aligned}$$

and hence the method must terminate at the $\ell ^{\mathrm{th}}$ iteration. We now bound $\ell$ based on the requirement in (64). Solving for the quadratic in $B_{\ell }$ in the first bound of (64), it is easy to see that $B_{\ell }\ge 4\mu K_{\theta }^{2}+2K_{\theta }$ implies (64). On the other hand, for the second condition in (64), it is immediate that $B_{\ell }\ge 2K_{\theta }^{2}$ implies (64). In view of (18) and the previous two bounds, it follows that

$$\begin{aligned} B_{\ell }\ge \frac{1}{L}\left( 1+\sqrt{\frac{\mu }{4L}}\right) ^{2(\ell -1)}\ge 2K_{\theta }(1+2\mu K_{\theta }^{2}) \end{aligned}$$

implies (64). Using the bound $\log (1+t)\ge t/(1+t)$ for $t\ge 0$ and the above bound on $\ell$, it is straightforward to see that $\ell$ is on the same order of magnitude as in (19). $\square$

C Refined ICG points

This appendix presents technical results related to the refined points of the ICG methods.

The result below proves Lemma 3 from the main body of the paper.

Proof of Lemma 3

(a) Using Proposition 1(a), the definition of ${\hat{v}}$, and the definitions of $\psi _{s}$ and $\psi _{n}$ in (28), we have that

$$\begin{aligned} {\hat{v}}&\in \frac{1}{\lambda }\left[ \nabla \psi _{s}(\hat{y_{}})+{\partial }\psi _{n}(\hat{y_{}})+w-y_{}\right] +\nabla f_{1}(\hat{y_{}})-\nabla f_{1}(w)\\&=\frac{1}{\lambda }\left[ \lambda \nabla f_{1}(w)+\lambda f_{2}(\hat{y_{}})+(w-y_{})+\lambda {\partial }h(y_{})\right] +\nabla f_{1}(\hat{y_{}})-\nabla f_{1}(w)\\&=\nabla f_{1}(\hat{y_{}})+\nabla f_{2}(\hat{y_{}})+{\partial }h(\hat{y_{}}), \end{aligned}$$

(b) Using assumption (A3), Proposition 1(b), the choice of M in (28), and the fact that $\Delta _{\mu }(y_{r};y_{},v)\le \varepsilon$, we first observe that

$$\begin{aligned}&\Vert \nabla f_{1}(\hat{y_{}})-\nabla f_{1}(z_{0})\Vert -L_{1}(y_{},z_{0})\Vert y_{}-z_{0}\Vert \le L_{1}(y_{},\hat{y_{}})\Vert \hat{y_{}}-y_{}\Vert \nonumber \\&\le \frac{L_{1}(y_{},\hat{y_{}})\sqrt{2\Delta _{\mu }(y_{r};y_{},v)}}{\sqrt{\lambda M_{2}^{+}+1}}\le \frac{\theta L_{1}(y_{},\hat{y_{}})}{\sqrt{\lambda M_{2}^{+}+1}}\Vert y_{}-z_{0}\Vert . \end{aligned}$$

(65)

Using now (65), the choice of M in (28), Proposition 1(c) with $L(\cdot ,\cdot )=\lambda L_{2}(\cdot ,\cdot )$, the fact that $\sigma \le 1$, and the definition of $C_{\lambda }(\cdot ,\cdot )$, we conclude that

$$\begin{aligned} \Vert {\hat{v}}\Vert&\le \frac{1}{\lambda }\Vert v_{r}\Vert +\frac{1}{\lambda }\Vert y_{}-z_{0}\Vert +\Vert \nabla f_{1}(\hat{y_{}})-\nabla f_{1}(z_{0})\Vert \\&\le \left[ L_{1}(y_{},z_{0})+\frac{1+\theta }{\lambda }+\frac{\theta \left[ \lambda M_{2}^{+}+1+\lambda L_{1}(y_{},\hat{y_{}})+\lambda L_{2}(y_{},\hat{y_{}})\right] }{\lambda \sqrt{\lambda M_{2}^{+}+1}}\right] \Vert y_{}-z_{0}\Vert \\&\le \left[ L_{1}(y_{},z_{0})+\frac{2+\theta C_{\lambda }(y_{},\hat{y_{}})}{\lambda }\right] \Vert y_{}-z_{0}\Vert . \end{aligned}$$

$\square$

D Spectral functions

This section presents some results about spectral functions as well as the proof of Propositions 6. It is assumed that the reader is familiar with the key quantities given in Sect. 4.1 (e.g., see (40) and (41)).

We first state two well-known results [2, 11] about spectral functions.

Lemma 21

Let $\Psi =\Psi ^{\mathcal{V}}\circ \sigma$ for some absolutely symmetric function $\Psi ^{\mathcal{V}}:{\mathbb {R}}^{r}\mapsto {\mathbb {R}}$. Then, the following properties hold:

(a)
$\Psi ^{*}=(\Psi ^{\mathcal{V}}\circ \sigma )^{*}=(\Psi ^{\mathcal{V}})^{*}\circ \sigma$;
(b)
$\nabla \Psi =(\nabla \Psi ^{\mathcal{V}})\circ \sigma$;

Lemma 22

Let $(\Psi ,\Psi ^{\mathcal{V}})$ be as in Lemma 21, the pair $(S,{Z}_{})\in \mathcal{Z}\times \mathrm{dom}\,\Psi$ be fixed, and the decomposition $S=P[\mathrm{dg}\,\sigma (S)]Q^{*}$ be an SVD of S, for some $(P,Q)\in \mathcal{U}^{m}\times \mathcal{U}^{n}$. If $\Psi \in \overline{\mathrm{Conv}}\ {\mathbb {R}}^{m\times n}$ and $\Psi ^{\mathcal{V}}\in \overline{\mathrm{Conv}}\ {\mathbb {R}}^{r}$, then for every $M>0$, we have

$$\begin{aligned} S\in {\partial }\left( \Psi +\frac{M}{2}\Vert \cdot \Vert _{F}^{2}\right) ({Z}_{})\iff {\left\{ \begin{array}{ll} \sigma (S)\in {\partial }\left( \Psi ^{\mathcal{V}}+\frac{M}{2}\Vert \cdot \Vert ^{2}\right) (\sigma ({Z}_{})),\\ {Z}_{}=P[\mathrm{dg}\,\sigma ({Z}_{})]Q^{*}. \end{array}\right. } \end{aligned}$$

We now present a new result about spectral functions.

Theorem 23

Let $(\Psi ,\Psi ^{\mathcal{V}})$ be as in Lemma 21 and the point ${Z}_{}\in {\mathbb {R}}^{m\times n}$ be such that $\sigma ({Z}_{})\in \mathrm{dom}\,\Psi ^{\mathcal{V}}$. Then for every $\varepsilon \ge 0$, we have $S\in {\partial }_{\varepsilon }\Psi ({Z}_{})$ if and only if $\sigma (S)\in {\partial }_{\varepsilon (S)}\Psi ^{\mathcal{V}}(\sigma ({Z}_{}))$, where

$$\begin{aligned} \varepsilon (S):=\varepsilon -\left[ \left\langle \sigma ({Z}_{}),\sigma (S)\right\rangle -\left\langle {Z}_{},S\right\rangle \right] \ge 0. \end{aligned}$$

(66)

Moreover, if S and Z have a simultaneous SVD, then $\varepsilon (S)=\varepsilon$.

Proof

Using Lemma 21(a), (66), and the well-known fact that $S\in {\partial }_{\varepsilon }\Psi ({Z}_{})$ if and only if $\varepsilon \ge \Psi ({Z}_{})+\Psi ^{*}(S)-\left\langle {Z}_{},S\right\rangle$, we have that $S\in {\partial }_{\varepsilon }\Psi ({Z}_{})$ if and only if

$$\begin{aligned} \varepsilon (S)&=\varepsilon -\left[ \left\langle \sigma ({Z}_{}),\sigma (S)\right\rangle -\left\langle {Z}_{},S\right\rangle \right] \\&\ge \Psi ({Z}_{})+\Psi ^{*}(S)-\left\langle {Z}_{},S\right\rangle -\left[ \left\langle \sigma ({Z}_{}),\sigma (S)\right\rangle -\left\langle {Z}_{},S\right\rangle \right] \\&=\Psi ^{\mathcal{V}}(\sigma ({Z}_{}))+(\Psi ^{\mathcal{V}})^{*}(\sigma (S))-\left\langle \sigma ({Z}_{}),\sigma (S)\right\rangle , \end{aligned}$$

or, equivalently, $\sigma (S)\in {\partial }_{\varepsilon (S)}\Psi ^{\mathcal{V}}(\sigma ({Z}_{}))$ and $\varepsilon (S)\ge 0$. To show that the existence of a simultaneous SVD of S and Z implies $\varepsilon (S)=\varepsilon$ it suffices to show that $\langle \sigma (S),\sigma ({Z}_{})\rangle =\langle S,{Z}_{}\rangle$. Indeed, if $S=P[\mathrm{dg}\,\sigma (S)]Q^{*}$ and ${Z}_{}=P[\mathrm{dg}\,\sigma ({Z}_{})]Q^{*}$, for some $(P,Q)\in \mathcal{U}^{m}\times \mathcal{U}^{n}$, then we have

$$\begin{aligned} \langle S,{Z}_{}\rangle =\langle \mathrm{dg}\,\sigma (S),P^{*}P[\mathrm{dg}\,\sigma ({Z}_{})]Q^{*}Q\rangle =\langle \mathrm{dg}\,\sigma (S),\mathrm{dg}\,\sigma ({Z}_{})\rangle =\langle \sigma (S),\sigma ({Z}_{})\rangle . \end{aligned}$$

$\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kong, W., Monteiro, R.D.C. Accelerated inexact composite gradient methods for nonconvex spectral optimization problems. Comput Optim Appl 82, 673–715 (2022). https://doi.org/10.1007/s10589-022-00377-9

Download citation

Received: 08 July 2021
Accepted: 03 May 2022
Published: 28 May 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10589-022-00377-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated inexact composite gradient methods for nonconvex spectral optimization problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A FISTA-type accelerated gradient algorithm for solving smooth nonconvex composite optimization problems

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Accelerated Primal-Dual Gradient Descent with Linesearch for Convex, Nonconvex, and Nonsmooth Optimization Problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Technical bounds

Proposition 19

Proof

B R-ACG algorithm

Proposition 20

Proof

Proof of Proposition 2(a)

C Refined ICG points

Proof of Lemma 3

D Spectral functions

Lemma 21

Lemma 22

Theorem 23

Proof

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Accelerated inexact composite gradient methods for nonconvex spectral optimization problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A FISTA-type accelerated gradient algorithm for solving smooth nonconvex composite optimization problems

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Accelerated Primal-Dual Gradient Descent with Linesearch for Convex, Nonconvex, and Nonsmooth Optimization Problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Technical bounds

Proposition 19

Proof

B R-ACG algorithm

Proposition 20

Proof

Proof of Proposition 2(a)

C Refined ICG points

Proof of Lemma 3

D Spectral functions

Lemma 21

Lemma 22

Theorem 23

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation