Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization

Ahookhosh, Masoud; Hien, Le Thi Khanh; Gillis, Nicolas; Patrinos, Panagiotis

doi:10.1007/s10589-021-00286-3

Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization

Published: 09 June 2021

Volume 79, pages 681–715, (2021)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Masoud Ahookhosh ORCID: orcid.org/0000-0003-4206-9789¹,
Le Thi Khanh Hien²,
Nicolas Gillis² &
…
Panagiotis Patrinos³

963 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

We introduce and analyze BPALM and A-BPALM, two multi-block proximal alternating linearized minimization algorithms using Bregman distances for solving structured nonconvex problems. The objective function is the sum of a multi-block relatively smooth function (i.e., relatively smooth by fixing all the blocks except one) and block separable (nonsmooth) nonconvex functions. The sequences generated by our algorithms are subsequentially convergent to critical points of the objective function, while they are globally convergent under the KL inequality assumption. Moreover, the rate of convergence is further analyzed for functions satisfying the Łojasiewicz’s gradient inequality. We apply this framework to orthogonal nonnegative matrix factorization (ONMF) that satisfies all of our assumptions and the related subproblems are solved in closed forms, where some preliminary numerical results are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization

Article 15 June 2021

First-Order Algorithms for Convex Optimization with Nonseparable Objective and Coupled Constraints

Article 18 June 2016

Bregman primal–dual first-order method and application to sparse semidefinite programming

Article Open access 04 December 2021

Notes

The codes are publicly available at https://github.com/MasoudAhoo/BPALM

References

Ahookhosh, M.: Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity. Math. Methods Oper. Res. 89(3), 319–353 (2019)
Article MathSciNet MATH Google Scholar
Ahookhosh, M., Hien, L.T.K., Gillis, N., Patrinos, P.: A block inertial bregman proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization. J. Optim. Theory Appl. (2021)
Ahookhosh, M., Themelis, A., Patrinos, P.: A Bregman forward-backward linesearch algorithm for nonconvex composite optimization: superlinear convergence to nonisolated local minima. SIAM J. Optim. 31(1), 653–685 (2021)
Article MathSciNet MATH Google Scholar
Araújo, U., Saldanha, B., Galvão, R., Yoneyama, T., Chame, H., Visani, V.: The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemometr. Intell. Lab. Syst. 57(2), 65–73 (2001)
Article Google Scholar
Armijo, L.: Minimization of functions having Lipschitz continuous first partial derivatives. Pac. J. Math. 16(1), 1–3 (1966)
Article MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Alternating proximal algorithms for weakly coupled convex minimization problems. applications to dynamical games and PDE’s. J. Convex Anal. 15(3), 485 (2008)
MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
Article MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. 137(1), 91–129 (2013)
Article MathSciNet MATH Google Scholar
Attouch, H., Redont, P., Soubeyran, A.: A new class of alternating proximal minimization algorithms with costs-to-move. SIAM J. Optim. 18(3), 1061–1081 (2007)
Article MathSciNet MATH Google Scholar
Attouch, H., Soubeyran, A.: Inertia and reactivity in decision making as cognitive variational inequalities. J. Conv. Anal. 13(2), 207 (2006)
MathSciNet MATH Google Scholar
Auslender, A.: Optimisation méthodes numériques. Mason, Paris (1976)
MATH Google Scholar
Bauschke, H.H., Bolte, J., Chen, J., Teboulle, M., Wang, X.: On linear convergence of non-Euclidean gradient methods without strong convexity and Lipschitz gradient continuity. J. Optim. Theory Appl. 182, 1068–1087 (2019)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42(2), 330–348 (2016)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Dao, M.N., Lindstrom, S.B.: Regularizing with Bregman–Moreau envelopes. SIAM J. Optim. 28(4), 3208–3228 (2018)
Article MathSciNet MATH Google Scholar
Beck, A., Pauwels, E., Sabach, S.: The cyclic block conditional gradient method for convex optimization problems. SIAM J. Optim. 25(4), 2024–2049 (2015)
Article MathSciNet MATH Google Scholar
Beck, A., Sabach, S., Teboulle, M.: An alternating semiproximal method for nonconvex regularized structured total least squares problems. SIAM J. Matrix Anal. Appl. 37(3), 1129–1150 (2016)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4), 2037–2060 (2013)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, Inc., Hoboken (1989)
MATH Google Scholar
Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2007)
Article MATH Google Scholar
Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)
Article MathSciNet MATH Google Scholar
Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)
Article MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
Article MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M., Vaisbourd, Y.: First order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3), 2131–2151 (2018)
Article MathSciNet MATH Google Scholar
Boţ, R.I., Csetnek, E.R.: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600–616 (2016)
Article MathSciNet MATH Google Scholar
Boţ, R.I., Nguyen, D.K.: The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates. Math. Oper. Res. 45(2), 682–712 (2020)
Article MathSciNet MATH Google Scholar
Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)
Article MathSciNet MATH Google Scholar
Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3(3), 538–543 (1993)
Article MathSciNet MATH Google Scholar
Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. John Wiley & Sons, Hoboken (2009)
Book Google Scholar
Combettes, P.L., Pesquet, J.C.: Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25(2), 1221–1248 (2015)
Article MathSciNet MATH Google Scholar
Van den Dries, L.: Tame Topology and o-Minimal Structures, vol. 248. Cambridge University Press, Cambridge (1998)
Book MATH Google Scholar
Fercoq, O., Bianchi, P.: A coordinate-descent primal-dual algorithm with large step size and possibly nonseparable functions. SIAM J. Optim. 29(1), 100–134 (2019)
Article MathSciNet MATH Google Scholar
Fu, X., Huang, K., Sidiropoulos, N.D., Ma, W.K.: Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications. IEEE Signal Process. Mag. 36(2), 59–80 (2019)
Article Google Scholar
Gillis, N.: The why and how of nonnegative matrix factorization. Regular. Optim. Kernels Support Vector Mach. 12(257), 257–291 (2014)
Google Scholar
Gillis, N., Vavasis, S.A.: Fast and robust recursive algorithmsfor separable nonnegative matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 698–714 (2013)
Article MATH Google Scholar
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss-Seidel method under convex constraints. Operat. Res. Lett. 26(3), 127–136 (2000)
Article MathSciNet MATH Google Scholar
Hanzely, F., Richtárik, P.: Fastest rates for stochastic mirror descent methods. arXiv:1803.07374 (2018)
Hanzely, F., Richtarik, P., Xiao, L.: Accelerated bregman proximal gradient methods for relatively smooth convex optimization. Comput Optim Appl 22, 1–36 (2021)
MathSciNet Google Scholar
Kimura, K., Tanaka, Y., Kudo, M.: A fast hierarchical alternating least squares algorithm for orthogonal nonnegative matrix factorization. In: D. Phung, H. Li (eds.) Proceedings of the Sixth Asian Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 39, pp. 129–141. PMLR, Nha Trang City, Vietnam (2015). http://proceedings.mlr.press/v39/kimura14.html
Kurdyka, K.: On gradients of functions definable in o-minimal structures. Annales de l’institut Fourier 48(3), 769–783 (1998)
Article MathSciNet MATH Google Scholar
Latafat, P., Freris, N.M., Patrinos, P.: A new randomized block-coordinate primal-dual proximal algorithm for distributed optimization. IEEE Trans. Autom. Cont. 64(10), 4050–4065 (2019)
Latafat, P., Themelis, A., Patrinos, P.: Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems. Math. Program. 1–30. arxiv.org/abs/1906.10053 (2021)
Li, Q., Zhu, Z., Tang, G., Wakin, M.B.: Provable Bregman-divergence based methods for nonconvex and non-Lipschitz problems. arXiv preprint arXiv:1904.09712 (2019)
Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. Les équations aux dérivées partielles pp. 87–89 (1963)
Łojasiewicz, S.: Sur la géométrie semi- et sous- analytique. Annales de l’institut Fourier 43(5), 1575–1595 (1993)
Article MathSciNet MATH Google Scholar
Lu, H., Freund, R.M., Nesterov, Y.: Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Optim. 28(1), 333–354 (2018)
Article MathSciNet MATH Google Scholar
Mukkamala, M.C., Ochs, P., Pock, T., Sabach, S.: Convex-concave backtracking for inertial bregman proximal gradient algorithms in nonconvex optimization. SIAM J. Math. Data Sci. 2(3), 658–682 (2020)
Article MathSciNet Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)
Article MathSciNet MATH Google Scholar
Pauca, V.P., Piper, J., Plemmons, R.J.: Nonnegative matrix factorization for spectral data analysis. Linear Algebra Appl. 416(1), 29–47 (2006)
Article MathSciNet MATH Google Scholar
Pock, T., Sabach, S.: Inertial proximal alternating linearized minimization (iPALM) for nonconvex and nonsmooth problems. SIAM J. Imag .Sci. 9(4), 1756–1787 (2016)
Article MathSciNet MATH Google Scholar
Pompili, F., Gillis, N., Absil, P.A., Glineur, F.: Two algorithms for orthogonal nonnegative matrix factorization with application to clustering. Neurocomputing 141, 15–25 (2014)
Article Google Scholar
Razaviyayn, M., Hong, M., Luo, Z.Q.: A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM J. Optim. 23(2), 1126–1153 (2013)
Article MathSciNet MATH Google Scholar
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1–2), 1–38 (2014)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer Science & Business Media, Berlin (2011)
MATH Google Scholar
Choi, S.: Algorithms for orthogonal nonnegative matrix factorization. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1828–1832 (2008)
Shefi, R., Teboulle, M.: On the rate of convergence of the proximal alternating linearized minimization algorithm for convex problems. EURO J. Comput. Optim. 4(1), 27–46 (2016)
Article MathSciNet MATH Google Scholar
Tam, M.K.: Regularity properties of non-negative sparsity sets. J. Math. Anal. Appl. 447(2), 758–777 (2017)
Article MathSciNet MATH Google Scholar
Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. 170(1), 67–96 (2018)
Themelis, A., Ahookhosh, M., Patrinos, P.: On the acceleration of forward-backward splitting via an inexact Newton method. In: Luke, R., Bauschke, H., Burachik, R. (eds.) Splitting Algorithms, Modern Operator Theory, and Applications, pp. 363–412. Springer, Berlin (2019)
Chapter Google Scholar
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475–494 (2001)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117(1–2), 387–423 (2009)
Article MathSciNet MATH Google Scholar
Wang, X., Yuan, X., Zeng, S., Zhang, J., Zhou, J.: Block coordinate proximal gradient method for nonconvex optimization problems: convergence analysis. http://www.optimization-online.org/DB\_HTML/2018/04/6573.html (2018)

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their insightful comments that helped improve the paper; in particular, one of the reviewers gave a suggestion that leads to the kernel function in Proposition 5.1. The first author is grateful to Andreas Themelis for his useful comments and discussions on the paper. MA and PP acknowledge the support by the Research Foundation Flanders (FWO) research projects G086518N and G086318N; Research Council KU Leuven C1 project No. C14/18/068; Fonds de la Recherche Scientifique - FNRS and the Fonds Wetenschappelijk Onderzoek - Vlaanderen (FWO) under EOS project no 30468160 (SeLMA). LTKH and NG also acknowledge the support by the European Research Council (ERC starting grant no 679515).

Author information

Authors and Affiliations

Department of Mathematics, University of Antwerp, Middelheimlaan 1, 2020, Antwerp, Belgium
Masoud Ahookhosh
Department of Mathematics and Operational Research, Faculté polytechnique, Université de Mons. Rue de Houdain 9, 7000, Mons, Belgium
Le Thi Khanh Hien & Nicolas Gillis
Department of Electrical Engineering (ESAT-STADIUS), KU Leuven, Kasteelpark Arenberg 10, 3001, Leuven, Belgium
Panagiotis Patrinos

Authors

Masoud Ahookhosh
View author publications
You can also search for this author in PubMed Google Scholar
Le Thi Khanh Hien
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Gillis
View author publications
You can also search for this author in PubMed Google Scholar
Panagiotis Patrinos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masoud Ahookhosh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Lemma 8.1

Let all assumptions of Theorem 4.4 be valid. Then, the following assertions hold:

(i)
$\mathbf{lim}_{k\rightarrow \infty }\mathbf{dist}\left( \varvec{x}^k,\omega (\varvec{x}^0)\right) =0$;
(ii)
$\omega (\varvec{x}^0)$ is a nonempty, compact, and connected set;
(iii)
the objective function $\varphi$ is finite and constant on $\omega (\varvec{x}^0)$.

Proof

Lemma 8.1(i) is a direct consequence of Theorem 4.4, and Lemma 8.1(ii) and Lemma 8.1(iii) can be proved in the same way as [23, Lemma 5(iii)-(iv)].

Lemma 8.2

Let all assumptions of Theorem 4.5 is satisfied. If $\varphi (\varvec{x}^k)>\varphi ^\star$, there exists $\varepsilon ,\eta >0$ and the desingularizing function $\psi$ such that

$$\begin{aligned} \psi '(\varphi (\varvec{x}^k)-\varphi ^\star )\mathbf{dist}(0,\partial \varphi (\varvec{x}^k))\ge 1 \quad \text { for } k\ge k_0. \end{aligned}$$

(8.1)

Proof

From Lemma 8.1(ii), the set of limit points $\omega (\varvec{x}^0)$ of ${(\varvec{x}^k)_{{k\in \mathbb {N}}}}$ is nonempty and compact and $\varphi$ is finite and constant on $\omega (\varvec{x}^0)$ due to Lemma 8.1(iii). Moreover, $\varphi (\varvec{x}^k)>\varphi ^\star$ and the sequence ${(\varphi (\varvec{x}^k))_{{k\in \mathbb {N}}}}$ is decreasing (Proposition 4.1(i)), i.e., there exist $\eta >0$ and $k_1\in \mathbb {N}$ such that $\varphi ^\star<\varphi (\varvec{x}^k)<\varphi ^\star +\eta$ for all $k\ge k_1$. For $\varepsilon >0$, Proposition 4.1(i) implies that there exists $k_2\in \mathbb {N}$ such that $\mathbf{dist}(\varvec{x}^k,\omega (\varvec{x}^0))<\varepsilon$ for $k\ge k_2$. Setting $k_0:=\mathbf{max}\{k_1,k_2\}$ and according to Fact 2.2, there exist $\varepsilon , \eta >0$ and a desingularization function $\psi$ such that for any element in

$$\begin{aligned} \{\varvec{x}^k~\mid ~\mathbf{dist}(\varvec{x}^k,\omega (\varvec{x}^0))<\varepsilon \}\cap [\varphi ^\star<\varphi (\varvec{x}^k)<\varphi ^\star +\eta ] \quad \text { for } k\ge k_0, \end{aligned}$$

the inequality (8.1) is valid.

We next present the proof of Theorem 4.7.

Proof of Theorem 4.7. The proof has two key parts.

In the first part, we show that there exist $c>0$ and $\overline{k}\in \mathbb {N}$ such that for all $k\ge \overline{k}$ the following inequalities hold for $i=1,\ldots ,N$:

$$\begin{aligned} \Vert x_i^k-x_i^\star \Vert \le \left\{ \begin{array}{ll} c \mathbf{max}\{1,\tfrac{\kappa }{1-\theta }\} \sqrt{{\mathcal {S}}_{k-1}} &{}~~~ \mathrm {if}\ \theta \in (0,1/2], \\ c \tfrac{\kappa }{1-\theta } {\mathcal {S}}_{k-1}^{1-\theta } &{}~~~ \mathrm {if}\ \theta \in (1/2,1). \end{array} \right. \end{aligned}$$

(8.2)

Let $\varepsilon >0$ be as described in (4.16) and $x^k\in \mathbf{B}(x^\star ; \varepsilon )$ for all $k\ge \tilde{k}$ and $\tilde{k}\in \mathbb {N}$. By the definitions of $a_k$ and $b_k$ in (4.15) and using (4.14), we get $a_{k+1}\le \tfrac{1}{2} a_k+b_k$ for all $k\ge \tilde{k}$. Since ${(\varphi (\varvec{x}^k))_{{k\in \mathbb {N}}}}$ is nonincreasing,

$$\begin{aligned} \sum _{j=k}^\infty a_{j+1}\le \tfrac{1}{2} \sum _{j=k}^\infty (a_j-a_{j+1}+a_{j+1})+ \tfrac{\widehat{c} N}{2}\sum _{j=k}^\infty \left( \Delta _j-\Delta _{j+1}\right) = \tfrac{1}{2}\sum _{j=k}^\infty a_{j+1}+\tfrac{1}{2} a_k+\tfrac{\widehat{c} N}{2} \Delta _k. \end{aligned}$$

Together with the arithmetic and quadratic mean inequalities, $\psi ({\mathcal {S}}_{k})\le \psi ({\mathcal {S}}_{k-1})$, and Proposition 4.1(i), this lead to

$$\begin{aligned} {\begin{matrix} \sum _{j=k}^\infty a_{j+1}&{}\le a_k+\widehat{c} N\Delta _k = \sum _{i=1}^N \Vert x_i^{k}-x_i^{k-1}\Vert +\widehat{c} N\psi ({\mathcal {S}}_k) \le \sqrt{N} \sqrt{\sum _{i=1}^N \Vert x_i^{k}-x_i^{k-1}\Vert ^2}+\widehat{c} N\psi ({\mathcal {S}}_k)\\ &{}\le \sqrt{2N}\mathbf{max}\{\tfrac{1}{\sqrt{\sigma _1}},\ldots ,\tfrac{1}{\sqrt{\sigma _N}}\} \sqrt{\sum _{i=1}^N {\mathbf {D}}_{h}(\varvec{x}^{k-1,i},\varvec{x}^{k-1,i-1})} +\widehat{c} N\psi ({\mathcal {S}}_k)\\ &{}\le \sqrt{\tfrac{2N}{\rho }} \mathbf{max}\{\tfrac{1}{\sqrt{\sigma _1}},\ldots ,\tfrac{1}{\sqrt{\sigma _N}}\} \sqrt{{\mathcal {S}}_{k-1}-{\mathcal {S}}_{k}}+\widehat{c} N\psi ({\mathcal {S}}_{k-1}). \end{matrix}} \end{aligned}$$

(8.3)

On the other hand, for $i=1,\ldots ,N$, we have

$$\begin{aligned} \Vert x_i^k-x_i^\star \Vert \le \Vert x_i^{k+1}-x_i^k\Vert +\Vert x_i^{k+1}-x_i^\star \Vert \le \ldots \le \sum _{j=k}^\infty \Vert x_i^{j+1}-x_i^j\Vert . \end{aligned}$$

This inequality, together with (8.3), yields

$$\begin{aligned} \sum _{i=1}^N\Vert x_i^k-x_i^\star \Vert \le \sqrt{\tfrac{2N}{\rho }} \mathbf{max}\{\tfrac{1}{\sqrt{\sigma _1}},\ldots ,\tfrac{1}{\sqrt{\sigma _N}}\} \sqrt{{\mathcal {S}}_{k-1}-{\mathcal {S}}_{k}}+\widehat{c} N \psi ({\mathcal {S}}_{k-1}), \end{aligned}$$

leading to

$$\begin{aligned} \Vert x_i^k-x_i^\star \Vert \le c \mathbf{max}\{\sqrt{{\mathcal {S}}_{k-1}},\psi ({\mathcal {S}}_{k-1})\}\quad i=1,\ldots ,N, \end{aligned}$$

(8.4)

where $c:=\sqrt{\tfrac{2N}{\rho }} \mathbf{max}\left\{\tfrac {1}{\sqrt{\sigma _1}},\ldots ,\tfrac {1}{\sqrt{\sigma _N}}\right\}+\widehat{c} N$ and $\psi (s):=\frac{\kappa }{1-\theta } s^{1-\theta }$. Let us consider the nonlinear equation

$$\begin{aligned} \sqrt{{\mathcal {S}}_{k-1}}-\frac{\kappa }{1-\theta } {\mathcal {S}}_{k-1}^{1-\theta }=0, \end{aligned}$$

which has a solution at ${\mathcal {S}}_{k-1}=\left( \tfrac {(1-\theta )}{\kappa }\right) ^{\tfrac{2}{1-2\theta }}$. Form the monotonicity of ${\mathcal {S}}_k$, there exists $\hat{k}\in \mathbb {N}$ such that for $k\ge \hat{k}$ (8.4) holds and

$$\begin{aligned} {\mathcal {S}}_{k-1}\le \left( \frac{1-\theta }{\kappa }\right) ^{\tfrac{2}{1-2\theta }}. \end{aligned}$$

We now consider two cases: (a) $\theta \in (0,1/2]$; (b) $\theta \in (1/2,1)$. In Case (a), if $\theta \in (0,1/2)$, then $\psi ({\mathcal {S}}_{k-1})\le \sqrt{{\mathcal {S}}_{k-1}}$. If $\theta =1/2$, then $\psi ({\mathcal {S}}_{k-1})=\tfrac{\kappa }{1-\theta }\sqrt{{\mathcal {S}}_{k-1}}$, i.e.,

$$\begin{aligned} \mathbf{max}\{\sqrt{{\mathcal {S}}_{k-1}},\psi ({\mathcal {S}}_{k-1})\}=\mathbf{max}\{1,\tfrac{\kappa }{1-\theta }\} \sqrt{{\mathcal {S}}_{k-1}}. \end{aligned}$$

Therefore, it holds that $\mathbf{max}\{\sqrt{{\mathcal {S}}_{k-1}},\psi ({\mathcal {S}}_{k-1})\}\le \mathbf{max}\{1,\tfrac{\kappa }{1-\theta }\} \sqrt{{\mathcal {S}}_{k-1}}$. In Case (b), we have that

$$\begin{aligned} \psi ({\mathcal {S}}_{k-1})\ge \sqrt{{\mathcal {S}}_{k-1}}, \end{aligned}$$

i.e., $\mathbf{max}\{\sqrt{{\mathcal {S}}_{k-1}},\psi ({\mathcal {S}}_{k-1})\}= \tfrac{\kappa }{1-\theta } {\mathcal {S}}_{k-1}^{1-\theta }$. Then, it follows from (8.4) that (8.2) holds for all $k\ge \overline{k}:=\mathbf{max}\{\tilde{k}, \hat{k}\}$.

In the second part of the proof, we will show the assertions in the statement of the theorem. For $({\mathcal {G}}_i^{k},\ldots ,{\mathcal {G}}_N^{k})\in \partial \varphi (\varvec{x}^k)$ as defined in Proposition 4.3, by Proposition 4.1(i), we infer

$$\begin{aligned} {\mathcal {S}}_{k-1}&-{\mathcal {S}}_{k}=\varphi (x^{k-1})-\varphi (x^{k}) \ge \rho \sum _{i=1}^N {\mathbf {D}}_{h}(x^{k-1,i},x^{k-1,i-1})\ge \frac{\rho }{2} \sum _{i=1}^N \sigma _i \Vert x_i^{k}-x_i^{k-1}\Vert ^2\\&\ge \frac{\rho }{2N}\min \{\sigma _1,\ldots ,\sigma _N\} \left( \sum _{i=1}^N \Vert x_i^{k}-x_i^{k-1}\Vert \right) ^2 \ge \frac{\rho }{2N\overline{c}^2}\min \{\sigma _1,\ldots ,\sigma _N\} \Vert ({\mathcal {G}}_i^{k},\ldots ,{\mathcal {G}}_N^{k})\Vert ^2\\&\ge \frac{\rho }{2N\overline{c}^2}\min \{\sigma _1,\ldots ,\sigma _N\}\mathbf{dist}(0,\partial \varphi (x^k))^2 \ge \frac{\rho }{2N\overline{c}^2\kappa ^2}\min \{\sigma _1,\ldots ,\sigma _N\} {\mathcal {S}}_{k-1}^{2\theta }=\widetilde{c}~ {\mathcal {S}}_{k-1}^{2\theta }, \end{aligned}$$

with $\widetilde{c}:=\frac{\rho }{2N\overline{c}^2\kappa ^2}\min \{\sigma _1,\ldots ,\sigma _N\}$ and for all $k\ge \overline{k}$. Hence, all assumptions of Fact 4.6 hold with $\alpha =2\theta$. Therefore, our results follows from this fact and (8.2). $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahookhosh, M., Hien, L.T.K., Gillis, N. et al. Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization. Comput Optim Appl 79, 681–715 (2021). https://doi.org/10.1007/s10589-021-00286-3

Download citation

Received: 16 February 2020
Accepted: 28 May 2021
Published: 09 June 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s10589-021-00286-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization

First-Order Algorithms for Convex Optimization with Nonseparable Objective and Coupled Constraints

Bregman primal–dual first-order method and application to sparse semidefinite programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Lemma 8.1

Proof

Lemma 8.2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization

First-Order Algorithms for Convex Optimization with Nonseparable Objective and Coupled Constraints

Bregman primal–dual first-order method and application to sparse semidefinite programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 8.1

Proof

Lemma 8.2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation