Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Ochs, Peter; Fadili, Jalal; Brox, Thomas

doi:10.1007/s10957-018-01452-0

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Published: 06 December 2018

Volume 181, pages 244–278, (2019)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

We propose a unifying algorithm for non-smooth non-convex optimization. The algorithm approximates the objective function by a convex model function and finds an approximate (Bregman) proximal point of the convex model. This approximate minimizer of the model function yields a descent direction, along which the next iterate is found. Complemented with an Armijo-like line search strategy, we obtain a flexible algorithm for which we prove (subsequential) convergence to a stationary point under weak assumptions on the growth of the model function error. Special instances of the algorithm with a Euclidean distance function are, for example, gradient descent, forward–backward splitting, ProxDescent, without the common requirement of a “Lipschitz continuous gradient”. In addition, we consider a broad class of Bregman distance functions (generated by Legendre functions), replacing the Euclidean distance. The algorithm has a wide range of applications including many linear and nonlinear inverse problems in signal/image processing and machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

It is often easy to find a feasible point. Of course, there are cases, where finding an initialization is a problem itself. We assume that the user provides a feasible initial point.
Note that $\inf _{k}\eta _{{k}}>0$ is equivalent to $\liminf _{k}\eta _{{k}}>0$, as we assume $\eta _{{k}}>0$ for all ${k}\in \mathbb {N}$.
The example is not meant to be meaningful and the model function to be algorithmically the best choice. This example shall demonstrate the flexibility and problem adaptivity of our framework.
For very specific instances, a recent line of research proposes to lift the problem to the space of low-rank matrices, and then use convex relaxation and computationally intensive conic programming that are only applicable to small-dimensional problems; see, e.g., [35] for blind deconvolution.
Strictly speaking, ${\mathcal {Z}}_3$ should be the nonnegative orthant for sparse NMF. But this does not change anything to our discussion since computing the Euclidean proximal mapping of the $\ell _1$ norm restricted to the nonnegative orthant is easy.

References

Drusvyatskiy, D., Ioffe, A.D., Lewis, A.S.: Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria. ArXiv e-prints (2016). ArXiv:1610.03446
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Appl. Math. 16(6), 964–979 (1979)
MathSciNet MATH Google Scholar
Lewis, A., Wright, S.: A proximal method for composite minimization. Math. Program. 158(1–2), 501–546 (2016)
Article MathSciNet MATH Google Scholar
Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. ArXiv e-prints (2016). ArXiv:1602.06661
Noll, D., Prot, O., Apkarian, P.: A proximity control algorithm to minimize nonsmooth and nonconvex functions. Pac. J. Optim. 4(3), 571–604 (2008)
MathSciNet MATH Google Scholar
Noll, D.: Convergence of non-smooth descent methods using the Kurdyka–Łojasiewicz inequality. J. Optim. Theory Appl. 160(2), 553–572 (2013)
Article MATH Google Scholar
Bonettini, S., Loris, I., Porta, F., Prato, M.: Variable metric inexact line-search based methods for nonsmooth optimization. SIAM J. Optim. 26(2), 891–921 (2016)
Article MathSciNet MATH Google Scholar
Burg, J.: The relationship between maximum entropy spectra and maximum likelihood spectra. Geophysics 37(2), 375–376 (1972)
Article Google Scholar
Bauschke, H., Borwein, J.: Legendre functions and the method of random Bregman projections. J. Convex Anal. 4(1), 27–67 (1997)
MathSciNet MATH Google Scholar
Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)
Article MathSciNet MATH Google Scholar
Bauschke, H., Borwein, J., Combettes, P.: Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces. Commun. Contemp. Math. 3(4), 615–647 (2001)
Article MathSciNet MATH Google Scholar
Chen, G., Teboulle, M.: Convergence analysis of proximal-like minimization algorithm using bregman functions. SIAM J. Optim. 3, 538–543 (1993)
Article MathSciNet MATH Google Scholar
Bauschke, H., Borwein, J., Combettes, P.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42(2), 596–636 (2003)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42(2), 330–348 (2016)
Article MathSciNet MATH Google Scholar
Nguyen, Q.: Forward-backward splitting with Bregman distances. Vietnam J. Math. 45(3), 519–539 (2017)
Article MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Svaiter, B.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137(1–2), 91–129 (2013). https://doi.org/10.1007/s10107-011-0484-9
Article MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014). https://doi.org/10.1007/s10107-013-0701-9
Article MathSciNet MATH Google Scholar
Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. Soc. Ind. Appl. Math. 11, 431–441 (1963)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2011)
Book MATH Google Scholar
Rockafellar, R.T., Wets, R.B.: Variational Analysis, vol. 317. Springer, Heidelberg (1998). https://doi.org/10.1007/978-3-642-02431-3
Book MATH Google Scholar
Ochs, P., Dosovitskiy, A., Brox, T., Pock, T.: On iteratively reweighted algorithms for nonsmooth nonconvex optimization in computer vision. SIAM J. Imaging Sci. 8(1), 331–372 (2015)
Article MathSciNet MATH Google Scholar
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. MIT Press, Cambridge (1986)
MATH Google Scholar
Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004)
Article MathSciNet MATH Google Scholar
Combettes, P., Dũng, D., Vũ, B.: Dualization of signal recovery problems. Set-Valued Var. Anal. 18(3–4), 373–404 (2010)
Article MathSciNet MATH Google Scholar
Bertero, M., Boccacci, P., Desiderà, G., Vicidomini, G.: Image deblurring with Poisson data: from cells to galaxies. Inverse Probl. 25(12), 123,006 (2009)
Article MathSciNet MATH Google Scholar
Zanella, R., Boccacci, P., Zanni, L., Bertero, M.: Efficient gradient projection methods for edge-preserving removal of Poisson noise. Inverse Probl. 25(4) (2009)
Vardi, Y., Shepp, L., Kaufman, L.: A statistical model for positron emission tomography. J. Am. Stat. Assoc. 80(389), 8–20 (1985)
Article MathSciNet MATH Google Scholar
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984)
Article MATH Google Scholar
Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge (1987)
Book Google Scholar
Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math. 42, 577–685 (1989)
Article MathSciNet MATH Google Scholar
Cichocki, A., Zdunek, R., Phan, A., Amari, S.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. Wiley, New York (2009)
Book Google Scholar
Chaudhuri, S., Velmurugan, R., Rameshan, R.: Blind Image Deconvolution. Springer, New York (2014)
MATH Google Scholar
Starck, J.L., Murtagh, F., Fadili, J.: Sparse Image and Signal Processing: Wavelets, Curvelets, Morphological Diversity, 2nd edn. Cambridge University Press, Cambridge (2015)
Book MATH Google Scholar
Xu, Y., Li, Z., Yang, J., Zhang, D.: A survey of dictionary learning algorithms for face recognition. IEEE Access 5, 8502–8514 (2017). https://doi.org/10.1109/ACCESS.2017.2695239
Article Google Scholar
Ahmed, A., Recht, B., Romberg, J.: Blind deconvolution using convex programming. IEEE Trans. Inf. Theory 60(3), 1711–1732 (2014)
Article MathSciNet MATH Google Scholar
Lee, D., Seung, H.: Learning the part of objects from nonnegative matrix factorization. Nature 401, 788–791 (1999)
Article MATH Google Scholar
Michelot, C.: A finite algorithm for finding the projection of a point onto the canonical simplex of $\mathbb{R}^n$. J. Optim. Theory Appl. 50, 195–200 (1986)
Article MathSciNet MATH Google Scholar
Olshausen, B., Field, D.: Sparse coding with an overcomplete basis set: a strategy employed by V1? Vis. Res. 37, 3311–3325 (1996)
Article Google Scholar
Hoyer, P.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)
MathSciNet MATH Google Scholar
Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Introductory lectures on convex optimization: A basic course. Applied optimization, vol. 87. Kluwer Academic Publishers, Boston, MA (2004)
Ochs, P., Chen, Y., Brox, T., Pock, T.: iPiano: inertial proximal algorithm for non-convex optimization. SIAM J. Imaging Sci. 7(2), 1388–1419 (2014)
Article MathSciNet MATH Google Scholar
Liang, J., Fadili, J., Peyré, G.: A multi-step inertial forward–backward splitting method for non-convex optimization. arXiv:1606.02118 [math] (2016)
Wen, B., Chen, X., Pong, T.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27(1), 124–145 (2017)
Article MathSciNet MATH Google Scholar
Drusvyatskiy, D., Kempton, C.: An accelerated algorithm for minimizing convex compositions. ArXiv e-prints (2016). ArXiv:1605.00125 [math]
Kurdyka, K.: On gradients of functions definable in o-minimal structures. Annales de l’institut Fourier 48(3), 769–783 (1998)
Article MathSciNet MATH Google Scholar
Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. In: Les Équations aux Dérivées Partielles, pp. 87–89. Éditions du centre National de la Recherche Scientifique, Paris (1963)
Łojasiewicz, S.: Sur la géométrie semi- et sous- analytique. Annales de l’institut Fourier 43(5), 1575–1595 (1993)
Article MATH MathSciNet Google Scholar
Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2006). https://doi.org/10.1137/050644641
Article MathSciNet MATH Google Scholar
Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

P. Ochs acknowledges funding by the German Research Foundation (DFG Grant OC 150/1-1).

Author information

Authors and Affiliations

Saarland University, Saarbrücken, Germany
Peter Ochs
Normandie Université ENSICAEN, CNRS, GREYC, Caen, France
Jalal Fadili
University of Freiburg, Freiburg, Germany
Thomas Brox

Authors

Peter Ochs
View author publications
You can also search for this author in PubMed Google Scholar
Jalal Fadili
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Brox
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Ochs.

Additional information

Communicated by Jérôme Bolte.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ochs, P., Fadili, J. & Brox, T. Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms. J Optim Theory Appl 181, 244–278 (2019). https://doi.org/10.1007/s10957-018-01452-0

Download citation

Received: 12 February 2018
Accepted: 24 November 2018
Published: 06 December 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s10957-018-01452-0

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accelerated Bregman proximal gradient methods for relatively smooth convex optimization

Gradient regularization of Newton method with Bregman distances

An inexact regularized proximal Newton method without line search

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accelerated Bregman proximal gradient methods for relatively smooth convex optimization

Gradient regularization of Newton method with Bregman distances

An inexact regularized proximal Newton method without line search

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation