Abstract
In this paper, we propose a hybrid Bregman alternating direction method of multipliers for solving the linearly constrained difference-of-convex problems whose objective can be written as the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function and a continuous concave function. At each iteration, we choose either subgradient step or proximal step to evaluate the concave part. Moreover, the extrapolation technique was utilized to compute the nonsmooth convex part. We prove that the sequence generated by the proposed method converges to a critical point of the considered problem under the assumption that the potential function is a Kurdyka–Łojasiewicz function. One notable advantage of the proposed method is that the convergence can be guaranteed without the Lischitz continuity of the gradient function of concave part. Preliminary numerical experiments show the efficiency of the proposed method.
Similar content being viewed by others
References
An, L.T.H., Belghiti, M.T., Tao, P.D.: A new efficient algorithm based on DC programming and DCA for clustering. J. Global Optim. 37(4), 593–608 (2007)
Attouch, H., Redont, P., Soubeyran, A.: A new class of alternating proximal minimization algorithms with costs-to-move. SIAM J. Optim. 18(3), 1061–1081 (2007)
Attouch, H., Bolte, J.: On the convergence of the proximal algorithms for nonsmooth functions involving analytic features. Math. Program. 116(1–2), 5–16 (2009)
Bai, M.R., Zhang, X.J., Shao, Q.Q.: Adaptive correction procedure for TVL1 image deblurring under impulse noise. Inverse Probl. 32(8), 085004 (2016)
Banert, S., Bot, R.I.: A general double-proximal gradient algorithm for d.c. programming. Math. Program. (2018). https://doi.org/10.1007/s10107-018-1292-2
Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, New York (2011)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problem. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)
Becker, S., Bobin, J., Candès, E.: NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2009)
Bolte, J., Sabach, S., Teboule, M.: Proximal alternating linerized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
Bot, R.I., Csetnek, E.R.: An inertial Tseng’s type proximal algorithm for nonsmooth and nonconvex optimization problems. J. Optim. Theory Appl. 171(2), 600–616 (2016)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Echstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Bredies, K., Lorenz, D.A., Reiterer, S.: Minimization of nonsmooth, nonconvex functionals by iterative thresholding. J Optim. Theory Appl. 165(1), 78–112 (2015)
Cai, J., Chan, R.H., Shen, L., Shen, Z.: Convergence analysis of tight framelet approach for missing data recovery. Adv. Comput. Math. 31(1), 87–113 (2009)
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)
Eckstein, J., Yao, W.: Relative-error approximate versions of Douglas Rachford splitting and special cases of the ADMM. Math. Program. 170(2), 417–444 (2018)
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2, 17–40 (1976)
Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Fortin, M., Glowinski, R. (eds.) Augmented Lagrangian Methods Applications to the Numerical Solution of Boundary-Value Problems, pp. 299–331. North-Holland, Amsterdam (1983)
Gasso, G., Rakotomamonjy, A., Canu, S.: Recovering sparse signals with a certain family of nonconvex penalties and DC programming. IEEE Trans. Signal Process. 57(12), 4686–4698 (2009)
Geremew, W., Nam, N.M., Semenova, A., Boginski, V., Pasiliao, E.: A DC programming approach for solving multicast network design problems via the Nesterov smoothing technique. J. Global Optim. 72(4), 705–729 (2018)
Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for sparse optimization problems. Math. Program. 169(1), 141–176 (2018)
Gonçalves, M.L.N., Melo, J.G., Monteiro, R.D.C.: Convergence rate bounds for a proximal ADMM with over-relaxation stepsize parameter for solving nonconvex linearly constrained problems (2017). arXiv preprint arXiv:1702.01850v2
Guo, K., Han, D.R., Wu, T.T.: Convergence of ADMM for optimization problem nonseparable nonconvex objective and linear constraints. Int. J. Comput. Math. 94(8), 1653–1669 (2017)
Han, D.R., Yuan, X.M.: Local linear convergence of the alternating direction method of multipliers for quadratic programs. SIAM J. Numer. Anal. 51(6), 3446–3457 (2013)
Hansen, P.C., Nagy, J.G., OĹeary, D.P.: Deblurring Images: Matrices, Spectra, and Filtering. SIAM, Philadelphia (2006)
He, B.S., Yuan, X.M.: On the \(O(1/n)\) convergence rate of the Douglas–Rachford alternating direction method. SIAM J. Numer. Anal. 50(2), 700–709 (2012)
Li, G.Y., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)
Liavas, A.P., Sidiropoulos, N.D.: Parallel algorithms for constrained tensor factorization via alternating direction method of multipliers. IEEE Trans. Signal. Process. 63(20), 5450–5463 (2015)
Liu, T.X., Pong, T.K., Takeda, A.: A refined convergence analysis of \(\text{ pDCA }_{{e}}\) with applications to simultaneous sparse recovery and outlier detection. Comput. Optim. Appl. 73(1), 69–100 (2019)
Liu, Q.H., Shen, X.Y., Gu, Y.T.: Lineralized ADMM for non-convex non-smooth optimization with convergence analysis (2017). arXiv preprint arXiv:1705.02502
Lou, Y.F., Yin, P.H., Xin, J.: Point source super-resolution via non-convex \(l_1\) based methods. J. Sci. Comput. 68(3), 1082–1100 (2016)
Lou, Y.F., Yan, M.: Fast \(l_{1}\)-\(l_{2}\) minimization via a proximal operator. J. Sci. Comput. 74(2), 767–785 (2018)
Lou, Y.F., Zeng, T.Y., Osher, S., Xin, J.: A weighted difference of anisotropic and isotropic total variation model for image processing. SIAM J. Imaging Sci. 8(3), 1798–1823 (2015)
Lu, Z.S., Li, X.R.: Sparse recovery via partial regularization: models, theory, and algorithms. Math. Oper. Res. 43(4), 1290–1316 (2018)
Lu, Z.S., Zhou, Z.R., Sun, Z.: Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization. Math. Program. 176(1–2), 369–401 (2019)
Maingé, P.E., Moudafi, A.: Convergence of new inertial proximal methods for DC programming. SIAM J. Optim. 19(1), 397–413 (2008)
Mordukhovich, B.S., Nam, N.M., Yen, N.D.: Fréchet subdifferential calculus and optimality conditions in nondifferentiable programming. Optimization 55(5–6), 685–708 (2006)
Nesterov, Y.: Introductory Lectures on Convex Optimization. A Basic Course. Kluwer, Boston (2004)
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation. I: Basic Theory, II: Applications. Springer, Berlin (2006)
Pratt, W.K.: Digital Image Processing: PIKS Scientific Inside. Wiley, Hoboken (2001)
Parihk, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2013)
Rockafellar, R.T., Wets, R.: Variational Analysis. Grundlehren Math., Wiss., vol. 317. Springer, Berlin (1998)
Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D. 60(1–4), 259–268 (1992)
Souza, J.C.O., Oliveira, P.R.: A proximal point algorithm for DC fuctions on Hadamard manifolds. J. Global Optim. 63(4), 797–810 (2015)
Sun, T., Yin, P.H., Cheng, L.Z., Jiang, H.: Alternating direction method of multipliers with difference of convex functions. Adv. Comput. Math. 44, 723–744 (2018)
Tao, P.D., An, L.T.H.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)
Tao, P.D., An, L.T.H.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
Wang, F.H., Xu, Z.B., Xu, H.K.: Convergence of Bregman alternating direction method with multipliers for nonconvex composite problems (2014). arXiv preprint arXiv:1410.8625
Wang, H.F., Kong, L.C., Tao, J.Y.: The linearized alternating direction method of multipliers for sparse group LAD model. Optim. Lett. 13, 505–525 (2019)
Wu, Z.M., Li, M., Wang, D.Z.W., Han, D.R.: A symmetric alternating direction method of multipliers for separable nonconvex minimization problems. Asia Pac. J. Oper. Res. 34, 1750030 (2017)
Wang, Y., Yao, W.T., Zeng, J.S.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78, 29–63 (2019)
Wen, B., Chen, X.J., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)
Yang, L., Pong, T.K., Chen, X.J.: Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction. SIAM J. Imaging Sci. 10(1), 74–110 (2017)
Yin, P.H., Liu, Y.F., He, Q., Xin, J.: Minimization of \(l_{1-2}\) for compressed sensing. SIAM J. Sci. Comput. 37(1), 536–563 (2015)
Zhang, T.: Some sharp performance bounds for the least squares regression with \(l_1\) regularization. Ann. Stat. 37(5A), 2109–2144 (2009)
Acknowledgements
This research was supported by the National Natural Science Foundation of China Grants 11801161, 61179033 and 11771003, and Natural Science Foundation of Hunan Province of China Grant 2018JJ3093. The authors are very grateful to Beijing Innovation Center for Engineering Science and Advanced Technology, Peking University and Beijing University of Technology for their joint project support. The first author Kai Tu would like to thank Prof. Penghua Yin from University of California, Los Angeles for providing the codes of [45], and Dr. Wenxing Zhang from University of Electronic Science and Technology of China and Dr. Benxing Zhang from Guilin University of Electronic Technology for the advices and discussions on the codes of the total variation image restoration problem. The authors also take this opportunity to thank the anonymous referees for their patient and valuable comments, which improved the quality of this paper greatly.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Proposition 2
Proof
Clearly, problem (51) is the special case of problem (1)–(2) with \(f_1(x)= \rho \Vert x\Vert _1\), \(f_2(x)=\rho \Vert x\Vert _2\) and \(g(y)=\frac{1}{2}\Vert Hy-y^0\Vert ^2\), \(A={\mathcal {I}}\), \(B=- K\) and \(b=\mathbf {0}\). Moreover, \(f_1\), \(f_2\), g, A and B satisfy Assumption 1 (a)–(d). It follows from the choice of \(\phi \) and \(\psi \) that \(L_{\phi }=r_1\), \(L_{\psi }=r_2\), \(\upsilon _{\phi }=r_1\) and \(\upsilon _{\psi }=r_2\). By simple computations, we have that \(b_1>0\) and \(b_2>0\). It follows from \(\sigma < \frac{1}{2\Vert H\Vert ^2}\) that
It follows that for any \(k\ge 1\),
where \(a_{1} =(\frac{1}{2} -\sigma \Vert H\Vert ^2)\), \(z=\frac{1}{t_0}H^* y^{0}\) and \(t_1= a_1 (\Vert y_0\Vert ^2 - t_0\Vert z\Vert ^2 ) \) with \(t_0=\lambda _{\min }(H^*H)\), the first inequality follows from (48), the second inequality is from \(\inf _{x} \{\Vert x\Vert _1 - \Vert x\Vert _2\} \ge 0\) and (53). Since H has full column rank, it follows from (54) that the sequences \(\{ y_k \}_{k\in {\mathbb {N}}}\) and \(\{ x_k -K y_k - \frac{\lambda _k}{\beta }\}_{k\in {\mathbb {N}}}\) are bounded, which together with (47) implies the sequences \(\{ \lambda _k \}_{k\in {\mathbb {N}}}\) and \(\{ x_k \}_{k\in {\mathbb {N}}}\) are bounded. Thus, the sequence \(\{\omega _k\}_{k\in {\mathbb {N}}}\) is bounded. Now, we point that for this problem, the potential function \(\varTheta (\xi ,x,{\tilde{x}},y,{\tilde{y}},\lambda )\), defined in (10), is a KL function. In fact, it follows from the definition of \(f_1\), \(f_2\) and \(\varTheta \) that
where \(I_{\varOmega }(\xi )\) is the indicator function of closed convex set \(\varOmega =\{\xi \in {\mathbb {R}}^{n_1}\mid \Vert \xi \Vert ^2 \le \rho ^2\}\). Clearly, \(\varOmega \) is a semi-algebraic set. By [10], we know that indicator function of semi-algebraic set is semi-algebraic, and that \(\Vert \cdot \Vert _{p}\) is semi-algebraic whenever p is rational, i.e., \(p=\frac{p_1}{p_2}\) where \(p_1\) and \(p_2\) are positive integers. Using the fact that finite sums of semi-algebraic functions is a semi-algebraic function, it yields that \(\varTheta \) is a semi-algebraic function, and hence it is a KL function. Note that all assumptions in Theorem 2 hold, which means the conclusion holds. \(\square \)
Proposition 7
Consider the total variation image restoration problem [45]:
where \(\rho >0\) is a regularization parameter, H is a blurred operator, \(K:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}^{2n}\) is the discrete gradient operator. Suppose that \(({\bar{x}}, {\bar{y}})\) is a local minimum of problem (55), then there exists \({\bar{\lambda }}\) such that \(({\bar{x}}, {\bar{y}}, {\bar{\lambda }})\) is a critical point of problem (55), i.e., \(({\bar{x}}, {\bar{y}}, {\bar{\lambda }})\) satisfy the inclusion (3).
Proof
We note that problem (55) is equivalent to the following problem
Since \(({\bar{x}}, {\bar{y}})\) is a local minimum of problem (55), it yields that \({\bar{x}}=K{\bar{y}}\), and that \({\bar{y}}\) is a local minimum of problem (56). It follows from Lemma 1 (b) that
where \(H^*\) is the adjoint operator of H. Since \(\Vert \cdot \Vert _1\) is a proper convex function and \(\Vert \cdot \Vert _2\) is a continuous convex function, it follows from the Corollary 3.4 in [37] and (57) that
Note that \(\partial (\rho \Vert \cdot \Vert _1\circ K) ({\bar{y}}) = \rho K^{*}\partial \Vert K{\bar{y}}\Vert _1 \) and \(\partial (\rho \Vert \cdot \Vert _2\circ K) ({\bar{y}}) = \rho K^{*}\partial \Vert K{\bar{y}}\Vert _2\), where \(K^*\) is the adjoint operator of K. Take \({\bar{\xi }}_1 \in \rho \partial \Vert {\bar{x}}\Vert _1 \) and \({\bar{\xi }}_2 \in \rho \partial \Vert {\bar{x}}\Vert _2 \), such that \(K^{*}({\bar{\xi }}_1 -{\bar{\xi }}_2) + H^{*}(H{\bar{y}}-y^0)= \mathbf {0}\). Letting \({\bar{\lambda }}= {\bar{\xi }}_1 -{\bar{\xi }}_2 \), it yields that
This completes the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Tu, K., Zhang, H., Gao, H. et al. A hybrid Bregman alternating direction method of multipliers for the linearly constrained difference-of-convex problems. J Glob Optim 76, 665–693 (2020). https://doi.org/10.1007/s10898-019-00828-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-019-00828-4