Abstract
In this work we propose a new splitting technique, namely Asymmetric Forward–Backward–Adjoint splitting, for solving monotone inclusions involving three terms, a maximally monotone, a cocoercive and a bounded linear operator. Our scheme can not be recovered from existing operator splitting methods, while classical methods like Douglas–Rachford and Forward–Backward splitting are special cases of the new algorithm. Asymmetric preconditioning is the main feature of Asymmetric Forward–Backward–Adjoint splitting, that allows us to unify, extend and shed light on the connections between many seemingly unrelated primal-dual algorithms for solving structured convex optimization problems proposed in recent years. One important special case leads to a Douglas–Rachford type scheme that includes a third cocoercive operator.
Similar content being viewed by others
Notes
C is \(\beta \)-cocoercive with respect to the P norm if for some \(\beta \in ]0,+\infty [\) the following holds
$$\begin{aligned} (\forall z\in \mathscr {H})(\forall z^{\prime }\in \mathscr {H})\quad \langle Cz-Cz^{\prime },z-z^{\prime }\rangle \ge \beta \Vert Cz-Cz^{\prime }\Vert _{P^{-1}}^{2}. \end{aligned}$$The sequence \((x_{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) converges to \(x^{\star }\) R-linearly if there is a sequence of nonnegative scalars \(({v_n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) such that \(\Vert x_{n}-x^{\star }\Vert \le v_n\) and \((v_n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) converges Q-linearly (see footnote 3) to zero.
The sequence \((x_{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) converges to \(x^{\star }\) Q-linearly with Q-factor given by \(\sigma \in ]0,1[\), if for n sufficiently large \(\Vert x_{n+1}-x^{\star }\Vert \le \sigma \Vert x_{n}-x^{\star }\Vert \) holds.
References
Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, Berlin (2011)
Bianchi, P., Hachem, W., Iutzeler, F.: A coordinate descent primal-dual algorithm and application to distributed asynchronous optimization. IEEE Trans. Autom. Control 61(10), 2947–2957 (2016). doi:10.1109/TAC.2015.2512043
Boţ, R.I., Csetnek, E.R., Hendrich, C.: Recent developments on primal-dual splitting methods with applications to convex minimization. In: Pardalos, P.M., Rassias T.M. (eds.) Mathematics Without Boundaries, pp. 57–99. Springer (2014)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Briceño-Arias, L.M.: Forward-Douglas–Rachford splitting and forward-partial inverse method for solving monotone inclusions. Optimization 64(5), 1239–1261 (2015)
Briceño-Arias, L.M., Combettes, P.L.: A monotone + skew splitting model for composite monotone inclusions in duality. SIAM J. Optim. 21(4), 1230–1250 (2011)
Cai, X., Han, D., Yuan, X.: The direct extension of admm for three-block separable convex minimization models is convergent when one function is strongly convex. Optim. Online 229, 230 (2014)
Cegielski, A.: Iterative methods for fixed point problems in Hilbert spaces, vol. 2057. Springer, Berlin (2012)
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J Math Imaging Vision 40(1), 120–145 (2011)
Chen, C., He, B., Ye, Y., Yuan, X.: The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math. Program. 155(1–2), 57–79 (2016)
Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Fixed-point Algorithms for Inverse Problems in Science and Engineering, pp. 185–212. Springer, New York (2011)
Combettes, P.L., Pesquet, J.C.: Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators. Set Valued Var. Anal. 20(2), 307–330 (2012)
Combettes, P.L., Pesquet, J.C.: Stochastic quasi-fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25(2), 1221–1248 (2015)
Combettes, P.L., Yamada, I.: Compositions and convex combinations of averaged nonexpansive operators. J. Mathe. Anal. Appl. 425(1), 55–70 (2015)
Condat, L.: A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 158(2), 460–479 (2013)
Davis, D.: Convergence rate analysis of primal-dual splitting schemes. SIAM J. Optim. 25(3), 1912–1943 (2015)
Davis, D., Yin, W.: Convergence rate analysis of several splitting schemes. arXiv preprint arXiv:1406.4834 (2014)
Davis, D., Yin, W.: A three-operator splitting scheme and its optimization applications. arXiv preprint arXiv:1504.01032 (2015)
Dong, Y.: An ls-free splitting method for composite mappings. Appl. Math. Lett. 18(8), 843–848 (2005)
Dontchev, A.L., Rockafellar, R.T.: Regularity and conditioning of solution mappings in variational analysis. Set Valued Anal. 12(1–2), 79–109 (2004)
Dontchev, A.L., Rockafellar, R.T.: Implicit functions and solution mappings. Springer, New York (2009)
Drori, Y., Sabach, S., Teboulle, M.: A simple algorithm for a class of nonsmooth convex–concave saddle-point problems. Oper. Res. Lett. 43(2), 209–214 (2015)
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)
Han, D., Yuan, X.: A note on the alternating direction method of multipliers. J. Optim. Theory Appl. 155(1), 227–238 (2012)
He, B., Yuan, X.: Convergence analysis of primal-dual algorithms for a saddle-point problem: from contraction perspective. SIAM J. Imaging Sci. 5(1), 119–149 (2012)
Li, M., Sun, D., Toh, K.C.: A convergent 3-block semi-proximal ADMM for convex minimization problems with one strongly convex block. Asia Pac. J. Oper. Res. 32, 1550024 (2015)
Lin, T., Ma, S., Zhang, S.: Iteration complexity analysis of multi-block admm for a family of convex minimization without strong convexity. J. Sci. Comput. 69(1), 52–81 (2016)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
Moreau, J.J.: Proximité et dualité dans un espace Hilbertien. Bull. de la Société mathématique de Fr. 93, 273–299 (1965)
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1, 123–231 (2013)
Pesquet, J.C., Repetti, A.: A class of randomized primal-dual algorithms for distributed optimization. J. Nonlinear Convex Anal. 16(12), 2453–2490 (2015)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1997)
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer, Berlin (2009)
Solodov, M.V., Tseng, P.: Modified projection-type methods for monotone variational inequalities. SIAM J. Control Optim. 34(5), 1814–1830 (1996)
Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Control Optim. 29(1), 119–138 (1991)
Tseng, P.: A modified forward–backward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38(2), 431–446 (2000)
Vũ, B.C.: A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 38(3), 667–681 (2013)
Acknowledgements
Funding was provided by KU Leuven (Grant No. BOF/STG-15-043).
Author information
Authors and Affiliations
Corresponding authors
Appendix
Appendix
The proofs omitted in the main text are listed below.
Proof of Propostition 4.1
(i): Let us start by noting that P and S defined in (57) are special cases of (65) and (75) when \(L=\mathrm{Id}\), \(\gamma _1=\gamma _2^{-1}=\gamma \) and, as such, are strongly positive if \(\theta \in [0,2[\) (see Lemma 5.1 and 5.3). Next, consider step 3 of Algorithm 1 and Substitute the parameters defined in (57), to derive
where D is defined in (5) and by construction is strongly positive. Let \(\lambda _n\) be equal to the inverse of the right hand side in (92) multiplied by \(\rho _n\), where \((\rho _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is uniformly bounded in the interval ]0, 2[. This would result in \(\alpha _n=\rho _n\) and simplify the iterations. Our goal is to establish the convergence using Theorem 3.1. Therefore, we must show that the condition (7) is satisfied when the sequence \((\lambda _{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is selected as described. We do this by showing that \((\lambda _{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\subseteq [\nu _1,2-\frac{1}{2\beta }-\nu _2]\), for some \(\nu _1,\nu _2>0\).
From strong positivity of D, boundedness of P and the fact that \((\rho _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is uniformly bounded above 0, it follows that \((\lambda _{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\subseteq [\nu _1,\infty [\) for some positive \(\nu _1\). Consider C defined in (56c), we have \(\Vert Cz-Cz^{\prime }\Vert ^2_{P^{-1}} = \gamma \left( 1-\tfrac{1}{4}\theta ^2\right) ^{-1}\Vert Cz-Cz^{\prime }\Vert ^2\). Therefore, it follows that C is \(\beta \)-cocoercive with respect to \(\Vert \cdot \Vert _P\), with \(\beta =\eta \gamma ^{-1}(1-\frac{1}{4}\theta ^2)\). Let \(\nu _2\) be a positive parameter such that
holds for all \(n\in \mathrm{{I}}\!\mathrm{{N}}\). It’s easy to verify that such \(\nu _2\) exists as long as \((\rho _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is uniformly bounded in the interval (59). For brevity, we define \(\beta ^\prime \) such that \(\frac{1}{2\beta ^\prime }=\frac{1}{2\beta }+\nu _2\). Additionally, introduce the notation \(\upsilon =\rho _n^{-1}(2-\frac{1}{2\beta ^\prime })\), \(\omega _{1}=\upsilon \theta +2(1-\theta )\), \(\omega _{2}=\theta ^2-3\theta +3\). We proceed by showing that \((\lambda _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is bounded above by \(2-\frac{1}{2\beta }-\nu _2\). A sufficient condition for this to hold is
Apply the Fenchel-Young inequality for \(\frac{\epsilon }{2}\Vert \cdot \Vert ^2\) to lower bound \(\xi \):
where \(|\cdot |\) is the absolute value. It follows from (93) that \(\upsilon >1\) for all \(n\in \mathrm{{I}}\!\mathrm{{N}}\). Let \(\epsilon =\frac{|\omega _{1}|}{2\gamma (\upsilon -1)}\) so that the term involving \(\Vert \tilde{y}_n\Vert ^2\) in (95) disappears. Therefore, we obtain that
is sufficient for (94) to hold. By substituting \(\omega _{1}\) and \(\omega _{2}\) and after some algebraic manipulations we find that (93) is sufficient for the right hand side in (96) to be positive. Consequently, Theorem 3.1 completes the proof of convergence. We showed that we can set \(\alpha _n=\rho _n\) by choosing \((\lambda _{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) appropriately. Algorithm 2 follows by setting \(\alpha _n=\rho _n\), a change of variables \(s_n=x_n-\gamma y_n\), substituting \(x_{n+1}\) and application of Moreau’s identity.
(ii): Mimic the proof of (i), but use Theorem 3.1 with \(C\equiv 0\), by showing that \(\lambda _n\) is uniformly bounded between 0 and 2.
(iii): When \(\theta =2\), we have \(P\in \mathscr {S}(\mathscr {K})\), \(P\succeq 0\). It follows from (56a), (58) and Lemma 3.1 that \((H+A)^{-1}P\) is continuous. Therefore, since \(F\equiv 0\), by appealing to Theorem 3.4 and following the same change of variables as in previous parts the assertion is proved. \(\square \)
Proof of Proposition 4.2
Building upon the argument in the proof of Proposition 4.1(iii) and Theorem 3.4(ii) we have
and \(\Vert Pz_{n+1}-Pz_n\Vert ^{2}=o(1/(n+1))\), where \(z_n= \left( x_n,\gamma ^{-1}(x_n-s_n)\right) \). Combine this with definition of P, (57a) with \(\theta =2\), to derive
Furthermore, simple calculation shows that \(P^2=(\gamma +\gamma ^{-1})P\). Hence for all \(z\in \mathscr {K}\)
where we used Fenchel-Young inequality for \(\frac{\gamma }{2}\Vert \cdot \Vert ^2\). It follows from (99) that \(\Vert P\Vert \le (\gamma +\gamma ^{-1})\). Combining this with (97) and (98) completes the proof. \(\square \)
Proof of Lemma 5.2
For every z and \(z_1\) in \(\mathscr {K}\) we have
It follows from Lemma 5.1 that \(P\in \mathscr {S}_\tau (\mathscr {K})\) with \(\tau \) defined in (69). Therefore, we have \(\Vert P^{-1}\Vert \le \tau ^{-1}\). Using (100), we have
Hence, C is \(\beta \)-cocoercive with respect to \(\Vert \cdot \Vert _P\), where \(\beta =\tau \min \{\beta _{l}^{-1},\beta _{h}^{-1}\}\). In the case when \(l=\iota _{\{0\}}\), we have \(\nabla l^{*}=0\) and it is possible to derive less conservative parameters. Define the linear operator \(Q:(x,y)\rightarrow (x,0)\), then,
where we used (65). For \(\beta _{h}\in ]0,+\infty [\), considering the above inequality and the fact that \(Cz=(\nabla h(x),0)\) we obtain
where the last inequality follows from the Baillon-Haddad theorem [1, Corollary 18.16]. This shows that C is \(\beta \)-cocoercive with respect to \(\Vert \cdot \Vert _P\), with \(\beta \) defined in (72).
Proof of Lemma 5.3
We start by showing that \(S_2\) is strongly positive. Lemma 5.1 asserts that \(P\in \mathscr {S}_\tau (\mathscr {K})\) with \(\tau \) defined in (69). Simple calculations show that \(\tau \le \min \{\gamma _1^{-1},\gamma _2^{-1}\}\). If \(\theta \in [2,\infty [\) define a constant c, such that \(0<c<\tau \), otherwise if \(\theta \in [0,2[\) set c such that \(0<c<\min \left\{ \tau ,\frac{(2-\theta )\gamma _{2}^{-1}}{4}\right\} \). Then for all \(z=(x,y)\in \mathscr {K}\), \(\langle (P-c\mathrm{Id})z,z \rangle >0\) which implies
It follows from (74b) that
where the last line follows from Fenchel-Young inequality for \(\frac{\epsilon }{2}\Vert \cdot \Vert ^2\). Let \(\epsilon =\gamma _{2}^{-1}-c\) so that the term involving \(\Vert y\Vert ^2\) disappears, and use (101) to derive
It follows from the choice of c for both \(\theta \in [0,2[\) and \(\theta \in [2,\infty [\) that the last term in (102) is nonnegative and hence, \(S_{2}\in \mathscr {S}_c(\mathscr {K})\). The analysis for \(S_1\) is similar and therefore has been omitted. Hence, it follows that \(S_1^{-1}\) and \(S_2^{-1}\) are strongly positive and so is their convex combination and its inverse, S. Furthermore, it was shown in (6) that if S and P are strongly positive, K skew-adjoint and M monotone, then \(D\in \mathscr {S}_\nu (\mathscr {K})\) for some \(\nu \in ]0,\infty [\), where D was defined in (5). We conclude by noting that D in (76) can be derived by substituting S in (5).\(\square \)
Proof of Proposition 5.2
Algorithm 4 is a special case of Algorithm 3 and consequently of Algorithm 1. Let us establish the proof by directly showing that Theorem 3.1 is applicable with \((\lambda _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) selected such that \(\alpha _{n}=1\), while satisfying (7). For all (i)–(iii), strong positivity of the operators P, S, and D follow from Lemma 5.1 and 5.3. Set
Such a choice would yield \(\alpha _n=1\). A sufficient condition for (7) to hold is to show that \((\lambda _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is uniformly bounded between 0 and \(\delta \). From strong positivity of D and boundedness of P it follows that \((\lambda _{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\subseteq [\nu _1,\infty [\) for some positive \(\nu _1\). Denote by \(\beta \) the cocoercivity constant of C with respect to \(\Vert \cdot \Vert _P\). We proceed by showing that \((\lambda _{n})_{n\in \mathrm{{I}}\!\mathrm{{N}}}\subseteq [\nu _1,2-\frac{1}{2\beta }-\nu _2]\). Assume that
Then, there exists a positive constant \(\nu _2\) such that
It follows from (103) that
where in the second inequality we used (105) to upper bound both \(\gamma _{2}\Vert L\Vert ^2\) and \(\gamma _{1}\Vert L\Vert ^2\). We showed that (104) is sufficient for (7) to hold. Consequently, Theorem 3.1 completes the proof of convergence.
-
(i)
: By Lemma 5.2, substitute \(\beta \) in (104) with (71) to derive (83). The algorithm follows by substituting \(\alpha _{n}=1\) in Algorithm 3 when \(\theta =0\) and \(\mu =\frac{1}{2}\).
-
(ii)
: Similar to (i), since \(l=\iota _{\{0\}}\), substitute \(\beta \) in (104) with (72), i.e. \(\beta =\beta _{h}^{-1}\gamma _{1}^{-1}\). This choice of \(\beta \) results in (84) and the algorithm follows.
-
(iii)
: The proof is similar to previous parts, but we show that \((\lambda _n)_{n\in \mathrm{{I}}\!\mathrm{{N}}}\) is uniformly bounded between 0 and 2, and then use Theorem 3.1 with \(C\equiv 0\).\(\square \)
Rights and permissions
About this article
Cite this article
Latafat, P., Patrinos, P. Asymmetric forward–backward–adjoint splitting for solving monotone inclusions involving three operators. Comput Optim Appl 68, 57–93 (2017). https://doi.org/10.1007/s10589-017-9909-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-017-9909-6