APPENDIX A
1.1 PROOF OF THEOREM 1
Before proving the theorem we introduce the following lemmas.
Lemma 1 (Lemma 2.4 from [32]). Let \(\mathcal{F} = ({{\mathcal{F}}_{k}}{{)}_{{k \geqslant 0}}}\) be a filtration and (uk) a stochastic process adapted to \(\mathcal{F}\) with \(\mathbb{E}[{{u}^{{k + 1}}}|{{F}_{k}}] = 0\). Then for any \(K \in \mathbb{N}\), \({{x}^{0}} \in X\) and any compact set \(\mathcal{C} \subseteq X\)
$$\begin{gathered} \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} \langle {{u}^{{k + 1}}},x\rangle } \right] \\ \leqslant \mathop {\max }\limits_{x \in \mathcal{C}} \frac{1}{2}{\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}} + \frac{1}{2}\sum\limits_{k = 0}^{K - 1} \mathbb{E}{\text{||}}{{u}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}}. \\ \end{gathered} $$
Lemma 2. Under Assumption 1 for iterates of Algorithm 1 the following inequality holds:
$$\begin{gathered} \mathbb{E}\left[ {{{{\left\| {{{\Delta }^{k}} - {{\mathbb{E}}_{k}}\left[ {{{\Delta }^{k}}} \right]} \right\|}}^{2}}} \right] \\ \leqslant \frac{{2{{{\overline L }}^{2}}}}{b}\mathbb{E}\left[ {{{{\left\| {{{x}^{k}} - {{w}^{{k - 1}}}} \right\|}}^{2}} + {{{\left\| {{{x}^{k}} - {{x}^{{k - 1}}}} \right\|}}^{2}}} \right], \\ \end{gathered} $$
(8)
where \({{\mathbb{E}}_{k}}[{{\Delta }^{k}}]\) is equal to
$${{\mathbb{E}}_{k}}[{{\Delta }^{k}}] = 2F({{x}^{k}}) - F({{x}^{{k - 1}}}).$$
(9)
Proof. We start from line 6 of Algorithm 1
$${{\mathbb{E}}_{k}}[{\text{||}}{{\Delta }^{k}} - {{\mathbb{E}}_{k}}[{{\Delta }^{k}}]{\text{|}}{{{\text{|}}}^{2}}]$$
$$\begin{gathered} = {{\mathbb{E}}_{k}}\left[ {\left\| {\frac{1}{b}\sum\limits_{j \in {{S}^{k}}} ({{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{w}^{{k - 1}}}) + ({{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{x}^{{k - 1}}})))} \right.} \right. \\ \left. {{{{\left. {_{{_{{_{{_{{_{{_{{_{{_{{}}}}}}}}}}}}}}}} + \,F({{w}^{{k - 1}}}) - (2F({{x}^{k}}) - F({{x}^{{k - 1}}}))} \right\|}}^{2}}} \right]. \\ \end{gathered} $$
With Cauchy–Schwarz inequality, we have
$${{\mathbb{E}}_{k}}[{\text{||}}{{\Delta }^{k}} - {{\mathbb{E}}_{k}}[{{\Delta }^{k}}]{\text{|}}{{{\text{|}}}^{2}}]$$
$$ \leqslant 2{{\mathbb{E}}_{k}}\left[ {{{{\left\| {\frac{1}{b}\sum\limits_{j \in {{S}^{k}}} ({{F}_{j}}({{x}^{k}})\, - \,{{F}_{j}}({{w}^{{k - 1}}}))\, - \,(F({{x}^{k}})\, - \,F({{w}^{{k - 1}}}))} \right\|}}^{2}}} \right]$$
$$ + \,\,2{{\mathbb{E}}_{k}}\left[ {{{{\left\| {\frac{1}{b}\sum\limits_{j \in {{S}^{k}}} ({{F}_{j}}({{x}^{k}})\, - \,{{F}_{j}}({{x}^{{k - 1}}}))\, - \,(F({{x}^{k}})\, - \,F({{x}^{{k - 1}}}))} \right\|}}^{2}}} \right].$$
Using we choose \(j_{1}^{k}, \ldots ,j_{b}^{k}\) in Sk indepdently and uniformly, one can note that
$$\begin{gathered} {{\mathbb{E}}_{k}}[\langle ({{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{w}^{{k - 1}}})) - (F({{x}^{k}}) - F({{w}^{{k - 1}}})), \\ ({{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{w}^{{k - 1}}})) - (F({{x}^{k}}) - F({{w}^{{k - 1}}}))\rangle ] \\ \end{gathered} $$
$$ = {{\mathbb{E}}_{k}}[\langle {{\mathbb{E}}_{{j_{i}^{k}}}}[({{F}_{{j_{i}^{k}}}}({{x}^{k}}) - {{F}_{{j_{i}^{k}}}}({{w}^{{k - 1}}})) - (F({{x}^{k}}) - F({{w}^{{k - 1}}}))],$$
$${{\mathbb{E}}_{{j_{l}^{k}}}}[({{F}_{{j_{l}^{k}}}}({{x}^{k}}) - {{F}_{{j_{l}^{k}}}}({{w}^{{k - 1}}})) - (F({{x}^{k}}) - F({{w}^{{k - 1}}}))]\rangle ] = 0.$$
Hence, we get
$${{\mathbb{E}}_{k}}[{\text{||}}{{\Delta }^{k}} - {{\mathbb{E}}_{k}}[{{\Delta }^{k}}]{\text{|}}{{{\text{|}}}^{2}}]$$
$$ \leqslant \,2{{\mathbb{E}}_{k}}\left[ {\sum\limits_{j \in {{S}^{k}}} \frac{1}{{{{b}^{2}}}}{{{\left\| {({{F}_{j}}({{x}^{k}})\, - \,{{F}_{j}}({{w}^{{k - 1}}}))\, - \,(F({{x}^{k}})\, - \,F({{w}^{{k - 1}}}))} \right\|}}^{2}}} \right]$$
$$ + \,2{{\mathbb{E}}_{k}}\left[ {\sum\limits_{j \in {{S}^{k}}} \frac{1}{{{{b}^{2}}}}{{{\left\| {({{F}_{j}}({{x}^{k}})\, - \,{{F}_{j}}({{x}^{{k - 1}}}))\, - \,(F({{x}^{k}})\, - \,F({{x}^{{k - 1}}}))} \right\|}}^{2}}} \right]$$
$$ = \,\frac{2}{{{{b}^{2}}}}{{\mathbb{E}}_{k}}\left[ {\sum\limits_{j \in {{S}^{k}}} {{{\left\| {({{F}_{j}}({{x}^{k}})\, - \,{{F}_{j}}({{w}^{{k - 1}}}))\, - \,(F({{x}^{k}})\, - \,F({{w}^{{k - 1}}}))} \right\|}}^{2}}} \right]$$
$$ + \,\,\frac{2}{{{{b}^{2}}}}{{\mathbb{E}}_{k}}\left[ {\sum\limits_{j \in {{S}^{k}}} {{{\left\| {({{F}_{j}}({{x}^{k}})\, - \,{{F}_{j}}({{x}^{{k - 1}}}))\, - \,(F({{x}^{k}})\, - \,F({{x}^{{k - 1}}}))} \right\|}}^{2}}} \right]$$
$$\begin{gathered} \leqslant \frac{2}{{{{b}^{2}}}}{{\mathbb{E}}_{k}}\left[ {\sum\limits_{j \in {{S}^{k}}} {{{\left\| {{{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{w}^{{k - 1}}})} \right\|}}^{2}}} \right] \\ + \frac{2}{{{{b}^{2}}}}{{\mathbb{E}}_{k}}\left[ {\sum\limits_{j \in {{S}^{k}}} {{{\left\| {{{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{x}^{{k - 1}}})} \right\|}}^{2}}} \right]. \\ \end{gathered} $$
In the last step, we used the fact that \(\mathbb{E}{\text{||}}X\) – \(\mathbb{E}X{\text{|}}{{{\text{|}}}^{2}} = \mathbb{E}{\text{||}}X{\text{|}}{{{\text{|}}}^{2}} - \,{\text{||}}\mathbb{E}X{\text{|}}{{{\text{|}}}^{2}}\). Next, we again take into account that \(j_{1}^{k}, \ldots ,j_{b}^{k}\) in Sk are chosen uniformly
$${{\mathbb{E}}_{k}}[{\text{||}}{{\Delta }^{k}} - {{\mathbb{E}}_{k}}[{{\Delta }^{k}}]{\text{|}}{{{\text{|}}}^{2}}]$$
$$\begin{gathered} \leqslant \frac{2}{b}{{\mathbb{E}}_{k}}[{{\mathbb{E}}_{{j \sim {\text{u}}{\text{.a}}{\text{.r}}.\{ 1, \ldots ,M\} }}}[{\text{||}}{{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{w}^{{k - 1}}}){\text{|}}{{{\text{|}}}^{2}} \\ \, + \,{\text{||}}{{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{x}^{{k - 1}}}){\text{|}}{{{\text{|}}}^{2}}]] \\ \end{gathered} $$
$$ = \frac{2}{{Mb}}\sum\limits_{j = 1}^M \left( {{{{\left\| {{{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{w}^{{k - 1}}})} \right\|}}^{2}} + {{{\left\| {{{F}_{j}}({{x}^{k}}) - {{F}_{j}}({{x}^{{k - 1}}})} \right\|}}^{2}}} \right).$$
Since each operator Fj is Lj-Lipschitz, we can rewrite it as
$$\begin{gathered} {{\mathbb{E}}_{k}}[{\text{||}}{{\Delta }^{k}} - {{\mathbb{E}}_{k}}[{{\Delta }^{k}}]{\text{|}}{{{\text{|}}}^{2}}] \\ \leqslant \frac{2}{{mb}}\sum\limits_{j = 1}^m L_{j}^{2}\left( {{\text{||}}{{x}^{k}} - {{w}^{{k - 1}}}{\text{|}}{{{\text{|}}}^{2}} + \,{\text{||}}{{x}^{k}} - {{x}^{{k - 1}}}{\text{|}}{{{\text{|}}}^{2}}} \right). \\ \end{gathered} $$
Applying the definition of \(\overline L \), we obtain
$$\begin{gathered} {{\mathbb{E}}_{k}}\left[ {{\text{||}}{{\Delta }^{k}} - {{\mathbb{E}}_{k}}[{{\Delta }^{k}}]{\text{|}}{{{\text{|}}}^{2}}} \right] \\ \leqslant \frac{{2{{{\overline L }}^{2}}}}{b}\left( {{\text{||}}{{x}^{k}} - {{w}^{{k - 1}}}{\text{|}}{{{\text{|}}}^{2}} + \,{\text{||}}{{x}^{k}} - {{x}^{{k - 1}}}{\text{|}}{{{\text{|}}}^{2}}} \right). \\ \end{gathered} $$
Taking the full expectation concludes the proof. \(\square \)
Lemma 3. For iterates of Algorithm 1 with \(\gamma = p\) the following bound holds for any compact set \(\mathcal{C} \subseteq X\):
$$\begin{gathered} \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} {{e}_{1}}(x,k)} \right] \leqslant 2\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}x - {{x}^{0}}{\text{|}}{{{\text{|}}}^{2}}\} \\ + \,\,\frac{{\gamma (1 - \gamma )}}{2}\sum\limits_{k = 0}^{K - 1} \mathbb{E}{\text{||}}{{x}^{{k + 1}}} - {{\omega }^{k}}{\text{|}}{{{\text{|}}}^{2}}, \\ \end{gathered} $$
where e1(k, x) = \({\text{||}}{{w}^{{k + 1}}}\, - \,x{\text{|}}{{{\text{|}}}^{2}}\, - \,{\text{||}}{{w}^{k}}\, - \,x{\text{|}}{{{\text{|}}}^{2}}\, + \,(1\, - \,\gamma ){\text{||}}{{x}^{{k + 1}}}\) – x||2.
Proof. For shortness we introduce
$${{u}^{{k + 1}}} = \gamma {{x}^{{k + 1}}} + (1 - \gamma ){{\omega }^{k}} - {{\omega }^{{k + 1}}}.$$
With new notation, we can rewrite \({{e}_{1}}(k,x)\) as
$${{e}_{1}}(k,x) = 2\left\langle {{{u}^{{k + 1}}},x} \right\rangle - \gamma {\text{||}}{{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - (1 - \gamma ){\text{||}}{{w}^{k}}{\text{|}}{{{\text{|}}}^{2}} + \,{\text{||}}{{w}^{k}}{\text{|}}{{{\text{|}}}^{2}}.$$
From line 8 of Algorithm 1 and using that \(\gamma = p\), one can obtain
$$\mathbb{E}[{{\mathbb{E}}_{k}}[{\text{||}}{{\omega }^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - (1 - \gamma ){\text{||}}{{\omega }^{k}}{\text{|}}{{{\text{|}}}^{2}}]] = 0.$$
Using two properties above, we reach for any compact set \(\mathcal{C} \subseteq X\)
$$\begin{gathered} \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} {{e}_{1}}(k,x)} \right] = 2\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} \left\langle {{{u}^{{k + 1}}},x} \right\rangle } \right] \\ \, + \mathbb{E}\left[ { - \gamma {\text{||}}{{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - (1 - \gamma ){\text{||}}{{w}^{k}}{\text{|}}{{{\text{|}}}^{2}}\, + \,{\text{||}}{{w}^{k}}{\text{|}}{{{\text{|}}}^{2}}} \right] \\ \end{gathered} $$
$$ = 2\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} \left\langle {{{u}^{{k + 1}}},x} \right\rangle } \right].$$
With \({{\mathcal{F}}_{k}} = \sigma ({{\xi }_{0}}, \ldots ,{{\xi }_{k}},{{x}^{k}})\) we have that \(\mathbb{E}[{{u}^{{k + 1}}}\) | \({{\mathcal{F}}_{k}}]\) = 0. It means that we can apply Lemma 1. Thus, we get
$$\begin{gathered} \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} {{e}_{1}}(k,x)} \right] \\ \leqslant 2\mathop {\max }\limits_{x \in \mathcal{C}} {{\left\| {{{x}_{0}} - x} \right\|}^{2}} + \frac{1}{2}\sum\limits_{k = 0}^{K - 1} \mathbb{E}{{\left\| {{{u}^{{k + 1}}}} \right\|}^{2}}. \\ \end{gathered} $$
(10)
We estimate \({\text{||}}{{u}_{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}}\) using the fact that \(\mathbb{E}{\text{||}}X\) – \(\mathbb{E}X{\text{|}}{{{\text{|}}}^{2}} = \mathbb{E}{\text{||}}X{\text{|}}{{{\text{|}}}^{2}} - \,{\text{||}}\mathbb{E}X{\text{|}}{{{\text{|}}}^{2}}\) and line 8 of Algorithm 1:
$$\begin{gathered} \mathbb{E}{{\left\| {{{u}_{{k + 1}}}} \right\|}^{2}} = \mathbb{E}[{{\mathbb{E}}_{k}}{{\left\| {{{u}_{{k + 1}}}} \right\|}^{2}}] \\ = \mathbb{E}[{{\mathbb{E}}_{k}}{\text{||}}{{\mathbb{E}}_{k}}\left[ {{{\boldsymbol\omega }_{{k + 1}}}} \right] - {{\boldsymbol\omega }_{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$ = \mathbb{E}[{{\mathbb{E}}_{k}}{{\left\| {{{\boldsymbol\omega }_{{k + 1}}}} \right\|}^{2}} - {{\left\| {{{\mathbb{E}}_{k}}\left[ {{{\boldsymbol\omega }_{{k + 1}}}} \right]} \right\|}^{2}}]$$
$$ = \mathbb{E}\left[ {\gamma {{{\left\| {{{x}_{{k + 1}}}} \right\|}}^{2}} + (1 - \gamma ){{{\left\| {{{\boldsymbol\omega }_{k}}} \right\|}}^{2}} - {{{\left\| {\gamma {{x}_{{k + 1}}} + (1 - \gamma ){{\boldsymbol\omega }_{k}}} \right\|}}^{2}}} \right]$$
$$ = \gamma (1 - \gamma )\mathbb{E}{{\left\| {{{x}_{{k + 1}}} - {{\omega }_{k}}} \right\|}^{2}}.$$
Applying this result to (10), we get
$$\begin{gathered} \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \sum\limits_{k = 0}^{K - 1} {{e}_{1}}(k,x)} \right] = 2\mathop {\max }\limits_{x \in \mathcal{C}} {{\left\| {{{x}_{0}} - x} \right\|}^{2}} \\ + \,\,\frac{{\gamma (1 - \gamma )}}{2}\sum\limits_{k = 0}^{K - 1} \mathbb{E}{{\left\| {{{x}_{{k + 1}}} - {{\omega }_{k}}} \right\|}^{2}}. \\ \end{gathered} $$
Proof of Theorem 1. We start from
$$\begin{gathered} {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} = {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \, + 2\langle {{x}^{{k + 1}}} - {{x}^{k}},{{x}^{{k + 1}}} - x\rangle - \,{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ = {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + 2\gamma \langle {{w}^{k}} - {{x}^{k}},{{x}^{{k + 1}}} - x\rangle - 2\eta \langle {{\Delta }^{k}},{{x}^{{k + 1}}} - x\rangle $$
$$\begin{gathered} - \,{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} - 2[\langle {{x}^{k}} + \gamma ({{w}^{k}} - {{x}^{k}}) \\ \, - \eta {{\Delta }^{k}} - {{x}^{{k + 1}}},{{x}^{{k + 1}}} - x\rangle ]. \\ \end{gathered} $$
From line 7 of Algorithm 1 and according to the property (5) of proximal operator, it follows, that
$${{x}^{k}} + \gamma ({{w}^{k}} - {{x}^{k}}) - \eta {{\Delta }^{k}} - {{x}^{{k + 1}}} \in \partial (\eta g)({{x}^{{k + 1}}}).$$
From convexity of \(g( \cdot )\), we obtain
$$\begin{gathered} {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} \leqslant {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \, + 2\gamma \langle {{w}^{k}} - {{x}^{k}},{{x}^{{k + 1}}} - x\rangle - 2\eta \langle {{\Delta }^{k}},{{x}^{{k + 1}}} - x\rangle \\ \end{gathered} $$
$$ - \,{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} + 2\eta g(x) - 2\eta g({{x}^{{k + 1}}}).$$
Using \(2\gamma \langle {{w}^{k}} - {{x}^{k}},{{x}^{{k + 1}}} - x\rangle \) = \(2\gamma \langle {{w}^{k}} - x,{{x}^{{k + 1}}} - x\rangle \) – \(2\gamma \langle {{x}^{k}} - x,{{x}^{{k + 1}}} - x\rangle \) and the following property of scalar product: \(2\langle a,b\rangle = {\text{||}}a + b{\text{|}}{{{\text{|}}}^{2}} - \,{\text{||}}a{\text{|}}{{{\text{|}}}^{2}} - \,{\text{||}}b{\text{|}}{{{\text{|}}}^{2}}\), we get
$$\begin{gathered} {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} \leqslant {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} \\ + \,\gamma \left( {{\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \,{\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} - \,{\text{||}}{{x}^{{k + 1}}} - {{w}^{k}}{\text{|}}{{{\text{|}}}^{2}}} \right) \\ \end{gathered} $$
$$\begin{gathered} \, - 2\eta \langle {{\Delta }^{k}},{{x}^{{k + 1}}} - x\rangle - \gamma {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \, - \gamma {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \gamma {\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ - \,{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} + 2\eta g(x) - 2\eta g({{x}^{{k + 1}}})$$
$$ = {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \gamma {\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{{k + 1}}} - {{w}^{k}}{\text{|}}{{{\text{|}}}^{2}}$$
$$\begin{gathered} \, - 2\eta \langle {{\Delta }^{k}},{{x}^{{k + 1}}} - x\rangle - (1 - \gamma ){\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} \\ \, + 2\eta g(x) - 2\eta g({{x}^{{k + 1}}}). \\ \end{gathered} $$
Applying the properties of \({{\mathbb{E}}_{k}}[{{\Delta }^{k}}]\) specified in Lemma 2, we obtain
$$\begin{gathered} {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} \leqslant {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \gamma {\text{|||}}{{w}^{k}} - x{{{\text{|}}}^{2}} \\ \, - \gamma {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{w}^{k}} - {{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ - \,2\eta \langle {{\mathbb{E}}_{k}}[{{\Delta }^{k}}],{{x}^{{k + 1}}} - x\rangle - (1 - \gamma ){\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}$$
$$ + \,2\eta \langle {{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}\rangle + 2\eta \langle {{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x\rangle $$
$$ + \,2\eta g(x) - 2\eta g({{x}^{{k + 1}}})$$
$$ = {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \gamma {\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{w}^{k}} - {{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}}$$
$$ - \,2\eta \left\langle {F({{x}^{k}}) + F({{x}^{k}}) - F({{x}^{{k - 1}}}),{{x}^{{k + 1}}} - x} \right\rangle $$
$$ - \,(1 - \gamma ){\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} + 2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ + \,2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$ + \,2\eta g(x) - 2\eta g({{x}^{{k + 1}}})$$
$$ = {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \gamma {\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{w}^{k}} - {{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}}$$
$$ - \,2\eta \langle F({{x}^{k}}) - F({{x}^{{k + 1}}}) + F({{x}^{k}}) - F({{x}^{{k - 1}}}),{{x}^{{k + 1}}} - x\rangle $$
$$ - \,2\eta \langle F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle $$
$$ - \,(1 - \gamma ){\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} + 2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ + \,2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle + 2\eta g(x) - 2\eta g({{x}^{{k + 1}}}).$$
By a simple rearrangements, we obtain
$$2\eta (g({{x}^{{k + 1}}}) - g(x)) + 2\eta \langle F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle $$
$$\begin{gathered} \leqslant {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} + \gamma {\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{k}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \, - \gamma {\text{||}}{{w}^{k}} - {{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - \,{\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ - \,2\eta \langle F({{x}^{k}}) - F({{x}^{{k + 1}}}) + F({{x}^{k}}) - F({{x}^{{k - 1}}}),{{x}^{{k + 1}}} - x\rangle $$
$$ - \,(1 - \gamma ){\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} + 2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ + \,2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$ = (1 - \gamma ){{\left\| {{{x}^{k}} - x} \right\|}^{2}} + {{\left\| {{{w}^{k}} - x} \right\|}^{2}}$$
$$ - \,\,(1 - \gamma ){{\left\| {{{x}^{{k + 1}}} - x} \right\|}^{2}} - {{\left\| {{{w}^{{k + 1}}} - x} \right\|}^{2}}$$
$$\begin{gathered} + {{\left\| {{{w}^{{k + 1}}} - x} \right\|}^{2}} - \gamma {{\left\| {{{x}^{{k + 1}}} - x} \right\|}^{2}} \\ \, - (1 - \gamma ){{\left\| {{{w}^{k}} - x} \right\|}^{2}} - \gamma {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} \\ \end{gathered} $$
$$\begin{gathered} - \,\,2\eta \langle F({{x}^{k}}) - F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle \\ \, + 2\eta \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - x\rangle \\ \end{gathered} $$
$$ - \,\,(1 - \gamma ){\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} + 2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ + \,2\eta \left\langle {{{\mathbb{E}}_{k}}\left[ {{{\Delta }^{k}}} \right] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$ = (1 - \gamma ){{\left\| {{{x}^{k}} - x} \right\|}^{2}} + {{\left\| {{{w}^{k}} - x} \right\|}^{2}}$$
$$ - \,(1 - \gamma ){{\left\| {{{x}^{{k + 1}}} - x} \right\|}^{2}} - {{\left\| {{{w}^{{k + 1}}} - x} \right\|}^{2}}$$
$$\begin{gathered} + \,\,{{\left\| {{{w}^{{k + 1}}} - x} \right\|}^{2}} - \gamma {{\left\| {{{x}^{{k + 1}}} - x} \right\|}^{2}} \\ \, - (1 - \gamma ){{\left\| {{{w}^{k}} - x} \right\|}^{2}} - \gamma {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} \\ \end{gathered} $$
$$ - \,2\eta \langle F({{x}^{k}}) - F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle $$
$$\begin{gathered} \, + 2\eta \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{k}} - x\rangle \\ \, + 2\eta \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle \\ \end{gathered} $$
$$ - \,(1 - \gamma ){{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}^{2}} + 2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ + \,2\eta \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle .$$
After taking sum and then averaging, one can get
$$2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {\langle F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle + g({{x}^{{k + 1}}}) - g(x)} \right]$$
$$ \leqslant \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {(1 - \gamma ){{{\left\| {{{x}^{k}} - x} \right\|}}^{2}} + {{{\left\| {{{w}^{k}} - x} \right\|}}^{2}}} \right]$$
$$ - \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {(1 - \gamma ){{{\left\| {{{x}^{{k + 1}}} - x} \right\|}}^{2}} + {{{\left\| {{{w}^{{k + 1}}} - x} \right\|}}^{2}}} \right]$$
$$ - 2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{k}}) - F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{k}} - x\rangle $$
$$ + \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {{{{\left\| {{{w}^{{k + 1}}} - x} \right\|}}^{2}} - \gamma {{{\left\| {{{x}^{{k + 1}}} - x} \right\|}}^{2}} - (1 - \gamma ){{{\left\| {{{w}^{k}} - x} \right\|}}^{2}}} \right]$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$ - \frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}^{2}}$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ = \frac{{2 - \gamma }}{K}{{\left\| {{{x}^{0}} - x} \right\|}^{2}} - \frac{{1 - \gamma }}{K}{{\left\| {{{x}^{K}} - x} \right\|}^{2}} - \frac{1}{K}{{\left\| {{{w}^{K}} - x} \right\|}^{2}}$$
$$ - \,2\eta \cdot \frac{1}{K}\langle F({{x}^{{K - 1}}}) - F({{x}^{K}}),{{x}^{K}} - x\rangle $$
$$ + \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {{{{\left\| {{{w}^{{k + 1}}} - x} \right\|}}^{2}} - \gamma {{{\left\| {{{x}^{{k + 1}}} - x} \right\|}}^{2}} - (1 - \gamma ){{{\left\| {{{w}^{k}} - x} \right\|}}^{2}}} \right]$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$ - \frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}^{2}}$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle .$$
(11)
Here we also used the initialization of Algorithm 1 with \({{w}^{0}} = {{x}^{{ - 1}}} = {{x}^{0}}\). Applying Young’s inequality, using the \(L\)-Lipshetzness of \(F\), and taking into account the definition of \(\eta \leqslant \frac{1}{{8L}}\) from conditions of the theorem for any k, one can obtain
$$\begin{gathered} - \,2\eta \langle F({{x}^{{K - 1}}}) - F({{x}^{K}}),{{x}^{K}} - x\rangle \\ \leqslant 2{{\eta }^{2}}{{\left\| {F({{x}^{{K - 1}}}) - F({{x}^{K}})} \right\|}^{2}} + \frac{1}{2}{\text{||}}{{x}^{K}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ \leqslant 2{{\eta }^{2}}{{L}^{2}}{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}^{2}} + \frac{1}{2}{{\left\| {{{x}^{K}} - x} \right\|}^{2}}$$
$$ \leqslant \frac{1}{{32}}{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}^{2}} + \frac{1}{2}{{\left\| {{{x}^{K}} - x} \right\|}^{2}}.$$
(12)
Combining (11) and (12), we get
$$2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} [\langle F({{x}^{{k + 1}}}),{{x}^{{k + 1}}} - x\rangle + g({{x}^{{k + 1}}}) - g(x)]$$
$$ \leqslant \frac{{2 - \gamma }}{K}{{\left\| {{{x}^{0}} - x} \right\|}^{2}} - \frac{1}{K}\left( {\frac{1}{2} - \gamma } \right){{\left\| {{{x}^{K}} - x} \right\|}^{2}} - \frac{1}{K}{{\left\| {{{w}^{K}} - x} \right\|}^{2}}$$
$$\begin{gathered} + \,\,\frac{1}{{32K}}{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}} + \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \text{[}{\text{||}}{{w}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} \\ \, - \gamma {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} - (1 - \gamma ){\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$ - \frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}^{2}}$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ \leqslant \frac{{2 - \gamma }}{K}{{\left\| {{{x}^{0}} - x} \right\|}^{2}}$$
$$ + \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {{{{\left\| {{{w}^{{k + 1}}} - x} \right\|}}^{2}} - \gamma {{{\left\| {{{x}^{{k + 1}}} - x} \right\|}}^{2}} - (1 - \gamma ){{{\left\| {{{w}^{k}} - x} \right\|}}^{2}}} \right]$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$\begin{gathered} - \frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {\kern 1pt} {\text{||}}{{w}^{k}} - {{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} \\ \, + \frac{1}{{32K}}{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle .$$
Next, we use monotonicity of F, apply Jensen’s inequality for the convex function g and obtain
$$\begin{gathered} 2\eta \left[ {\left\langle {F(x),\frac{1}{K}\sum\limits_{k = 0}^{K - 1} {{x}^{{k + 1}}} - x} \right\rangle + g\left( {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} {{x}^{{k + 1}}}} \right) - g(x)} \right] \\ \leqslant \frac{{2 - \gamma }}{K}{{\left\| {{{x}^{0}} - x} \right\|}^{2}} \\ \end{gathered} $$
$$ + \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {{{{\left\| {{{w}^{{k + 1}}} - x} \right\|}}^{2}} - \gamma {{{\left\| {{{x}^{{k + 1}}} - x} \right\|}}^{2}} - (1 - \gamma ){{{\left\| {{{w}^{k}} - x} \right\|}}^{2}}} \right]$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle $$
$$\begin{gathered} - \,\,\frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {\kern 1pt} {\text{||}}{{w}^{k}} - {{x}^{{k + 1}}}{\text{|}}{{{\text{|}}}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {\kern 1pt} {\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}} \\ + \frac{1}{{32K}}{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}} \\ \end{gathered} $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle .$$
Using new notation \({{\bar {x}}^{K}} = \frac{1}{K}\sum\limits_{k = 0}^{K - 1} {{x}^{{k + 1}}}\) and taking maximum on \(\mathcal{C}\), we achieve
$$2\eta {\text{Gap}}({{\bar {x}}^{K}}) \leqslant \mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{{2 - \gamma }}{K}{{{\left\| {{{x}^{0}} - x} \right\|}}^{2}}_{{_{{_{{_{{_{{_{{_{{_{{_{{}}}}}}}}}}}}}}}}}}} \right.$$
$$ + \,\,\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left[ {{{{\left\| {{{w}^{{k + 1}}} - x} \right\|}}^{2}} - \gamma {{{\left\| {{{x}^{{k + 1}}} - x} \right\|}}^{2}} - (1 - \gamma ){{{\left\| {{{w}^{k}} - x} \right\|}}^{2}}} \right]\,$$
$$\left. { + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle } \right\} + \frac{1}{{32K}}{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}^{2}}$$
$$ - \frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}^{2}}$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle $$
$$ \leqslant \mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{{2 - \gamma }}{K}{{{\left\| {{{x}^{0}} - x} \right\|}}^{2}}} \right\}$$
$$\begin{gathered} + \mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \text{[}{\text{||}}{{w}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}}\, - \,\gamma {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}}} \right. \\ \left. {_{{_{{_{{_{{_{{_{{_{{_{{_{{_{{}}}}}}}}}}}}}}}}}}}} - \,(1 - \gamma ){\text{||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}}]} \right\} \\ \end{gathered} $$
$$\begin{gathered} + \,2\eta \mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle } \right\} \\ + \frac{1}{{32K}}{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}^{2}} \\ \end{gathered} $$
$$ - \frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}^{2}} - \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}^{2}}$$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle $$
$$ + \,2\eta \cdot \frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle .\,$$
Here we also used that maximum of the sum not greater than the sum of the maximums. After that we take the an expectation and get
$$2\eta \mathbb{E}[{\text{Gap}}({{\bar {x}}^{K}})] \leqslant \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{{2 - \gamma }}{K}{{{\left\| {{{x}^{0}} - x} \right\|}}^{2}}} \right\}} \right]$$
$$\begin{gathered} + \,\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} [{\text{||}}{{w}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}} - \gamma {\text{||}}{{x}^{{k + 1}}} - x{\text{|}}{{{\text{|}}}^{2}}} \right.} \right. \\ \left. {\left. {_{{_{{_{{_{{_{{_{{_{{_{{_{{_{{_{{}}}}}}}}}}}}}}}}}}}}}} - \,(1 - \gamma {\text{)||}}{{w}^{k}} - x{\text{|}}{{{\text{|}}}^{2}}]} \right\}} \right] \\ \end{gathered} $$
$$\begin{gathered} \, + 2\eta \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{k}} - x} \right\rangle } \right\}} \right] \\ \, + \frac{1}{{32K}}\mathbb{E}[{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$\, - \mathbb{E}\left[ {\frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right]$$
$$ + \,2\eta \mathbb{E}\left[ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle } \right]$$
$$ + \,2\eta \mathbb{E}\left[ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle } \right].$$
With Lemma 3 for the second line of the previous estimate and Lemma 1 for the third line, we get
$$2\eta \mathbb{E}[{\text{Gap}}({{\bar {x}}^{K}})] \leqslant \mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{{2 - \gamma }}{K}{{{\left\| {{{x}^{0}} - x} \right\|}}^{2}}} \right\}} \right]$$
$$ + \mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{2}{K}{\text{||}}x - {{x}^{0}}{\text{|}}{{{\text{|}}}^{2}}} \right\} + \frac{{\gamma (1 - \gamma )}}{{2K}}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{x}^{{k + 1}}} - {{\omega }^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$ + \mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {\frac{1}{K}{\text{||}}x - {{x}^{0}}{\text{|}}{{{\text{|}}}^{2}}} \right\} + \frac{{{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$\begin{gathered} - \,\,\mathbb{E}\left[ {\frac{\gamma }{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right] \\ + \frac{1}{{32K}}\mathbb{E}[{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$ + \,2\eta \mathbb{E}\left[ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle } \right]$$
$$ + \,2\eta \mathbb{E}\left[ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle } \right]$$
$$ \leqslant \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ] + \frac{{{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$\begin{gathered} \, - \mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right] \\ + \frac{1}{{32K}}\mathbb{E}[{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$ + \,2\eta \mathbb{E}\left[ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle } \right]$$
$$ + \,2\eta \mathbb{E}\left[ {\frac{1}{K}\sum\limits_{k = 0}^{K - 1} \left\langle {{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}} \right\rangle } \right].$$
(13)
According to Young’s inequality,
$$\begin{gathered} \mathbb{E}[2\eta \langle {{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}},{{x}^{{k + 1}}} - {{x}^{k}}\rangle ] \\ \leqslant 4{{\eta }^{2}}\mathbb{E}[{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}] + \frac{1}{4}\mathbb{E}[{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}], \\ \end{gathered} $$
(14)
and
$$\begin{gathered} \mathbb{E}[2\eta \langle F({{x}^{{k - 1}}}) - F({{x}^{k}}),{{x}^{{k + 1}}} - {{x}^{k}}\rangle ] \\ \leqslant 4{{\eta }^{2}}\mathbb{E}[{\text{||}}F({{x}^{{k - 1}}}) - F({{x}^{k}}){\text{|}}{{{\text{|}}}^{2}}] + \frac{1}{4}\mathbb{E}[{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}]. \\ \end{gathered} $$
(15)
Combining (14), (15) with (13), we obtain
$$\begin{gathered} 2\eta \mathbb{E}[{\text{Gap}}({{{\bar {x}}}^{K}})] \leqslant \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ] \\ + \frac{{{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$\begin{gathered} - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right] \\ + \frac{1}{{32K}}\mathbb{E}[{\text{||}}{{x}^{{K - 1}}} - {{x}^{K}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$\begin{gathered} + \frac{{4{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}F({{x}^{{k - 1}}}) - F({{x}^{k}}){\text{|}}{{{\text{|}}}^{2}}] \\ + \frac{1}{{4K}}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$\begin{gathered} + \frac{{4{{\eta }^{2}}}}{K}\mathbb{E}\sum\limits_{k = 0}^{K - 1} \text{[}{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}] \\ + \frac{1}{{4K}}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}] \\ \end{gathered} $$
$$ \leqslant \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ] + \frac{{5{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$\begin{gathered} - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}2 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right] \\ + \frac{1}{{32K}}\mathbb{E}\left[ {{{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}}^{2}}} \right] \\ \end{gathered} $$
$$ + \frac{{4{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}F({{x}^{{k - 1}}}) - F({{x}^{k}}){\text{|}}{{{\text{|}}}^{2}}].$$
L—Lipschitzness of F (Assumption 1) and the choice of \(\gamma \leqslant \frac{1}{L}\) give
$$\begin{gathered} 2\eta \mathbb{E}[{\text{Gap}}({{{\bar {x}}}^{K}})] \leqslant \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ] \\ + \frac{{5{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}\left[ {{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}} \right] \\ \end{gathered} $$
$$\begin{gathered} \, - \mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}2 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right] \\ + \frac{1}{{32K}}\mathbb{E}\left[ {{{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}}^{2}}} \right] \\ \end{gathered} $$
$$ + \,\,\frac{{4{{\eta }^{2}}{{L}^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{x}^{{k - 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$ \leqslant \frac{4}{K}\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} } \right] + \frac{{5{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}\left[ {{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}} \right]$$
$$\begin{gathered} - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}2 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right] \\ + \frac{1}{{32K}}\mathbb{E}\left[ {{{{\left\| {{{x}^{{K - 1}}} - {{x}^{K}}} \right\|}}^{2}}} \right] \\ \end{gathered} $$
$$ + \frac{1}{{4K}}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{x}^{{k - 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$ \leqslant \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ] + \frac{{5{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}\left[ {{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}} \right]$$
$$ - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}2 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right]$$
$$ + \,\,\frac{1}{{4K}}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{x}^{{k + 1}}} - {{x}^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$ = \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ] + \frac{{5{{\eta }^{2}}}}{K}\sum\limits_{k = 0}^{K - 1} \mathbb{E}[{\text{||}}{{\mathbb{E}}_{k}}[{{\Delta }^{k}}] - {{\Delta }^{k}}{\text{|}}{{{\text{|}}}^{2}}]$$
$$ - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}4 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right].$$
Here we also used the initialization of Algorithm 1 with \({{x}^{{ - 1}}} = {{x}^{0}}\). Applying Lemma 2, we obtain
$$2\eta \mathbb{E}[{\text{Gap}}({{\bar {x}}^{K}})] \leqslant \frac{4}{K}\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {{\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}} \right\}} \right]$$
$$ + \,\,\frac{{10{{\eta }^{2}}{{{\overline L }}^{2}}}}{{bK}}\sum\limits_{k = 0}^{K - 1} ({\text{||}}{{x}^{k}} - {{w}^{{k - 1}}}{\text{|}}{{{\text{|}}}^{2}} + {{\left\| {{{x}^{k}} - {{x}^{{k - 1}}}} \right\|}^{2}})$$
$$ - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}4 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right]$$
$$ \leqslant \frac{4}{K}\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {{\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}} \right\}} \right]$$
$$ + \,\,\frac{{10{{\eta }^{2}}{{{\overline L }}^{2}}}}{{bK}}\sum\limits_{k = 0}^{K - 1} \left( {{\text{||}}{{x}^{{k + 1}}} - {{w}^{k}}{\text{|}}{{{\text{|}}}^{2}} + {{{\left\| {{{x}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}}} \right)$$
$$ - \,\,\mathbb{E}\left[ {\frac{\gamma }{{2K}}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}} + \frac{{1{\text{/}}4 - \gamma }}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right]$$
$$ \leqslant \frac{4}{K}\mathbb{E}\left[ {\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {{\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}} \right\}} \right]$$
$$\begin{gathered} - \,\,\mathbb{E}\left[ {\left( {\frac{\gamma }{2} - \frac{{10{{\eta }^{2}}{{{\overline L }}^{2}}}}{b}} \right)\frac{1}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{w}^{k}} - {{x}^{{k + 1}}}} \right\|}}^{2}}} \right. \\ \left. { + \left( {\frac{1}{4} - \gamma - \frac{{10{{\eta }^{2}}{{{\overline L }}^{2}}}}{b}} \right)\frac{1}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right]. \\ \end{gathered} $$
Here we again used the initialization of Algorithm 1 with \({{w}^{{ - 1}}} = {{x}^{{ - 1}}} = {{x}^{0}}\). The choice of \(\eta \leqslant \frac{{\sqrt {\gamma b} }}{{8\bar {L}}}\) and 0 < \(\gamma \leqslant \frac{1}{{16}}\) gives
$$2\eta \mathbb{E}[{\text{Gap}}({{\bar {x}}^{K}})] \leqslant \frac{4}{K}\mathbb{E}[\mathop {\max }\limits_{x \in \mathcal{C}} \{ {\text{||}}{{x}^{0}} - x{\text{|}}{{{\text{|}}}^{2}}\} ]$$
$$ - \,\,\mathbb{E}\left[ {\left( {\frac{1}{{12}} - \gamma } \right)\frac{1}{K}\sum\limits_{k = 0}^{K - 1} {{{\left\| {{{x}^{{k + 1}}} - {{x}^{k}}} \right\|}}^{2}}} \right]$$
$$ \leqslant \frac{4}{K}\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {{{{\left\| {{{x}^{0}} - x} \right\|}}^{2}}} \right\}.$$
And we have
$$\mathbb{E}[{\text{Gap}}({{\bar {x}}^{K}})] \leqslant \frac{2}{{\eta K}}\mathop {\max }\limits_{x \in \mathcal{C}} \left\{ {{{{\left\| {{{x}^{0}} - x} \right\|}}^{2}}} \right\}.$$
Substitution of \(\eta \) from the conditions of the theorem and \(\gamma = p\) finishes the proof.