Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Federated Multi-task Graph Learning

Published: 11 June 2022 Publication History

Abstract

Distributed processing and analysis of large-scale graph data remain challenging because of the high-level discrepancy among graphs. This study investigates a novel subproblem: the distributed multi-task learning on the graph, which jointly learns multiple analysis tasks from decentralized graphs. We propose a federated multi-task graph learning (FMTGL) framework to solve the problem within a privacy-preserving and scalable scheme. Its core is an innovative data-fusion mechanism and a low-latency distributed optimization method. The former captures multi-source data relatedness and generates universal task representation for local task analysis. The latter enables the quick update of our framework with gradients sparsification and tree-based aggregation. As a theoretical result, the proposed optimization method has a convergence rate interpolates between \(\mathcal {O}(1/T)\) and \(\mathcal {O}(1/\sqrt {T})\), up to logarithmic terms. Unlike previous studies, our work analyzes the convergence behavior with adaptive stepsize selection and non-convex assumption. Experimental results on three graph datasets verify the effectiveness and scalability of FMTGL.

A Appendix

A.2 Parameter Study

We investigate the model configuration on the DBLP dataset in the section. The first two hyperparameters are the number of D2TGNN layers \(L\) and the node representation embedding of D2TGNN \(d\). We conclude the results in Table 6, which indicates that the configuration \(L=3,d=256\) is a good setting. FMTGL has a good performance under the parameter configuration while remaining a lightweight structure.
Table 6.
 L135135
dTaskMAP(%)MF1(%)
128AI88.9988.9189.3483.8184.4284.10
System91.1090.9791.1685.9385.9986.48
Theory90.4490.4591.1088.3187.9288.92
Interdisciplinary91.2390.9491.2085.2685.3885.60
256AI90.6090.4990.4985.0185.0985.17
System92.5692.7292.6787.4187.6287.34
Theory91.7291.9191.6189.2489.2889.05
Interdisciplinary92.5492.2792.8186.4986.1686.75
512AI89.6689.8190.0384.6384.4784.35
System92.0091.7191.8587.0987.0587.05
Theory91.2891.0991.3688.7388.5088.76
Interdisciplinary91.7092.0991.8886.2286.2485.95
Table 6. Results for Selecting the Number of D2TGNN Layers \(L\) and the Representation Dimension \(d\)
We then discuss the selection of the number of support vectors for constructing fusion space \(h\) and the reduction dimension of node representation for computing the correction term of adjacency. The results are summarized in Table 7, and we find that our method is robust to the selection of \(h,c\). Moreover, the results indicate the optimal configuration is \(h=8,c=10\).
Table 7.
 c5102051020
hTaskMAP(%)MF1(%)
4AI90.0290.1989.8484.8285.0084.60
System91.9692.0592.1986.9787.3187.31
Theory91.6391.4491.3189.0188.7989.13
Interdisciplinary92.0191.9591.7886.0386.3585.92
8AI90.0689.8289.8384.6084.5284.71
System91.9292.3391.9087.0787.3387.28
Theory91.2191.3791.4588.6588.7189.12
Interdisciplinary91.8892.2091.8386.1486.4486.04
16AI90.1289.9689.4484.6484.8684.01
System91.9291.7191.7786.9586.5686.61
Theory91.5291.6291.3089.0389.0488.88
Interdisciplinary91.4292.1091.7686.1086.3685.84
Table 7. Results for Selecting the Reduction Dimension \(c\) and the Number of Support Vectors \(h\)

A.3 Theoretical Analysis

A.3.1 Proof of Lemma 1.

As \(f(\mathrm{W})=\frac{1}{m}\Sigma _{i=1}^ml_i(\mathrm{w}_i)\), we have:
\begin{equation} \begin{array}{ll} \displaystyle ||\nabla f(\mathrm{W}_1)-\nabla f(\mathrm{W}_2)||^2 &=\ \frac{1}{m^2}\Bigg \Vert \sum _{i=1}^m\nabla l_i(\mathrm{w}_i^1)-\nabla l_i(\mathrm{w}_i^2)\Bigg \Vert ^2\\ &\displaystyle \le \ \frac{1}{m}\sum _{i=1}^m||\nabla l_i(\mathrm{w}_i^1)-\nabla l_i(\mathrm{w}_i^2)||^2\\ &\displaystyle \le \ \frac{L^2}{m}\sum _{i=1}^m||\mathrm{w}_i^1-\mathrm{w}_i^2||^2=\frac{L^2}{m}||\mathrm{W}_1-\mathrm{W}_2||^2. \end{array} \end{equation}
(23)
Thus, we derive the Lemma 1.

A.3.2 Proof of Lemma 2.

Since \(f(\mathrm{W})=\frac{1}{m}\Sigma _{i=1}^ml_i(\mathrm{w}_i)\), we have:
\begin{equation} \begin{array}{ll} \displaystyle ||\nabla f(\mathrm{W})||^2&=\ \Bigg \Vert \frac{1}{m}\sum _{i=1}^m\nabla l_i(w_i)\Bigg \Vert ^2\\ & \quad \ \displaystyle \Bigg \Vert \frac{1}{m}\sum _{i=1}^m\Bigg (\frac{\partial l_i}{\partial w}(w^i),\dots ,\frac{\partial l_i}{\partial w^i}(w),\dots \Bigg)\Bigg \Vert ^2\\ \displaystyle &=\ \frac{1}{m^2}\Bigg \Vert \Bigg (\sum _{i=1}^m\frac{\partial l_i}{\partial w}(w^i),\dots ,\frac{\partial l_i}{\partial w^i}(w),\dots \Bigg)\Bigg \Vert ^2\\ \displaystyle &=\ \frac{1}{m^2}\Bigg \Vert \sum _{i=1}^m\frac{\partial l_i}{\partial w}(w^i)\Bigg \Vert ^2+\frac{1}{m^2}\sum _{i=1}^m\Bigg \Vert \frac{\partial l_i}{\partial w^i}(w))\Bigg |^2\\ \displaystyle &=\ ||\nabla f(w;w^1,\dots ,w^m)||^2+\sum _{i=1}^m\Bigg \Vert \frac{1}{m}\nabla l_i(w^i;w)\Bigg \Vert ^2. \end{array} \end{equation}
(24)

A.3.3 Proof of Lemma 3.

According to H(1), we have:
\begin{equation} \begin{array}{ll} & ||\nabla l_i((\mathrm{w},\mathrm{w}_x^i))-\nabla l_i((\mathrm{w},\mathrm{w}_y^i))||^2\\ &\le \ L^2||(\mathrm{w},\mathrm{w}_x^i)-(\mathrm{w},\mathrm{w}_y^i)||^2\\ &=\ L^2||\mathrm{w}_x^i-\mathrm{w}_y^i||^2. \end{array} \end{equation}
(25)
Thus, we derive the first inequality. Since second equation of Lemma 3 is trivial, we prove the third inequality here:
\begin{equation} \begin{aligned}&||\nabla l_i (\mathrm{w},\mathrm{w}^i)-G^i(\mathrm{w},\mathrm{w}^i,\xi)||^2\\ &=||\nabla l_i(\mathrm{w};\mathrm{w}^i)-G^i(\mathrm{w},\xi ;\mathrm{w}^i)||^2\\ &\quad \ +\ ||\nabla l_i(\mathrm{w}^i;\mathrm{w})-G^i(\mathrm{w}^i,\xi ;\mathrm{w})||^2\\ &\ge l_i(\mathrm{w}^i;\mathrm{w})-G^i(\mathrm{w}^i,\xi ;\mathrm{w})||^2. \end{aligned} \end{equation}
(26)
Because \(e^{\frac{x^2}{\sigma ^2}}\) is an increasing function, we derive the third inequality with the expectation operator and H(3).

A.3.4 Corollary of Hypothesis 4.

With H(4), we further derive the following lemma.
Lemma 4.
\(\forall x^i\in \mathrm{R}^d,i=1,\dots ,m\), and \(0\lt k\le d\), it holds that:
\begin{equation} \Bigg \Vert \sum _{i=1}^m x^i-gTopK_{i=1}^mx^i\Bigg \Vert ^2\le \left(1-\frac{k}{d}\right)\Bigg \Vert \sum _{i=1}^m x^i\Bigg \Vert ^2. \end{equation}
(27)
Proof. We have \(\mathrm{E}_{\xi }||(RandomK(x,\xi)||^2=\frac{k}{d}||x||^2\), according to H(4), we obtain the Equation (27).

A.3.5 Proof of Theorem 2.

Firstly, we define an auxiliary random variable \(x_t\) at iteration \(t\):
\begin{equation} x_{t+1}=x_t-\alpha _tG_t(\mathrm{w}_t,\xi _t), \end{equation}
(28)
where \(G_t(\mathrm{w}_t,\xi _t)=\frac{1}{m}\Sigma _{i=1}^mG^i_t(\mathrm{w}_t,\xi _t^i)\) and \(x_0=0\). The difference between \(x_t\) and the model parameter \(\mathrm{w}_t\) is defined as: \(\epsilon _t = \mathrm{w}_t-x_t\). According to Algorithm 1, we can verify the equation by induction method:
\begin{equation} \epsilon _t = \frac{1}{m}\sum _{i=1}^m\epsilon _t^i. \end{equation}
(29)
The following lemma gives an estimate about the second moment of \(\epsilon _t\):
Lemma 5.
For any iteration \(t\),
\begin{equation} \mathrm{E}[||\epsilon _t||^2]\le \left(1+\frac{1}{\eta }\right)\gamma \mathrm{E}[||\alpha _tG_t(\mathrm{w}_t,\xi _t)||^2]+\gamma (1+\eta)\mathrm{E}[||\epsilon _{t-1}||^2], \end{equation}
(30)
where \(\gamma =1-\frac{k}{d},0\lt k\le d\). \(\eta \gt 0\) is an arbitrarily selected constant.
Lemma 6.
If x \(\ge\) 0 and \(x\le C(A+Bx)^{\epsilon +1/2}\), then,
\begin{equation} x\le \max \big (\big [C(2B)^{1/2+\epsilon }\big ]^{\frac{1}{1/2-\epsilon }},C(2A)^{1/2+\epsilon }\big). \end{equation}
(31)
Our analysis is mainly based on Lemma 7. However, there is a large gap between Theorem 2 and Lemma 7. In Lemmas 810, we derive several precise estimations to bridge the gap.
Lemma 7.
Assuming H(1), H(2), then for any \(T\ge 1\), the iterates of MTgTop-k S-SGD satisfy the following estimate:
\begin{equation} \begin{aligned}\frac{3}{4}\mathrm{E}\left[\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\right] \le f(w_1)-f^*+\sum _{t=1}^T\alpha _t\tilde{L}^2\mathrm{E}[||\epsilon _t||^2]+\frac{\tilde{L}}{2}\mathrm{E}\big [||\alpha _tG_t(\mathrm{w}_t,\xi _t)||^2\big ]. \end{aligned} \end{equation}
(32)
Lemma 8.
Assuming H(3), for any iteration \(T\), we have the the following estimation:
\begin{equation} \mathrm{E}\left[\max _{1\le t\le T}\Bigg \Vert \frac{1}{m}\sum _{i=1}^m\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)\Bigg \Vert ^2\right]\le \sigma ^2\left(\frac{1}{m}+\ln T\right). \end{equation}
(33)
Lemma 9.
Let \(\alpha _t\) equals to the Equation (10), then we give the following bound for the gradients at each iteration \(T\):
\begin{equation} \begin{aligned}\mathrm{E}\left[\sum _{t=1}^T\alpha _t^2||G_t(\mathrm{w}_t,\xi _t)||^2\right] &\le \frac{\alpha ^2}{2\epsilon \beta ^{2\epsilon }}+2(\alpha _1-\alpha _{T+1})\mathrm{E}[\max _{1\le t\le T}\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2]\\ \mathrm{E}\left[\sum _{t=1}^T\alpha _t^3||G_t(\mathrm{w}_t,\xi _t)||^2\right] &\le \frac{2\alpha ^3}{(6\epsilon +1)\beta ^{3\epsilon +1/2}}+(\alpha _1-\alpha _{T+1})\frac{\alpha ^2}{2\epsilon \beta ^{2\epsilon }}\\ &\quad \ +\ (\alpha _1^2-\alpha _{T+1}^2)\mathrm{E}[\max _{1\le t\le T}\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2]. \end{aligned} \end{equation}
(34)
For simplicity, we denote the \(\mathrm{E}[\Sigma _{t=1}^T\alpha _t^2||G_t(\mathrm{w}_t,\xi _t)||^2]\) as \(Q^2_T\), \(\mathrm{E}[\Sigma _{t=1}^T\alpha _t^3||G_t(\mathrm{w}_t,\xi _t)||^2]\) as \(Q^3_T\), respectively. Another essential bound for our analysis is given by the following lemma.
Lemma 10.
For any iteration \(T\):
\begin{equation} \mathrm{E}\left[\sum _{t=1}^T\alpha _t||\epsilon _t||^2\right]\le \frac{1}{\eta }Q_T^3\sum _{t=1}^{T-1}((1+\eta)\gamma)^t. \end{equation}
(35)
Similarly, \(\gamma = 1-\frac{k}{d},0\lt k\le d\) and \(\eta \gt 0\). we can further simplify the r.h.s.: Define \(\gamma (1+\eta)=\tau\), and select \(\eta\) such that \(\tau \lt 1\).
\begin{equation} \mathrm{E}\left[\sum _{t=1}^T\alpha _t||\epsilon _t||^2\right]\le \frac{\tau Q_T^3}{\eta (1-\tau)}. \end{equation}
(36)
Proof. According to Lemma 7:
\begin{equation} \begin{aligned}\frac{3}{4} \mathrm{E}\Bigg [\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\Bigg ]\le f(w_1)-f^*+\sum _{t=1}^T\alpha _t\tilde{L}^2\mathrm{E}[||\epsilon _t||^2] +\frac{\tilde{L}}{2}\mathrm{E}[||\alpha _tG_t(\mathrm{w}_t,\xi _t)||^2]. \end{aligned} \end{equation}
(37)
Using Lemma 10, it holds that:
\begin{equation} \frac{3}{4}\mathrm{E}\Bigg [\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\Bigg ]\le f(w_1)-f^*+\frac{\tau \tilde{L}^2Q_T^3}{\eta (1-\tau)}+\frac{\tilde{L}Q_T^2}{2}. \end{equation}
(38)
Expand r.h.s. with Lemma 9, we derive that:
\begin{equation} \begin{aligned} \frac{\tau \tilde{L}^2Q_T^3}{\eta (1-\tau)}+\frac{\tilde{L}Q_T^2}{2}\le & \left(\frac{\tau \alpha _1^2\tilde{L}^2}{\eta (1-\tau)}+\alpha _1\tilde{L}\right)\mathrm{E}[\max _{1\le t \le T}\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2] +K_1. \end{aligned} \end{equation}
(39)
Further, according to Lemma 8:
\begin{equation} \begin{aligned}\mathrm{E}\left[\max _{1\le t\le T}\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2\right]&\le \mathrm{E}[\max _{1\le t\le T}\alpha _t(2||G_t(\mathrm{w}_t,\xi _t)-\nabla f(\mathrm{w}_t)||^2 +2||\nabla f(\mathrm{w}_t)||^2)]\\ &\le \alpha _1 \sigma ^2\left(\frac{1}{m}+\ln T\right) +\mathrm{E}[\max _{1\le t\le T}\alpha _t||\nabla f(\mathrm{w}_t)||^2]\\ &\le \alpha _1 \sigma ^2\left(\frac{1}{m}+\ln T\right) +\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2 \end{aligned} \end{equation}
(40)
Combine Equations (38), (39), and (40), it implies that:
\begin{equation} \begin{aligned} &\frac{3}{4}\mathrm{E}\Bigg [\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\Bigg ]\le f(w_1)-f^*+ \Bigg (\frac{\tau \alpha _1^2\tilde{L}^2}{\eta (1-\tau)}+\alpha _1\tilde{L}\Bigg)\Bigg (\alpha _1\sigma ^2\Bigg (\frac{1}{m}+\ln T\Bigg)+\\ &\mathrm{E}\left[\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\right]\Bigg)+K_1. \end{aligned} \end{equation}
(41)
Where \(K_1\) equals to \(\frac{2\tau \tilde{L}^2\alpha ^3}{\eta (1-\tau)(6\epsilon +1)\beta ^{3\epsilon +1/2}}+\frac{\tau \alpha _1\alpha ^2\tilde{L}^2}{2\eta (1-\tau)\epsilon \beta ^{2\epsilon }}+\frac{\alpha ^2\tilde{L}}{4\epsilon \beta ^{2\epsilon }}\), \(\alpha _1 = \frac{\alpha }{\beta ^{\epsilon +1/2}}\). We rearrange the Equation (41) by transposing the \(\mathrm{E}[\Sigma _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2]\) to l.h.s, and let \(\kappa\) equals to the r.h.s of the result.
\begin{equation} \mathrm{E}\Bigg [\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\Bigg ]\le \kappa . \end{equation}
(42)
For simplicity, we define \(\Delta = \Sigma _{t=1}^T||\nabla f(\mathrm{w}_t)||^2\). According to \(H\ddot{o}lder^{\prime }s\) inequality, we give the l.h.s. lower bound of Equation (42):
\begin{equation} \begin{aligned}\mathrm{E}\Bigg [\sum _{t=1}^T\alpha _t||\nabla f(\mathrm{w}_t)||^2\Bigg ]&\ge \mathrm{E}[\alpha _T\Delta ]\\ &\ge \frac{(\mathrm{E}[\Delta ^{1/2-\epsilon }])^{\frac{1}{1/2-\epsilon }}}{(\mathrm{E}[(\frac{1}{\alpha _T})^{\frac{1/2-\epsilon }{1/2+\epsilon }}])^{\frac{1/2+\epsilon }{1/2-\epsilon }}}. \end{aligned} \end{equation}
(43)
For \(\frac{1}{\alpha _T}\), it holds that:
\begin{equation} \begin{aligned}\frac{1}{\alpha _T} &= \frac{1}{\alpha }\Bigg (\beta +\sum _{t=1}^{T-1}||G_t(\mathrm{w}_t)||^2\Bigg)^{1/2+\epsilon }\\ &\le \frac{1}{\alpha }\Bigg (\beta +2\sum _{t=1}^{T-1}(||\nabla f(\mathrm{w}_t)-G_t(\mathrm{w}_t)||^2+||\nabla f(\mathrm{w}_t)||^2)\Bigg)^{1/2+\epsilon }. \end{aligned} \end{equation}
(44)
Combining Equation (42), (43), and (44) and H(3), we derive that:
\begin{equation} \begin{aligned}(\mathrm{E}[\Delta ^{1/2-\epsilon }])^{\frac{1}{1/2-\epsilon }}&\le \kappa \Bigg (\mathrm{E}\Bigg [(\frac{1}{\alpha _T})^{\frac{1/2-\epsilon }{1/2+\epsilon }}\Bigg ]\Bigg)^{\frac{1/2+\epsilon }{1/2-\epsilon }} \le \kappa \Bigg (\mathrm{E}\Bigg [\Bigg (\beta +2\sum _{t=1}^{T-1}||\nabla f(w_t)-G(w_t)||^2\Bigg)^{1/2-\epsilon }\Bigg ]\\ &\quad \ +\ 2\mathrm{E}\Bigg [\Bigg (\sum _{t=1}^{T-1}||\nabla f(w_t)||^2\Bigg)^{1/2-\epsilon }\Bigg ]\Bigg)^{\frac{1/2+\epsilon }{1/2-\epsilon }}\\ &\le \kappa ((\beta +2T\sigma ^2)^{1/2-\epsilon }+2\mathrm{E}[\Delta ^{1/2-\epsilon }])^{\frac{1/2+\epsilon }{1/2-\epsilon }}. \end{aligned} \end{equation}
(45)
Noticing that we have the following lower bound for \(\mathrm{E}[\Delta ^{1/2-\epsilon }]\):
\begin{equation} T^{1/2-\epsilon }\mathrm{E}[\min _{1\le t\le T}||\nabla f(\mathrm{W}_t)||^{1-2\epsilon }]\le \mathrm{E}[\Delta ^{1/2-\epsilon }], \end{equation}
(46)
according to Lemma 6, we derive the Theorem 2.

A.3.6 Proof of Lemma 5.

Note that \(\epsilon _t = \mathrm{w}_t-x_t\):
\begin{equation} \begin{aligned}\mathrm{E}[||w_{t+1}-x_{t+1}||^2] &=\mathrm{E}\Bigg [\Bigg \Vert \frac{1}{m}\sum _{i}^m(\alpha _tG_t^i(\mathrm{w}_t,\xi _t)+\epsilon _t^i)+\mathrm{w}_t-x_t-\epsilon _t\\ &\quad \ -\ \frac{1}{m}gTopK_{i=1}^m(\alpha _t(G_t^i(\mathrm{w}_t,\xi _t)+\epsilon _t^i))\Bigg \Vert ^2\Bigg ]\\ &=\ \mathrm{E}\Bigg [\Bigg \Vert \frac{1}{m}\sum _{i}^m(\alpha _tG_t^i(\mathrm{w}_t,\xi _t)+\epsilon _t^i)\\ &\quad \ -\ \frac{1}{m}gTopK_{i=1}^m(\alpha _t(G_t^i(\mathrm{w}_t,\xi _t)+\epsilon _t^i)\Bigg \Vert ^2\Bigg ]\\ &\le \ \gamma \mathrm{E}\Bigg [\Bigg \Vert \frac{1}{m}\sum _{i}^m(\alpha _tG_t^i(\mathrm{w}_t,\xi _t)+\epsilon _t^i)\Bigg \Vert ^2\Bigg ]\\ &=\ \gamma \mathrm{E}\Bigg [\Bigg \Vert \alpha _tG_t(\mathrm{w}_t,\xi _t)+\mathrm{w}_t-x_t\Bigg \Vert ^2\Bigg ]\\ &\le \ \gamma \left(1+\frac{1}{\eta }\right)\mathrm{E}||\alpha _tG_t(\mathrm{w}_t,\xi _t)||^2+\gamma (1+\eta)\mathrm{E}||\mathrm{w}_t-x_t||^2, \end{aligned} \end{equation}
(47)
by substituting \(w_{t+1}-x_{t+1}\) with \(\epsilon _{t+1}\), we get the result.

A.3.7 Proof of Lemma 6.

If \(A\le Bx\), then \(x\le C(2Bx)^{\frac{1}{2}+\epsilon }\), move \(x\) to l.h.s., we have \(x\le [C(2B)^{\frac{1}{2}+\epsilon }]^{\frac{1}{1/2-\epsilon }}\).
If \(A\gt Bx\), then \(x\lt C(2A)^{\frac{1}{2}+\epsilon }\). The proof is finished by taking the maximum of two estimates.

A.3.8 Proof of Lemma 7.

Under assumption H(1), we have
\begin{equation} \begin{aligned}f(x_{t+1})\le &f(x_t)+\alpha _t\nabla f(x_t)^\mathrm{T}(\nabla f(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)) +\frac{\tilde{L}}{2}||x_{t+1}-x_t||^2, \end{aligned} \end{equation}
(48)
since \(x_{t+1}-x_t=-\alpha _tG_t(\mathrm{w}_t)\), we have
\begin{equation} \begin{aligned}&f(x_{t+1})\le f(x_t) +\alpha _t \nabla f(x_t)^\mathrm{T}(\nabla f(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)) -\alpha _t\nabla f(x_t)^\mathrm{T}\nabla f(\mathrm{w}_t)+\frac{\tilde{L}}{2}||\alpha _t G_t(\mathrm{w}_t)||^2, \end{aligned} \end{equation}
(49)
according to assumption H(2), we have
\begin{equation} \mathrm{E}[\alpha _t \nabla f(x_t)^\mathrm{T}(\nabla f(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t))]=0, \end{equation}
(50)
besides, it holds that:
\begin{equation} \begin{aligned}-\alpha _t\nabla f(x_t)^\mathrm{T}\nabla f(\mathrm{w}_t) &= -\frac{\alpha _t}{2}||\nabla f(x_t)||^2-\frac{\alpha _t}{2}||\nabla f(\mathrm{w}_t)||^2 +\frac{\alpha _t}{2}||\nabla f(x_t)-\nabla f(\mathrm{w}_t)||^2. \end{aligned} \end{equation}
(51)
According to H(1):
\begin{equation} \begin{aligned}-\alpha _t\nabla f(x_t)^\mathrm{T}\nabla f(\mathrm{w}_t) & \le -\frac{\alpha _t}{2}||\nabla f(x_t)||^2-\frac{\alpha _t}{2}||\nabla f(\mathrm{w}_t)||^2 +\frac{\alpha _t\tilde{L}^2}{2}||\mathrm{w}_t-x_t||^2. \end{aligned} \end{equation}
(52)
We take expectation to the Equation (49),
\begin{equation} \begin{aligned}\mathrm{E}\Bigg [\frac{\alpha _t}{2}(||\nabla f(x_t)||^2+\tilde{L}^2||\mathrm{w}_t-x_t||^2)\Bigg ]&\le \mathrm{E}\Bigg [f(x_t)-f(x_{t+1}) -\frac{\alpha _t}{2}||\nabla f(\mathrm{w}_t)||^2+\alpha _t\tilde{L}^2||\mathrm{w}_t-x_t||^2\\ &\quad \ +\ \frac{\tilde{L}}{2}||\alpha _tG_t(\mathrm{w}_t,\xi _t)||^2\Bigg ]. \end{aligned} \end{equation}
(53)
Furthermore, we apply H(1) again and derive the following equation:
\begin{equation} \frac{1}{2}||\nabla f(\mathrm{w}_t)||^2\le 2||\nabla f(x_t)||^2+2\tilde{L}^2||\mathrm{w}-x_t||^2. \end{equation}
(54)
After simplification, we have
\begin{equation} \begin{aligned}\frac{3}{4}\mathrm{E}[\alpha _t||\nabla f(\mathrm{w}_t)||^2]&\le \mathrm{E}[f(x_t)]-\mathrm{E}[f(x_{t+1})]+ \mathrm{E}[\alpha _tL^2||\mathrm{w}_t-x_t||^2]+\frac{\tilde{L}}{2}\mathrm{E}[||\alpha _tG_t(\mathrm{w}_t,\xi _t)||^2], \end{aligned} \end{equation}
(55)
summing the above equation over \(t=1,\ldots ,T\), we then prove the Lemma 7 with the fact that \(f^*\le f(x),\forall x\).

A.3.9 Proof of Lemma 8.

According to the Jensen’s inequality, we have
\begin{equation} \begin{aligned}&\exp \Bigg (\frac{\mathrm{E}[\max _{1\le t\le T}||\frac{1}{m}\sum _i^m\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)||^2]}{\sigma ^2}\Bigg)\\ &\quad \ \le \ \mathrm{E}\Bigg [\exp \Bigg (\frac{\max _{1\le t\le T}\frac{1}{m^2}\sum _i^m||\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)||^2}{\sigma ^2}\Bigg)\Bigg ]. \end{aligned} \end{equation}
(56)
Notice that \(\exp (\cdot)\) is an increasing function, the r.h.s equals to:
\begin{equation} \mathrm{E}\Bigg [\max _{1\le t\le T}\exp \Bigg (\frac{\sum _i^m||\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)||^2}{m^2\sigma ^2}\Bigg)\Bigg ]. \end{equation}
(57)
Applying Jensen’s inequality again, we find that the Equation (57) is less than or equal to the following one:
\begin{equation} \mathrm{E}\Bigg [\max _{1\le t\le T}\frac{1}{m}\sum _i^m\exp \Bigg (\frac{||\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)||^2}{m\sigma ^2}\Bigg)\Bigg ]. \end{equation}
(58)
Equation (58) is less than or equal to:
\begin{equation} \begin{aligned} & \frac{1}{m}\sum _i^m\sum _t^T\mathrm{E}\Bigg [\exp \Bigg (\frac{||\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _{t})||^2}{m\sigma ^2}\Bigg)\Bigg ]\\ & \quad \ =\ \frac{1}{m}\sum _i^m\sum _{t=1}^T\mathrm{E}\Bigg [\mathrm{E}_i\Bigg [\exp (\frac{||\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _{t})||^2}{m\sigma ^2})&\Bigg ]\Bigg ]. \end{aligned} \end{equation}
(59)
Note that \(x^{\frac{1}{m}}\) is a concave function of \(x\), applying Jensen’s inequality again. We find that Equation (59) is less than or equal to:
\begin{equation} \frac{1}{m}\sum _i^m\sum _{t=1}^T\mathrm{E}\Bigg [\Bigg (\mathrm{E}_i\Bigg [\exp \Bigg (\frac{||\nabla l_i(\mathrm{w}_t)-G_t(\mathrm{w}_t,\xi _t)||^2}{\sigma ^2}\Bigg)\Bigg ]\Bigg)^{\frac{1}{m}}\Bigg ]. \end{equation}
(60)
According to H(3), Equation (60) is less than or equal to \(Te^{\frac{1}{m}}\). Therefore, we finish the proof.

A.3.10 Proof of Lemma 9.

To get Lemma 9, we introduce another lemma at first:
Lemma 11.
Let \(a_0\gt 0, a_i\ge 0,i=1,\ldots ,T\) and \(\beta \gt 1\). Then,
\begin{equation} \sum _{t=1}^T\frac{a_t}{(a_0+\sum _{i=1}^{t}a_i)^\beta }\le \frac{1}{(\beta -1)a_0^{\beta -1}}. \end{equation}
(61)
We now turn to prove Lemma 11. Assuming that \(f:[0,+\infty)\rightarrow [0,+\infty)\) is a non-increasing function, we have:
\begin{equation} \sum _{t=1}^Ta_tf(a_0+\sum _{i=1}^t a_i)\le \int ^{\Sigma _{t=0}^Ta_t}_{a_0}f(x)dx. \end{equation}
(62)
Let \(s_t=\Sigma _{i=0}^ta_i\), and we obtain the following inequality:
\begin{equation} a_if(s_i)=\int _{s_{i-1}}^{s_i} f(s_i)dx\le \int _{s_{i-1}}^{s_i} f(x)dx. \end{equation}
(63)
Summing the index \(i\) from 1 to \(T\), we obtain Equation (62). The proof of Lemma 11 is finished by applying the Equation (62).
Proof: We then prove the first estimate:
\begin{equation} \begin{aligned} \mathrm{E}\Bigg [\sum _{t=1}^T\alpha _t^2||G_t(\mathrm{w}_t,\xi _t)||^2\Bigg ]&=\mathrm{E}\Bigg [\sum _{t=1}^T\alpha _{t+1}^2||G_t^i(\mathrm{w}_t,\xi _t)||^2\\ &\quad +\ \sum _{t=1}^T||G_t(\mathrm{w}_t^i,\xi _t)||^2(\alpha _t^2-\alpha _{t+1}^2)\Bigg ].\\ \end{aligned} \end{equation}
(64)
Notice that \((\alpha _t^2-\alpha _{t+1}^2) = (\alpha _t+\alpha _{t+1})(\alpha _t-\alpha _{t+1})\), and \(\lbrace \alpha _t\rbrace\) is decreasing with respect to \(t\). Equation (64) is less than or equal to the following one:
\begin{equation} \mathrm{E}\Bigg [\sum _{t=1}^T\alpha _{t+1}^2||G_t^i(\mathrm{w}_t,\xi _t)||^2+\sum _{t=1}^T2\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2(\alpha _t-\alpha _{t+1})\Bigg ]. \end{equation}
(65)
According to the definition of \(\alpha _t\),
\begin{equation} \sum _{t=1}^T\alpha _{t+1}^2||G_t(\mathrm{w}_t^i,\xi _t)||^2 = \sum _{t=1}^T\frac{\alpha ^2 ||G_t(\mathrm{w}_t,\xi _t)||^2}{(\beta +\sum _{i=1}^{t}||G_i(w_i,\xi _i)||^2)^{1+2\epsilon }}. \end{equation}
(66)
Applying Lemma 11, we find that Equation (66) is less than or equal to \(\frac{\alpha ^2}{2\epsilon \beta ^{2\epsilon }}\). Furthermore, we have:
\begin{equation} \begin{aligned}&\sum _{t=1}^T2\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2(\alpha _t-\alpha _{t+1})\le 2\max _{1\le t\le T}\alpha _t||G_t(\mathrm{w}_t,\xi _t)||^2\sum _{t=1}^T(\alpha _t-\alpha _{t+1}). \end{aligned} \end{equation}
(67)
Combining the above results, we obtain the first estimation. For the second estimate:
\begin{equation} \begin{aligned}\sum _{t=1}^T\alpha _t^3||G_t(\mathrm{w}_t,\xi _t)||^2&=\sum _{t=1}^T\alpha _{t+1}^3||G_t(\mathrm{w}_t,\xi _{t})||^2 +\sum _{t=1}^T(\alpha _t^3-\alpha _{t+1}^3)||G_t(\mathrm{w}_t,\xi _{t})||^2. \end{aligned} \end{equation}
(68)
Similarly, according to the definition of \(\alpha _t\):
\begin{equation} \begin{aligned}\sum _{t=1}^T\alpha _{t+1}^3||G_t(\mathrm{w}_t^i,\xi _{t})||^2 = \sum _{t=1}^T\frac{\alpha ^3 ||G_t(\mathrm{w}_t,\xi _{t})||^2}{(\beta +\sum _{i=1}^{t}||G_i(w_i,\xi _{t})||^2)^{3/2+3\epsilon }}. \end{aligned} \end{equation}
(69)
Applying Lemma 11 again, we find that the Equation (69) is less than or equal to \(\frac{2\alpha ^3}{(6\epsilon +1)\beta ^{3\epsilon +1/2}}\). Besides, we have:
\begin{equation} \begin{aligned}\sum _{t=1}^T(\alpha _t^3-\alpha _{t+1}^3)||G_t(\mathrm{w}_t,\xi _{t})||^2 &= \sum _{t=1}^T(\alpha _t-\alpha _{t+1})(\alpha _t^2+\alpha _t\alpha _{t+1}+\alpha _{t+1}^2)||G_t(\mathrm{w}_t,\xi _{t})||^2\\ &=\ \sum _{t=1}^T(\alpha _t-\alpha _{t+1})(\alpha _t^2+\alpha _t\alpha _{t+1})||G_t(\mathrm{w}_t,\xi _{t})||^2\\ &\quad \ +\ (\alpha _t-\alpha _{t+1})\alpha _{t+1}^2||G_t(\mathrm{w}_t,\xi _{t})||^2\\ &\le \ \sum _{t=1}^T\alpha _t(\alpha _t^2-\alpha _{t+1}^2)||G_t(\mathrm{w}_t,\xi _{t})||^2\\ &\quad \ +\ \max _{1\le t\le T}(\alpha _t-\alpha _{t+1})\sum _{t=1}^T\alpha _{t+1}^2||G_t(\mathrm{w}_t,\xi _{t})||^2\\ &\le \ \max _{1\le t\le T}\alpha _t||G_t(\mathrm{w}_t,\xi _{t})||^2(\alpha _1^2-\alpha _{T+1}^2)+\max _{1\le t\le T}(\alpha _t-\alpha _{t+1})\frac{\alpha ^2}{2\epsilon \beta ^{2\epsilon }}. \end{aligned} \end{equation}
(70)
Combining the above results, we obtain the second estimate of Lemma 9.

A.3.11 Proof of Lemma 10.

Let \(\Sigma _{t=1}^T\alpha _t||\epsilon _t||^2=S_T\), according to Lemma 5:
\begin{equation} \begin{aligned}&\sum _{t=1}^T\alpha _t||\epsilon _t||^2\le \left(1+\frac{1}{\eta }\right)\gamma \sum _{t=1}^T\alpha _t^3||G_t(\mathrm{w}_t,\xi _t)||^2 +(1+\eta)\gamma \sum _{t=1}^T\alpha _{t-1}||\epsilon _{t-1}||^2. \end{aligned} \end{equation}
(71)
After substitution, we derive:
\begin{equation} S_T \le \Bigg (1+\frac{1}{\eta }\Bigg)\gamma Q_T^3+(1+\eta)\gamma S_{T-1}. \end{equation}
(72)
Using induction method, and notice that \(S_1 = 0\):
\begin{equation} \begin{aligned}S_T&\le \frac{1}{\eta }\sum _{t=1}^{T-1}((1+\eta)\gamma)^tQ_{T-t+1}^3\\ &\le \frac{1}{\eta }\max _{2\le t\le T}Q_{t}^3\sum _{t=1}^{T-1}((1+\eta)\gamma)^t. \end{aligned} \end{equation}
(73)
The \(Q_T^3\) is a non-decreasing function of \(t\), thus \(\max _{2\le t\le T}Q_{t}^3=Q_T^3\). Finally, we derive the following estimate:
\begin{equation} S_T \le \frac{1}{\eta }Q_T^3\sum _{t=1}^{T-1}((1+\eta)\gamma)^t. \end{equation}
(74)

References

[1]
Amr Ahmed, Abhimanyu Das, and Alexander J. Smola. 2014. Scalable hierarchical multitask learning algorithms for conversion optimization in display advertising. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining. 153–162.
[2]
Inci M. Baytas, Ming Yan, Anil K. Jain, and Jiayu Zhou. 2016. Asynchronous multi-task learning. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 11–20.
[3]
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2016. Practical secure aggregation for federated learning on user-held data. arXiv preprint arXiv:1611.04482 (2016).
[4]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018).
[5]
Jianmin Chen, Xinghao Pan, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. 2016. Revisiting distributed synchronous SGD. arXiv preprint arXiv:1604.00981 (2016).
[6]
Francesco Dinuzzo, Gianluigi Pillonetto, and Giuseppe De Nicolao. 2010. Client–server multitask learning from distributed datasets. IEEE Transactions on Neural Networks 22, 2 (2010), 290–303.
[7]
John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, 7 (2011).
[8]
Ross Girshick. 2015. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 1440–1448.
[9]
Farzin Haddadpour and Mehrdad Mahdavi. 2019. On the convergence of local descent methods in federated learning. arXiv preprint arXiv:1910.14425 (2019).
[10]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024–1034.
[11]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018).
[12]
Chaoyang He, Keshav Balasubramanian, Emir Ceyani, Yu Rong, Peilin Zhao, Junzhou Huang, Murali Annavaram, and Salman Avestimehr. 2021. FedGraphNN: A federated learning system and benchmark for graph neural networks. arXiv preprint arXiv:2104.07145 (2021).
[13]
Peng Jiang and Gagan Agrawal. 2018. A linear speedup analysis of distributed deep learning with sparse and quantized communication. In Advances in Neural Information Processing Systems. 2525–2536.
[14]
Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[15]
Jakub Konečnỳ, Brendan McMahan, and Daniel Ramage. 2015. Federated optimization: Distributed optimization beyond the datacenter. arXiv preprint arXiv:1511.03575 (2015).
[16]
Jakub Konečnỳ, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).
[17]
David Leroy, Alice Coucke, Thibaut Lavril, Thibault Gisselbrecht, and Joseph Dureau. 2019. Federated learning for keyword spotting. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6341–6345.
[18]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. 177–187.
[19]
Jure Leskovec and Julian J. Mcauley. 2012. Learning to discover social circles in ego networks. In Advances in Neural Information Processing Systems. 539–547.
[20]
Xiaoyu Li and Francesco Orabona. 2018. On the convergence of stochastic gradient descent with adaptive stepsizes. arXiv preprint arXiv:1805.08114 (2018).
[21]
Sulin Liu, Sinno Jialin Pan, and Qirong Ho. 2017. Distributed multi-task relationship learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 937–946.
[22]
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 43–52.
[23]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273–1282.
[24]
Guangxu Mei, Ziyu Guo, Shijun Liu, and Li Pan. 2019. SGNN: A graph neural network based federated learning approach by hiding structure. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2560–2568.
[25]
Yurii Nesterov. 2003. Introductory Lectures on Convex Optimization: A Basic Course. Vol. 87. Springer Science & Business Media.
[26]
Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).
[27]
Anit Kumar Sahu, Tian Li, Maziar Sanjabi, Manzil Zaheer, Ameet Talwalkar, and Virginia Smith. 2018. On the convergence of federated optimization in heterogeneous networks. arXiv preprint arXiv:1812.06127 3 (2018).
[28]
Felix Sattler, Simon Wiedemann, Klaus-Robert Müller, and Wojciech Samek. 2020. Robust and communication-efficient federated learning from non-i.i.d. data. IEEE Transactions on Neural Networks and Learning Systems 31 (2020), 3400–3413.
[29]
Shaohuai Shi, Qiang Wang, Kaiyong Zhao, Zhenheng Tang, Yuxin Wang, Xiang Huang, and Xiaowen Chu. 2019. A distributed synchronous SGD algorithm with global top-k sparsification for low bandwidth networks. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2238–2247.
[30]
Shaohuai Shi, Kaiyong Zhao, Qiang Wang, Zhenheng Tang, and Xiaowen Chu. 2019. A convergence analysis of distributed SGD with communication-efficient gradient sparsification. In IJCAI. 3411–3417.
[31]
Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, and Ameet S. Talwalkar. 2017. Federated multi-task learning. In Advances in Neural Information Processing Systems. 4424–4434.
[32]
Toyotaro Suzumura, Yi Zhou, Natahalie Barcardo, Guangnan Ye, Keith Houck, Ryo Kawahara, Ali Anwar, Lucia Larise Stavarache, Daniel Klyashtorny, Heiko Ludwig, et al. 2019. Towards federated graph learning for collaborative financial crimes detection. arXiv preprint arXiv:1909.12946 (2019).
[33]
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 990–998.
[34]
Paul Voigt and Axel Von dem Bussche. 2017. The EU general data protection regulation (GDPR). A Practical Guide, 1st Ed., Cham: Springer International Publishing (2017).
[35]
Binghui Wang, Ang Li, Hai Li, and Yiran Chen. 2020. GraphFL: A federated learning framework for semi-supervised node classification on graphs. arXiv preprint arXiv:2012.04187 (2020).
[36]
Jialei Wang, Mladen Kolar, and Nathan Srebro. 2016. Distributed multi-task learning. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016, Cadiz, Spain, May 9–11, 2016 (JMLR Workshop and Conference Proceedings), Arthur Gretton and Christian C. Robert (Eds.), Vol. 51. JMLR.org, 751–760. http://proceedings.mlr.press/v51/wang16d.html
[37]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).
[38]
Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated learning with non-IID data. ArXiv abs/1806.00582 (2018).

Cited By

View all
  • (2024)Performance-Based Pricing for Federated Learning via AuctionProceedings of the VLDB Endowment10.14778/3648160.364816917:6(1269-1282)Online publication date: 3-May-2024
  • (2024)A Survey of Graph Neural Networks for Social Recommender SystemsACM Computing Surveys10.1145/366182156:10(1-34)Online publication date: 22-Jun-2024
  • (2024)Counterfactual Graph Convolutional Learning for Personalized RecommendationACM Transactions on Intelligent Systems and Technology10.1145/365563215:4(1-20)Online publication date: 18-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 13, Issue 5
October 2022
424 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3542930
  • Editor:
  • Huan Liu
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022
Online AM: 22 April 2022
Accepted: 01 March 2022
Revised: 01 November 2021
Received: 01 March 2021
Published in TIST Volume 13, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Federated learning
  2. graph learning
  3. multi-task learning

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Natural Science Foundation of China
  • Natural Science Funding of Zhejiang Province

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)203
  • Downloads (Last 6 weeks)21
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Performance-Based Pricing for Federated Learning via AuctionProceedings of the VLDB Endowment10.14778/3648160.364816917:6(1269-1282)Online publication date: 3-May-2024
  • (2024)A Survey of Graph Neural Networks for Social Recommender SystemsACM Computing Surveys10.1145/366182156:10(1-34)Online publication date: 22-Jun-2024
  • (2024)Counterfactual Graph Convolutional Learning for Personalized RecommendationACM Transactions on Intelligent Systems and Technology10.1145/365563215:4(1-20)Online publication date: 18-Jun-2024
  • (2024)ID-SR: Privacy-Preserving Social Recommendation Based on Infinite Divisibility for Trustworthy AIACM Transactions on Knowledge Discovery from Data10.1145/363941218:7(1-25)Online publication date: 19-Jun-2024
  • (2024)Privacy-Preserving Individual-Level COVID-19 Infection Prediction via Federated Graph LearningACM Transactions on Information Systems10.1145/363320242:3(1-29)Online publication date: 22-Jan-2024
  • (2024)DeCoCDR: Deployable Cloud-Device Collaboration for Cross-Domain RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657786(2114-2123)Online publication date: 10-Jul-2024
  • (2024)Revisit Targeted Model Poisoning on Federated Recommendation: Optimize via Multi-objective TransportProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657764(1722-1732)Online publication date: 10-Jul-2024
  • (2024)GraphScope Flex: LEGO-like Graph Computing StackCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653383(386-399)Online publication date: 9-Jun-2024
  • (2024)Over-the-Air Federated Graph LearningIEEE Transactions on Wireless Communications10.1109/TWC.2024.347190623:12_Part_1(18669-18683)Online publication date: 1-Dec-2024
  • (2024)GraphFederator: Federated Visual Analysis for Multi-party Graphs2024 IEEE 17th Pacific Visualization Conference (PacificVis)10.1109/PacificVis60374.2024.00027(172-181)Online publication date: 23-Apr-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media