Abstract
Training a machine learning model with federated edge learning (FEEL) is typically time consuming due to the constrained computation power of edge devices and the limited wireless resources in edge networks. In this study, the training time minimization problem is investigated in a quantized FEEL system, where heterogeneous edge devices send quantized gradients to the edge server via orthogonal channels. In particular, a stochastic quantization scheme is adopted for compression of uploaded gradients, which can reduce the burden of per-round communication but may come at the cost of increasing the number of communication rounds. The training time is modeled by taking into account the communication time, computation time, and the number of communication rounds. Based on the proposed training time model, the intrinsic trade-off between the number of communication rounds and per-round latency is characterized. Specifically, we analyze the convergence behavior of the quantized FEEL in terms of the optimality gap. Furthermore, a joint data-and-model-driven fitting method is proposed to obtain the exact optimality gap, based on which the closed-form expressions for the number of communication rounds and the total training time are obtained. Constrained by the total bandwidth, the training time minimization problem is formulated as a joint quantization level and bandwidth allocation optimization problem. To this end, an algorithm based on alternating optimization is proposed, which alternatively solves the subproblem of quantization optimization through successive convex approximation and the subproblem of bandwidth allocation by bisection search. With different learning tasks and models, the validation of our analysis and the near-optimal performance of the proposed optimization algorithm are demonstrated by the simulation results.
摘要
由于边缘设备有限算力和边缘网络有限的无线资源, 利用联邦边缘学习 (federated edge learning, FEEL) 训练机器学习模型通常非常耗时. 本文研究了量化FEEL系统中训练时间最小化问题, 其中异构边缘设备通过正交信道向边缘服务器发送量化后的梯度. 采用随机量化对上传的梯度进行压缩, 可减少每轮通信的开销, 但可能会增加通信轮数. 综合考虑通信时间、 计算时间和通信轮数对训练时间进行建模. 基于所提出的训练时间模型, 描述了通信轮数和每轮延迟之间的内在权衡. 具体地, 分析了量化FEEL的收敛性. 提出一种基于数据模型双驱动的拟合方法以得到精确的最优间隔, 并在此基础上得到通信轮数和总训练时间的闭式表达式. 在总带宽限制下, 将训练时间最小化问题建模为量化级数和带宽分配的优化问题. 本文通过交替求解量化优化子问题 (通过连续凸近似方法求解) 和带宽分配子问题 (通过二分查找方法求解) 解决这个问题. 在不同学习任务和模型下, 仿真结果证明了本文分析的有效性和所提优化算法性能接近最优.
Similar content being viewed by others
References
Alistarh D, Grubic D, Li JZ, et al., 2017. QSGD: communication-efficient SGD via gradient quantization and encoding. Proc 31st Int Conf on Neural Information Processing Systems, p.1707–1718.
Amiri MM, Gündüz D, 2020a. Federated learning over wireless fading channels. IEEE Trans Wirel Commun, 19(5):3546–3557. https://doi.org/10.1109/TWC.2020.2974748
Amiri MM, Gündüz D, 2020b. Machine learning at the wireless edge: distributed stochastic gradient descent over-the-air. IEEE Trans Signal Process, 68:2155–2169. https://doi.org/10.1109/TSP.2020.2981904
Basu D, Data D, Karakus C, et al., 2020. Qsparse-local-SGD: distributed SGD with quantization, sparsification, and local computations. IEEE J Sel Areas Inform Theory, 1(1):217–226. https://doi.org/10.1109/JSAIT.2020.2985917
Bernstein J, Wang YX, Azizzadenesheli K, et al., 2018. signSGD: compressed optimisation for non-convex problems. Proc 35th Int Conf on Machine Learning, p.560–569.
Chang WT, Tandon R, 2020. Communication efficient federated learning over multiple access channels. https://arxiv.org/abs/2001.08737
Chen MZ, Poor HV, Saad W, et al., 2021a. Convergence time optimization for federated learning over wireless networks. IEEE Trans Wirel Commun, 20(4):2457–2471. https://doi.org/10.1109/TWC.2020.3042530
Chen MZ, Yang ZH, Saad W, et al., 2021b. A joint learning and communications framework for federated learning over wireless networks. IEEE Trans Wirel Commun, 20(1):269–283. https://doi.org/10.1109/TWC.2020.3024629
Cover TM, Thomas JA, 2006. Elements of Information Theory (2nd Ed.). John Wiley & Sons, Hoboken, USA.
Dhillon HS, Huang H, Viswanathan H, 2017. Wide-area wireless communication challenges for the Internet of Things. IEEE Commun Mag, 55(2):168–174. https://doi.org/10.1109/MCOM.2017.1500269CM
Diamond S, Boyd S, 2016. CVXPY: a python-embedded modeling language for convex optimization. J Mach Learn Res, 17(1):2909–2913.
Dinh CT, Tran NH, Nguyen MNH, et al., 2021. Federated learning over wireless networks: convergence analysis and resource allocation. IEEE/ACM Trans Netw, 29(1):398–409. https://doi.org/10.1109/TNET.2020.3035770
Gong XW, Vorobyov SA, Tellambura C, 2011. Optimal bandwidth and power allocation for sum ergodic capacity under fading channels in cognitive radio networks. IEEE Trans Signal Process, 59(4):1814–1826. https://doi.org/10.1109/TSP.2010.2101069
Gradshteyn IS, Ryzhik IM, 2014. Table of Integrals, Series, and Products. Academic Press, Cambridge, USA.
He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770–778. https://doi.org/10.1109/CVPR.2016.90
Jin R, He X, Dai H, 2020. On the design of communication efficient federated learning over wireless networks. https://arxiv.org/abs/2004.07351v1
Kairouz P, McMahan HB, Avent B, et al., 2019. Advances and open problems in federated learning. Found Trends® Mach Learn, 14(1–2):1–210. https://doi.org/10.1561/2200000083
Letaief KB, Chen W, Shi YM, et al., 2019. The roadmap to 6G: AI empowered wireless networks. IEEE Commun Mag, 57(8):84–90. https://doi.org/10.1109/MCOM.2019.1900271
Li X, Huang KX, Yang WH, et al., 2020. On the convergence of FedAvg on non-IID data. Proc 8th Int Conf on Learning Representations, p.1–26.
Liu DZ, Simeone O, 2021. Privacy for free: wireless federated learning via uncoded transmission with adaptive power control. IEEE J Sel Areas Commun, 39(1):170–185. https://doi.org/10.1109/JSAC.2020.3036948
Luo B, Li X, Wang SQ, et al., 2021. Cost-effective federated learning design. IEEE Conf on Computer Communications, p.1–10. https://doi.org/10.1109/INFOCOM42981.2021.9488679
Nguyen VD, Sharma SK, Vu TX, et al., 2021. Efficient federated learning algorithm for resource allocation in wireless IoT networks. IEEE Int Things J, 8(5):3394–3409. https://doi.org/10.1109/JIOT.2020.3022534
Nori MK, Yun S, Kim IM, 2021. Fast federated learning by balancing communication trade-offs. IEEE Trans Commun, 69(8):5168–5182. https://doi.org/10.1109/TCOMM.2021.3083316
Park J, Samarakoon S, Bennis M, et al., 2019. Wireless network intelligence at the edge. Proc IEEE, 107(11):2204–2239. https://doi.org/10.1109/JPROC.2019.2941458
Park J, Samarakoon S, Elgabli A, et al., 2021. Communication-efficient and distributed learning over wireless networks: principles and applications. Proc IEEE, 109(5):796–819. https://doi.org/10.1109/JPROC.2021.3055679
Razaviyayn M, 2014. Successive Convex Approximation: Analysis and Applications. PhD Thesis, University of Minnesota, Minnesota, USA.
Reisizadeh A, Mokhtari A, Hassani H, et al., 2020. FedPAQ: a communication-efficient federated learning method with periodic averaging and quantization. Proc 23rd Int Conf on Artificial Intelligence Statistics, p.2021–2031.
Ren JK, He YH, Wen DZ, et al., 2020. Scheduling for cellular federated edge learning with importance and channel awareness. IEEE Trans Wirel Commun, 19(11):7690–7703. https://doi.org/10.1109/TWC.2020.3015671
Salehi M, Hossain E, 2021. Federated learning in unreliable and resource-constrained cellular wireless networks. IEEE Trans Commun, 69(8):5136–5151. https://doi.org/10.1109/TCOMM.2021.3081746
Shi SH, Chu XW, Cheung KC, et al., 2019. Understanding top-k sparsification in distributed deep learning. https://arxiv.org/abs/1911.08772v1
Shlezinger N, Chen MZ, Eldar YC, et al., 2021. UVe-QFed: universal vector quantization for federated learning. IEEE Trans Signal Process, 69:500–514. https://doi.org/10.1109/TSP.2020.3046971
Stich SU, Cordonnier JB, Jaggi M, 2018. Sparsified SGD with memory. Proc 32nd Int Conf on Neural Information Processing Systems, p.4452–4463.
Tse D, Viswanath P, 2005. Fundamentals of Wireless Communication. Cambridge University Press, New York, USA. https://doi.org/10.1017/CBO9780511807213
Wan S, Lu JX, Fan PY, et al., 2021. Convergence analysis and system design for federated learning over wireless networks. IEEE J Sel Areas Commun, 39(12):3622–3639. https://doi.org/10.1109/JSAC.2021.3118351
Wang SQ, Tuor T, Salonidis T, et al., 2019. Adaptive federated learning in resource constrained edge computing systems. IEEE J Sel Areas Commun, 37(6):1205–1221. https://doi.org/10.1109/JSAC.2019.2904348
Wang YM, Xu YQ, Shi QJ, et al., 2022. Quantized federated learning under transmission delay and outage constraints. IEEE J Sel Areas Commun, 40(1):323–341. https://doi.org/10.1109/JSAC.2021.3126081
Wangni JQ, Wang JL, Liu J, et al., 2018. Gradient sparsification for communication-efficient distributed optimization. https://arxiv.org/abs/1710.09854v1
Yang ZH, Chen MZ, Saad W, et al., 2021. Energy efficient federated learning over wireless communication networks. IEEE Trans Wirel Commun, 20(3):1935–1949. https://doi.org/10.1109/TWC.2020.3037554
Zhu GX, Wang Y, Huang KB, 2020a. Broadband analog aggregation for low-latency federated edge learning. IEEE Trans Wirel Commun, 19(1):491–506. https://doi.org/10.1109/TWC.2019.2946245
Zhu GX, Liu DZ, Du YQ, et al., 2020b. Toward an intelligent edge: wireless communication meets machine learning. IEEE Commun Mag, 58(1):19–25. https://doi.org/10.1109/MCOM.001.1900103
Zhu GX, Du YQ, Gündüz D, et al., 2021. One-bit over-the-air aggregation for communication-efficient federated edge learning: design and convergence analysis. IEEE Trans Wirel Commun, 20(3):2120–2135. https://doi.org/10.1109/TWC.2020.3039309
Author information
Authors and Affiliations
Contributions
Peixi LIU, Jiamo JIANG, Guangxu ZHU, Wei JIANG, and Wu LUO designed the research. Guangxu ZHU, Wei JIANG, and Wu LUO supervised the research. Peixi LIU and Guangxu ZHU implemented the simulations. Peixi LIU drafted the paper. Jiamo JIANG and Guangxu ZHU helped organize the paper. Lei CHENG, Ying DU, and Zhiqin WANG revised and finalized the paper.
Corresponding authors
Ethics declarations
Peixi LIU, Jiamo JIANG, Guangxu ZHU, Lei CHENG, Wei JIANG, Wu LUO, Ying DU, and Zhiqin WANG declare that they have no conflict of interest.
Additional information
Project supported by the National Key R&D Program of China (No. 2020YFB1807100), the National Natural Science Foundation of China (No. 62001310), and the Guangdong Basic and Applied Basic Research Foundation, China (No. 2022A1515010109)
List of supplementary materials
Proof S1 Proof of Theorem 1
Proof S2 Proof of Lemma 1
Table S1 Simulation parameters
Fig. S1 Optimality gap and test accuracy in simulation 1
Fig. S2 Optimality gap and test accuracy in simulation 2
Rights and permissions
About this article
Cite this article
Liu, P., Jiang, J., Zhu, G. et al. Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation. Front Inform Technol Electron Eng 23, 1247–1263 (2022). https://doi.org/10.1631/FITEE.2100538
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2100538