Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation

Liu, Peixi; Jiang, Jiamo; Zhu, Guangxu; Cheng, Lei; Jiang, Wei; Luo, Wu; Du, Ying; Wang, Zhiqin

doi:10.1631/FITEE.2100538

Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation

基于联邦边缘学习的梯度量化和带宽分配优化策略

Review Article
Published: 24 August 2022

Volume 23, pages 1247–1263, (2022)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Peixi Liu (刘沛西) ORCID: orcid.org/0000-0002-9047-8889^1,3,
Jiamo Jiang (江甲沫) ORCID: orcid.org/0000-0002-4986-7081²,
Guangxu Zhu (朱光旭) ORCID: orcid.org/0000-0001-9532-9201³,
Lei Cheng (程磊)^4,5,
Wei Jiang (蒋伟)¹,
Wu Luo (罗武)¹,
Ying Du (杜滢)² &
…
Zhiqin Wang (王志勤)²

384 Accesses
Explore all metrics

Abstract

Training a machine learning model with federated edge learning (FEEL) is typically time consuming due to the constrained computation power of edge devices and the limited wireless resources in edge networks. In this study, the training time minimization problem is investigated in a quantized FEEL system, where heterogeneous edge devices send quantized gradients to the edge server via orthogonal channels. In particular, a stochastic quantization scheme is adopted for compression of uploaded gradients, which can reduce the burden of per-round communication but may come at the cost of increasing the number of communication rounds. The training time is modeled by taking into account the communication time, computation time, and the number of communication rounds. Based on the proposed training time model, the intrinsic trade-off between the number of communication rounds and per-round latency is characterized. Specifically, we analyze the convergence behavior of the quantized FEEL in terms of the optimality gap. Furthermore, a joint data-and-model-driven fitting method is proposed to obtain the exact optimality gap, based on which the closed-form expressions for the number of communication rounds and the total training time are obtained. Constrained by the total bandwidth, the training time minimization problem is formulated as a joint quantization level and bandwidth allocation optimization problem. To this end, an algorithm based on alternating optimization is proposed, which alternatively solves the subproblem of quantization optimization through successive convex approximation and the subproblem of bandwidth allocation by bisection search. With different learning tasks and models, the validation of our analysis and the near-optimal performance of the proposed optimization algorithm are demonstrated by the simulation results.

摘要

由于边缘设备有限算力和边缘网络有限的无线资源, 利用联邦边缘学习 (federated edge learning, FEEL) 训练机器学习模型通常非常耗时. 本文研究了量化FEEL系统中训练时间最小化问题, 其中异构边缘设备通过正交信道向边缘服务器发送量化后的梯度. 采用随机量化对上传的梯度进行压缩, 可减少每轮通信的开销, 但可能会增加通信轮数. 综合考虑通信时间、计算时间和通信轮数对训练时间进行建模. 基于所提出的训练时间模型, 描述了通信轮数和每轮延迟之间的内在权衡. 具体地, 分析了量化FEEL的收敛性. 提出一种基于数据模型双驱动的拟合方法以得到精确的最优间隔, 并在此基础上得到通信轮数和总训练时间的闭式表达式. 在总带宽限制下, 将训练时间最小化问题建模为量化级数和带宽分配的优化问题. 本文通过交替求解量化优化子问题 (通过连续凸近似方法求解) 和带宽分配子问题 (通过二分查找方法求解) 解决这个问题. 在不同学习任务和模型下, 仿真结果证明了本文分析的有效性和所提优化算法性能接近最优.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Optimization of Bandwidth Allocation and Gradient Quantization for Federated Edge Learning

Communication-efficient Federated Learning via Quantized Clipped SGD

An Asynchronous Federated Learning Optimization Scheme Based on Model Partition

References

Alistarh D, Grubic D, Li JZ, et al., 2017. QSGD: communication-efficient SGD via gradient quantization and encoding. Proc 31^st Int Conf on Neural Information Processing Systems, p.1707–1718.
Amiri MM, Gündüz D, 2020a. Federated learning over wireless fading channels. IEEE Trans Wirel Commun, 19(5):3546–3557. https://doi.org/10.1109/TWC.2020.2974748
Article Google Scholar
Amiri MM, Gündüz D, 2020b. Machine learning at the wireless edge: distributed stochastic gradient descent over-the-air. IEEE Trans Signal Process, 68:2155–2169. https://doi.org/10.1109/TSP.2020.2981904
Article MathSciNet Google Scholar
Basu D, Data D, Karakus C, et al., 2020. Qsparse-local-SGD: distributed SGD with quantization, sparsification, and local computations. IEEE J Sel Areas Inform Theory, 1(1):217–226. https://doi.org/10.1109/JSAIT.2020.2985917
Article Google Scholar
Bernstein J, Wang YX, Azizzadenesheli K, et al., 2018. signSGD: compressed optimisation for non-convex problems. Proc 35^th Int Conf on Machine Learning, p.560–569.
Chang WT, Tandon R, 2020. Communication efficient federated learning over multiple access channels. https://arxiv.org/abs/2001.08737
Chen MZ, Poor HV, Saad W, et al., 2021a. Convergence time optimization for federated learning over wireless networks. IEEE Trans Wirel Commun, 20(4):2457–2471. https://doi.org/10.1109/TWC.2020.3042530
Article Google Scholar
Chen MZ, Yang ZH, Saad W, et al., 2021b. A joint learning and communications framework for federated learning over wireless networks. IEEE Trans Wirel Commun, 20(1):269–283. https://doi.org/10.1109/TWC.2020.3024629
Article Google Scholar
Cover TM, Thomas JA, 2006. Elements of Information Theory (2^nd Ed.). John Wiley & Sons, Hoboken, USA.
MATH Google Scholar
Dhillon HS, Huang H, Viswanathan H, 2017. Wide-area wireless communication challenges for the Internet of Things. IEEE Commun Mag, 55(2):168–174. https://doi.org/10.1109/MCOM.2017.1500269CM
Article Google Scholar
Diamond S, Boyd S, 2016. CVXPY: a python-embedded modeling language for convex optimization. J Mach Learn Res, 17(1):2909–2913.
MathSciNet MATH Google Scholar
Dinh CT, Tran NH, Nguyen MNH, et al., 2021. Federated learning over wireless networks: convergence analysis and resource allocation. IEEE/ACM Trans Netw, 29(1):398–409. https://doi.org/10.1109/TNET.2020.3035770
Article Google Scholar
Gong XW, Vorobyov SA, Tellambura C, 2011. Optimal bandwidth and power allocation for sum ergodic capacity under fading channels in cognitive radio networks. IEEE Trans Signal Process, 59(4):1814–1826. https://doi.org/10.1109/TSP.2010.2101069
Article MathSciNet Google Scholar
Gradshteyn IS, Ryzhik IM, 2014. Table of Integrals, Series, and Products. Academic Press, Cambridge, USA.
MATH Google Scholar
He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770–778. https://doi.org/10.1109/CVPR.2016.90
Jin R, He X, Dai H, 2020. On the design of communication efficient federated learning over wireless networks. https://arxiv.org/abs/2004.07351v1
Kairouz P, McMahan HB, Avent B, et al., 2019. Advances and open problems in federated learning. Found Trends® Mach Learn, 14(1–2):1–210. https://doi.org/10.1561/2200000083
MATH Google Scholar
Letaief KB, Chen W, Shi YM, et al., 2019. The roadmap to 6G: AI empowered wireless networks. IEEE Commun Mag, 57(8):84–90. https://doi.org/10.1109/MCOM.2019.1900271
Article Google Scholar
Li X, Huang KX, Yang WH, et al., 2020. On the convergence of FedAvg on non-IID data. Proc 8^th Int Conf on Learning Representations, p.1–26.
Liu DZ, Simeone O, 2021. Privacy for free: wireless federated learning via uncoded transmission with adaptive power control. IEEE J Sel Areas Commun, 39(1):170–185. https://doi.org/10.1109/JSAC.2020.3036948
Article Google Scholar
Luo B, Li X, Wang SQ, et al., 2021. Cost-effective federated learning design. IEEE Conf on Computer Communications, p.1–10. https://doi.org/10.1109/INFOCOM42981.2021.9488679
Nguyen VD, Sharma SK, Vu TX, et al., 2021. Efficient federated learning algorithm for resource allocation in wireless IoT networks. IEEE Int Things J, 8(5):3394–3409. https://doi.org/10.1109/JIOT.2020.3022534
Article Google Scholar
Nori MK, Yun S, Kim IM, 2021. Fast federated learning by balancing communication trade-offs. IEEE Trans Commun, 69(8):5168–5182. https://doi.org/10.1109/TCOMM.2021.3083316
Article Google Scholar
Park J, Samarakoon S, Bennis M, et al., 2019. Wireless network intelligence at the edge. Proc IEEE, 107(11):2204–2239. https://doi.org/10.1109/JPROC.2019.2941458
Article Google Scholar
Park J, Samarakoon S, Elgabli A, et al., 2021. Communication-efficient and distributed learning over wireless networks: principles and applications. Proc IEEE, 109(5):796–819. https://doi.org/10.1109/JPROC.2021.3055679
Article Google Scholar
Razaviyayn M, 2014. Successive Convex Approximation: Analysis and Applications. PhD Thesis, University of Minnesota, Minnesota, USA.
Google Scholar
Reisizadeh A, Mokhtari A, Hassani H, et al., 2020. FedPAQ: a communication-efficient federated learning method with periodic averaging and quantization. Proc 23^rd Int Conf on Artificial Intelligence Statistics, p.2021–2031.
Ren JK, He YH, Wen DZ, et al., 2020. Scheduling for cellular federated edge learning with importance and channel awareness. IEEE Trans Wirel Commun, 19(11):7690–7703. https://doi.org/10.1109/TWC.2020.3015671
Article Google Scholar
Salehi M, Hossain E, 2021. Federated learning in unreliable and resource-constrained cellular wireless networks. IEEE Trans Commun, 69(8):5136–5151. https://doi.org/10.1109/TCOMM.2021.3081746
Article Google Scholar
Shi SH, Chu XW, Cheung KC, et al., 2019. Understanding top-k sparsification in distributed deep learning. https://arxiv.org/abs/1911.08772v1
Shlezinger N, Chen MZ, Eldar YC, et al., 2021. UVe-QFed: universal vector quantization for federated learning. IEEE Trans Signal Process, 69:500–514. https://doi.org/10.1109/TSP.2020.3046971
Article MathSciNet Google Scholar
Stich SU, Cordonnier JB, Jaggi M, 2018. Sparsified SGD with memory. Proc 32^nd Int Conf on Neural Information Processing Systems, p.4452–4463.
Tse D, Viswanath P, 2005. Fundamentals of Wireless Communication. Cambridge University Press, New York, USA. https://doi.org/10.1017/CBO9780511807213
Book Google Scholar
Wan S, Lu JX, Fan PY, et al., 2021. Convergence analysis and system design for federated learning over wireless networks. IEEE J Sel Areas Commun, 39(12):3622–3639. https://doi.org/10.1109/JSAC.2021.3118351
Article Google Scholar
Wang SQ, Tuor T, Salonidis T, et al., 2019. Adaptive federated learning in resource constrained edge computing systems. IEEE J Sel Areas Commun, 37(6):1205–1221. https://doi.org/10.1109/JSAC.2019.2904348
Article Google Scholar
Wang YM, Xu YQ, Shi QJ, et al., 2022. Quantized federated learning under transmission delay and outage constraints. IEEE J Sel Areas Commun, 40(1):323–341. https://doi.org/10.1109/JSAC.2021.3126081
Article Google Scholar
Wangni JQ, Wang JL, Liu J, et al., 2018. Gradient sparsification for communication-efficient distributed optimization. https://arxiv.org/abs/1710.09854v1
Yang ZH, Chen MZ, Saad W, et al., 2021. Energy efficient federated learning over wireless communication networks. IEEE Trans Wirel Commun, 20(3):1935–1949. https://doi.org/10.1109/TWC.2020.3037554
Article Google Scholar
Zhu GX, Wang Y, Huang KB, 2020a. Broadband analog aggregation for low-latency federated edge learning. IEEE Trans Wirel Commun, 19(1):491–506. https://doi.org/10.1109/TWC.2019.2946245
Article Google Scholar
Zhu GX, Liu DZ, Du YQ, et al., 2020b. Toward an intelligent edge: wireless communication meets machine learning. IEEE Commun Mag, 58(1):19–25. https://doi.org/10.1109/MCOM.001.1900103
Article Google Scholar
Zhu GX, Du YQ, Gündüz D, et al., 2021. One-bit over-the-air aggregation for communication-efficient federated edge learning: design and convergence analysis. IEEE Trans Wirel Commun, 20(3):2120–2135. https://doi.org/10.1109/TWC.2020.3039309
Article Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Advanced Optical Communication Systems and Networks, Department of Electronics, Peking University, Beijing, 100871, China
Peixi Liu (刘沛西), Wei Jiang (蒋伟) & Wu Luo (罗武)
China Academy of Information and Communications Technology, Beijing, 100191, China
Jiamo Jiang (江甲沫), Ying Du (杜滢) & Zhiqin Wang (王志勤)
Shenzhen Research Institute of Big Data, Shenzhen, 518172, China
Peixi Liu (刘沛西) & Guangxu Zhu (朱光旭)
College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, 310027, China
Lei Cheng (程磊)
Zhejiang Provincial Key Laboratory of Information Processing, Communication and Networking, Hangzhou, 310027, China
Lei Cheng (程磊)

Authors

Peixi Liu (刘沛西)
View author publications
You can also search for this author in PubMed Google Scholar
Jiamo Jiang (江甲沫)
View author publications
You can also search for this author in PubMed Google Scholar
Guangxu Zhu (朱光旭)
View author publications
You can also search for this author in PubMed Google Scholar
Lei Cheng (程磊)
View author publications
You can also search for this author in PubMed Google Scholar
Wei Jiang (蒋伟)
View author publications
You can also search for this author in PubMed Google Scholar
Wu Luo (罗武)
View author publications
You can also search for this author in PubMed Google Scholar
Ying Du (杜滢)
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqin Wang (王志勤)
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Peixi LIU, Jiamo JIANG, Guangxu ZHU, Wei JIANG, and Wu LUO designed the research. Guangxu ZHU, Wei JIANG, and Wu LUO supervised the research. Peixi LIU and Guangxu ZHU implemented the simulations. Peixi LIU drafted the paper. Jiamo JIANG and Guangxu ZHU helped organize the paper. Lei CHENG, Ying DU, and Zhiqin WANG revised and finalized the paper.

Corresponding authors

Correspondence to Jiamo Jiang (江甲沫) or Guangxu Zhu (朱光旭).

Ethics declarations

Peixi LIU, Jiamo JIANG, Guangxu ZHU, Lei CHENG, Wei JIANG, Wu LUO, Ying DU, and Zhiqin WANG declare that they have no conflict of interest.

Additional information

Project supported by the National Key R&D Program of China (No. 2020YFB1807100), the National Natural Science Foundation of China (No. 62001310), and the Guangdong Basic and Applied Basic Research Foundation, China (No. 2022A1515010109)

List of supplementary materials

Proof S1 Proof of Theorem 1

Proof S2 Proof of Lemma 1

Table S1 Simulation parameters

Fig. S1 Optimality gap and test accuracy in simulation 1

Fig. S2 Optimality gap and test accuracy in simulation 2

Supplementary materials for