Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3661725.3661747acmotherconferencesArticle/Chapter ViewAbstractPublication PagescmldsConference Proceedingsconference-collections
research-article

QFL: Federated Learning Acceleration Based on QAT Hardware Accelerator

Published: 20 June 2024 Publication History

Abstract

Federated Learning(FL) enables geographically dispersed organizations to collaboratively train a machine learning model. In this process, a parameter server enables global updating and synchronization of model by receiving and aggregating model data from multiple clients. In order to ensure security of this process, homomorphic encryption (HE) algorithms are used by clients to achieve data privacy. However, HE brings huge computational overhead (i.e., the computational cost of data encryption/decryption) and communication overhead (multiple rounds of FL communication, more than 150 times of ciphertext expansion in each round), and eventually becomes the performance bottleneck of the entire FL system. In this paper, we present QFL, a system solution for FL based on Intel QAT(Quick Assist Technology) hardware accelerator that substantially reduces the computation and communication overhead caused by HE. Based on the optimized HE algorithm, we leverage coroutines to concurrently and asynchronously offload the HE modular exponentiation operation to the QAT, and use an event-driven mechanism to get QAT calculation results timely to reduce computing overhead. Through the combination of error feedback gradient compression algorithm and QAT Hardware Accelerated Huffman coding, we greatly reduce the communication overhead and accelerate server-side gradient aggregation,reduce the system complexity. Our solution improves encryption throughput by 16 × compared with the open source Python encryption library python-Paillier[1]. Compared with the state-of-the-art FL framework with HE [32], our solution shrinks the training time by 3 × when reaching the same test accuracy.

References

[1]
2013. Python Paillier Library. https://github.com/data61/python-paillier
[2]
2021. Pyfhel: Python For Homomorphic Encryption Libraries. https://github.com/ibarrond/Pyfhel
[3]
2021. SEAL. https://github.com/Microsoft/SEAL
[4]
Nitesh Aggarwal, CP Gupta, and Iti Sharma. 2014. Fully Homomorphic symmetric scheme without bootstrapping. In Proceedings of 2014 International Conference on Cloud Computing and Internet of Things. 14–17. https://doi.org/10.1109/CCIOT.2014.7062497
[5]
Dan Alistarh, Demjan Grubic, Jerry Z. Li, Ryota Tomioka, and Milan Vojnovic. 2017. QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 1707–1718.
[6]
Ayoub Benaissa, Bilal Retiat, Bogdan Cebere, and Alaa Eddine Belfedhal. 2021. TenSEAL: A library for encrypted tensor operations using homomorphic encryption. arXiv preprint arXiv:2104.03152 (2021).
[7]
Xuanyu Cao, Tamer Başar, Suhas Diggavi, Yonina C. Eldar, Khaled B. Letaief, H. Vincent Poor, and Junshan Zhang. 2023. Communication-Efficient Distributed Learning: An Overview. IEEE Journal on Selected Areas in Communications 41, 4 (2023), 851–873. https://doi.org/10.1109/JSAC.2023.3242710
[8]
Xiaodian Cheng, Wanhang Lu, Xinyang Huang, Shuihai Hu, and Kai Chen. 2021. HAFLO: GPU-Based Acceleration for Federated Logistic Regression. arxiv:2107.13797 [cs.LG]
[9]
Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. 2017. Homomorphic Encryption for Arithmetic of Approximate Numbers. In Advances in Cryptology – ASIACRYPT 2017, Tsuyoshi Takagi and Thomas Peyrin (Eds.). Springer International Publishing, Cham, 409–437.
[10]
Xiaojie Feng and Haizhou Du. 2021. FLZip: An Efficient and Privacy-Preserving Framework for Cross-Silo Federated Learning. In 2021 IEEE International Conferences on Internet of Things (iThings) and IEEE Green Computing & Communications (GreenCom) and IEEE Cyber, Physical & Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics). 209–216. https://doi.org/10.1109/iThings-GreenCom-CPSCom-SmartData-Cybermatics53846.2021.00044
[11]
Craig Gentry. 2009. Fully Homomorphic Encryption Using Ideal Lattices. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing (Bethesda, MD, USA) (STOC ’09). Association for Computing Machinery, New York, NY, USA, 169–178. https://doi.org/10.1145/1536414.1536440
[12]
Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).
[13]
Intel. 2019. Intel quickassist technology (intel QAT). Retrieved August, 2019 from https://www.intel.com/content/www/us/en/architecture-and-technology/intel-quick-assist-technology-overview.html
[14]
Zhifeng Jiang, Wei Wang, and Yang Liu. 2021. FLASHE: Additively Symmetric Homomorphic Encryption for Cross-Silo Federated Learning. arxiv:2109.00675 [cs.CR]
[15]
Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, 2021. Advances and Open Problems in Federated Learning.
[16]
Sai Praneeth Karimireddy, Quentin Rebjock, Sebastian Stich, and Martin Jaggi. 2019. Error Feedback Fixes SignSGD and other Gradient Compression Schemes. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 3252–3261. https://proceedings.mlr.press/v97/karimireddy19a.html
[17]
Alex Krizhevsky, Geoffrey Hinton, 2009. Learning multiple layers of features from tiny images. (2009).
[18]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).
[19]
Yujun Lin, Song Han, Huizi Mao, Yu Wang, and William J Dally. 2018. Deep Gradient Compression: Reducing the communication bandwidth for distributed training. In The International Conference on Learning Representations.
[20]
Changchang Liu, Supriyo Chakraborty, and Dinesh Verma. 2019. Secure Model Fusion for Distributed Learning Using Partial Homomorphic Encryption. Springer International Publishing, Cham, 154–179. https://doi.org/10.1007/978-3-030-17277-0_9
[21]
Yang Liu, Tao Fan, Tianjian Chen, Qian Xu, and Qiang Yang. 2021. FATE: An Industrial Grade Platform for Collaborative Learning with Data Protection. J. Mach. Learn. Res. 22, 1, Article 226 (jan 2021), 6 pages.
[22]
Payman Mohassel and Peter Rindal. 2018. ABY3: A Mixed Protocol Framework for Machine Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (Toronto, Canada) (CCS ’18). Association for Computing Machinery, New York, NY, USA, 35–52. https://doi.org/10.1145/3243734.3243760
[23]
Pascal Paillier. 1999. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In Proceedings of the 17th International Conference on Theory and Application of Cryptographic Techniques (Prague, Czech Republic) (EUROCRYPT’99). Springer-Verlag, Berlin, Heidelberg, 223–238.
[24]
Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 2014. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs. In Proc. Interspeech 2014. 1058–1062. https://doi.org/10.21437/Interspeech.2014-274
[25]
Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, 2021. Towards scalable distributed training of deep learning on public cloud clusters. Proceedings of Machine Learning and Systems 3 (2021), 401–412.
[26]
Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (Denver, Colorado, USA) (CCS ’15). Association for Computing Machinery, New York, NY, USA, 1310–1321. https://doi.org/10.1145/2810103.2813687
[27]
N. P. Smart and F. Vercauteren. 2010. Fully Homomorphic Encryption with Relatively Small Key and Ciphertext Sizes. In Public Key Cryptography – PKC 2010, Phong Q. Nguyen and David Pointcheval (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 420–443.
[28]
N. P. Smart and F. Vercauteren. 2014. Fully Homomorphic SIMD Operations. Des. Codes Cryptography 71, 1 (apr 2014), 57–81. https://doi.org/10.1007/s10623-012-9720-4
[29]
Marten van Dijk, Craig Gentry, Shai Halevi, and Vinod Vaikuntanathan. 2010. Fully Homomorphic Encryption over the Integers. In Proceedings of the 29th Annual International Conference on Theory and Applications of Cryptographic Techniques (French Riviera, France) (EUROCRYPT’10). Springer-Verlag, Berlin, Heidelberg, 24–43. https://doi.org/10.1007/978-3-642-13190-5_2
[30]
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated Machine Learning: Concept and Applications. ACM Trans. Intell. Syst. Technol. 10, 2, Article 12 (jan 2019), 19 pages. https://doi.org/10.1145/3298981
[31]
Zhaoxiong Yang, Shuihai Hu, and Kai Chen. 2020. FPGA-Based Hardware Accelerator of Homomorphic Encryption for Efficient Federated Learning. arxiv:2007.10560 [cs.CR]
[32]
Chengliang Zhang, Suyi Li, Junzhe Xia, Wei Wang, Feng Yan, and Yang Liu. 2020. BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning. In Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference(USENIX ATC’20). USENIX Association, USA, Article 33, 14 pages.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CMLDS '24: Proceedings of the International Conference on Computing, Machine Learning and Data Science
April 2024
381 pages
ISBN:9798400716393
DOI:10.1145/3661725
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Federated Learning
  2. Gradient Compression
  3. Hardware Acceleration
  4. Homomorphic Encryption

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CMLDS 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 53
    Total Downloads
  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)7
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media