research-article

Falkor: Federated Learning Secure Aggregation Powered by AESCTR GPU Implementation

Authors:

Mariya Georgieva Belorgey,

Sofia Dandjee,

Nicolas Gama,

Dimitar Jetchev,

Dmitry MikushinAuthors Info & Claims

WAHC '23: Proceedings of the 11th Workshop on Encrypted Computing & Applied Homomorphic Cryptography

Pages 11 - 22

https://doi.org/10.1145/3605759.3625261

Published: 26 November 2023 Publication History

Get Access

Abstract

We propose a novel protocol, Falkor, for secure aggregation for Federated Learning in the multi-server scenario based on masking of local models via a stream cipher based on AES in counter mode and accelerated by GPUs running on the aggregating servers. The protocol is resilient to client dropout and has reduced clients/servers communication cost by a factor equal to the number of aggregating servers (compared to the naïve baseline method). It scales simultaneously in the two major complexity aspects: 1) large number of clients; 2) highly complex machine learning models such as CNNs, RNNs, Transformers, etc. The AES-CTR-based masking function in our aggregation protocol is built on the concept of counterbased cryptographically-secure pseudorandom number generators (csPRNGs) as described in [32] and subsequently used by Facebook for their torchcsprng csPRNG. We improve upon torchcsprng by careful use of shared memory on the GPU device, a recent idea of Cihangir Tezcan [38] and obtain 100x speedup in the masking function compared to a single CPU core.

Finally, we demonstrate scalability of our protocol in two realworld Federated Learning scenarios: 1) efficient training of large logistic regression models with 50 features and 50M data points distributed across 1000 clients that can dropout and securely aggregated via three servers (running secure multi-party computation (SMPC)); 2) training a recurrent neural network (RNN) model for sentiment analysis of Twitter feeds coming from a large number of Twitter users (more than 250,000 users). In case 1), our secure aggregation algorithm runs in less than a minute compared to a pure MPC computation (on 3 parties) that takes 27 hours and uses 400GB RAM machines as well as 1 gigabit-per-second network. In case 2), the total training is around 10 minutes using our GPU powered secure aggregation versus 10 hours using a single CPU core.

References

[1]

M. Abadi, A. Chu, I. Goodfellow, H. Brendan McMahan, I. Mironov, K. Talwar, and L. Zhang. 2016. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24--28, 2016, Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers, and Shai Halevi (Eds.). ACM, 308--318. https://doi.org/10.1145/2976749.2978318

Abstract

References

Index Terms

Recommendations

Verifiable Secure Aggregation Protocol Under Federated Learning

DealSecAgg: Efficient Dealer-Assisted Secure Aggregation for Federated Learning

FedShare: Secure Aggregation based on Additive Secret Sharing in Federated Learning

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations