Optimising Communication Overhead in Federated Learning Using NSGA-II
Authors:
José Ángel Morell,
Zakaria Abdelmoiz Dahi,
Francisco Chicano,
Gabriel Luque,
Enrique Alba
Abstract:
Federated learning is a training paradigm according to which a server-based model is cooperatively trained using local models running on edge devices and ensuring data privacy. These devices exchange information that induces a substantial communication load, which jeopardises the functioning efficiency. The difficulty of reducing this overhead stands in achieving this without decreasing the model'…
▽ More
Federated learning is a training paradigm according to which a server-based model is cooperatively trained using local models running on edge devices and ensuring data privacy. These devices exchange information that induces a substantial communication load, which jeopardises the functioning efficiency. The difficulty of reducing this overhead stands in achieving this without decreasing the model's efficiency (contradictory relation). To do so, many works investigated the compression of the pre/mid/post-trained models and the communication rounds, separately, although they jointly contribute to the communication overload. Our work aims at optimising communication overhead in federated learning by (I) modelling it as a multi-objective problem and (II) applying a multi-objective optimization algorithm (NSGA-II) to solve it. To the best of the author's knowledge, this is the first work that \texttt{(I)} explores the add-in that evolutionary computation could bring for solving such a problem, and \texttt{(II)} considers both the neuron and devices features together. We perform the experimentation by simulating a server/client architecture with 4 slaves. We investigate both convolutional and fully-connected neural networks with 12 and 3 layers, 887,530 and 33,400 weights, respectively. We conducted the validation on the \texttt{MNIST} dataset containing 70,000 images. The experiments have shown that our proposal could reduce communication by 99% and maintain an accuracy equal to the one obtained by the FedAvg Algorithm that uses 100% of communications.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
JSDoop and TensorFlow.js: Volunteer Distributed Web Browser-Based Neural Network Training
Authors:
José Á. Morell,
Andrés Camero,
Enrique Alba
Abstract:
In 2019, around 57\% of the population of the world has broadband access to the Internet. Moreover, there are 5.9 billion mobile broadband subscriptions, i.e., 1.3 subscriptions per user. So there is an enormous interconnected computational power held by users all around the world. Also, it is estimated that Internet users spend more than six and a half hours online every day. But in spite of bein…
▽ More
In 2019, around 57\% of the population of the world has broadband access to the Internet. Moreover, there are 5.9 billion mobile broadband subscriptions, i.e., 1.3 subscriptions per user. So there is an enormous interconnected computational power held by users all around the world. Also, it is estimated that Internet users spend more than six and a half hours online every day. But in spite of being a great amount of time, those resources are idle most of the day. Therefore, taking advantage of them presents an interesting opportunity. In this study, we introduce JSDoop, a prototype implementation to profit from this opportunity. In particular, we propose a volunteer web browser-based high-performance computing library. JSdoop divides a problem into tasks and uses different queues to distribute the computation. Then, volunteers access the web page of the problem and start processing the tasks in their web browsers. We conducted a proof-of-concept using our proposal and TensorFlow.js to train a recurrent neural network that predicts text. We tested it in a computer cluster and with up to 32 volunteers. The experimental results show that training a neural network in distributed web browsers is feasible and accurate, has a high scalability, and it is an interesting area for research.
△ Less
Submitted 12 October, 2019;
originally announced October 2019.