research-article

Decentralized Learning Made Easy with DecentralizePy

Authors:

Anne-Marie Kermarrec,

Rishi Sharma, and

Milos VujasinovicAuthors Info & Claims

EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems

May 2023

Pages 34 - 41

https://doi.org/10.1145/3578356.3592587

Published: 08 May 2023 Publication History

Abstract

Decentralized learning (DL) has gained prominence for its potential benefits in terms of scalability, privacy, and fault tolerance. It consists of many nodes that coordinate without a central server and exchange millions of parameters in the inherently iterative process of machine learning (ML) training. In addition, these nodes are connected in complex and potentially dynamic topologies. Assessing the intricate dynamics of such networks is clearly not an easy task. Often in literature, researchers resort to simulated environments that do not scale and fail to capture practical and crucial behaviors, including the ones associated to parallelism, data transfer, network delays, and wall-clock time. In this paper, we propose decentralizepy, a distributed framework for decentralized ML, which allows for the emulation of large-scale learning networks in arbitrary topologies. We demonstrate the capabilities of decentralizepy by deploying techniques such as sparsification and secure aggregation on top of several topologies, including dynamic networks with more than one thousand nodes.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: a system for Large-Scale machine learning (OSDI'16). https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi

[2]

Dan Alistarh, Demjan Grubic, Jerry Z. Li, Ryota Tomioka, and Milan Vojnovic. 2017. QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding (NIPS'17).

[3]

Dan Alistarh, Torsten Hoefler, Mikael Johansson, Sarit Khirirat, Nikola Konstantinov, and Cédric Renggli. 2018. The Convergence of Sparsified Gradient Methods (NIPS'18). https://proceedings.neurips.cc/paper_files/paper/2018/file/314450613369e0ee72d0da7f6fee773c-Paper.pdf

[4]

Batiste Le Bars, Aurélien Bellet, Marc Tommasi, Erick Lavoie, and Anne-Marie Kermarrec. 2023. Refined Convergence and Topology Learning for Decentralized Optimization with Heterogeneous Data (AISTATS'23). arXiv:2204.04452

[5]

Aurélien Bellet, Anne-Marie Kermarrec, and Erick Lavoie. 2022. D-Cliques: Compensating for Data Heterogeneity with Topology in Decentralized Federated Learning (SRDS'22).

[6]

Juan Benet. 2014. IPFS - Content Addressed, Versioned, P2P File System. CoRR (2014). arXiv:1407.3561

[7]

Daniel J Beutel, Taner Topal, Akhil Mathur, Xinchi Qiu, Titouan Parcollet, and Nicholas D Lane. 2020. Flower: A Friendly Federated Learning Research Framework. (2020). arXiv:2007.14390

[8]

Luca Boccassi et al. 2023. ZeroMQ: An open-source universal messaging library. https://zeromq.org

[9]

Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, et al. 2019. Towards Federated Learning at Scale: System Design (ML-Sys'19). https://proceedings.mlsys.org/paper_files/paper/2019/file/bd686fd640be98efaae0091fa301e613-Paper.pdf

[10]

Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical Secure Aggregation for Privacy-Preserving Machine Learning (CCS '17).

Digital Library

[11]

Sebastian Caldas, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, and Ameet Talwalkar. 2019. Leaf: A benchmark for federated settings. In 2nd Intl. Workshop on Federated Learning for Data Privacy and Confidentiality (FL-NeurIPS'19). arXiv:1812.01097

[12]

Akash Dhasade, Nevena Dresevic, Anne-Marie Kermarrec, and Rafael Pires. 2022. TEE-based decentralized recommender systems: The raw data sharing redemption (IPDPS'22).

[13]

Paulo Gouveia, João Neves, Carlos Segarra, Luca Liechti, Shady Issa, Valerio Schiavoni, and Miguel Matos. 2020. Kollaps: Decentralized and Dynamic Topology Emulation (EuroSys '20). Article 23.

Digital Library

[14]

Chaoyang He, Songze Li, Jinhyun So, Mi Zhang, Hongyi Wang, Xiaoyang Wang, Praneeth Vepakomma, Abhishek Singh, Hang Qiu, Li Shen, Peilin Zhao, Yan Kang, Yang Liu, Ramesh Raskar, Qiang Yang, Murali Annavaram, and Salman Avestimehr. 2020. FedML: A research library and benchmark for federated machine learning. (2020). arXiv:2007.13518

[15]

Kevin Hsieh, Amar Phanishayee, Onur Mutlu, and Phillip B. Gibbons. 2020. The Non-IID Data Quagmire of Decentralized Machine Learning. In Proceedings of the 37th International Conference on Machine Learning (ICML'20). Article 408. http://proceedings.mlr.press/v119/hsieh20a/hsieh20a.pdf

[16]

Márk Jelasity, Spyros Voulgaris, Rachid Guerraoui, Anne-Marie Kermarrec, and Maarten Van Steen. 2007. Gossip-based peer sampling. ACM Transactions on Computer Systems (TOCS) 25, 3 (2007), 8--es.

Digital Library

[17]

Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konecný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, and Sen Zhao. 2020. Advances and open problems in federated learning. Foundations and Trends in Machine Learning 14, 1--2 (2020).

Digital Library

[18]

Anastasia Koloskova, Tao Lin, Sebastian U Stich, and Martin Jaggi. 2020. Decentralized Deep Learning with Arbitrary Communication Compression (ICLR'20). https://openreview.net/forum?id=SkgGCkrKvH

[19]

Anastasia Koloskova, Nicolas Loizou, Sadra Boreiri, Martin Jaggi, and Sebastian Stich. 2020. A Unified Theory of Decentralized SGD with Changing Topology and Local Updates (ICML'20). https://proceedings.mlr.press/v119/koloskova20a.html

[20]

Anastasia Koloskova, Sebastian Stich, and Martin Jaggi. 2019. Decentralized stochastic optimization and gossip algorithms with compressed communication (ICML'19). https://proceedings.mlr.press/v97/koloskova19a.html

[21]

Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. 2014. The CIFAR-10 dataset. 55, 5 (2014). https://www.cs.toronto.edu/~kriz/cifar.html

[22]

Fan Lai, Yinwei Dai, Sanjay Singapuram, et al. 2022. FedScale: Benchmarking Model and System Performance of Federated Learning at Scale (ICML'22). https://proceedings.mlr.press/v162/lai22a.html

[23]

Xiangru Lian, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, and Ji Liu. 2017. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent (NIPS'17). https://proceedings.neurips.cc/paper_files/paper/2017/file/f75526659f31040afeb61cb7133e4e6d-Paper.pdf

[24]

Yujun Lin, Song Han, Huizi Mao, Yu Wang, and Bill Dally. 2018. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training (ICLR'18). https://openreview.net/forum?id=SkhQHMW0W

[25]

Yang Liu, Tao Fan, Tianjian Chen, Qian Xu, and Qiang Yang. 2021. FATE: An Industrial Grade Platform for Collaborative Learning With Data Protection. J. Mach. Learn. Res. 22, 226 (2021). http://jmlr.org/papers/v22/20-815.html

[26]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data (AISTATS'17). https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf

[27]

Message Passing Interface Forum. 2021. MPI: A Message-Passing Interface Standard Version 4.0. https://www.mpi-forum.org/docs/mpi4.0/mpi40-report.pdf

[28]

Christodoulos Pappas, Dimitris Chatzopoulos, Spyros Lalis, and Manolis Vavalis. 2021. IPLS: A Framework for Decentralized Federated Learning (IFIP Networking'21).

[29]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS'19. https://proceedings.neurips.cc/paper_files/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf

[30]

Holger R Roth, Yan Cheng, Yuhong Wen, Isaac Yang, Ziyue Xu, YuanTing Hsieh, Kristopher Kersten, Ahmed Harouni, Can Zhao, Kevin Lu, Zhihong Zhang, Wenqi Li, Andriy Myronenko, Dong Yang, Sean Yang, Nicola Rieke, Abood Quraini, Chester Chen, Daguang Xu, Nic Ma, Prerna Dogra, Mona G Flores, and Andrew Feng. 2022. NVIDIA FLARE: Federated Learning from Simulation to Real-World. In Workshop on Federated Learning: Recent Advances and New Challenges. https://openreview.net/forum?id=hD9QaIQTL_f

[31]

Rishi Sharma et al. 2022. decentralizepy: An open-source decentralized learning research framework. https://github.com/sacs-epfl/decentralizepy

[32]

Nikko Strom. 2015. Scalable distributed DNN training using commodity GPU cloud computing. In 16th Annual Conference of the International Speech Communication Association (INTER-SPEECH'15). https://www.isca-speech.org/archive_v0/interspeech_2015/papers/i15_1488.pdf

[33]

Thijs Vogels, Hadrien Hendrikx, and Martin Jaggi. 2022. Beyond spectral gap: the role of the topology in decentralized learning (NeurIPS'22). https://proceedings.neurips.cc/paper_files/paper/2022/file/61162d94822d468ee6e92803340f2040-Paper-Conference.pdf

[34]

Thijs Vogels, Sai Praneeth Karimireddy, and Martin Jaggi. 2020. Practical Low-Rank Communication Compression in Decentralized Deep Learning (NeurIPS'20). https://proceedings.neurips.cc/paper_files/paper/2020/file/a376802c0811f1b9088828288eb0d3f0-Paper.pdf

[35]

Milos Vujasinovic. 2023. Secure Aggregation on Sparse Models in Decentralized Learning Systems. Master's thesis. EPFL. https://www.epfl.ch/labs/sacs/wp-content/uploads/2023/02/Secure_Aggregation_on_Sparse_Models_in_Decentralized_Learning_Systems___Milos_Vujasinovic.pdf

[36]

Lin Xiao, Stephen Boyd, and Seung-Jean Kim. 2007. Distributed average consensus with least-mean-square deviation. J. Parallel and Distrib. Comput. 67, 1 (2007).

Digital Library

[37]

Timothy Yang, Galen Andrew, Hubert Eichner, Haicheng Sun, Wei Li, Nicholas Kong, Daniel Ramage, and Françoise Beaufays. 2018. Applied federated learning: Improving google keyboard query suggestions. (2018). arXiv:1812.02903

[38]

Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, and Dacheng Tao. 2022. Topology-aware generalization of decentralized SGD (ICML'22). https://proceedings.mlr.press/v162/zhu22d.html

[39]

Alexander Ziller, Andrew Trask, Antonio Lopardo, et al. 2021. PySyft: A library for easy federated learning. In Federated Learning Systems.

Cited By

Dhasade AKermarrec APires RSharma RVujasinovic MWigger J(2023)Get More for Less in Decentralized Learning Systems2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00067(463-474)Online publication date: Jul-2023
https://doi.org/10.1109/ICDCS57875.2023.00067

Index Terms

Decentralized Learning Made Easy with DecentralizePy

Recommendations

Peer-to-peer deep learning with non-IID data
Abstract
Collaborative training of deep neural networks using edge devices has attracted substantial research interest recently. The two main architecture approaches for the training process are centrally orchestrated Federated Learning and ...
Highlights
- Batch Normalization layers improve decentralized learning on non-IID data.
- P2P-...
Read More
Full-Information Lookups for Peer-to-Peer Overlays

Most peer-to-peer lookup schemes keep a small amount of routing state per node, typically logarithmic in the number of overlay nodes. This design assumes that routing information at each member node must be kept small so that the bookkeeping required to ...
Read More
Decentralized and collaborative machine learning framework for IoT
Abstract
Decentralized machine learning has recently been proposed as a potential solution to the security issues of the canonical federated learning approach. In this paper, we propose a decentralized and collaborative machine learning framework ...
Highlights
- This paper introduces an algorithm for resource-limited devices in IoT deployments.
- Two incremental learning mechanisms are proposed: a full ILVQ and a hybrid approach.
- Two sharing protocols are proposed: a random and a performance-...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems

May 2023

176 pages

ISBN:9798400700842

DOI:10.1145/3578356

Workshop Co-chairs:
Eiko Yoneki
University of Cambridge
,
Luigi Nardi
Lund University
Stanford University

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

EuroMLSys '23

Sponsor:

SIGOPS

EuroMLSys '23: 3rd Workshop on Machine Learning and Systems

May 8, 2023

Rome, Italy

Acceptance Rates

Overall Acceptance Rate 18 of 26 submissions, 69%

Upcoming Conference

EuroSys '25

Sponsor:
sigops

Twentieth European Conference on Computer Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
181
Total Downloads

Downloads (Last 12 months)143
Downloads (Last 6 weeks)12

Other Metrics

View Author Metrics

Citations

Cited By

Dhasade AKermarrec APires RSharma RVujasinovic MWigger J(2023)Get More for Less in Decentralized Learning Systems2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00067(463-474)Online publication date: Jul-2023
https://doi.org/10.1109/ICDCS57875.2023.00067

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents