Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3605573.3605610acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

PSRA-HGADMM: A Communication Efficient Distributed ADMM Algorithm

Published: 13 September 2023 Publication History

Abstract

Among distributed machine learning algorithms, the global consensus alternating direction method of multipliers (ADMM) has attracted much attention because it can effectively solve large-scale optimization problems. However, the high communication cost slows its convergence and limits scalability. To solve the problem, we propose a hierarchical grouping ADMM algorithm (PSRA-HGADMM) with a novel Ring-Allreduce communication model in this paper. Firstly, we optimize the parameter exchange of the ADMM algorithm and implement the global consensus ADMM algorithm in the decentralized architecture. Secondly, to improve the communication efficiency of the distributed system, we propose a novel Ring-Allreduce communication model (PSR-Allreduce) based on the idea of parameter server architecture. Finally, a Worker-Leader-Group generator (WLG) framework is designed to solve the problem of inconsistency of cluster nodes. This framework combines hierarchical parameter aggregation and adopts the grouping strategy to improve the scalability of the distributed system. Experiments show that PSRA-HGADMM has better convergence performance and better scalability than ADMMLib and AD-ADMM. Compared with ADMMLib, the overall communication cost of PSRA-HGADMM is reduced by 32%.

References

[1]
Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, 2016. Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning. PMLR, 173–182.
[2]
Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning 3, 1 (2011), 1–122.
[3]
Anis Elgabli, Jihong Park, Amrit S Bedi, Mehdi Bennis, and Vaneet Aggarwal. 2020. GADMM: Fast and communication efficient framework for distributed machine learning.J. Mach. Learn. Res. 21, 76 (2020), 1–39.
[4]
Anis Elgabli, Jihong Park, Amrit Singh Bedi, Chaouki Ben Issaid, Mehdi Bennis, and Vaneet Aggarwal. 2020. Q-GADMM: Quantized group ADMM for communication efficient decentralized machine learning. IEEE Transactions on Communications 69, 1 (2020), 164–181.
[5]
Andrew Gibiansky. 2017. Bringing HPC techniques to deep learning. Baidu Research, Tech. Rep. (2017).
[6]
William Gropp, William D Gropp, Ewing Lusk, Anthony Skjellum, and Argonne Distinguished Fellow Emeritus Ewing Lusk. 1999. Using MPI: portable parallel programming with the message-passing interface. Vol. 1. MIT press.
[7]
Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md Patwary, Mostofa Ali, Yang Yang, and Yanqi Zhou. 2017. Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409 (2017).
[8]
Qirong Ho, James Cipar, Henggang Cui, Jin Kyu Kim, and Eric P Xing. 2013. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server. Advances in Neural Information Processing Systems 2013, 2013 (2013), 1223.
[9]
Xin Huang, Guozheng Wang, and Yongmei Lei. 2021. GR-ADMM: A Communication Efficient Algorithm Based on ADMM. In 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 220–227.
[10]
Melvin Johnson, Mike Schuster, Quoc V Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5 (2017), 339–351.
[11]
Arun Kumar, Matthias Boehm, and Jun Yang. 2017. Data management in machine learning: Challenges, techniques, and systems. In Proceedings of the 2017 ACM International Conference on Management of Data. 1717–1722.
[12]
Mu Li, David G Andersen, Alexander J Smola, and Kai Yu. 2014. Communication efficient distributed machine learning with the parameter server. Advances in Neural Information Processing Systems 27 (2014).
[13]
Weiyu Li, Yaohua Liu, Zhi Tian, and Qing Ling. 2019. Communication-censored linearized ADMM for decentralized consensus optimization. IEEE Transactions on Signal and Information Processing over Networks 6 (2019), 18–34.
[14]
Chih-Jen Lin, Ruby C Weng, and S Sathiya Keerthi. 2007. Trust region newton methods for large-scale logistic regression. In Proceedings of the 24th international conference on Machine learning. 561–568.
[15]
Qinyi Luo, Jinkun Lin, Youwei Zhuo, and Xuehai Qian. 2019. Hop: Heterogeneity-aware decentralized training. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 893–907.
[16]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, 2016. Mastering the game of Go with deep neural networks and tree search. nature 529, 7587 (2016), 484–489.
[17]
K Sohn, H Lee, X Yan, C Cortes, N Lawrence, and D Lee. 2015. Advances in neural information processing systems. Neural Information Processing Systems Foundation, Curran Associates, Inc (2015), 3483–3491.
[18]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1–9.
[19]
Zhuojun Tian, Zhaoyang Zhang, Jue Wang, Xiaoming Chen, Wei Wang, and Huaiyu Dai. 2020. Distributed ADMM with synergetic communication and computation. IEEE Transactions on Communications 69, 1 (2020), 501–517.
[20]
Leslie G Valiant. 1990. A bridging model for parallel computation. Commun. ACM 33, 8 (1990), 103–111.
[21]
Dongxia Wang, Yongmei Lei, Jinyang Xie, and Guozheng Wang. 2021. HSAC-ALADMM: an asynchronous lazy ADMM algorithm based on hierarchical sparse allreduce communication. The Journal of Supercomputing 77, 8 (2021), 8111–8134.
[22]
Jinyang Xie and Yongmei Lei. 2019. ADMMLIB: A library of communication-efficient AD-ADMM for distributed machine learning. In IFIP International Conference on Network and Parallel Computing. Springer, 322–326.
[23]
Zheng Xu, Mario Figueiredo, and Tom Goldstein. 2017. Adaptive ADMM with spectral penalty parameter selection. In Artificial Intelligence and Statistics. PMLR, 718–727.
[24]
Kun-Hsing Yu, Andrew L Beam, and Isaac S Kohane. 2018. Artificial intelligence in healthcare. Nature biomedical engineering 2, 10 (2018), 719–731.
[25]
Hao Zhang, Zeyu Zheng, Shizhen Xu, Wei Dai, Qirong Ho, Xiaodan Liang, Zhiting Hu, Jinliang Wei, Pengtao Xie, and Eric P Xing. 2017. Poseidon: An efficient communication architecture for distributed deep learning on { GPU} clusters. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). 181–193.
[26]
Ruiliang Zhang and James Kwok. 2014. Asynchronous distributed ADMM for consensus optimization. In International conference on machine learning. PMLR, 1701–1709.

Cited By

View all
  • (2025)The Fast Inertial ADMM optimization framework for distributed machine learningFuture Generation Computer Systems10.1016/j.future.2024.107575164(107575)Online publication date: Mar-2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '23: Proceedings of the 52nd International Conference on Parallel Processing
August 2023
858 pages
ISBN:9798400708435
DOI:10.1145/3605573
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Hierarchical grouping strategy
  2. PSRA-HGADMM
  3. Ring Allreduce
  4. The global consensus ADMM algorithm
  5. Worker-Leader-Group generator (WLG) framework

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • The National Natural Foundation of China

Conference

ICPP 2023
ICPP 2023: 52nd International Conference on Parallel Processing
August 7 - 10, 2023
UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)4
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)The Fast Inertial ADMM optimization framework for distributed machine learningFuture Generation Computer Systems10.1016/j.future.2024.107575164(107575)Online publication date: Mar-2025

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media