research-article

Just move it!: dynamic parameter allocation in action

Authors:

Alexander Renz-Wieland,

Tobias Drobisch,

Rainer Gemulla,

Volker MarklAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 14, Issue 12

Pages 2707 - 2710

https://doi.org/10.14778/3476311.3476325

Published: 01 July 2021 Publication History

Abstract

Parameter servers (PSs) ease the implementation of distributed machine learning systems, but their performance can fall behind that of single machine baselines due to communication overhead. We demonstrate Lapse, an open source PS with dynamic parameter allocation. Previous work has shown that dynamic parameter allocation can improve PS performance by up to two orders of magnitude and lead to near-linear speed-ups over single machine baselines. This demonstration illustrates how Lapse is used and why it can provide order-of-magnitude speed-ups over other PSs. To do so, this demonstration interactively analyzes and visualizes how dynamic parameter allocation looks like in action.

References

[1]

M. Abadi, P. Barham, J. Chen, et al. TensorFlow: A system for large-scale machine learning. OSDI '16, pp. 265--283.

Digital Library

[2]

A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, A. Smola. Scalable inference in latent variable models. WSDM '12, pp. 123--132.

Digital Library

[3]

A. Beutel, P. P. Talukdar, A. Kumar, C. Faloutsos, E. Papalexakis, E. Xing. FlexiFaCT: Scalable flexible factorization of coupled tensors on Hadoop. SDM '14, pp. 109--117.

[4]

T. Chen, M. Li, Y. Li, et al. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. CoRR, abs/1512.01274, 2015.

[5]

T. Chilimbi, Y. Suzue, J. Apacible, K. Kalyanaraman. Project Adam: Building an efficient and scalable deep learning training system. OSDI '14, p. 571--582.

Digital Library

[6]

W. Dai, A. Kumar, J. Wei, Q. Ho, G. Gibson, E. P. Xing. High-performance distributed ML at scale through parameter server consistency models. AAAI '15.

Digital Library

[7]

J. Dean, G. Corrado, R. Monga, et al. Large scale distributed deep networks. NIPS '12, pp. 1223--1231.

Digital Library

[8]

R. Gemulla, E. Nijkamp, P. Haas, Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. KDD '11, pp. 69--77.

Digital Library

[9]

J. Gonzalez, Y. Low, H. Gu, D. Bickson, C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. OSDI '12, pp. 17--30.

Digital Library

[10]

Q. Ho, J. Cipar, H. Cui, et al. More effective distributed ML via a stale synchronous parallel parameter server. NIPS '13, pp. 1223--1231.

Digital Library

[11]

Y. Huang, T. Jin, Y. Wu, et al. FlexPS: Flexible parallelism control in parameter server architecture. PVLDB, 11(5):566--579, 2018.

Digital Library

[12]

R. Jagerman, C. Eickhoff, M. de Rijke. Computing web-scale topic models using an asynchronous parameter server. SIGIR '17, pp. 1337--1340.

Digital Library

[13]

J. Jiang, B. Cui, C. Zhang, L. Yu. Heterogeneity-aware distributed parameter servers. SIGMOD '17, p. 463--478.

Digital Library

[14]

J. Kim, Q. Ho, S. Lee, et al. STRADS: A distributed framework for scheduled model parallel machine learning. EuroSys '16, pp. 5:1--5:16.

Digital Library

[15]

J. K. Kim, A. Aghayev, G. Gibson, E. Xing. STRADS-AP: Simplifying distributed machine learning programming without introducing a new programming model. USENIX '19, pp. 207--222.

Digital Library

[16]

A. Lerer, L. Wu, J. Shen, T. Lacroix, L. Wehrstedt, A. Bose, A. Peysakhovich. PyTorch-BigGraph: A large-scale graph embedding system. SysML '19.

[17]

M. Li, D. Andersen, J. W. Park, A. Smola, A. Ahmed, V. Josifovski, J. Long, E. Shekita, B.-Y. Su. Scaling distributed machine learning with the parameter server. OSDI '14, pp. 583--598.

Digital Library

[18]

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, J. Hellerstein. Distributed GraphLab: A framework for machine learning and data mining in the cloud. PVLDB, 5(8):716--727, 2012.

Digital Library

[19]

B. Peng, B. Zhang, L. Chen, M. Avram, R. Henschel, C. Stewart, S. Zhu, E. Mccallum, L. Smith, T. Zahniser, et al. HarpLDA+: Optimizing latent dirichlet allocation for parallel efficiency. BigData '17, pp. 243--252.

[20]

P. Raman, S. Srinivasan, S. Matsushima, X. Zhang, H. Yun, S. Vishwanathan. Scaling multinomial logistic regression via hybrid parallelism. KDD '19, pp. 1460--1470.

Digital Library

[21]

A. Renz-Wieland, R. Gemulla, S. Zeuch, V. Markl. Dynamic parameter allocation in parameter servers. PVLDB, 13(12):1877--1890, 2020.

Digital Library

[22]

A. Smola, S. Narayanamurthy. An architecture for parallel topic models. PVLDB, 3(1--2):703--710, 2010.

Digital Library

[23]

C. Teflioudi, F. Makari, R. Gemulla. Distributed matrix completion. ICDM '12, pp. 655--664.

Digital Library

[24]

H. Yun, H.-F. Yu, C.-J. Hsieh, S. Vishwanathan, I. Dhillon. NOMAD: Non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. PVLDB, 7(11):975--986, 2014.

Digital Library

[25]

Z. Zhang, B. Cui, Y. Shao, L. Yu, J. Jiang, X. Miao. PS2: Parameter server on Spark. SIGMOD '19, pp. 376--388.

Digital Library

Cited By

Miao XShi YZhang HZhang XNie XYang ZCui BIves ZBonifati AEl Abbadi A(2022)HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model TrainingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517902(470-480)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517902
Renz-Wieland AGemulla RKaoudi ZMarkl VIves ZBonifati AEl Abbadi A(2022)NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter AccessProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517860(481-495)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517860

Index Terms

Just move it!: dynamic parameter allocation in action
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
2. Software and its engineering

Index terms have been assigned to the content through auto-classification.

Recommendations

Move-optimal gossiping among mobile agents

Mobile-agent-based distributed systems are attracting widespread attention because of their adaptability and flexibility; mobile agents traverse the system and carry out a task at each node. In mobile-agent-based systems, gossip is a fundamental task in ...
Write Once, Move Anywhere: Toward Dynamic Interoperability of Mobile Agent Systems
Playing Music in Just Intonation: A Dynamically Adaptive Tuning Scheme
We investigate a dynamically adaptive tuning scheme for microtonal tuning of musical instruments, allowing the performer to play music in just intonation in any key. Unlike other methods, which are based on a procedural analysis of the chordal structure, ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 14, Issue 12

July 2021

587 pages

ISSN:2150-8097

Editors:
Xin Luna Dong
Amazon
,
Felix Naumann
HPI, University of Potsdam

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2021

Published in PVLDB Volume 14, Issue 12

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
48
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Miao XShi YZhang HZhang XNie XYang ZCui BIves ZBonifati AEl Abbadi A(2022)HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model TrainingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517902(470-480)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517902
Renz-Wieland AGemulla RKaoudi ZMarkl VIves ZBonifati AEl Abbadi A(2022)NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter AccessProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517860(481-495)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517860

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents