Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1996130.1996134acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Juggle: proactive load balancing on multicore computers

Published: 08 June 2011 Publication History

Abstract

We investigate proactive dynamic load balancing on multicore systems, in which threads are continually migrated to reduce the impact of processor/thread mismatches to enhance the flexibility of the SPMD-style programming model, and enable SPMD applications to run efficiently in multiprogrammed environments. We present Juggle, a practical decentralized, user-space implementation of a proactive load balancer that emphasizes portability and usability. Juggle shows performance improvements of up to 80% over static balancing for UPC, OpenMP, and pthreads benchmarks. We analyze the impact of Juggle on parallel applications and derive lower bounds and approximations for thread completion times. We show that results from Juggle closely match theoretical predictions across a variety of architectures, including NUMA and hyper-threaded systems. We also show that Juggle is effective in multiprogrammed environments with unpredictable interference from unrelated external applications.

References

[1]
R. D. Blumofe and D. Papadopoulos. The performance of work stealing in multiprogrammed environments (extended abstract). ACM SIGMETRICS Perform. Eval. Rev., 26(1):266--267, 1998.
[2]
C. Boneti et al. A Dynamic Scheduler for Balancing HPC Applications. In Proc. 2008 ACM/IEEE Supercomputing Conference, 2008.
[3]
F. Cedo et al. The Convergence of Realistic Distributed Load-Balancing Algorithms. Theory Computer Systems, 41:609--618, 2007.
[4]
D. G. Feitelson and L. Rudolph. Gang Scheduling Performance Benefits for Fine-Grain Synchronization. J. Parallel and Distributed Computing, 16:306--318, 1992.
[5]
C. Fonlupt et al. Data-parallel load balancing strategies. Parallel Computing, 24:1665--1684, 1996.
[6]
A. Gupta et al. The Impact Of Operating System Scheduling Policies And Synchronization Methods On Performance Of Parallel Applications. SIGMETRICS Perform. Eval. Rev., 19(1), 1991.
[7]
S. Hofmeyr et al. Load Balancing on Speed. In Proc. 15th ACM Sym. on Principles & Practice of Parallel Programming, 2010.
[8]
C. Iancu et al. Oversubscription on multicore processors. In Proc. 24rd Int'l Parallel and Distributed Processing Sym. (IPDPS), 2010.
[9]
T. Jones et al. Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the OS. ACM Supercomputing, 2003.
[10]
Z. Khan et al. Performance analysis of Dynamic Load Balancing Techniques for Parallel and Distributed Systems. (IJCNS) Int'l J. of Computer and Network Security, 2, 2010.
[11]
A. Kukanov et al. The Foundations for Scalable Multi-Core Software in Intel® Threading Building Blocks. Intel Tech. Jour., 2007.
[12]
T. Li et al. Efficient And Scalable Multiprocessor Fair Scheduling Using Distributed Weighted Round-Robin. In Proc. 14th ACM SIGPLAN Sym. on Principles and Practice of Parallel Programming, 2009.
[13]
R. Nishtala et al. Optimizing Collective Communication on Multicores. In Proc. 1st USENIX Ws. on Hot Topics in Parallelism, 2009.
[14]
S. Olivier and J. Prins. Scalable Dynamic Load Balancing Using UPC. In Proc. 37th Int'l Conf. on Parallel Processing, pages 123--131, 2008.
[15]
J. Ousterhout. Scheduling Techniques for Concurrent Systems. In Proc. 3rd Int'l Conf. on Distributed Computing Systems, 1982.
[16]
A. Plastino et al. Developing SPMD Applications with Load Balancing. Parallel Computing, pages 743--766, 2003.
[17]
J. Roberson. ULE: A Modern Scheduler for FreeBSD. In USENIX BSDCon, pages 17--28, 2003.
[18]
M. Willebeek-LeMair and A. Reeves. Strategies for dynamic load balancing on highly parallel computers. IEEE Trans. on Parallel and Distributed Systems, 4, 1993.
[19]
C. Xu and F. C. Lau. Load Balancing in Parallel Computers: Theory and Practice. Kluwer Academic Publishers, 1997.

Cited By

View all
  • (2020)Compiler Abstractions and Runtime for Extreme-scale SAR and CFD Workloads2020 IEEE/ACM Fifth International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)10.1109/ESPM251964.2020.00010(1-7)Online publication date: Nov-2020
  • (2019)Distributed-Memory Load Balancing With Cyclic Token-Based Work-Stealing Applied to Reverse Time MigrationIEEE Access10.1109/ACCESS.2019.29391007(128419-128430)Online publication date: 2019
  • (2016)Enabling Hybrid Parallel Runtimes Through Kernel and Virtualization SupportACM SIGPLAN Notices10.1145/3007611.289225551:7(161-175)Online publication date: 25-Mar-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '11: Proceedings of the 20th international symposium on High performance distributed computing
June 2011
296 pages
ISBN:9781450305525
DOI:10.1145/1996130
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. operating systems
  2. parallel programming
  3. proactive load balancing

Qualifiers

  • Research-article

Conference

HPDC '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Compiler Abstractions and Runtime for Extreme-scale SAR and CFD Workloads2020 IEEE/ACM Fifth International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)10.1109/ESPM251964.2020.00010(1-7)Online publication date: Nov-2020
  • (2019)Distributed-Memory Load Balancing With Cyclic Token-Based Work-Stealing Applied to Reverse Time MigrationIEEE Access10.1109/ACCESS.2019.29391007(128419-128430)Online publication date: 2019
  • (2016)Enabling Hybrid Parallel Runtimes Through Kernel and Virtualization SupportACM SIGPLAN Notices10.1145/3007611.289225551:7(161-175)Online publication date: 25-Mar-2016
  • (2016)Enabling Hybrid Parallel Runtimes Through Kernel and Virtualization SupportProceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/2892242.2892255(161-175)Online publication date: 25-Mar-2016
  • (2016)Time Donating Barrier for efficient task scheduling in competitive multicore systemsFuture Generation Computer Systems10.1016/j.future.2015.04.00554(469-477)Online publication date: Jan-2016
  • (2015)TumblerACM Transactions on Architecture and Code Optimization10.1145/282769812:4(1-24)Online publication date: 16-Nov-2015
  • (2015)Fair-Share Scheduling for Performance-Asymmetric Multicore Architecture via Scaled Virtual RuntimeProceedings of the 2015 IEEE 21st International Conference on Embedded and Real-Time Computing Systems and Applications10.1109/RTCSA.2015.10(60-69)Online publication date: 19-Aug-2015
  • (2014)A performance-aware quality of service-driven scheduler for multicore processorsACM SIGBED Review10.1145/2597457.259746411:1(50-55)Online publication date: 1-Feb-2014
  • (2014)Saving energy by exploiting residual imbalances on iterative applications2014 21st International Conference on High Performance Computing (HiPC)10.1109/HiPC.2014.7116895(1-10)Online publication date: Dec-2014
  • (2012)A Hierarchical Approach for Load Balancing on Parallel Multi-core SystemsProceedings of the 2012 41st International Conference on Parallel Processing10.1109/ICPP.2012.9(118-127)Online publication date: 10-Sep-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media