Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2822332.2822335acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Towards efficient scheduling of data intensive high energy physics workflows

Published: 15 November 2015 Publication History

Abstract

Data intensive high energy physics workflows executed on geographically distributed resources pose a tremendous challenge for efficient use of computing resources. In this early work paper, we present a hierarchical framework for efficient allocation of resources and energy-efficient assignment of tasks for a representative high energy physics application, the Belle II experiments. With an expected data rate of 25 peta bytes per year from experimental data and Monte Carlo simulations, the Belle II experiment provides an ideal platform for algorithmic development. Building on the analogy of the unit commitment problem in electric power grids, we present a novel cost-efficient method for resource allocation that feeds into energy-efficient assignment of tasks to resources using a novel semi-matching based algorithm. We demonstrate that this approach is both computationally efficient and effective. We expect the methods developed in this work to benefit Belle II and other complex workflows executed on distributed resources.

References

[1]
Phoronix Media: Intel core i7 3770k power consumption, thermal, 1999. 7
[2]
S. Alam and J. Vetter. A framework to develop symbolic performance models of parallel applications. In Proc. of the 20th IEEE International Parallel and Distributed Processing Symposium, page 8, 2006. 8
[3]
S. R. Alam and J. S. Vetter. Hierarchical model validation of symbolic performance models of scientific kernels. In W. E. Nagel, W. V. Walter, and W. Lehner, editors, Proc. of the 12th International Euro-Par Conference, volume 4128 of Lecture Notes in Computer Science, pages 65--77. Springer, 2006. 8
[4]
S. Ali, H. J. Siegel, M. Maheswaran, and D. Hensgen. Task execution time modeling for heterogeneous computing systems. In Heterogeneous Computing Workshop, 2000.(HCW 2000) Proceedings. 9th, pages 185--199. IEEE, 2000. 5
[5]
D. M. Asner, E. Dart, and T. Hara. Belle II: Experiment network and computing. Technical Report arXiv:1308.0672. PNNL-SA-97204, Aug 2013. Contributed to CSS2013 (Snowmass). 1
[6]
O. Beaumont, H. Casanova, A. Legrand, Y. Robert, and Y. Yang. Scheduling divisible loads on star and tree networks: results and open problems. Parallel and Distributed Systems, IEEE Transactions on, 16(3):207--218, March 2005. 5
[7]
A. Benoit, J. Langguth, and B. Ucar. Semi-matching algorithms for scheduling parallel tasks under resource constraints. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW '13, pages 1744--1753, Washington, DC, USA, 2013. IEEE Computer Society. 8
[8]
V. Bharadwaj, D. Ghose, and T. G. Robertazzi. Divisible load theory: A new paradigm for load scheduling in distributed systems. Cluster Computing, 6(1):7--17, Jan. 2003. 7
[9]
J. Bruno, E. G. Coffman, Jr., and R. Sethi. Scheduling independent tasks to reduce mean finishing time. Commun. ACM, 17(7):382--387, July 1974. 8
[10]
A. Calotoiu, T. Hoeer, M. Poke, and F. Wolf. Using automated performance modeling to find scalability bugs in complex codes. In Proc. of the 2013 ACM/IEEE Conference on Supercomputing, Denver, CO, 2013. ACM, New York, NY. 8
[11]
L. De La Torre and J. Seguel. A comparison of two master-worker scheduling methods. In High Performance Computing and Communications, 2009. HPCC '09. 11th IEEE International Conference on, pages 597--602, June 2009. 5
[12]
R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Optimization and approximation in deterministic sequencing and scheduling: a survey. Annals of discrete mathematics, 5(2):287--326, 1979. 8
[13]
T. Hara. Belle II: Computing and network requirements. In Proc. of the Asia-Pacific Advanced Network, pages 115--122, 2014. 1, 2, 3
[14]
N. J. A. Harvey, R. E. Ladner, L. Lovász, and T. Tamir. Semi-matchings for bipartite graphs and load balancing. J. Algorithms, 59(1):53--78, Apr. 2006. 5, 8
[15]
E. Jeannot, E. Saule, and D. Trystram. Bi-objective approximation scheme for makespan and reliability optimization on uniform parallel machines. In Proceedings of the 14th International Euro-Par Conference on Parallel Processing, Euro-Par '08, pages 877--886, Berlin, Heidelberg, 2008. Springer-Verlag. 5
[16]
B. C. Lee, D. M. Brooks, B. R. de Supinski, M. Schulz, K. Singh, and S. A. McKee. Methods of inference and learning for performance modeling of parallel applications. In Proc. of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 249--258, San Jose, CA, 2007. ACM, New York, NY. 8
[17]
J. Meng, V. Morozov, K. Kumaran, V. Vishwanath, and T. Uram. GROPHECY: GPU performance projection from CPU code skeletons. In Proc. of the 2011 ACM/IEEE Conference on Supercomputing, Seattle, WA, 2011. IEEE Computer Society, Los Alamitos, CA. 8
[18]
G. R. Nudd, D. J. Kerbyson, E. Papaefstathiou, S. C. Perry, J. S. Harper, and D. V. Wilcox. Pace: A toolset for the performance prediction of parallel and distributed systems. International Journal on High Performance Computing Applications, 14(3):228--251, 2000. 8

Cited By

View all
  • (2023)Scaling Optimal Allocation of Cloud Resources Using Lagrange RelaxationJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-43943-8_9(173-192)Online publication date: 15-Sep-2023
  • (2021)GDSim: Benchmarking Geo-Distributed Data Center Schedulers2021 IEEE 10th International Conference on Cloud Networking (CloudNet)10.1109/CloudNet53349.2021.9657143(148-156)Online publication date: 8-Nov-2021
  • (2019)Stochastic Programming Approach for Resource Selection Under Demand UncertaintyJob Scheduling Strategies for Parallel Processing10.1007/978-3-030-10632-4_6(107-126)Online publication date: 13-Jan-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WORKS '15: Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science
November 2015
98 pages
ISBN:9781450339896
DOI:10.1145/2822332
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bi-objective optimization
  2. energy-efficient scheduling
  3. high energy physics
  4. large scale workflows
  5. modeling and simulation
  6. task assignment

Qualifiers

  • Research-article

Funding Sources

  • U. S. Department of Energy

Conference

SC15
Sponsor:

Acceptance Rates

WORKS '15 Paper Acceptance Rate 9 of 13 submissions, 69%;
Overall Acceptance Rate 30 of 54 submissions, 56%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Scaling Optimal Allocation of Cloud Resources Using Lagrange RelaxationJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-43943-8_9(173-192)Online publication date: 15-Sep-2023
  • (2021)GDSim: Benchmarking Geo-Distributed Data Center Schedulers2021 IEEE 10th International Conference on Cloud Networking (CloudNet)10.1109/CloudNet53349.2021.9657143(148-156)Online publication date: 8-Nov-2021
  • (2019)Stochastic Programming Approach for Resource Selection Under Demand UncertaintyJob Scheduling Strategies for Parallel Processing10.1007/978-3-030-10632-4_6(107-126)Online publication date: 13-Jan-2019
  • (2018)Optimizing Distributed Data-Intensive Workflows2018 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2018.00045(279-289)Online publication date: Sep-2018
  • (2018)Towards Efficient Resource Allocation for Distributed Workflows Under Demand UncertaintiesJob Scheduling Strategies for Parallel Processing10.1007/978-3-319-77398-8_6(103-121)Online publication date: 28-Feb-2018
  • (2017)Integrating prediction, provenance, and optimization into high energy workflowsJournal of Physics: Conference Series10.1088/1742-6596/898/6/062052898(062052)Online publication date: 21-Nov-2017
  • (2016)Efficient Genetic Algorithm Encoding for Large-Scale Multi-objective Resource Allocation2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2016.36(1360-1369)Online publication date: May-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media