research-article

Author retrospective for PYRROS: static task scheduling and code generation for message passing multiprocessors

Authors:

Apostolos GerasoulisAuthors Info & Claims

ACM International Conference on Supercomputing 25th Anniversary Volume

Pages 18 - 20

https://doi.org/10.1145/2591635.2591647

Published: 10 June 2014 Publication History

Abstract

Given a program with annotated task parallelism represented as a directed acyclic graph (DAG), the PYRROS project was focused on fast DAG scheduling, code generation and runtime execution on distributed memory architectures. PYRROS scheduling goes through several processing stages including clustering of tasks, cluster mapping, and task execution ordering. Since the publication of the PYRROS project, there have been new advancements in the area of DAG scheduling algorithms, the use of DAG scheduling for irregular and large-scale computation, and software system development with annotated task parallelism on modern parallel and cloud architectures. This retrospective describes our experience from this project and the follow-up work, and reviews representative papers related to DAG scheduling published in the last decade.

References

[1]

D. Brown and P. Messina (Chairs). Scientific Grand Challenges: Crosscutting Technologies for Computing at the Exascale. Report from the 2010 Workshop. U.S. Department of Energy, 2010.

[2]

Vikram S. Adve and Rizos Sakellariou. Compiler synthesis of task graphs for parallel program performance prediction. In Proc. of 13th Inter. Workshop on Languages and Compilers for Parallel Computing-Revised Papers, pages 208--226, 2001.

Digital Library

[3]

Steven Balensiefer, Lucas Kregor-Stickles, and Mark Oskin. An evaluation framework and instruction set architecture for ion-trap based quantum micro-architectures. In Proc. of 32nd Annual Inter. Symposium on Computer Architecture, pages 186--196, 2005.

Digital Library

[4]

Muthu Manikandan Baskaran, Nagavijayalakshmi Vydyanathan, Uday Kumar Reddy Bondhugula, J. Ramanujam, Atanas Rountev, and P. Sadayappan. Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors. In Proc. of 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 219--228, 2009.

Digital Library

[5]

Pieter Bellens, Josep M. Perez, Rosa M. Badia, and Jesus Labarta. Cellss: A programming model for the cell be architecture. In Proc. of ACM/IEEE Supercomputing'06.

Digital Library

[6]

R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In Proc. of 5th ACM Symposium on Principles and Practice of Parallel Programming, pages 207--216, 1995.

Digital Library

[7]

George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Pierre Lemarinier, and Jack Dongarra. Dague: A generic distributed dag engine for high performance computing. Parallel Comput., 38(1--2):37--51, January 2012.

Digital Library

[8]

Doruk Bozdaug, Füsun Özgüner, and Umit V. Catalyurek. Compaction of schedules and a two-stage approach for duplication-based dag scheduling. IEEE Trans. Parallel Distrib. Syst., 20(6):857--871, June 2009.

Digital Library

[9]

Javier Bueno, Judit Planas, Alejandro Duran, Rosa M. Badia, Xavier Martorell, Eduard Ayguade, and Jesus Labarta. Productive programming of gpu clusters with ompss. In Proc. of 2012 IEEE 26th Inter. Parallel and Distributed Processing Symposium, pages 557--568, 2012.

Digital Library

[10]

F. T. Chong, S. D. Sharma, E. A. Brewer, and J. Saltz. Multiprocessor Runtime Support for Fine-Grained Irregular DAGs. In Rajiv K. Kalia and Priya Vashishta, editors, Toward Teraflop Computing and New Grand Challenge Applications., New York, 1995. Nova Science Publishers.

[11]

Michel Cosnard and Emmanuel Jeannot. Compact dag representation and its dynamic scheduling. J. Parallel Distrib. Comput., 58(3):487--514, September 1999.

Digital Library

[12]

Michel Cosnard, Emmanuel Jeannot, and Tao Yang. Compact dag representation and its symbolic scheduling. J. Parallel Distrib. Comput., 64(8):921--935, August 2004.

Digital Library

[13]

Mohammad I. Daoud and Nawwaf Kharma. A hybrid heuristic-genetic algorithm for task scheduling in heterogeneous processor networks. J. Parallel Distrib. Comput., 71(11):1518--1531, November 2011.

Digital Library

[14]

Jack Dongarra. Achitecture Aware Algorithms and Software for Peta and Exascale. Presentation at Ken Kennedy Institute of Information Technology, Feb 2014.

[15]

C. Fu, X. Jiao, and T. Yang. Efficient Sparse LU Factorization with Partial Pivoting on Distributed Memory Architectures. IEEE Transactions on Parallel and Distributed Systems, 9(2):109--125, 1998.

Digital Library

[16]

C. Fu and T. Yang. Sparse LU Factorization with Partial Pivoting on Distributed Memory Machines. In Proc. of ACM/IEEE SuperComputing'96.

Digital Library

[17]

C. Fu and T. Yang. Run-time Compilation for Parallel Sparse Matrix Computations. In Proc. of ACM Inter. Conf. on Supercomputing, pages 237--244, 1996.

Digital Library

[18]

C. Fu and T. Yang. Space and Time Efficient Execution of Parallel Irregular Computations. In Proc. of ACM Symposium on Principles & Practice of Parallel Programming, pages 57--68, 1997.

Digital Library

[19]

A. Gerasoulis and T. Yang. On the Granularity and Clustering of Directed Acyclic Task Graphs . IEEE Trans. on Parallel and Distributed Syst., 4(6):686--701, June 1993.

Digital Library

[20]

Apostolos Gerasoulis and Jia Jiao. Rescheduling support for mapping dynamic scientific computation onto distributed memory multiprocessors. In Proc. of Third Inter. Euro-Par Conference on Parallel Processing, pages 905--912, 1997.

Digital Library

[21]

M. Girkar and C. Polychronopoulos. Automatic Extraction of Functinal Parallelism from Ordinary Programs. IEEE Trans. on Parallel and Distributed Syst., 3(2):166--178, 1992.

Digital Library

[22]

Thomas A. Henzinger, Anmol V. Singh, Vasu Singh, Thomas Wies, and Damien Zufferey. A marketplace for cloud resources. In Proc. of Tenth ACM Inter. Conf. on Embedded Software, pages 1--8, 2010.

Digital Library

[23]

Thomas A. Henzinger, Anmol V. Singh, Vasu Singh, Thomas Wies, and Damien Zufferey. Static scheduling in clouds. In Proc. of 3rd USENIX HotCloud, 2011.

Digital Library

[24]

Yu-Kwong Kwok and Ishfaq Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv., 31(4):406--471, 1999.

Digital Library

[25]

Qingyu Meng, Alan Humphrey, John Schmidt, and Martin Berzins. Investigating applications portability with the uintah dag-based runtime system on petascale supercomputers. In Proc. of SC13: Inter. Conf. for High Performance Computing, Networking, Storage and Analysis, pages 96:1--96:12, 2013.

Digital Library

[26]

Martha Mercaldi, Steven Swanson, Andrew Petersen, Andrew Putnam, Andrew Schwerin, Mark Oskin, and Susan J. Eggers. Instruction scheduling for a tiled dataflow architecture. In Proc. of 12th Inter. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 141--150, 2006.

Digital Library

[27]

Tony Nowatzki, Michael Sartin-Tarm, Lorenzo De Carli, Karthikeyan Sankaralingam, Cristian Estan, and Behnam Robatmili. A general constraint-centric scheduling framework for spatial architectures. In Proc. of 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 495--506, 2013.

Digital Library

[28]

V. Sarkar. Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors. MIT Press, 1989.

Digital Library

[29]

Oliver Sinnen, Andrea To, and Manpreet Kaur. Contention-aware scheduling with task duplication. J. Parallel Distrib. Comput., 71(1):77--86, January 2011.

Digital Library

[30]

Xiaoyong Tang, Kenli Li, Meikang Qiu, and Edwin H. M. Sha. A hierarchical reliability-driven scheduling algorithm in grid systems. J. Parallel Distrib. Comput., 72(4):525--535, April 2012.

Digital Library

[31]

Naga Vydyanathan, Umit Catalyurek, Tahsin Kurc, Ponnuswamy Sadayappan, and Joel Saltz. Optimizing latency and throughput of application workflows on clusters. Parallel Comput., 37(10--11):694--712, 2011.

Digital Library

[32]

T. Yang and A. Gerasoulis. List Scheduling With and Without Communication . Parallel Computing, 19:1321--1344, 1993.

Digital Library

[33]

T. Yang and A. Gerasoulis. DSC: Scheduling Parallel Tasks on An Unbounded Number of Processors. IEEE Trans. on Parallel and Distributed Syst., 5(9):951--967, 1994.

Digital Library

[34]

Tao Yang and Cong Fu. Space/time-efficient scheduling and execution of parallel irregular computations. ACM Trans. Program. Lang. Syst., 20(6):1195--1222, November 1998.

Digital Library

Index Terms

Author retrospective for PYRROS: static task scheduling and code generation for message passing multiprocessors

Recommendations

On the Granularity and Clustering of Directed Acyclic Task Graphs

The authors consider the impact of the granularity on scheduling task graphs. Schedulingconsists of two parts, the processors assignment of tasks, also called clustering, and theordering of tasks for execution in each processor. The authors introduce ...
Brief Announcement: Scheduling Parallelizable Jobs Online to Maximize Throughput
SPAA '17: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures

We consider scheduling parallelizable jobs online to maximize the throughput or profit of the schedule. A set of n jobs arrive online and each job J_i has an associated function p_i(t), the profit obtained for finishing job J_i at time t. Each job has its ...
Scheduling directed a-cyclic task graphs on a bounded set of heterogeneous processors using task duplication

In a distributed computing environment, the schedule by which tasks are assigned to processors is critical to minimizing the overall run-time of the application. However, the problem of discovering the schedule that gives the minimum finish time is NP-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ACM International Conference on Supercomputing 25th Anniversary Volume

June 2014

94 pages

ISBN:9781450328401

DOI:10.1145/2591635

Editor:
Utpal Banerjee

Copyright © 2014 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2014

Check for updates

Author Tags

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
83
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents