Article

Free access

Space and time efficient execution of parallel irregular computations

Authors:

Tao YangAuthors Info & Claims

PPOPP '97: Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming

Pages 57 - 68

https://doi.org/10.1145/263764.263773

Published: 21 June 1997 Publication History

Abstract

Solving problems of large sizes is an important goal for parallel machines with multiple CPU and memory resources. In this paper, issues of efficient execution of overhead-sensitive parallel irregular computation under memory constraints are addressed. The irregular parallelism is modeled by task dependence graphs with mixed granularities. The trade-off in achieving both time and space efficiency is investigated. The main difficulty of designing efficient run-time system support is caused by the use of fast communication primitives available on modern parallel architectures. A run-time active memory management scheme and new scheduling techniques are proposed to improve memory utilization while retaining good time efficiency, and a theoretical analysis on correctness and performance is provided. This work is implemented in the context of RAPID system [5] which provides run-time support for parallelizing irregular code on distributed memory machines and the effectiveness of the proposed techniques is verified on sparse Cholesky and LU factorization with partial pivoting. The experimental results on Cray-T3D show that solvable problem sizes can be increased substantially under limited memory capacities and the loss of execution efficiency caused by the extra memory managing overhead is reasonable.

References

[1]

G. E. Blelloch, P. B. Gibbons, and Y. Matias. Provably Efficient Scheduling for Languages with Fine-Grained Parallelism. In Proceedings of 7th A CM Symposium on Parallel Algorithms and Architectures, pages 1-12, July 1995.]]

Digital Library

[2]

R. Blumfoe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In Proceedings of Fifth A CM Symposium on Principles and Practice of Parallel Programming, pages 207--216, July 1995.]]

Digital Library

[3]

S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the Benefits of Mixed Data and Task Parallelism. In Proceedings of 7th A CM Symposium on Parallel Algorithms and Architectures, pages 74-83, July 1995.]]

Digital Library

[4]

R. Cytron and J. Ferrante. What's in a name? The Value of Renaming for Parallelism Detection and Storage Allocation. In Proceedings of International Conference on Parallel Processing, pages 19-27, February 1987.]]

[5]

C. Fu and T. Yang. Run-time Compilation for Parallel Sparse Matrix Computations. In Proceedings of ACM International Conference on Supercomputing, pages 237-244, Philadelphia, May 1996.]]

Digital Library

[6]

C. Fu and T. Yang. Sparse LU Factorization with Partial Pivoting on Distributed Memory Machines. in Proceedings of ACM/1EEE Supercomputing'96, Pittsburgh, November 1996.]]

Digital Library

[7]

C. Fu and T. Yang. Run-time Techniques for Exploiting Irregular Task Parallelism on Distributed Memory Architectures. Journal of Parallel and Distributed Computing, 1997. Accepted for publication. Also as UCSB technical report TRCS97-03.]]

Digital Library

[8]

A. Gerasoulis, j. Jiao, and T. Yang. Scheduling of Structured and Unstructured Computation. In D. Hsu, A. Rosenberg, and D. Sotteau, editors, Inter#nnections Networks and Mappings and Scheduling Parallel Computation, pages 139-172. American Math. Society, 1995.]]

[9]

M. Girkar and C. Polychronopoulos. Automatic Extraction of Functina} Parallelism from Ordinary Programs. IEEE Transactions on Parallel and Distributed Systems, 3(2):166-178, 1992.]]

Digital Library

[10]

M. Ibel, K. E. Schauser, C. J. Scheiman, and M. Weis. Implementing Active Messages and Split-C for SCI Clusters and Some Architectural implications. In Sixth International Workshop on SCl-based Low-cost/Highperformance Computing, September 1996.]]

[11]

X. Li. Sparse Gaussian Elimination on High Performance Computers. PhD thesis, CS, UC Berkeley, 1996.]]

Digital Library

[12]

C. D. Polychronopoulos. Parallel Programming and Compilers. Kluwer Academic Publishers, 1988.]]

Digital Library

[13]

S. Ramaswamy, S. Sapatnekar, and P. Banerjee. A Convex Programming Approach for Exploiting Data and Functional Parallelism. In Proceedings of International Conference on Parallel Processing, pages 116- 125, 1994.]]

Digital Library

[14]

E. Rothberg and R. Schreiber. Improved Load Distribution in Parallel Sparse Cholesky Factorization. In Proceedings of A CM/IEEE Supercomputing, pages 783- 792, November 1994.]]

[15]

J. Saltz, K. Crowley, R. Mirchandaney, and H. Berryman. Run-Time Scheduling and Execution of Loops on Message Passing Machines. Journal of Parallel and Distributed Computing, 8:303-312, 1990.]]

Digital Library

[16]

V. Sarkar. Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors. MIT Press, 1989.]]

Digital Library

[17]

R. Schreiber. Scalability of Sparse Direct Solvers, volume 56 of Graph Theory and Sparse Matrix Computation (Edited by Alan George and John R. Gilbert and Joseph W.H. Liu), pages 191-209. Springer-Verlag, New York, 1993.]]

[18]

T. Stricker, J. Stichnoth, D. O'Hallaron, S. Hinrichs, and T. Gross. Decoupling Synchronization and Data Transfer in Message Passing Systems of Parallel Computers. In Proceedings of A CM International Conference on Supercomputing, pages 1-10, Barcelona, July 1995.]]

Digital Library

[19]

R. Wolski and J. Feo. Program Parititoning for NUMA Multiprocessor Computer Systems. Journal of Parallel and Distributed Computing, 1993.]]

Digital Library

[20]

T. Yang and A. Gerasoulis. List Scheduling with and without Communication Delays. Parallel Computing, 19:1321-1344, 1992.]]

Digital Library

[21]

T. Yang and A. Gerasoulis. DSC: Scheduling Parallel Tasks on An Unbounded Number of Processors. IEEE Transactions on Parallel and Distributed Systems, 5(9):951-967, 1994. A short version is in Proceedings of Supercomputing'91.]]

Digital Library

Cited By

Cosnard MJeannot E(2019)Compact DAG Representation and Its Dynamic SchedulingJournal of Parallel and Distributed Computing10.1006/jpdc.1999.156658:3(487-514)Online publication date: 4-Jan-2019
https://dl.acm.org/doi/10.1006/jpdc.1999.1566
Yang TGerasoulis A(2014)Author retrospective for PYRROSACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591647(18-20)Online publication date: 10-Jun-2014
https://dl.acm.org/doi/10.1145/2591635.2591647
Tang HShen KYang T(1999)Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machinesACM SIGPLAN Notices10.1145/329366.30111434:8(107-118)Online publication date: 1-May-1999
https://dl.acm.org/doi/10.1145/329366.301114
Show More Cited By

Index Terms

Space and time efficient execution of parallel irregular computations

Recommendations

Space and time efficient execution of parallel irregular computations

Solving problems of large sizes is an important goal for parallel machines with multiple CPU and memory resources. In this paper, issues of efficient execution of overhead-sensitive parallel irregular computation under memory constraints are addressed. ...
Space/time-efficient scheduling and execution of parallel irregular computations

In this article we investigate the trade-off between time and space efficiency in scheduling and executing parallel irregular computations on distributed-memory machines. We employ acyclic task dependence graphs to model irregular parallelism with mixed ...
Scheduling and run-time support for parallel irregular computations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PPOPP '97: Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming

June 1997

287 pages

ISBN:0897919068

DOI:10.1145/263764

Chairmen:
Rob Schreiber
Hewlett-Packard Labs, Palo Alto, CA
,
Keshav Pingali
Cornell Univ., Ithaca, NY
,
Editor:
Michael A. Berman

ACM SIGPLAN Notices Volume 32, Issue 7
July 1997
287 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/263767
Chairmen:
Rob Schreiber
Hewlett-Packard Labs, Palo Alto, CA
,
Keshav Pingali
Cornell Univ., Ithaca, NY
,
Editor:
A. Michael Berman
Issue’s Table of Contents

Copyright © 1997 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 1997

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

PPoPP97

Sponsor:

SIGPLAN

PPoPP97: Principles & Practices of Parallel Programming

June 18 - 21, 1997

Nevada, Las Vegas, USA

Acceptance Rates

PPOPP '97 Paper Acceptance Rate 26 of 86 submissions, 30%;

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
375
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)16

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cosnard MJeannot E(2019)Compact DAG Representation and Its Dynamic SchedulingJournal of Parallel and Distributed Computing10.1006/jpdc.1999.156658:3(487-514)Online publication date: 4-Jan-2019
https://dl.acm.org/doi/10.1006/jpdc.1999.1566
Yang TGerasoulis A(2014)Author retrospective for PYRROSACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591647(18-20)Online publication date: 10-Jun-2014
https://dl.acm.org/doi/10.1145/2591635.2591647
Tang HShen KYang T(1999)Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machinesACM SIGPLAN Notices10.1145/329366.30111434:8(107-118)Online publication date: 1-May-1999
https://dl.acm.org/doi/10.1145/329366.301114
Tang HShen KYang TSnir MChien A(1999)Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machinesProceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/301104.301114(107-118)Online publication date: 1-May-1999
https://dl.acm.org/doi/10.1145/301104.301114
Shen KJiao XYang TMiller GGibbons P(1998)Elimination forest guided 2D sparse LU factorizationProceedings of the tenth annual ACM symposium on Parallel algorithms and architectures10.1145/277651.277658(5-15)Online publication date: 1-Jun-1998
https://dl.acm.org/doi/10.1145/277651.277658
Cosnard MJeannot ERougeot L(1998)Low memory cost dynamic scheduling of large coarse grain task graphsProceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing10.1109/IPPS.1998.669966(524-530)Online publication date: 1998
https://doi.org/10.1109/IPPS.1998.669966
Cosnard MJeannot ETao Yang (1998)Symbolic partitioning and scheduling of parameterized task graphsProceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)10.1109/ICPADS.1998.741109(428-434)Online publication date: 1998
https://doi.org/10.1109/ICPADS.1998.741109
Fu CJiao XYang T(1998)Efficient Sparse LU Factorization with Partial Pivoting on Distributed Memory ArchitecturesIEEE Transactions on Parallel and Distributed Systems10.1109/71.6638649:2(109-125)Online publication date: 1-Feb-1998
https://dl.acm.org/doi/10.1109/71.663864
Lee CWang YYang T(1997)Global Optimization for Mapping Parallel Image Processing Tasks on Distributed Memory MachinesJournal of Parallel and Distributed Computing10.1006/jpdc.1997.136045:1(29-45)Online publication date: 25-Aug-1997
https://dl.acm.org/doi/10.1006/jpdc.1997.1360

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents