article

Free access

Chores: enhanced run-time support for shared-memory parallel computing

Editor: Anita K. Jones Authors:

Derek L. Eager,

John JahorjanAuthors Info & Claims

ACM Transactions on Computer Systems (TOCS), Volume 11, Issue 1

Pages 1 - 32

https://doi.org/10.1145/151250.151251

Published: 01 February 1993 Publication History

PDF eReader

Abstract

Parallel computing is increasingly important in the solution of large-scale numerical problems. The difficulty of efficiently hand-coding parallelism, and the limitations of parallelizing compilers, have nonetheless restricted its use by scientific programmers.

In this paper we propose a new paradigm, chores, for the run-time support of parallel computing on shared-memory multiprocessors. We consider specifically uniform memory access shared-memory environments, although the chore paradigm should also be appropriate for use within the clusters of a large-scale nonuniform memory access machine.

We argue that chore systems attain both the high efficiency of compiler approaches for the common case of data parallelism, and the flexibility and performance of user-level thread approaches for functional parallelism. These benefits are achieved within a single, simple conceptual model that almost entirely relieves the programmer and compiler from concerns of granularity, scheduling, and enforcement of synchronization constraints. Measurements of a prototype implementation demonstrate that the chore model can be supported more efficiently than can traditional approaches to either data or functional parallelism alone.

References

[1]

ALVERSON, G. A., AND NOTKIN, D. Program structuring for effective parallel portability. IEEE Trans. Parallel Dist. Syst. To appear.

Crossref

Google Scholar

[2]

ANDERSON, T. E., LAZOWSKA, E. D., AND LEVY, H.M. The performance ~mplications of thread management alternatives for sharedmlemory multiprocessors. IEEE Trans. Comput. C-38, 12 (Dec 1989), 1631 1644.

Crossref

Google Scholar

[3]

ANDERSON, T. E., BERSHAD, B. N., LAZOWSKA, E. n., AND LEVY, H.M. Scheduler activations: Effective kernel support for the user-level management of parallelism, ACM Trans Comput. Syst. 10, i (Feb. 1992), 53 79

Crossref

Google Scholar

[4]

BERSHAD, B. N., LAZOWSKA, E. D., AND LEW, H.M. PRESTO: A system for object-oriented parallel programming. Softw. Pract. Ex~er. 18, 8 (Aug. 1988), 713-732.

Crossref

Google Scholar

[5]

CARRIERO, N., AND GELERNTER, D. How to write parallel programs: A guide to the perplexed. ACM Comput. Surv. 21, 3 (Sept. 1989), 323-357.

Crossref

Google Scholar

[6]

DOEPPNER, T. W., JR. Threads: A system for the support of concurrent programming. Tech. Rep. CS-87-11, Dept. of Computer Science, Brown Univ., 1987.

Google Scholar

[7]

DRAVES, R., AND COOPER, E. C threads. Tech. Rep. CMU-CS-88-154, School of Computer Science, Carnegie Mellon Univ., June 1988.

Google Scholar

[8]

EAGER, D. L., AND ZAHORJAN, J. Adaptive guided self-scheduling. Tech. Rep. 92-01-01, Dept. of Computer Science and Engineering, Univ. of Washington, Jan. 1992.

Google Scholar

[9]

EDLER, J., LIPKrS, J., AND SCHONBERG, E. Process management for highly parallel UNIX systems. Ultracomputer note 136, Courant Inst. of Mathematical Sciences, New York Univ., April 1988.

Google Scholar

[10]

JORDAN, H. The Force. Tech. Rep. ECE-87-1-1, Dept. of Electrical and Computer Engineering, Univ. of Colorado, Jan. 1987.

Google Scholar

[11]

JORDAN, H., BENTEN, M., ALAGHBAND, G., AND JACOB, R. The Force: A highly portable parallel programming language. In Proceedings of the 1989 International Conference on Parallel Processing (St. Charles, Ill., Aug. 1989), pp. II 112-II 117.

Google Scholar

[12]

KALE, L.V. The Chare kernel parallel programming language and system. In Proceedings of the 1990 International Conference on Parallel Processing (Aug. 1990), pp. II 17-II 25.

Google Scholar

[13]

KARP, A. H., AND BABB, R. G. II. A comparison of 12 parallel Fortran dialects. IEEE Softw. 5, 5 (Sept. 1988), 52 66.

Crossref

Google Scholar

[14]

MARSH, B. D., SCOTT, M. L., LEBLANC, T. J., AND MARKATOS, E. P. First-class user-level threads. In Proceedmgs of the 13th ACM Symposium on Operating Systems Principles (Pacific Grove, Calif., Oct. 1991), 110-121.

Crossref

Google Scholar

[15]

MOHR, E., KRANZ, D. A., AND HALSTEAn, R. H., JR. Lazy task creation: A technique for increasing the granularity of parallel programs. IEEE Trans. Parallel Distrib. Syst. 2, 3 (July 1991), 264-280.

Crossref

Google Scholar

[16]

MOLLER-NIELSEN, P., AND STAUNSTRUP, J. Problem-heap: A paradigm for multiprocessor algorithms. Parallel Comput. 4, i (Feb. 1987), 63-74.

Google Scholar

[17]

PANCAKE, C. M., AND BERGMARK, D. Do parallel languages respond to the needs of scientific programmers?, Comput. 23, 12 (Dec. 1990), 13 24.

Crossref

Google Scholar

[18]

POLYCHRONOPOULOS, C. D., AND KUCK, D.J. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Trans. Comput. C-36, 12 (Dec. 1987), 1425 1439.

Crossref

Google Scholar

[19]

POL~CHRONOPOULOS, C. D., GmKAR, M, HAGHmHAT, M. R., LEE, C. L., LEUNG, B., AND SCHOUTEN, D. Parafrase-2: An environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors. In Proceedings of the 1989 Internatwnal Conference on Parallel Processing (St. Charles, Ill., Aug. 1989), pp. II 39-II 48.

Google Scholar

[20]

POLYCHRONOPOULOS, C.D. Auto-scheduling: Control flow and data flow come together. Tech. Rep. CSRD-TR-1058, Center for Supercomputing Research and Development, Univ. of Illinois, Nov. 1990.

Google Scholar

[21]

REISER, M., AND LAVENBERG, S. S. Mean value analysis of closed multichain queueing networks. J. ACM 27, 2 (Apr. 1980), 313-322.

Crossref

Google Scholar

[22]

TEVANIAN, A., RASHID, R., GOLUB, D., BLACK, D., COOPER, E., ANn YOUNG, M. Mach threads and the UNIX kernel: The battle for control. In Proceedings of the 1987 USENIX Summer Conference (Phoenix, Ariz., 1987), pp. 185 197.

Google Scholar

[23]

THACKER, C., STEWART, L., AND SATTERTHWAITE, E, JR. Firefly: A multiprocessor workstation. IEEE Trans. Comput. 37, 8 (Aug. 1988), 909-920.

Crossref

Google Scholar

[24]

THOMAS, R. H., AND CROWTHER, W. The uniform system: An approach to runtime support for large scale shared memory parallel processors. In Proceedings of the 1988 International Conference on Parallel Processing (St. Charles, Ill., Aug. 1988), pp. 245-254.

Google Scholar

[25]

TOOMEY, L. J., PLACHY, E. C., SCARBOROUGH, R. G., SAHULKA, R. J., SHAW, J. F., AND SHANNON, A.W. IBM Parallel Fortran. IBM Syst. J. 27, 4 (Dec. 1988), 416 435.

Crossref

Google Scholar

[26]

TtJCKE~, A., AND GUPTA, A. Process control and scheduling issues for multiprogrammed shared-memory multiprocessors. In Proceedings of the 12th ACM Symposium on Operating Systems PrLnclples (Litchfield Park, Ariz, Dec. 1989), pp. 159-166

Crossref

Google Scholar

[27]

VANDEVOORDE, M., AND ROBERTS, E. WorkCrews: An abstraction for controlling parallelism. Int. J. Parallel Program. 17, 4 (Aug. 1988), 347-366.

Crossref

Google Scholar

[28]

WEISER, M., DEMERS, A., AND HAUSER, C. The portable common runtime approach to interoperability. In Proceedtngs of the 12th ACM Symposium on Operating Systems Pr~nczples (Litchfield Park, Ariz., Dec. 1989), pp. 114 122.

Crossref

Google Scholar

[29]

WOLF, M.J. Optimizing Supercompilers for Supercomputers. Pitman, 1989.

Crossref

Google Scholar

[30]

ZAHO~AN, J., LAZOWS~, E. D., AND EAGER, D.L. The effect of scheduling discipline on spin overhead in shared memory parallel processors. IEEE Trans. Parallel D~str~b. Syst. 2, 2 (Apr. 1991), 180-198.

Crossref

Google Scholar

Cited By

View all

Iwasaki SAmer ATaura KBalaji P(2020)Analyzing the Performance Trade-Off in Implementing User-Level ThreadsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297605731:8(1859-1877)Online publication date: 1-Aug-2020
https://doi.org/10.1109/TPDS.2020.2976057
Ghane MChandrasekaran SCheung M(2019)GeckoProceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3303084.3309489(21-30)Online publication date: 17-Feb-2019
https://dl.acm.org/doi/10.1145/3303084.3309489
Iwasaki SAmer ATaura KBalaji P(2018)Lessons learned from analyzing dynamic promotion for user-level threadingProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291687(1-12)Online publication date: 11-Nov-2018
https://dl.acm.org/doi/10.5555/3291656.3291687
Show More Cited By

Index Terms

Chores: enhanced run-time support for shared-memory parallel computing

Recommendations

Parallel CFD Computing Using Shared Memory OpenMP
ICCS '01: Proceedings of the International Conference on Computational Sciences-Part I

The eXtended Full-Potential (FPX) helicopter rotor Computational Fluid Dynamics (CFD) code in its reduced two-dimensional version is successfully converted into a parallel version. The FPX code solves the full potential equation using an approximately ...
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

We present the design for the NYU Ultracomputer, a shared-memory MIMD parallel machine composed of thousands of autonomous processing elements. This machine uses an enhanced message switching network with the geometry of an Omega-network to approximate ...
Exploiting Distributed-Memory and Shared-Memory Parallelism on Clusters of SMPs with Data Parallel Programs

Clusters of SMPs are hybrid-parallel architectures that combine the main concepts of distributed-memory and shared-memory parallel machines. Although SMP clusters are widely used in the high performance computing community, there exists no single ...

Reviews

Reviewer: Brett D. Fleisch

The Chores system provides runtime support for parallel computing. This research paper attempts to bridge a gap between the compiler community and the operating systems community. The former has focused on modifications to programming languages to feature loop-based parallelism, while the latter has focused on threads, which are concurrent streams of execution that each have their own execution context and stack. Eager and Jahorjan argue that the existing programming models for parallel applications along with the tools developed to support them have been only partially successful. They cite the underutilization of parallel computing by scientists. The authors argue that the compiler approaches lack flexibility, while threads-based approaches have been too low-level and have required too much explicit management of the details of parallelism. The Chores system is proposed to bridge this gap. It attempts to unify the expressiveness and performance of the loop-based parallelism model for the common case of data parallelism and the flexibility and performance of user-level threads for functional parallelism. The Chores model is similar to the work heap approach except that each work heap item in C hores is partitionable and precedence can be specified. The model resembles the Chameleon system of Alverson and Notkin [1] but includes support for blocking synchronization, lightweight partitioning of partitionable chores, and dependency specifications among atoms of a partitionable chore. The paper appears to be a reasonable addition to the work on these approaches, given the currency and importance of thread systems and data parallel languages. The paper is written in a reasonably clear manner. Examples are given in C++.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems

ACM Transactions on Computer Systems Volume 11, Issue 1

Feb. 1993

106 pages

ISSN:0734-2071

EISSN:1557-7333

DOI:10.1145/151250

Editor:
Anita K. Jones
Univ. of Virginia, Charlottesville

Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 1993

Published in TOCS Volume 11, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
886
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)8

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Iwasaki SAmer ATaura KBalaji P(2020)Analyzing the Performance Trade-Off in Implementing User-Level ThreadsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297605731:8(1859-1877)Online publication date: 1-Aug-2020
https://doi.org/10.1109/TPDS.2020.2976057
Ghane MChandrasekaran SCheung M(2019)GeckoProceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3303084.3309489(21-30)Online publication date: 17-Feb-2019
https://dl.acm.org/doi/10.1145/3303084.3309489
Iwasaki SAmer ATaura KBalaji P(2018)Lessons learned from analyzing dynamic promotion for user-level threadingProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291687(1-12)Online publication date: 11-Nov-2018
https://dl.acm.org/doi/10.5555/3291656.3291687
Iwasaki SAmer ATaura KBalaji P(2018)Lessons learned from analyzing dynamic promotion for user-level threadingProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC.2018.00026(1-12)Online publication date: 11-Nov-2018
https://dl.acm.org/doi/10.1109/SC.2018.00026
Houston MPark JRen MKnight TFatahalian KAiken ADally WHanrahan PChatterjee SScott M(2008)A portable runtime interface for multi-level memory hierarchiesProceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming10.1145/1345206.1345229(143-152)Online publication date: 20-Feb-2008
https://dl.acm.org/doi/10.1145/1345206.1345229
SECK TUOH MORA JHERNANDEZ MROMERO NTREJO ACHAPA VERGARA S(2007)MODELING LINEAR DYNAMICAL SYSTEMS BY CONTINUOUS-VALUED CELLULAR AUTOMATAInternational Journal of Modern Physics C10.1142/S012918310701058918:05(833-848)Online publication date: May-2007
https://doi.org/10.1142/S0129183107010589
Fatahalian KHorn DKnight TLeem LHouston MPark JErez MRen MAiken ADally WHanrahan PHorner-Miller B(2006)SequoiaProceedings of the 2006 ACM/IEEE conference on Supercomputing10.1145/1188455.1188543(83-es)Online publication date: 11-Nov-2006
https://dl.acm.org/doi/10.1145/1188455.1188543
Fatahalian KKnight THouston MErez MHorn DLeem LPark JRen MAiken ADally WHanrahan P(2006)Sequoia: Programming the Memory HierarchyACM/IEEE SC 2006 Conference (SC'06)10.1109/SC.2006.55(4-4)Online publication date: Nov-2006
https://doi.org/10.1109/SC.2006.55
Islam NProdromidis ASquillante M(2005)Dynamic partitioning in different distributed-memory environmentsJob Scheduling Strategies for Parallel Processing10.1007/BFb0022297(244-270)Online publication date: 15-Jun-2005
https://doi.org/10.1007/BFb0022297
Nguyen TVaswani RZahorjan J(2005)Using runtime measured workload characteristics in parallel processor schedulingJob Scheduling Strategies for Parallel Processing10.1007/BFb0022293(155-174)Online publication date: 15-Jun-2005
https://doi.org/10.1007/BFb0022293
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

Parallel CFD Computing Using Shared Memory OpenMP

The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

Exploiting Distributed-Memory and Shared-Memory Parallelism on Clusters of SMPs with Data Parallel Programs

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations