Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1267638.1267653acmconferencesArticle/Chapter ViewAbstractPublication PagesosdiConference Proceedingsconference-collections
Article
Free access

Distributed filaments: efficient fine-grain parallelism on a cluster of workstations

Published: 14 November 1994 Publication History

Abstract

A fine-grain parallel program is one in which processes are typically small ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations such as iterative grid computations recursive fork/join programs the bodies of parallel FOR loops and the implicit parallelism in functional or dataflow languages. It is useful both to describe massively parallel computations and as a target for code generation by compilers. However fine-grain parallelism has long been thought to be inefficient due to the overheads of process creation context switching, and synchronization. This paper describes a software kernel. Distributed Filaments (DF) that implements fine-grain parallelism both portably and efficiently on a workstation cluster DF runs on existing off-the-shelf hardware and software. It has a simple interface so it is easy to use. DF achieves e ciency by using stateless threads on each node overlapping communication and computation, employing a new reliable datagram communication protocol and automatically balancing the work generated by fork/join computations.

References

[1]
[ALL89] T. E. Anderson, E. D. Lazowska and H. M. Levy. The performance implications of thread management alternatives for shared-memory multiprocessors. IEEE Transactions on Computers. 38(12):1631-1644. December 1989.
[2]
[AOC+88] Gregory R. Andrews, Ronald A. Olsson. Michael Coffin. Irving Elshoff, Kelvin Nilsen. Titus Pursin and Gregg Townsend. An overview of the SR language and implementation. ACM Transactions on Programming Languages and Systems, 10(1):51-86. January 1988.
[3]
[Bal90] Henri E. Bal. Experience with distributed programming in Orca. Proc. IEEE CS 1990 Int. Conf. on Computer Languages, pages 79-89. March 1990.
[4]
[BS90] Peter A. Buhr and R. A. Stroobosscher. The uSystem: providing light-weight concurrency on shared memory multiprocessor computers running UNIX. Software Practice and Experience , pages 929-964. September 1990.
[5]
[BZS93] Brian N. Bershad. Matthew J. Zekauskas and Wayne A. Sawdon. The Midway distributed shared memory system. In COMPCON '93. 1993.
[6]
[CBZ91] John B. Carter, John K. Bennett and Willy Zwaenepoel. Implementation and performance of Munin. In Proceedings of 13th ACM Symposium On Operating Systems, pages 152-164. October 1991.
[7]
[CGL86] Nicholas Carriero, David Gelernter and Jerry Leichter. Distributed data structures in Linda. In Thirteenth ACM Symp. on Principles of Programming Languages. pages 236-242. January 1986.
[8]
[CGSv93] David E. Culler. Seth Copen Goldstein, Klaus Erik Schauser and Thorsten von Eicken. TAM--a compiler controlled threaded abstract machine. Journal of Parallel and Distributed Computing, 18(3):347-370. August 1993.
[9]
[CZ83] D. R. Cheriton and W. Zwaenepoel. The distributed V kernel and its performance for diskless workstations. In Proceedings of the Ninth ACM Symposium on Operating Systems Principles pages 128-140. October 1983.
[10]
[DJAR91] Partha Dasgupta. Richard J. LeBlanc Jr. Mustaque Ahmad and Umakishore Ramachandran. The Clouds distributed operating system. Computer pages 34-44. November 1991.
[11]
[EAL93] Dawson R. Engler. Gregory R. Andrews and David K. Lowenthal. Shared Filaments: Efficient support for fine-grain parallelism on shared-memory multiprocessors. TR 93-13, Dept. of Computer Science. University of Arizona, April 1993.
[12]
[EZ93] Derek L. Eager and John Zahorjan. Chores: Enhanced run time support for shared memory parallel computing. ACM Transactions on Computer Systems. 11(1):1-32. February 1993.
[13]
[FP89] Brett D. Fleisch and Gerald J. Popek. Mirage: a coherent distributed shared memory design. In Proceedings of th ACM Symposium On Operating Systems, pages 211-223. December 1989.
[14]
[Fre94] Vincent W. Freeh. A comparison of implicit and explicit parallel programming. TR 93-30a, University of Arizona. May 1994.
[15]
[FRS+91] W. Fenton, B. Ramkumar, V. A. Saletore, A. B. Sinha and L. V. Kale. Supporting machine independent programming on diverse parallel architectures. In Proceedings of the 1991 International Conference on Parallel Processing, volume II, Software, pages II-193-II-201, Boca Raton, FL, August 1991. CRC Press.
[16]
[HB92] Matthew Haines and Wim Bohm. The design of VISA: A virtual shared addressing system. Technical Report CS-92-120. Colorado State University May 1992.
[17]
[HFM88] D. Hansgen, R. Finkel, and U. Manber. Two algorithms for barier synchronization. Int. Journal of Parallel Programming, 17(1):1-18, February 1988.
[18]
[KCA91] Kiyoshi Kurihara, David Chaiken, and Anant Agarwal. Latency tolerance through multithreading in large scale multiprocessors. In International Symposium on Shared Memory Multiprocessing, pages 91-101, April 1991.
[19]
[KDCZ94] Pete Keleher, Sandhya Dwarkadas, Alan Cox, and Willy Zwaenepoel. TreadMarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the 1994 Winter Usenix Conference. pages 115-131, January 1994.
[20]
[LH89] Kai Li and Paul Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems, 7(4), November 1989.
[21]
[SFL+94] Ioannis Schoinas, Babak Falsafi, Alvin R. Lebeck, Steven K. Reinhardt, James R. Larus, and David A. Wood. Fine-grain access control for distributed shared memory. In Sixth International Conference on Architecture Support for Programming Languages and Operating Systems (to appear), October 1994.
[22]
[SHG93] Jaswinder Pal Singh, John L. Hennessy, and Anoop Gupta. Scaling parallel programs for multiprocessors: Methodology and examples. Computer 26(7):42-50. July 1993.
[23]
[TC88] Robert H. Thomas and Will Crowther. The Uniform system: an approach to runtime support for large scale shared memory parallel processors. In 1988 Conference on Parallel Processing , pages 245-254. August 1988.
[24]
[TL93] Chanramohan A. Thekkath and Henry M. Levy. Limits to low-latency communication on high-speed networks. ACM Transactions on Computer Systems, 11(2):179-203. May 1993.
[25]
[vCGS92] Thorsten von Eicken, David E. Culler, Seth Copen Goldstein and Klaus Eric Schauser. Active Messages: a mechanism for intergrated communication and computation. In Proceedings of the 19th International Symposium on Computer Architecture, pages 256-266, May 1992.

Cited By

View all
  • (2016)Scheduling Parallelizable Jobs Online to Minimize the Maximum Flow TimeProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/2935764.2935782(195-205)Online publication date: 11-Jul-2016
  • (2010)Using memory mapping to support cactus stacks in work-stealing runtime systemsProceedings of the 19th international conference on Parallel architectures and compilation techniques10.1145/1854273.1854324(411-420)Online publication date: 11-Sep-2010
  • (2010)Brief announcementProceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures10.1145/1810479.1810517(186-188)Online publication date: 13-Jun-2010
  • Show More Cited By

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OSDI '94: Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
November 1994
228 pages

Sponsors

Publisher

USENIX Association

United States

Publication History

Published: 14 November 1994

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)8
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Scheduling Parallelizable Jobs Online to Minimize the Maximum Flow TimeProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/2935764.2935782(195-205)Online publication date: 11-Jul-2016
  • (2010)Using memory mapping to support cactus stacks in work-stealing runtime systemsProceedings of the 19th international conference on Parallel architectures and compilation techniques10.1145/1854273.1854324(411-420)Online publication date: 11-Sep-2010
  • (2010)Brief announcementProceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures10.1145/1810479.1810517(186-188)Online publication date: 13-Jun-2010
  • (2003)Efficient support for pipelining in software distributed shared memory systemsReal-time system security10.5555/903866.903874(95-121)Online publication date: 1-Jan-2003
  • (2003)A comparative analysis of fine-grain threads packagesJournal of Parallel and Distributed Computing10.1016/j.jpdc.2003.06.00163:11(1050-1063)Online publication date: 1-Nov-2003
  • (1999)Space-efficient scheduling of nested parallelismACM Transactions on Programming Languages and Systems10.1145/314602.31460721:1(138-173)Online publication date: 1-Jan-1999
  • (1998)Thread scheduling for multiprogrammed multiprocessorsProceedings of the tenth annual ACM symposium on Parallel algorithms and architectures10.1145/277651.277678(119-129)Online publication date: 1-Jun-1998
  • (1998)Per-Node Multithreading and Remote LatencyIEEE Transactions on Computers10.1109/12.67571147:4(414-426)Online publication date: 1-Apr-1998
  • (1997)Adaptive and reliable parallel computing on networks of workstationsProceedings of the annual conference on USENIX Annual Technical Conference10.5555/1268680.1268690(10-10)Online publication date: 6-Jan-1997
  • (1997)Space-efficient implementation of nested parallelismACM SIGPLAN Notices10.1145/263767.26377032:7(25-36)Online publication date: 21-Jun-1997
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media