Multigrain shared memory

January 1998

Author:
Donald Yeung,
Supervisor:
Anant Agarwal

Publisher:

Massachusetts Institute of Technology
201 Vassar Street, W59-200 Cambridge, MA
United States

Order Number:AAI0599281

Pages:

Purchase on ProQuest

Bibliometrics

Abstract

Parallel workstations, each comprising a 10-100 processor shared memory machine, promise cost-effective general-purpose multiprocessing. This thesis explores the coupling of such small- to medium-scale shared memory multiprocessors through software over a local area network to synthesize larger shared memory systems. Multiprocessors built in this fashion are called Distributed Scalable Shared memory Multiprocessors (DSSMPs).

The challenge of building DSSMPs lies in seamlessly extending hardware-supported shared memory of each parallel workstation to span a cluster of parallel workstations using software only. Such a shared memory system is called Multigrain Shared Memory because it naturally supports two grains of sharing: fine-grain cache-line sharing within each parallel workstation, and coarse-grain page sharing across parallel workstations. Applications that can leverage the efficient fine-grain support for shared memory provided by each parallel workstation have the potential for high performance.

This thesis makes three contributions in the context of Multigrain Shared Memory. First, it provides the design of a multigrain shared memory system, called MGS, and demonstrates its feasibility and correctness via an implementation on a 32-processor Alewife machine. Second, this thesis undertakes an in-depth application study that quantifies the extent to which shared memory applications can leverage efficient shared memory mechanisms provided by DSSMPs. The thesis begins by looking at the performance of unmodified shared memory programs, and then investigates application transformations that improve performance. Finally, this thesis presents an approach called Synchronization Analysis for analyzing the performance of multigrain shared memory systems. The thesis develops a performance model based on Synchronization Analysis, and uses the model to study DSSMPs with up to 512 processors. The experiments and analysis demonstrate that scalable DSSMPs can be constructed from small-scale workstation nodes to achieve competitive performance with large-scale all-hardware shared memory systems. For instance, the model predicts that a 256-processor DSSMP built from 16-processor parallel workstation nodes achieves equivalent performance to a 128-processor all-hardware multiprocessor on a communication-intensive workload. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

Contributors

Donald Yeung
University of Maryland, College Park
- Publication Years1991 - 2024
- Publication counts38
- Citation count1,710
- Available for Download32
- Downloads (cumulative)22,511
- Downloads (12 months)1,949
- Downloads (6 weeks)432
- Average Downloads per Article703
- Average Citation per Article45
View Full Profile
Anant Agarwal
Tilera Corporation
- Publication Years1986 - 2014
- Publication counts100
- Citation count7,272
- Available for Download68
- Downloads (cumulative)68,316
- Downloads (12 months)5,340
- Downloads (6 weeks)889
- Average Downloads per Article1,005
- Average Citation per Article73
View Full Profile

Comments

Recommendations

Multigrain shared memory

Parallel workstations, each comprising tens of processors based on shared memory, promise cost-effective scalable multiprocessing. This article explores the coupling of such small- to medium-scale shared-memory multiprocessors through software over a ...
Compiling shared-memory applications for distributed-memory systems
Multigrain Shared Memory

Browse Theses

Sections