Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Design and performance of the software-controlled coma
Publisher:
  • University of Southern California
  • Computer Science Dept. 200 University Park Los Angeles, CA
  • United States
ISBN:978-0-599-00013-1
Order Number:AAI9902847
Pages:
159
Reflects downloads up to 15 Jan 2025Bibliometrics
Skip Abstract Section
Abstract

Traditionally, cache coherence in multiprocessors has been maintained in hardware. However, the cost-effectiveness of hardware protocols for Distributed Shared Memory (DSM) systems is questionable. Virtual Shared Memory systems have highlighted the many advantages of software-implemented protocols, albeit at a performance price. The performance gap is narrowed by hybrid systems with software-implemented coherence protocols and hardware support for fine-grain access control.

This work contains the first proposal and evaluation of a hybrid COMA (Cache-Only Memory Architecture). The system is called SC-COMA for Software-Controlled COMA, to emphasize that the protocol engine is emulated by software executed on the main processor. Contrary to user-level protocols, the software handling coherence events in SC-COMA runs in sub-kernel mode, transparently and efficiently providing the same services to applications as a hardware counterpart. SC-COMA is employing a novel coherence protocol, optimized for a hybrid implementation, which has been fully implemented. The support for fine-grain access control is embedded in the memory controller.

The evaluation methodology is based on execution-driven simulation of complete applications from the SPLASH-2 suite. Results show that SC-COMA is competitive and a viable solution to easily transform networks of workstations into powerful multiprocessors. On systems with 32 processors, it achieves a slowdown of 11-56% with respect to an aggressive hardware counterpart, across a range of applications and memory overhead. Scalability is good and faster processors favorably affect the performance. An investigation on the impact of memory organization on the performance of hybrid systems reveals that, in most of a wide range of cases, COMA outperforms other alternatives: CC-NUMA, Simple COMA, and RC-NUMA due to the lower node miss ratio.

The performance of SC-COMA is further improved by three techniques: relaxed inclusion, mastership hints, and replacement hints. Even more significant improvements are obtained by adapting the SC-COMA approach to other hardware platforms: symmetric multiprocessor (SMP) nodes and processors with non-blocking stores.

Contributors
  • University of Southern California
  • University of Southern California

Recommendations