Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Improving cache performance with adaptive cache topologies and deferred coherence models
Publisher:
  • University of Florida
  • Gainesville, FL
  • United States
ISBN:978-0-599-47853-4
Order Number:AAI9946009
Pages:
155
Bibliometrics
Skip Abstract Section
Abstract

Memory references exhibit locality and are therefore not uniformly distributed across the sets of a cache. This skew reduces the effectiveness of a cache because it results in the caching of a considerable number of less-recently used lines. In this dissertation, a technique that dynamically identifies these less-recently used lines and effectively utilizes the cache frames is described. These underutilized cache frames can be occupied by the more-recently used cache lines. Also, these frames can be used to further reduce the miss ratio through data prefetching. In the proposed design, the possible locations that a line can reside in is not predetermined. Instead, the cache is dynamically partitioned into groups. Because both the number of groups and each group associativity adapt to the dynamic reference pattern, this design is called the adaptive group-associative cache. This new adaptive cache topology utilizes the cache frames. Performance evaluation shows the group-associative cache is able to achieve a hit ratio better than that of a 4-way set-associative cache. For some of the SPEC95 workloads, the hit ratio approaches that of a fully associative cache.

Private caches are a critical component to hide memory access latency in high performance multiprocessor systems. However, multiple processors may concurrently update a distinct portion of a cache line and cause unnecessary cache invalidations under traditional cache coherence protocols.

In this dissertation research, a deferred cache coherence model is proposed, which allows a cache line to be shared in multiple caches in the inconsistent state as long as the processors are guaranteed not to access any stale data. Multiple write requests to different portions of a cache line can be performed locally without invalidation. An efficient mechanism to reconcile multiple inconsistent copies of the modified line is described to satisfy the data dependence. This new cache coherence model minimizes the cache coherence activities. Simulation results show that the proposed cache coherence model improves the performance of the parallel applications compared to conventional MESI and delayed coherence protocol up to 30%.

Contributors
  • University of Florida
  • University of Florida

Recommendations