Improving cache performance with adaptive cache topologies and deferred coherence models

January 1999

Author:
Yongjoon Lee,
Chair:
Jih-Kwon Peir

Publisher:

University of Florida
Gainesville, FL
United States

ISBN:978-0-599-47853-4

Order Number:AAI9946009

Pages:

155

Purchase on ProQuest

Bibliometrics

Abstract

Memory references exhibit locality and are therefore not uniformly distributed across the sets of a cache. This skew reduces the effectiveness of a cache because it results in the caching of a considerable number of less-recently used lines. In this dissertation, a technique that dynamically identifies these less-recently used lines and effectively utilizes the cache frames is described. These underutilized cache frames can be occupied by the more-recently used cache lines. Also, these frames can be used to further reduce the miss ratio through data prefetching. In the proposed design, the possible locations that a line can reside in is not predetermined. Instead, the cache is dynamically partitioned into groups. Because both the number of groups and each group associativity adapt to the dynamic reference pattern, this design is called the adaptive group-associative cache. This new adaptive cache topology utilizes the cache frames. Performance evaluation shows the group-associative cache is able to achieve a hit ratio better than that of a 4-way set-associative cache. For some of the SPEC95 workloads, the hit ratio approaches that of a fully associative cache.

Private caches are a critical component to hide memory access latency in high performance multiprocessor systems. However, multiple processors may concurrently update a distinct portion of a cache line and cause unnecessary cache invalidations under traditional cache coherence protocols.

In this dissertation research, a deferred cache coherence model is proposed, which allows a cache line to be shared in multiple caches in the inconsistent state as long as the processors are guaranteed not to access any stale data. Multiple write requests to different portions of a cache line can be performed locally without invalidation. An efficient mechanism to reconcile multiple inconsistent copies of the modified line is described to satisfy the data dependence. This new cache coherence model minimizes the cache coherence activities. Simulation results show that the proposed cache coherence model improves the performance of the parallel applications compared to conventional MESI and delayed coherence protocol up to 30%.

Contributors

Yongjoon Lee
University of Florida
- Publication Years1998 - 1999
- Publication counts2
- Citation count159
- Available for Download3
- Downloads (cumulative)2,874
- Downloads (12 months)282
- Downloads (6 weeks)36
- Average Downloads per Article958
- Average Citation per Article80
View Full Profile
Jih Kwon Peir
University of Florida
- Publication Years1986 - 2022
- Publication counts42
- Citation count403
- Available for Download17
- Downloads (cumulative)7,965
- Downloads (12 months)657
- Downloads (6 weeks)127
- Average Downloads per Article469
- Average Citation per Article10
View Full Profile

Comments

Recommendations

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture

Projections of computer technology forecast processors with peak performance of 1,000 MIPS in the relatively near future. These processors could easily lose half or more of their performance in the memory hierarchy if the hierarchy design is based on ...
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture

Projections of computer technology forecast processors with peak performance of 1,000 MIPS in the relatively near future. These processors could easily lose half or more of their performance in the memory hierarchy if the hierarchy design is based on ...
Improving l2 cache performance through stream-directed optimizations

Browse Theses

Sections

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

Improving l2 cache performance through stream-directed optimizations

Sections

Save to Binder

Recommendations

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

Improving l2 cache performance through stream-directed optimizations