Cache Performance Average Memory Access Time
Cache Performance Average Memory Access Time
• Performance Enhancements
– Prefetch
– Memory Module Interleaving
– Load-Through
3
Cache Hit/Miss Rate and Miss Penalty
• Cache Hit:
– The access can be done in the cache.
– Hit Rate: The ratio of number of hits to all accesses.
• Hit rates over 0.9 are essential for high-performance PCs.
• Cache Miss:
– The access can not be done in the cache.
– Miss Rate: The ratio of number of misses to all accesses.
– When cache miss occur, extra time is needed to bring blocks
from the slower main memory to the faster cache.
• During that time, the processor is stalled.
– Miss Penalty: the total access time passed through (seen by
the processor) when a cache miss occurs.
4
Average memory access time
AMAT = (L1 Cache hit time * L1 Cache hit ratio) + (L2 Cache hit time * L2 Cache hit ratio) +
(Main Memory access time * Main Memory hit ratio
An Example of Miss Penalty
• Miss Penalty: the total access time passed through
(seen by the processor) when a cache miss occurs.
• Consider a system with only one level of cache with
following parameters: 𝑡 10𝑡 Main
CPU Cache
– Word access time to the cache: 𝑡 𝑡 Memory
– 𝑀: Miss Penalty
𝑖
𝑡 𝑤𝑖𝑡ℎ𝑜𝑢𝑡 1300
= 274 = 4.74 (𝑠𝑝𝑒𝑒𝑑 𝑢𝑝!)
𝑡 𝑤𝑖𝑡ℎ
CSCI2510 Lec08: Cache Performance 13
Class Exercise 8.2
• Consider the same system with one level of cache.
– Word access time to the cache: 1 cycle
– Word access time to the main memory: 10 𝑐𝑦𝑐𝑙𝑒𝑠
– Miss Penalty: 1 + 10 + 7 1 + 1 = 19 (𝑐𝑦𝑐𝑙𝑒𝑠)
• What is the performance difference between this
cache and an ideal cache?
– Ideal Cache: All the accesses can be done in cache.
B
B
Main
Processor Cache Memory
Larger B
Larger B
Main
Processor Cache Memory
prefetch
CSCI2510 Lec08: Cache Performance 23
Outline
• Performance Evaluation
– Cache Hit/Miss Rate and Miss Penalty
– Average Memory Access Time
• Performance Enhancements
– Prefetch
– Memory Module Interleaving
– Load-Through
ABR DBR ABR DBR ABR DBR ABR DBR ABR DBR ABR DBR
Module Module Module Module Module Module
0 i n-1 0 i 2 k- 1
0 1 2 0 1 2
…
(a) Consecutive words in the same module (b) Consecutive words in
CSCI2510 Lec08: Cache Performance
successive modules 27
Example of Memory Module Interleaving
• Consider a cache read miss, and we need to load a
block of 8 words from main memory to the cache.
• Assume consecutive words are in successive modules for
the better interleaving (i.e., Scheme (b)).
• For every memory module:
– Address Buffer Register & Data Buffer Register
– Module Operations:
• Send an address to ABR: 𝟏 cycle ABR DBR
• Read the first word from module into DBR: 𝟔 cycles
• Module
Read a subsequent word from module into DBR: 𝟒 cycles i
• Read the data from DBR: 𝟏 cycle
Assume reads can be performed in parallel as accessing ABR or DBR, but
it only allows accessing either ABR or DBR of a module at a time.
CSCI2510 Lec08: Cache Performance 28
Without Interleaving (Single Module)
• Total cycles to read a single word from the module:
1 6 1 Send an address to ABR: 𝟏 cycle
Read the first word: 𝟔 cycles
– 1 cycle to send the address Read a subsequent word: 𝟒 cycles
Read the data from DBR: 𝟏 cycle
– 6 cycles to read the first word
– 1 cycle to read the data from DBR 1 + 6 + 1 = 8 𝑐𝑦𝑐𝑙𝑒𝑠
• Total cycles to read an 8-word block from the module:
Cycl e 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 … 36
1st 1 6 1 1+6+4×7+1
(read the 1st word) 1 4 1 = 36 𝑐𝑦𝑐𝑙𝑒𝑠
2nd
1 4 1 3rd
ABR DBR ABR DBR
…
Send an address Read data from DBR
+ (in paralle) Module Module + (in parallel)
Read a i i Read a word
8 th
word
CSCI2510 Lec08: Cache Performance
from module 1 4 1
29
With Interleaving Send an address to ABR: 𝟏 cycle
Read the first word: 𝟔 cycles
Read a subsequent word: 𝟒 cycles
• Total cycles to read a Read the data from DBR: 𝟏 cycle
Cache Main
Processor Memory
load-through:
forward the requested word to the processor
as soon as it is read from the main memory!
CSCI2510 Lec08: Cache Performance 34
Summary
• Performance Evaluation
– Cache Hit/Miss Rate and Miss Penalty
– Average Memory Access Time
• Performance Enhancements
– Prefetch
– Memory Module Interleaving
– Load-Through