3 - Cache Memory
3 - Cache Memory
• Access time (latency): For random-access memory, this is the time it takes to
perform a read or write operation, that is, the time from the instant that an address
is presented to the memory to the instant that data have been stored or made
available for use. For non-random-access memory, access time is the time it takes
to position the read–write mechanism at the desired location.
• Memory cycle time: This concept is primarily applied to random-access memory
and consists of the access time plus any additional time required before a second
access can commence. This additional time may be required for transients to die
out on signal lines or to regenerate data if they are read destructively. Note that
memory cycle time is concerned with the system bus, not the processor.
• Transfer rate: This is the rate at which data can be transferred into or out of a
memory unit. For random-access memory, it is equal to 1/(cycle time).
Computer Memory System Overview
• The dilemma facing the designer is clear. The designer would like to
use memory technologies that provide for large-capacity memory,
both because the capacity is needed and because the cost per bit is
low. However, to meet performance requirements, the designer needs
to use expensive, relatively lower-capacity memories with short
access times.
• The way out of this dilemma is not to rely on a single memory
component or technology, but to employ a memory hierarchy.
The Memory Hierarchy
• Figure 4.3b depicts the use of multiple levels of cache. The L2 cache is
slower and typically larger than the L1 cache, and the L3 cache is
slower and typically larger than the L2 cache.
Cache Memory Principles
Cache Memory Principles
Cache Memory Principles
• The length of a line, not including tag and control bits, is the line size.
• Because there are more blocks than lines, an individual line cannot be
uniquely and permanently dedicated to a particular block. Thus, each
line includes a tag that identifies which particular block is currently
being stored. The tag is usually a portion of the main memory address,
as described later.
Mapping Function
• Because there are fewer cache lines than main memory blocks, an
algorithm is needed for mapping main memory blocks into cache
lines. Further, a means is needed for determining which main memory
block currently occupies a cache line.
• Three techniques can be used: direct, associative, and set-associative.
Direct Mapping
S w
Direct Mapping
• The cache logic interprets these s bits as a tag of s-r bits (most
significant portion) and a line field of r bits. This latter field identifies
one of the m = 2r lines of the cache. To summarize,
• Address length = (s+w) bits
• Number of addressable units = 2(s+w) bytes or words
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s, Number of lines in cache m = 2r
• Size of cache = 2r+w words or bytes
• Size of tag = (s-r) bits
Direct Mapping
• Once the cache has been filled, when a new block is brought into
the cache, one of the existing blocks must be replaced.
• For direct mapping, there is only one possible line for any particular
block, and no choice is possible. For the associative and set-
associative techniques, a replacement algorithm is needed.
• A number of algorithms have been tried. We mention three of the
most common: LRU, LFU, and FIFO.
Replacement Algorithms
• Least recently used (LRU): Replace that block in the set that has
been in the cache longest with no reference to it.
• Because of its simplicity of implementation, LRU is the most popular
replacement algorithm.
• It is most effective.
• First-in-first-out (FIFO): Replace that block in the set that has been
in the cache longest.
• Least frequently used (LFU): Replace that block in the set that has
experienced the fewest references.
Write Policy