Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
21 views

3 - Cache Memory

The document discusses computer memory systems and cache memory principles. It describes different types of computer memory including internal memory like cache and main memory, and external memory. It also explains concepts like memory hierarchy, cache mapping techniques like direct mapping and associative mapping, and cache memory performance parameters.

Uploaded by

Aliaa Tarek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

3 - Cache Memory

The document discusses computer memory systems and cache memory principles. It describes different types of computer memory including internal memory like cache and main memory, and external memory. It also explains concepts like memory hierarchy, cache mapping techniques like direct mapping and associative mapping, and cache memory performance parameters.

Uploaded by

Aliaa Tarek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Cache Memory

Computer Organization (2022/2023)


Eng. Hossam Mady
Teaching Assistant and Researcher at Aswan Faculty of Engineering
Computer Memory System Overview
Computer Memory System Overview

• The term location refers to whether memory is internal and external


to the computer. Internal memory is often equated with main
memory. Cache is another form of internal memory.
• External memory consists of peripheral storage devices, such as disk
and tape, that are accessible to the processor via I/O controllers.
• An obvious characteristic of memory is its capacity. For internal
memory, this is typically expressed in terms of bytes or words. External
memory capacity is typically expressed in terms of bytes.
Computer Memory System Overview
• A related concept is the unit of transfer. For internal memory, the
unit of transfer is equal to the number of electrical lines into and out of
the memory module.
• Another distinction among memory types is the method of accessing
units of data. These include the following: Sequential access, direct
access, random access, and associative Access.
• From a user’s point of view, the two most important characteristics of
memory are capacity and performance. Three performance
parameters are used: Access time, memory cycle time, and transfer
rate.
Method of Accessing
• Sequential access: Memory is organized into units of data, called records.
Access must be made in a specific linear sequence. Access time is variable.
• Direct access: Individual blocks or records have a unique address based on
physical location. Access is accomplished by direct access to reach a general
vicinity plus sequential searching, counting, or waiting to reach the final
location. Access time is variable.
• Random access: Each addressable location in memory has a unique,
physically wired-in addressing mechanism. The time to access a given
location is independent of the sequence of prior accesses and is constant.
Access time is constant.
• Associative: A word is retrieved based on a portion of its contents rather
than its address.
Performance Parameters

• Access time (latency): For random-access memory, this is the time it takes to
perform a read or write operation, that is, the time from the instant that an address
is presented to the memory to the instant that data have been stored or made
available for use. For non-random-access memory, access time is the time it takes
to position the read–write mechanism at the desired location.
• Memory cycle time: This concept is primarily applied to random-access memory
and consists of the access time plus any additional time required before a second
access can commence. This additional time may be required for transients to die
out on signal lines or to regenerate data if they are read destructively. Note that
memory cycle time is concerned with the system bus, not the processor.
• Transfer rate: This is the rate at which data can be transferred into or out of a
memory unit. For random-access memory, it is equal to 1/(cycle time).
Computer Memory System Overview

• A variety of physical types of memory have been employed. The most


common forms are:
• Semiconductor memory
• Magnetic surface memory: used for disk and tape
• Optical
• Magneto-optical
Computer Memory System Overview
• Several physical characteristics of data storage are important. In a
volatile memory, information decays naturally or is lost when electrical
power is switched off. In a nonvolatile memory, information once
recorded remains without deterioration until deliberately changed; no
electrical power is needed to retain information.
• Magnetic-surface memories are nonvolatile. Semiconductor memory
may be either volatile or nonvolatile.
• Nonerasable memory cannot be altered, except by destroying the
storage unit.
The Memory Hierarchy

• The design constraints on a computer’s memory can be summed up


by three questions: How much? How fast? How expensive?
• As might be expected, there is a trade-off among the three key
characteristics of memory: namely, capacity, access time, and cost.
• Faster access time, greater cost per bit
• Greater capacity, smaller cost per bit
• Greater capacity, slower access time
The Memory Hierarchy

• The dilemma facing the designer is clear. The designer would like to
use memory technologies that provide for large-capacity memory,
both because the capacity is needed and because the cost per bit is
low. However, to meet performance requirements, the designer needs
to use expensive, relatively lower-capacity memories with short
access times.
• The way out of this dilemma is not to rely on a single memory
component or technology, but to employ a memory hierarchy.
The Memory Hierarchy

• Thus, smaller, more expensive, faster memories are supplemented by


larger, cheaper, slower memories.
Cache Memory Principles

• The concept is illustrated in Figure 4.3a. There is a relatively large and


slow main memory together with a smaller, faster cache memory. The
cache contains a copy of portions of main memory.
• When the processor attempts to read a word of memory, a check is
made to determine if the word is in the cache. If so, the word is
delivered to the processor. If not, a block of main memory, consisting
of some fixed number of words, is read into the cache and then the
word is delivered to the processor.
Cache Memory Principles

• Figure 4.3b depicts the use of multiple levels of cache. The L2 cache is
slower and typically larger than the L1 cache, and the L3 cache is
slower and typically larger than the L2 cache.
Cache Memory Principles
Cache Memory Principles
Cache Memory Principles

• The previous figure depicts the structure of a cache/main-memory


system. Main memory consists of up to 2n addressable words, with
each word having a unique n-bit address.
• For mapping purposes, this memory is considered to consist of a
number of fixed-length blocks of K words each. That is, there are
M = 2n/K blocks in main memory. The cache consists of m blocks,
called lines. Each line contains K words, plus a tag of a few bits.
Cache Memory Principles

• The length of a line, not including tag and control bits, is the line size.
• Because there are more blocks than lines, an individual line cannot be
uniquely and permanently dedicated to a particular block. Thus, each
line includes a tag that identifies which particular block is currently
being stored. The tag is usually a portion of the main memory address,
as described later.
Mapping Function

• Because there are fewer cache lines than main memory blocks, an
algorithm is needed for mapping main memory blocks into cache
lines. Further, a means is needed for determining which main memory
block currently occupies a cache line.
• Three techniques can be used: direct, associative, and set-associative.
Direct Mapping

• Direct mapping: The simplest technique that maps each block of


main memory into only one possible cache line. The mapping is
expressed as:
𝒊 = 𝒋 𝒎𝒐𝒅𝒖𝒍𝒐 𝒎
𝑖 = 𝑐𝑎𝑐ℎ 𝑙𝑖𝑛𝑒 𝑛𝑢𝑚𝑏𝑒𝑟,
𝑗 = 𝑚𝑎𝑖𝑛 𝑚𝑒𝑚𝑜𝑟𝑦 𝑏𝑙𝑜𝑐𝑘 𝑛𝑢𝑚𝑏𝑒𝑟,
𝑚 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑖𝑛𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑎𝑐ℎ𝑒.
Direct Mapping
Direct Mapping

• For purposes of cache access, each main memory address can be


viewed as consisting of three fields.
• The least significant w bits identify a unique word or byte within a
block of main memory; in most contemporary machines, the address
is at the byte level.
• The remaining s bits specify one of the 2s blocks of main memory.
Tag (T) Line number (r) Word (w)

S w
Direct Mapping
• The cache logic interprets these s bits as a tag of s-r bits (most
significant portion) and a line field of r bits. This latter field identifies
one of the m = 2r lines of the cache. To summarize,
• Address length = (s+w) bits
• Number of addressable units = 2(s+w) bytes or words
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s, Number of lines in cache m = 2r
• Size of cache = 2r+w words or bytes
• Size of tag = (s-r) bits
Direct Mapping

• When a block is actually read into its assigned line, it is necessary to


tag the data to distinguish it from other blocks that can fit into that
line. The most significant s – r bits (tag) serve this purpose.
• The direct mapping technique is simple and inexpensive to
implement. Its main disadvantage is that there is a fixed cache
location for any given block. Thus, if a program happens to reference
words repeatedly from two different blocks that map into the same
line, then the blocks will be continually swapped in the cache, and the
hit ratio will be low (a phenomenon known as thrashing).
Associative Mapping

• Associative mapping overcomes the disadvantage of direct mapping


by permitting each main memory block to be loaded into any line of
the cache.
• In this case, the cache control logic interprets a memory address
simply as a Tag and a Word field. The Tag field uniquely identifies a
block of main memory. To determine whether a block is in the cache,
the cache control logic must simultaneously examine every line’s tag
for a match.
Tag Word
Associative Mapping
• Note that no field in the address corresponds to the line number, so
that the number of lines in the cache is not determined by the address
format. To summarize:
• Address length = (s+w) bits
• Number of addressable units = 2(s+w) bytes or words
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s
• Number of lines in cache = undetermined
• Size of tag = s bits
Associative Mapping

• With associative mapping, there is flexibility as to which block to


replace when a new block is read into the cache. Replacement
algorithms, discussed later.
• The principal disadvantage of associative mapping is the complex
circuitry required to examine the tags of all cache lines in parallel.
Set-Associative Mapping

• Set-associative mapping is a compromise that exhibits the


strengths of both the direct and associative approaches while
reducing their disadvantages.
• In this case, the cache consists of number sets, each of which
consists of a number of lines. The relationships are
𝑚 =𝑣×𝑘
𝑖 = 𝑗 𝑚𝑜𝑑𝑢𝑙𝑜 𝑣
Set-Associative Mapping
𝑚 =𝑣×𝑘
𝑖 = 𝑗 𝑚𝑜𝑑𝑢𝑙𝑜 𝑣
𝑖 = 𝑐𝑎𝑐ℎ𝑒 𝑠𝑒𝑡 𝑛𝑢𝑚𝑏𝑒𝑟,
𝑗 = 𝑚𝑎𝑖𝑛 𝑚𝑒𝑚𝑜𝑟𝑦 𝑏𝑙𝑜𝑐𝑘 𝑛𝑢𝑚𝑏𝑒𝑟,
𝑚 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑖𝑛𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑎𝑐ℎ𝑒,
𝑣 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑒𝑡𝑠,
𝑘 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑖𝑛𝑒𝑠 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑠𝑒𝑡
• This is referred to as k-way set-associative mapping
Set-Associative Mapping

• With set-associative mapping, block Bj can be mapped into any of


the lines of set j.
• The following figure illustrates this mapping for the first v blocks of
main memory. As with associative mapping, each word maps into
multiple cache lines. For set-associative mapping, each word maps
into all the cache lines in a specific set, so that main memory block
B0 maps into set 0, and so on.
Set-Associative Mapping
Replacement Algorithms

• Once the cache has been filled, when a new block is brought into
the cache, one of the existing blocks must be replaced.
• For direct mapping, there is only one possible line for any particular
block, and no choice is possible. For the associative and set-
associative techniques, a replacement algorithm is needed.
• A number of algorithms have been tried. We mention three of the
most common: LRU, LFU, and FIFO.
Replacement Algorithms

• Least recently used (LRU): Replace that block in the set that has
been in the cache longest with no reference to it.
• Because of its simplicity of implementation, LRU is the most popular
replacement algorithm.
• It is most effective.
• First-in-first-out (FIFO): Replace that block in the set that has been
in the cache longest.
• Least frequently used (LFU): Replace that block in the set that has
experienced the fewest references.
Write Policy

• When a block that is resident in the cache is to be replaced, there


are two cases to consider. If the old block in the cache has not been
altered, then it may be overwritten with a new block without first
writing out the old block.
• If at least one write operation has been performed on a word in that
line of the cache, then main memory must be updated by writing
the line of cache out to the block of memory before bringing in the
new block.
Write Policy
• The simplest technique is called write through. Using this
technique, all write operations are made to main memory as well as
to the cache.
• The main disadvantage of this technique is that it generates
substantial memory traffic and may create a bottleneck.
• An alternative technique, known as write back, minimizes memory
writes. With write back, updates are made only in the cache. When
an update occurs, a dirty bit, or use bit, associated with the line is
set. Then, when a block is replaced, it is written back to main
memory if and only if the dirty bit is set.
Unified Versus Split Caches
• More recently, it has become common to split the cache into two:
one dedicated to instructions and one dedicated to data.
• Advantages of unified cache:
• Higher hit rate
• Balances load of instruction and data fetches automatically
• Only one cache needs to be designed and implemented
• Advantages of split cache:
• Eliminates cache contention between instruction fetch/decode unit and execution unit.
• Important in pipelining.

You might also like