Memory Cache: Computer Architecture and Organization

MEMORY CACHE
Computer Architecture and Organization
Friday, September 16 2022

Haikal Azka Raffasya (3337210038)
Nur Amelia (1001220250)
MEMBERS M. Reztu Rizki Novdian (3337210007)
Fadel Najim Aldiansyah (3337210008)
Hafiz Ibrahim (3337210012)
Zaenal Abidin (3337210036)
M. Rizal Rifaldi (3337210044)
Zahrani Anindita Sahara (3337210048)
Alya Fauzia Azizah (3337210054)
M. Fariz Abqory (3337210058)

MEMORY CACHE OVERVIEW
4.1
Computer Memory System Overview
ristic of Memory S
acte yste
Car m
Location Capacity Access Method

Internal, main mamory in Number of Bytes, based on a Sequential, how to sort files
computer such as RAM or fixed number of bytes. from top to bottom.
ROM
Number of Words, the Direct, directly search for the
Eksternal, storage space of a number in the word is not file you are looking for
device, such as a hard disk certain. without being sorted.
Unit of Transfer
Is maximum number that can be read and written at the same time.
Word, usually used on internal memory.
Block, usually used on external memory.
Performance
Access Time, the length of time it takes the memory to complete its task.
Memory Cycle Time, the amount of time required to perform the next task.
Transfer Rate, memory speed receive or output data
Physical Type
Magnetic Memory, Optical Drive, Semiconductor,

example hard disk Example CDs, DVDs, and Example RAM, ROM,
Blu-ray discs SDRAM, etc.
Phycical Caracteristic
Volatile, memory that requires power to be used. Examples of RAM.
Non-Volatile, even if the device is off, the memory will retain data. For example HDD and
SSD (ROM).
4.2
CACHE MEMORY PRINCIPLES
Figure below depicts the use of multiple levels of
Cache memory is designed to combine the memory
cache. The L2 cache is slower and typically larger
access time of expensive, high-speed memory
than the L1 cache, and the L3 cache is slower and
combined with the large memory size of less
typically larger than the L2 cache. There is a
expensive, lower-speed memory. The concept is
relatively large and slow main memory together
shown below
with a smaller, faster cache memory.
The cache contains a copy of portions of main memory. When the processor attempts to
read a word of memory, a check is made to determine if the word is in the cache. If so, the
word is delivered to the processor. If not, a block of main memory, consisting of some fixed
number of words, is read into the cache and then the word is delivered to the processor.
Figure below depicts the structure of a cache/main-memory system. Main memory consists of up to 2n
addressable words, with each word having a unique n- bit address. For mapping purposes, this memory
is considered to consist of a number of fixed-length blocks of K words each. The cache consists of m
blocks, called lines. Each line contains K words,
Multiple Cache (on-chip)

This Figure illustrates the read operation. The processor generates the read address (RA) of a word to
be read. If the word is contained in the cache, it is delivered to the processor. Otherwise, the block
containing that word is loaded into the cache, and the word is delivered to the processor.
4.3
ELEMENTS OF CACHE DESIGN
CACHE ADDRESS
Most non-embedded processors, and many embedded processors, support virtual memory.
Basically, virtual memory is a facility that allows programs to address memory from a logical point of
view, regardless of the amount of main memory physically available. When virtual memory is used,
the address field of the machine instruction contains the virtual address. To read and write from
memory, a hardware memory management unit (MMU) translates each virtual address into a
physical address in main memory.
When virtual addresses are used, the system designer may choose to place the cache between the
processor and the MMU or between the MMU and main memory A logical cache, also known as a
virtual cache, stores data using virtual addresses. The processor accesses the cache directly, without
going through the MMU. A physical cache stores data using main memory physical addresses.
CACHE ADDRESS
CACHE SIZE
The cache designer wants a cache that is small enough so that the average price per bit is close to
the price of main memory and large enough that the average access time is close to the cache itself.
There are several motivations for minimizing cache size. Where the larger the cache, the greater the
number of gates involved in addressing the cache. As a result, large caches tend to be somewhat
slower than smaller caches – even when built with the same IC technology and placed in the same
place in chips and circuit boards. Availability of chips and board area also limits cache size. Since
cache performance is very sensitive to the nature of its workload, it is not possible to obtain an
optimum cache size.
APPING FUNCTION
M
There are fewer cache channels compared to main memory blocks, an algorithm is needed for mapping main
memory blocks into cache channels. There are three types of techniques, namely as follows:
Direct Mapping Associative Mapping Set Associative Mapping
This mapping maps each main In this mapping, the cache is

This mapping overcomes the
memory block to only one divided into a number of sets.
drawback of direct mapping by Each set contains a number of
cache channel. If a block is in
allowing each block of main lines. Associative set mapping
cache, then its place is fixed.
memory to be loaded onto any takes advantage of the
The advantage of direct
cache line. There is the flexibility advantages of direct mapping and
mapping is that it is simple and
of replacing blocks when a new associative mapping approaches.
inexpensive.
block is read into the cache.
lacement Algorith
Rep m
FIFO/LIFO
In FIFO the items that go to the cache first are
01 removed regardless of how often or how many
times they were accessed before. LIFO behaves the
other way around - removing the most recent item
from the cache.
LRU (Least Recently Used)
02 The implementation is pretty much the same as a

FIFO cache - using maps and doubly linked lists.
The only difference is that whenever there is a
cache hit, we move it to the front of the queue.
placment Algorith
Re m
Least Frecuently Used (LFU)

In LFU, the cache counts the number of times an
03 item was accessed and outputs the item that has the
least number of accesses.
Random Algorithme
This technique does not use any information in
04 determining which page to replace.
Every time a page fault occurs, the replacement is
chosen at random.
WRITE POLICY
Write-through The logic behind this

technique is based on the
Write operations that involve fact that during a cache
data in main memory as well write operation, cached
as in cache memory so that words can be accessed
multiple times.
data is always valid. The
This method helps reduce
disadvantages of this
the number of references
technique are:
to main memory.
Data traffic to main
memory and cache is very Write-back
high
Reduce system
performance, hangs may
occur
LINE SIZE
When a block of data is retrieved and placed in the cache, not only the desired word but
also some number of adjacent words are retrieved. As the block becomes even bigger and
the probability of using the newly fetched information becomes less than the probability
of reusing the information that has to be replaced. Two specific effects come into play:
1. Larger blocks reduce the number of blocks that fit into a cache.
2. As a block becomes larger, each additional word is farther from the requested word
and therefore less likely to be needed in the near future.
The relationship between block size and hit ratio is complex, depending on the locality
characteristics of a particular program, and no definitive optimum value has been found.
UMBER OF CACHE
N
MULTIPLE CACHE (ON-CHIP)
Multilevel cache is one of the techniques to improve cache performance by reducing the “miss penalty”. The term
miss penalty refers to the extra time required to bring the data into cache from the main memory whenever
there is a “miss” in cache.
Typically, most contemporary designs include both on-chip (internal level 1 (L1)) and external caches (level 2
(L2)). For the reason If there is no L2 cache and the processor makes an access request for a memory location not
in the L1 cache, then the processor must access DRAM or ROM memory across the bus. On the other hand, if an
L2 SRAM (static RAM) cache is used, then frequently the missing information can be quickly retrieved.
Two features of contemporary cache design for multilevel caches are noteworthy.
1. Many designs use the system bus use a separate data path as the path for transfer between the L2 cache and
the processor to reduce the burden on the system bus.
2. With the continued shrinkage of processor components, combine the L2 cache on the processor chip,
improving performance.
Originally, the L3 cache was accessible over the external bus. More recently, most microprocessors have
incorporated an on- chip L3 cache.
Total Hit Ratio (L1 and L2) for 8-kB and 16-kB L1
UNIFINED VERSUS SPLIT CACHES
More recently, it has become common to split the cache into two: one dedicated to
instructions and one dedicated to data. These two caches both exist at the same
level, typically as two L1 caches. There are two potential advantages of a unified
cache:
1. For a given cache size, a unified cache has a higher hit rate than split caches
because it balances the load between instruction and data fetches automatically.
2. Only one cache needs to be designed and implemented.
The key advantage of the split cache design is that it eliminates contention for the
cache between the instruction fetch/decode unit and the execution unit.
4.4
PENTIUM 4 CACHE ORGANIZATION
The Pentium 4 is a seventh generation microprocessor created by Intel Corporation and
released in November 2000 following the Intel Pentium III processor.
Pentium development evolution (Intel MicroproCesor)
1993 1997 1999 2000

Intel Pentium Processor Intel Pentium II Processor Intel Pentium III Processor Intel Pentium 4 Processor
1995 1998 1999

Intel Pentium Pro Processor Intel Pentium II Xeon Processor Intel Pentium III XeonProcessor
CACHE LEVEL PENTIUM 4
Cache Level 1, separate cache and is Cache Level 2, unified cache and
8KB in size and is a four-way 256KB in size. The line size is 128
associative set. This means that each bytes and is an eight-way associative
set consists of four rows in the set. This means that each set consists
cache. The line size is 64 bytes. of eight rows in the cache.
Cache Level 3, associative set

is eight directions and has a
line size of 128 bytes.
Pentium 4 Block Diagram
The processor core consists of four major components:
Fetch/decode unit: fetches program instruction in order from the L2 cache, decodes these
into a series of micro-operations, and stores the results in the L1 instruction cache.
Out-of –order execution logic: schedules execution of the micro- operations subject to data
dependencies and resource and resource availability, thus micro- operations maybe
scheduled for execution in a different order than they were fetched from the instruction
stream.
Execution units: These units execute micro- operations, fetching required data from the L1
data cache and temporarily storing results in registers.
Memory subsystem: This unit includes the L2 and L3 cache and system bus, which is used to
access main memory when the L1 and L2cache have a cache miss, and to access the system
I/O resources.
APPENDIX 4A
PERFORMANCE CHARACTERISTICS OF TWO-
LEVEL MEMORIES
This two-tier architecture exploits a property known as locality to provide a comparable
increase in performance over a single tier of memory. The main memory cache
mechanism is part of the computer architecture, implemented in hardware and usually
not visible to the operating system. There are two other examples of a two-tier memory
approach that also exploit locality and which are, at least in part, implemented in the
operating system: virtual memory and disk cache.
The basis for the superior performance of two-tier memory is a principle known as
reference locality. This principle states that memory references tend to cluster. Over a
long period of time, the clusters used change, but in a short time, the processor mainly
works with fixed clusters of the original reference memory.
Two conditions in the difference in locality, that is:
Spatial locality, refers to the tendency of Temporal locality, refers to the tendency of the
execution to involve a number of clustered processor to access memory locations that have
memory locations and the processor accessing been used recently and take advantage of larger
instructions sequentially through the cache cache blocks by incorporating prefetching
hierarchy. mechanisms into cache control logic.
The locality property can be exploited in two-level memory formation. Upper-level
memory (MI) is smaller, faster, and more expensive (per bit) than lower-level memory
(M2). The ml is used as temporary storage for most of the contents of the larger M2.
When a memory reference is created, an attempt is made to access the item in Ml.
Parameters relevant to the assessment of the two-

How to express the average time for an item:
level memory mechanism.
Relationship of Average Memory Cost to Relative Memory Size for a
Two-Level Memory
Parameters relevant to the assessment of the two-

level memory mechanism.
For two-tier memory to provide a significant
performance boost, we need to have Ts (roughly
Access Efficiency as a Function of Hit Ratio (r=T2/T1) equal to Ty Ts T )
To get this, consider the quantity Tl/Ts, which is
called the access efficiency. It is a measure of how
close the mean access time T,) is to the access time
Ml(Ti).
Source
Erlina Tati, Putri Rahmi Eka (2017). Pemanfaatan Elemen Perancangan Cache Yang Tepat
Untuk Meminimalisasi Konsumsi Daya Dan Meningkatkan Kinerja Cache Memory Pada
Microprocessor. Diakses 10 September 2022, dari Universitas Andalas.
Stallings, William (2016). Computer Organization and Architecture Edisi 10: PT.INDEKS
Kelompok GRAMEDIA.
Azari, Ahmad (2018): Elemen Cache Memory pada
http://ahmadazhari76.blogspot.com/2018/01/elemen-cache-memory.html?m=1 diakses
pada tanggal 14 September 2022
Safira, Amera P. (2021): Apa Itu Cache Memory? Pengertian, Fungsi, & Jenisnya : Golden Fast
Network. pada https://www.goldenfast.net/blog/apa-itu-cache-memory/ diakses pada 13
September 2022
Source
Fariza, Alvana Noor, (2022): Kenali Cache Memory, Inovasi Untuk Meningkatkan Kecepatan
Olah Data: Sekawan Studio. pada https://sekawanstudio.com/blog/apa-itu-cache-memory/
diakses pada tanggal 13 September 2022.
McClanahan, Patric (2011): 1.7.1 Cache Memory - Multilevel Cache: University of California
pada
https://eng.libretexts.org/Courses/Delta_College/Operating_System%3A_The_Basics/01%3
A_The_Basics_-_An_Overview/1.7_Cache_Memory/1.7.1_Cache_Memory_-
_Multilevel_Cache diakses pada tanggal 13 September 2022
Any Question?
THANK YOU

Memory Cache: Computer Architecture and Organization

Uploaded by

Copyright:

Available Formats

Memory Cache: Computer Architecture and Organization

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Memory Cache: Computer Architecture and Organization

Uploaded by

Copyright:

Available Formats

MEMORY CACHE

Computer Architecture and Organization

Friday, September 16 2022

Nur Amelia (1001220250)

MEMBERS M. Reztu Rizki Novdian (3337210007)

Fadel Najim Aldiansyah (3337210008)

Hafiz Ibrahim (3337210012)

Zaenal Abidin (3337210036)

M. Rizal Rifaldi (3337210044)

Zahrani Anindita Sahara (3337210048)

Alya Fauzia Azizah (3337210054)

M. Fariz Abqory (3337210058)

Location Capacity Access Method

Magnetic Memory, Optical Drive, Semiconductor,

Multiple Cache (on-chip)

Direct Mapping Associative Mapping Set Associative Mapping

This mapping maps each main In this mapping, the cache is

LRU (Least Recently Used)

02 The implementation is pretty much the same as a

Least Frecuently Used (LFU)

Write-through The logic behind this

Pentium development evolution (Intel MicroproCesor)

1993 1997 1999 2000

1995 1998 1999

Cache Level 3, associative set

Parameters relevant to the assessment of the two-

Parameters relevant to the assessment of the two-

You might also like