COA - Unit 5
COA - Unit 5
COA - Unit 5
RAM - Random Access Memory: As the names suggest, the RAM or random access memory is a form
of semiconductor memory technology that is used for reading and writing data in any order – in other
words as it is required by the processor. It is used for such applications as the computer or processor
memory where variables and other stored and are required on a random basis. Data is stored and read
many times to and from this type of memory.
ROM - Read Only Memory: A ROM is a form of semiconductor memory technology used where the
data is written once and then not changed. In view of this it is used where data needs to be stored
permanently, even when the power is removed - many memory technologies lose the data once the power
is removed. As a result, this type of semiconductor memory technology is widely used for storing
programs and data that must survive when a computer or processor is powered down. For example the
BIOS of a computer will best stored in ROM. As then name implies, data cannot be easily written to
ROM. Depending on the technology used in the ROM, writing the data into the ROM initially may require
special hardware. Although it is often possible to change the data, this gain requires special hardware to
erase the data ready for new data to be written in.
The different memory types or memory technologies are detailed below:
DRAM: Dynamic RAM is a form of random access memory. DRAM uses a capacitor to store each bit
of data, and the level of charge one ach capacitor determines whether that bit is a logical 1 or 0. However
these capacitors do not hold their charge indefinitely, and therefore the data needs to be refreshed
periodically. As a result of this dynamic refreshing it gains its name of being a dynamic RAM. DRAM
is the form of semiconductor memory that is often used in equipment including personal computers and
work stations where it forms the main RAM for the computer.
EEPROM: This is an Electrically Erasable Programmable Read Only Memory. Data can be written to it
and it can be erased using an electrical voltage. This is typically applied to an erase pin on the chip. Like
other types of PROM, EEPROM retains the contents of the memory even when the power is turned off.
Also like other types of ROM, EEPROM is not as fast as RAM.
EPROM: This is an Erasable Programmable Read Only Memory. This form of semiconductor memory
can be programmed and then erased at a later time. This is normally achieved by exposing the silicon to
ultraviolet light. To enable this to happen there is a circular window in the package of the EPROM to
enable the light to reach the silicon of the chip. When the PROM is in use, this window is normally
covered by a label, especially when the data may need to be preserved for an extended period. The PROM
stores its data as a charge on a capacitor. There is a charge storage capacitor for each cell and this can be
read repeatedly as required. However it is found that after many years the charge may leak away and the
data may be lost. Nevertheless, this type of semiconductor memory used to be widely used in applications
where a form of ROM was required, but where the data needed to be changed periodically, as in a
development environment, or where quantities were low.
F-RAM: Ferro electric RAM is a random-access memory technology that has many similarities to the
standard DRAM technology. The major difference is that it incorporates a ferroelectric layer instead of
the more usual dielectric layer and this provides its non-volatile capability. As it offers a non-volatile
capability, F-RAM is a direct competitor to Flash.
P-RAM/PCM: This type of semiconductor memory is known as Phase change Random Access
Memory, P-RAM or just Phase Change memory, PCM. It is based around a phenomenon where a formof
chalcogenide glass changes is state or phase between an amorphous state (high resistance) and
apolycrystallinestate(lowresistance).Itispossibleto detectthestateofanindividualcelland henceusethis for
data storage. Currently this type of memory has not been widely commercialized, but it isexpectedtobe
acompetitorforflashmemory.
PROM: This stands for Programmable Read Only Memory. It is a semiconductor memory which
canonly have data written to it once - the data written to it is permanent. These memories are bought in
ablank format and they are programmed using a special PROM programmer. Typically a PROM
willconsist of an array of fuseable links some of which are "blown" during the programming process
toprovide therequireddatapattern.
SDRAM: Synchronous DRAM. This form of semiconductor memory can run at faster speeds
thanconventional DRAM. It is synchronised to the clock of the processor and is capable of keeping two
setsof memory addresses open simultaneously. By transferring data alternately from one set of
addresses,and then the other, SDRAM cuts down on the delays associated with non-synchronous RAM,
whichmustcloseone address bankbeforeopeningthenext.
SRAM: Static Random Access Memory. This form of semiconductor memory gains its name from
thefact that, unlike DRAM, the data does not need to be refreshed dynamically. It is able to support
fasterread and write times than DRAM (typically 10 ns against 60 ns for DRAM), and in addition its
cycletime is much shorter because it does not need to pause between accesses. However it consumes
morepower,is lessdenseandmoreexpensivethanDRAM.Asaresultofthis itis normallyusedforcaches,while
DRAMis usedas themainsemiconductormemorytechnology.
MEMORY ORGANIZATION
Memory Interleaving:
Pipeline and vector processors often require simultaneous access to memory from two or more
resources. An instruction pipeline may require the fetching of an instruction and an operand at the same
time from two different segments.
Similarly, an arithmetic pipeline usually requires two or more operands to enter the pipeline at the
same time. Instead of using two memory buses for simultaneous access, the memory can be partitioned
into a number of modules connected to a common memory address and data buses. A memory module is
a memory array together with its own address and data registers. Figure 9-13 shows a memory unit with
four modules. Each memory array has its own address register AR and data register DR.
The address registers receive information from a common address bus and the data registers
communicate with a bidirectional data bus. The two least significant bits of the address can be used to
distinguish between the four modules. The modular system permits one module to initiate a memory
access while other modules are in the process of reading or writing a word and each module can honor a
memory request in dependent of the state of the other modules.
The advantage of a modular memory is that it allows the use of a technique called interleaving. In
an interleaved memory, different sets of addresses are assigned to different memory modules. For
example, in a two-module memory system, the even addresses may be in one module and the odd
addresses in the other.
Why do we use Memory Interleaving?
Memory interleaving is used to improve the performance of computer systems by increasing memory
bandwidth. It involves dividing memory into multiple banks and accessing them in a round-robin fashion,
which allows for simultaneous access to multiple memory locations and reduces the time required to access
data.
Types of Interleaved Memory
• Reduced memory access latency: Interleaving allows the processor to access the memory in a more
efficient manner, thereby reducing the memory access latency. This results in faster data access and
improved system responsiveness.
• Increased memory capacity: Interleaving enables the use of more memory modules, thereby increasing
the total memory capacity of the system. This is particularly useful in systems that require a large
amount of memory, such as servers and high-performance workstations.
• Improved reliability: Interleaving provides a level of redundancy in the memory system. In case of a
failure of one memory module, the system can continue to function by using the remaining modules.
• Better error correction: Interleaving makes it easier to detect and correct errors in the memory system.
By spreading the data across multiple memory modules, errors can be isolated and corrected more
easily.
1. Registers
Registers are small, high-speed memory units located in the CPU. They are used to store the most
frequently used data and instructions. Registers have the fastest access time and the smallest storage
capacity, typically ranging from 16 to 64 bits.
2. Cache Memory
Cache memory is a small, fast memory unit located close to the CPU. It stores frequently used data and
instructions that have been recently accessed from the main memory. Cache memory is designed to
minimize the time it takes to access data by providing the CPU with quick access to frequently used data.
3. Main Memory
Main memory, also known as RAM (Random Access Memory), is the primary memory of a computer
system. It has a larger storage capacity than cache memory, but it is slower. Main memory is used to store
data and instructions that are currently in use by the CPU.
Types of Main Memory
• Static RAM: Static RAM stores the binary information in flip flops and information remains
valid until power is supplied. It has a faster access time and is used in implementing cache
memory.
• Dynamic RAM: It stores the binary information as a charge on the capacitor. It requires
refreshing circuitry to maintain the charge on the capacitors after a few milliseconds. It contains
more memory cells per unit area as compared to SRAM.
4. Secondary Storage
Secondary storage, such as hard disk drives (HDD) and solid-state drives (SSD), is a non-volatile memory
unit that has a larger storage capacity than main memory. It is used to store data and instructions that are
not currently in use by the CPU. Secondary storage has the slowest access time and is typically the least
expensive type of memory in the memory hierarchy.
5. Magnetic Disk
Magnetic Disks are simply circular plates that are fabricated with either a metal or a plastic or a magnetized
material. The Magnetic disks work at a high speed inside the computer and these are frequently used.
6. Magnetic Tape
Magnetic Tape is simply a magnetic recording device that is covered with a plastic film. It is generally
used for the backup of data. In the case of a magnetic tape, the access time for a computer is a little slower
and therefore, it requires some amount of time for accessing the strip.
Capacity:
It is the global volume of information the memory can store. As we move from top to bottom in
the Hierarchy, the capacity increases.
Access Time:
It is the time interval between the read/write request and the availability of the data. As we move
from top to bottom in the hierarchy, the access time increases.
Performance:
Earlier when the computer system was designed without Memory Hierarchy design, the speed
gap increases between the CPU registers and Main Memory due to large difference in access time. This
results in lower performance of the system and thus, enhancement was required. This enhancement was
made in the form of Memory Hierarchy Design because of which the performance of the system
increases. One of the most significant ways to increase system performance is minimizing howfar down
the memory hierarchy one has to go to manipulate data.
Cost per bit:
As we move from bottom to top in the Hierarchy, the cost per bit increases i.e. Internal Memory
is costlier than External Memory.
Cache Hits
The processor does not need to know explicitly about the existence of the cache. It simply issues
Read and Write requests using addresses that refer to locations in the memory. The cache control
circuitry determines whether the requested word currently exists in the cache.
If it does, the Read or Write operation is performed on the appropriate cache location. In this case, aread
Or write hit is said to have occurred.
Cache Misses
A Read operation for a word that is not in the cache constitutes a Read miss. It causes the block
of words containing the requested word to be copied from the main memory into the cache.
HIT RATIO:
Performance of cache is measured by the number of cache hits to the number of searches. This parameter
of measuring performance is known as the Hit Ratio.
Hit ratio=(Number of cache hits)/(Number of searches)
Types of Cache Memory
L1 or Level 1 Cache:
It is the first level of cache memory that is present inside the processor. It is present in a small
amount inside every core of the processor separately. The size of this memory ranges from 2KB to 64
KB.
L2 or Level 2 Cache:
It is the second level of cache memory that may present inside or outside the CPU. If not present
inside the core, It can be shared between two cores depending upon the architecture and is connected to a
processor with the high-speed bus. The size of memory ranges from 256 KB to 512 KB.
L3 or Level 3 Cache:
It is the third level of cache memory that is present outside the CPU and is shared by all the cores
of the CPU. Some high processors may have this cache. This cache is used to increase the performance of
the L2 and L1 cache. The size of this memory ranges from 1 MB to 8MB.
Cache Mapping:
• Cache mapping defines how a block from the main memory is mapped to the cache memory in case
of a cache miss.
OR
• Cache mapping is a technique by which the contents of main memory are brought into the cache
memory.
There are three different types of mapping used for the purpose of cache memory which are as
follows:
Direct mapping,
Associative mapping
Set-Associative mapping.
Associative Mapping
In Associative mapping method, in which a main memory block can be placed into any cache
block position. In this case, 12tag bits are required to identify a memory block when it is resident in the
cache. The tag bits of an address received from the processor are compared to the tag bits of each block
of the cache to see if the desired block is present. This is called the associative-mapping technique.
It gives complete freedom in choosing the cache location in which to place the memory block,
resulting in a more efficient use of the space in the cache. When anew block is brought in to the cache,
it replaces (ejects) an existing block only if the cache is full. In this case, we need an algorithm to select
the block to be replaced.
To avoid a long delay, the tags must be searched in parallel. A search of this kind is called an
Associative search.
Set-Associative Mapping
Another approach is to use a combination of the direct- and associative-mapping techniques.
The blocks of the cache are grouped into sets, and the mapping allows a block of the main memory ore
side in any block of a specific set. Hence, the contention problem of the direct method is eased by
having a few choices for block placement.
At the same time, the hardware cost is reduced by decreasing the size of the associative search.
An example of this set-associative-mapping technique is shown in Figure 8.18 for a cache with two
blocks per set. In this case, memory blocks 0, 64, 128, . . . , 4032 map into cache set 0, and they can
occupy either of the two block positions within this set.
Having 64 sets means that the 6-bit set field of the address determines which set of the cache
might contain the desired block. The tag field of the address must then be associatively compared to
the tags of the two blocks of the set to check if the desired block is present. This two-way associative
search is simple to implement.
The number of blocks per set is a parameter that can be selected to suit the requirements
of a particular computer. For the main memory and cache sizes in Figure 8.18, four blocks per set can be
accommodated by a 5-bit set field, eight blocks per set by a 4-bit set field, and so on. The extreme
condition of 128 blocks per set requires no set bits and corresponds to the fully-associative technique,
with12 tag bits. The other extreme of one block per set is the direct-mapping.
Replacement Algorithms
In a direct-mapped cache, the position of each block is predetermined by its address; hence, the
replacement strategy is trivial. In associative and set-associative caches there exists some flexibility.
When a new block is to be brought in to the cache and all the positions that it may occupy are full, the
cache controller must decide which of the old blocks to overwrite.
This is an important issue, because the decision can be a strong determining factor in system
performance. In general, the objective is to keep blocks in the cache that are likely to be referenced in
the near future. But,it is not easy to determine which blocks are about to be referenced.
The property of locality of reference in programs gives a clue to a reasonable strategy. Because
program execution usually stays in localized areas for reasonable periods of time, there is a high
probability that the blocks that have been referenced recently will be referenced again soon. Therefore,
when a block is to be overwritten, it is sensible to overwrite the one that has gone the longest time
without being referenced. This block is called the least recently used (LRU)block, and the technique is
called the LRU replacement algorithm.
The LRU algorithm has been used extensively. Although it performs well for many access
patterns, it can lead to poor performance in some cases.
Our cache storage is finite. Especially in caching environments where high-performance and expensive
storage is used. So in short, we have no choice but to evict some objects and keep others.
Cache replacement algorithms do just that. They decide which objects can stay and which objects should be
evicted.
After reviewing some of the most important algorithms we go through some of the challenges that we
might encounter.
LRU
The least recently used (LRU) algorithm is one of the most famous cache replacement algorithms and for
good reason!
As the name suggests, LRU keeps the least recently used objects at the top and evicts objects that haven't
been used in a while if the list reaches the maximum capacity.
So it's simply an ordered list where objects are moved to the top every time they're accessed; pushing other
objects down.
LRU is simple and providers a nice cache-hit rate for lots of use-cases.
LFU
the least frequently used (LFU) algorithm works similarly to LRU except it keeps track of how many times
an object was accessed instead of how recently it was accessed.
Each object has a counter that counts how many times it was accessed. When the list reaches the maximum
capacity, objects with the lowest counters are evicted.
LFU has a famous problem. Imagine an object was repeatedly accessed for a short period only. Its counter
increases by a magnitude compared to others so it's very hard to evict this object even if it's not accessed for
a long time.
FIFO
FIFO (first-in-first-out) is also used as a cache replacement algorithm and behaves exactly as you would
expect. Objects are added to the queue and are evicted with the same order. Even though it provides a
simple and low-cost method to manage the cache but even the most used objects are eventually evicted
when they're old enough.
This algorithm randomly selects an object when it reaches maximum capacity. It has the benefit of not
keeping any reference or history of objects and being very simple to implement at the same time.
Write Policies
A cache’s write policy is the behavior of a cache while performing a write operation. A cache’s write
policy plays a central part in all the variety of different characteristics exposed by the cache
The write operation is proceedingin2ways.
• Write- through protocol
• Write-back protocol
Write-through protocol:
Here the cache location and the main memory locations are updated simultaneously.
Write-back protocol:
• This technique is to update only the cache location and to mark it as
with associated flag bit called dirty/modified bit.
• The word in the main memory will be updated later, when the block containing
this marked word is to be removed from the cache to make room for a new block.
• To over come the read miss Load–through/Early restart protocol is used.
The data is updated only in the cache and updated into the memory at a later time. Data is updated in the
memory only when the cache line is ready to be replaced (cache line replacement is done using Belady’s
Anomaly, Least Recently Used Algorithm, FIFO, LIFO, and others depending on the application).
Write Back is also known as Write Deferred.
• Dirty Bit: Each Block in the cache needs a bit to indicate if the data present in the cache was
modified(Dirty) or not modified(Clean). If it is clean there is no need to write it into the
memory. It is designed to reduce write operation to a memory. If Cache fails or if the System
fails or power outages the modified data will be lost. Because it’s nearly impossible to restore
data from cache if lost.
• If write occurs to a location that is not present in the Cache(Write Miss), we use two
options, Write Allocation and Write Around.
WRITE TROUGH PROTOCOL WRITE BACK PROTOCOL