Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Memory Organization Lecture

Uploaded by

w8nng9rhhf
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Memory Organization Lecture

Uploaded by

w8nng9rhhf
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 91

Computer Organization

and Architecture
Unit 3 : Memory Organization

Prepared by :
Prof. Khushbu Chauhan
Computer Engg. Dept.
MPSTME, NMIMS
Outlines
• Introduction: Internal Memory- Memory characteristics and memory hierarchy.
• Cache Memory- Elements of cache design
• Address mapping and Translation-Direct mapping
• Address mapping and translation- Associative mapping
• Address mapping and translation - Set associative mapping
• Performance characteristics of two level memory
• Semiconductor main memory- Types of RAM, DRAM and SRAM
• Chip logic
• Memory module organization
• High speed memories- Associative memory
• High speed memories- Interleaved memory.
Introduction : Computer Memory

• Computer memory is an internal or external system that stores


data and instructions on a device. It consists of several cells,
called memory cells, that each have a unique identification
number. The central processing unit (CPU), which reads and
executes instructions, selects specific cells to read or write data
depending on the task the user is asking the computer to do.
Characteristics of Memory Systems
Location

The computer memory is placed in three different locations:


a. CPU :- It is in the form of CPU registers and its internal cache memory
(16K bytes in case of Pentium)
b. Internal :- It is in the main memory of the system which CPU can
access directly
c. External: It is in the form of secondary storage devices such as
magnetic disk, tapes, etc. The CPU accesses this memory with the help
of I/O controllers
Capacity
• It is expressed using two terms:
• Word size and number of words.
a.Word size :- It is expressed in bytes (8-bit). The common
word sizes are 8,16 and 32 bits.
b.Number of word :- This term specifies the number of words
available in the particular memory device.
•e.g. If memory capacity is 4K X 8, then its word size is 8 and
number of words are 4K=4096.
Unit of Transfer
• It is the maximum number of bits that can be read or written into the

memory at a time.
• Internal
– Usually governed by data bus width
• External
– Usually a block which is much larger than a word
• Addressable unit
– Smallest location which can be uniquely addressed
– Word internally
– Cluster on M$ disks
Access Methods
• Sequential
– Start at the beginning and read through in order
– Access time depends on location of data and previous location
– e.g. tape
• Direct
– Individual blocks have unique address
– Access is by jumping to vicinity plus sequential search
– Access time depends on location and previous location
– e.g. disk
• Random
– Individual addresses identify locations exactly
– Access time is independent of location or previous access
– e.g. RAM
• Associative
– Data is located by a comparison with contents of a portion of the
store
– Access time is independent of location or previous access
– e.g. cache
Performance
•The performance of the memory system is determined using three parameters:

I. Access time: In case of random access memory, it is the time taken by memory to
complete read/ write operation from the instant that an address is sent to the memory.
On the other hand, for nonrandom access memory, access time is the time it takes
to position the read-write mechanism at the desired location.

II. Memory cycle time: This term is used only with random access memory and it is
defined as access time plus additional time required before a second access can commence.

III. Transfer rate: It is defined as the rate at which data can be transferred into or out
of a memory unit.
Physical Types
• Semiconductor
– RAM
• Magnetic
– Disk & Tape
• Optical
– CD & DVD
• Others
– Bubble
– Hologram
Physical Types
Physical Characteristics
• Two most common physical types used today are semiconductor memory and
magnetic surface memory.

•Physical characteristics :-

a.Volatile/Nonvolatile: If memory can hold data even if power is turned off, it is


called as nonvolatile memory; otherwise it is called as volatile memory.

b.Erasable/Non erasable: The memories in which data is once programmed


cannot be erased are called as nonerasable memories. On the other hand, if data in
the memory is erasable then memory is called as erasable memory.
Physical Characteristics
• Decay
• Volatility
• Erasable
• Power consumption
Organisation
• Physical arrangement of bits into words
• Not always obvious
• e.g. interleaved
Characteristics Of Memory Systems
Memory Hierarchy
• Registers
– In CPU
• Internal or Main memory
– May include one or more levels of
cache
– “RAM”
• External memory
– Backing store
Memory Hierarchy
• Registers
• L1 Cache (In CPU Chip)
• L2 Cache (In Processor
and main memory)
• Main memory
• Disk cache
• Magnetic Disk
• Optical
• Magnetic Tape
• Ideally, computer memory should be fast, large and inexpensive.
• Unfortunately, it is impossible to meet all the three of these requirements
simultaneously. Increased speed and size are achieved at increased cost.

• Very fast memory system can be achieved if SRAM chips are used.

• These chips are expensive and hence it is impracticable to build a large main
memory using SRAM chips.

• The only alternative is to use DRAM chips for large main memories.
• Processor fetches the code and data from the main memory to execute the

program.

• The DRAMs which form the main memory are slower devices.

• So it is necessary to insert wait states in memory read/ write cycles.

• This reduces the speed of execution.

• The solution to this problem is that most of the computer programs

work with only small sections of code and data at a particular time.
• In the memory system small section of SRAM is added along with main memory,
referred to as cache memory.

• The program (code) and data that work at a particular time is usually accessed from
the cache memory.

• This is accomplished by loading the active part of code and data from main memory
and cache memory.

• The cache controller looks after the swapping between main memory and cache
memory with the help of DMA controller.
• The cache memory just discussed is called secondary cache.

• Recent processors have the built-in cache memory called primary cache.

• DRAMs along with cache allow main memories in the range of tens of
megabytes to be implemented at a reasonable cost, size and better speed
performance.

• But the size of memory is still small compared to the demands of large
programs with voluminous data.

• A solution is provided by using secondary storage, mainly magnetic disk and


magnetic tapes to implement large memory spaces.
Cache Memory
• Cache memory is designed to combine the memory access time of expensive, high speed

memory combined with the large memory size of less expensive, lower-speed memory.

• The cache contains a copy of portions of main memory. When the processor attempts to read a

word of memory, a check is made to determine if the word is in the cache. If so, the word is

delivered to the processor. If not, a block of main memory, consisting of some fixed number of

words, is read into the cache and then the word is delivered to the processor. Because of the

phenomenon of locality of reference, when a block of data is fetched into the cache to satisfy a

single memory reference, it is likely that there will be future references to that same memory

location or to other words in the block.


Cache and Main Memory
Cache/Main Memory Structure
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from main memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which block of main memory is
in each cache slot
Cache Read Operation
• The processor generates the read
address (RA) of a word to be read. If
the word is contained in the cache, it
is delivered to the processor.
Otherwise, the block containing that
word is loaded into the cache, and
the word is delivered to the
processor.
• It shows these last two operations
occurring in parallel and reflects the
organization
Typical Cache Organization
• This is typical of contemporary cache
organizations. In this organization, the
cache connects to the processor via
data, control, and address lines. The
data and address lines also attach to
data and address buffers, which attach
to a system bus from which main
memory is reached. When a cache hit
occurs, the data and address buffers are
disabled and communication is only
between processor and cache, with no
system bus traffic
Elements of cache design
Cache Addressing
• Where does cache sit?
– Between processor and virtual memory management unit (Logical
address: generated by the CPU during program execution.
– Between MMU and main memory (Physical address: actual address in
the main memory where data is stored)
• Logical cache (virtual cache) stores data using virtual addresses
– Processor accesses cache directly, not thorough physical cache
– Cache access faster, before MMU address translation
– Virtual addresses use same address space for different applications
• Must flush cache on each context switch
• Physical cache stores data using main memory physical addresses
Logical and Physical Caches
• A logical cache, also known as
a virtual cache, stores data
using virtual addresses. The
processor accesses the cache
directly, without going through
the MMU. A physical cache
stores data using main memory
physical addresses.
Cache Size

• The larger the cache, the larger the number of gates involved in
addressing the cache. The result is that large caches tend to be slightly
slower than small ones—even when built with the same integrated
circuit technology and put in the same place on chip and circuit board.
• The available chip and board area also limits cache size. Because the
performance of the cache is very sensitive to the nature of the
workload, it is impossible to arrive at a single “optimum” cache size.
Cache Sizes of Some Processors
Mapping Function
• Because there are fewer cache lines than main memory blocks, an algorithm is needed for mapping main

memory blocks into cache lines. Further, a means is needed for determining which main memory block

currently occupies a cache line. The choice of the mapping function dictates how the cache is organized.

Three techniques can be used: direct, associative, and set-associative

• Cache of 64kByte

• Cache block of 4 bytes


– i.e. cache is 16k (214) lines of 4 bytes

• 16MBytes main memory

• 24 bit address
– (224=16M)
Direct Mapping

• Each block of main memory maps to only one cache line


– i.e. if a block is in cache, it must be in one specific place

• Address is in two parts

• Least Significant w bits identify unique word

• Most Significant s bits specify one memory block

• The MSBs are split into a cache line field r and a tag of s-r (most
significant)
Direct Mapping
• Step-01
• Every multiplexer reads the given line number from the physical address generated using the select lines in
parallel.
• In order to read the line number of a total of L bits, the total number of select lines that every multiplexer
has should be equal to L.
• Step-02
• Once the reading of the line number is done, all the multiplexers go to the line corresponding in the cache
memory with the help of its parallel input lines.
• The total number of input lines every multiplexer has = The total number of lines present in the memory of
the cache.
• Step-03
• A multiplexer outputs a tag bit that it selects from that to the comparator with the help of its output line.
• The total number of output lines present in each multiplexer should be equal to 1.
Direct Mapping Address Structure

• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
– 8 bit tag (=22-14)
– 14 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping from Main Memory to Cache
Direct Mapping Cache Line Table
Direct Mapping Cache Organization
Direct Mapping Example

Figure shows our example system using direct


mapping. In the example, m = 16K = 214 and i = j
modulo 214 . The mapping becomes
Cache Line Starting Memory Address of Block
0 000000, 010000, …, FF0000
1 000004, 010004, …, FF0004
… …
… …
… …
214 - 1 00FFFC, 01FFFC, …, FFFFFC

Note that no two blocks that map into the same line
number have the same tag number. Thus, blocks with
starting addresses 000000, 010000, …, FF0000 have
tag numbers 00, 01, …, FF, respectively
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
– If a program accesses 2 blocks that map to the same line repeatedly,
cache misses are very high
Associative Mapping
• A main memory block can load into any

line of cache

• Memory address is interpreted as tag

and word

• Tag uniquely identifies block of memory

• Every line’s tag is examined for a match

• Cache searching gets expensive


Fully Associative Cache Organization
Associative Mapping Address Structure

• 22 bit tag stored with each 32 bit block of data


• Compare tag field with tag entry in cache to check for hit
• Least significant 2 bits of address identify which 16 bit word is required from 32 bit
data block
• e.g.
– Address Tag Data Cache line
– FFFFFC FFFFFC 24682468 3FFF
Associative Mapping Example

Figure shows example using associative mapping. A


main memory address consists of a 22-bit tag and a
2-bit byte number. The 22-bit tag must be stored
with the 32-bit block of data for each line in the
cache. Note that it is the leftmost (most significant)
22 bits of the address that form the tag. Thus, the
24-bit hexadecimal address 16339C has the 22-bit
tag 058CE7. This is easily seen in binary notation:
Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w = 2s
• Number of lines in cache = undetermined
• Size of tag = s bits
Set Associative Mapping
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given set
– e.g. Block B can be in any line of set i
• e.g. 2 lines per set
– 2 way associative mapping
– A given block can be in one of 2 lines in only one set
Mapping From Main Memory to Cache:v Associative
Mapping From Main Memory to Cache: k-way Associative
k-Way Set-Associative Cache Organization
Set Associative Mapping Address Structure

• Use set field to determine cache set to look in


• Compare tag field to see if we have a hit
• e.g
– Address Tag Data Set number
– 1FF 7FFC 1FF 12345678 1FFF
– 001 7FFC 001 11223344 1FFF
Two Way Set Associative Mapping Example
Set Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2d
• Number of lines in set = k
• Number of sets = v = 2d
• Number of lines in cache = kv = k * 2d
• Size of tag = (s – d) bits
Replacement Algorithms

• Once the cache has been filled, when a new block is brought into the cache,
one of the existing blocks must be replaced. For direct mapping, there is
only one possible line for any particular block, and no choice is possible. For
the associative and set- associative techniques, a replacement algorithm is
needed.
• To achieve high speed, such an algorithm must be implemented in
hardware. A number of algorithms have been tried. We mention four of the
most common. Probably the most effective is:
• Least recently used (LRU): Replace that block in the set that has been in the cache longest with no
reference to it. For two- way set associative, this is easily implemented. Each line includes a USE
bit. When a line is referenced, its USE bit is set to 1 and the USE bit of the other line in that set is
set to 0.
• When a block is to be read into the set, the line whose USE bit is 0 is used. Because we are
assuming that more recently used memory locations are more likely to be referenced, LRU should
give the best hit ratio. LRU is also relatively easy to implement for a fully associative cache. The
cache mechanism maintains a separate list of indexes to all the lines in the cache. When a line is
referenced, it moves to the front of the list. For replacement, the line at the back of the list is
used. Because of its simplicity of implementation, LRU is the most popular replacement algorithm.
• first-in-first-out(FIFO): Replace that block in the set that has been in
the cache longest. FIFO is easily implemented as a round-robin or
circular buffer technique. Still another possibility is least frequently used
(LFU): Replace that block in the set that has experienced the fewest
references. LFU could be implemented by associating a counter with
each line. A technique not based on usage (i.e., not LRU, LFU, FIFO, or
some variant) is to pick a line at random from among the candidate
lines. Simulation studies have shown that random replacement provides
only slightly inferior performance to an algorithm based on usage.
Write Policy
• When a block that is resident in the cache is to be replaced, there are two
cases to consider. If the old block in the cache has not been altered, then it
may be overwritten with a new block without first writing out the old block.
• If at least one write operation has been performed on a word in that line of
the cache, then main memory must be updated by writing the line of cache
out to the block of memory before bringing in the new block. A variety of
write policies, with performance and economic trade-offs, is possible. There
are two problems to contend with. First, more than one device may have
access to main memory.
Write through
• Using this technique, all write operations are made to main memory as well as to the cache, ensuring

that main memory is always valid. Any other processor– cache module can monitor traffic to main

memory to maintain consistency within its own cache. The main disadvantage of this technique is

that it generates substantial memory traffic and may create a bottleneck.

• All writes go to main memory as well as cache

• Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up to date

• Lots of traffic

• Slows down writes

• Remember bogus write through caches!


Write back
• An alternative technique, known as write back, minimizes memory writes. With write back, updates
are made only in the cache. When an update occurs, a dirty bit, or use bit, associated with the line is
set. Then, when a block is replaced, it is written back to main memory if and only if the dirty bit is set.
• The problem with write back is that portions of main memory are invalid, and hence accesses by I/O
modules can be allowed only through the cache. This makes for complex circuitry and a potential
bottleneck. Experience has shown that the percentage of memory references that are writes is on the
order of 15%.
• However, for HPC applications, this number may approach 33% (vector-Vector multiplication) and can
go as high as 50% (matrix transposition).
• EXAMPLE
Line Size
• Retrieve not only desired word but a number of adjacent words as well
• Increased block size will increase hit ratio at first
– the principle of locality

• Hit ratio will decreases as block becomes even bigger


– Probability of using newly fetched information becomes less than probability of reusing replaced

• Larger blocks
– Reduce number of blocks that fit in cache
– Data overwritten shortly after being fetched
– Each additional word is less local so less likely to be needed

• No definitive optimum value has been found


• 8 to 64 bytes seems reasonable
• For HPC systems, 64- and 128-byte most common
Multilevel Caches
• High logic density enables caches on chip
– Faster than bus access

– Frees bus for other transfers

• Common to use both on and off chip cache


– L1 on chip, L2 off chip in static RAM

– L2 access much faster than DRAM or ROM

– L2 often uses separate data path

– L2 may now be on chip

– Resulting in L3 cache
• Bus access or now on chip…
Virtual Memory
• In most modern computers, the physical main memory is not as large as the address space spanned by an
address issued by the processor.

• Here, the virtual memory technique is used to extend the apparent size of the physical memory.

• It uses secondary storage such as disks, to extend the apparent size of the physical memory.

• When a program does not completely fit into the main memory, it is divided into segments.

• The segments which are currently being executed are kept in the main memory and remaining segments are
stored in the secondary storage devices, such as a magnetic disk.

• If an executing program needs a segment which is not currently in the main memory, the required
segment is copied from the secondary storage device.

• When new segment of a program is to be copied into a main memory, it must replace another segment
already in the memory.
• In modern computers, the operating system moves program and data automatically between the main memory and

secondary storage.

• Techniques that automatically swaps program and data blocks between main memory and secondary storage device are

called virtual memory management.

• The address that processor issues to access either instruction or data are called virtual or logical address.

•These addresses are translated into physical addresses by a combination of

• hardware and software components.

• If a virtual address refers to a part of the program or data space that is currently in the main memory, then the contents of

the appropriate location in the main memory are accessed immediately.

• On the other hand, if the reference address is not in the main memory, its contents must be brought into a suitable location in

the main memory before they can be used.


• Virtual Memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of the main memory.

• The addresses, a program may use to refer to a memory, are distinguished from the addresses the memory system uses to identify physical storage

sites, and program-generated addresses are translated automatically to the corresponding machine addresses.

• The size of virtual storage is limited by the addressing scheme of the computer system and the amount of secondary memory is available not by the actual

number of the main storage locations.

• It is a technique that is implemented using both hardware and software. It maps memory addresses used by a program, called virtual addresses, into

physical addresses in computer memory.

• All memory references within a process are logical addresses that are dynamically translated into physical addresses at run time. This means that a

process can be swapped in and out of the main memory such that it occupies different places in the main memory at different times during the course of

execution.

• A process may be broken into a number of pieces and these pieces need not be continuously located in the main memory during execution. The

combination of dynamic run-time address translation and use of page or segment table permits this.

• If these characteristics are present then, it is not necessary that all the pages or segments are present in the main memory during execution. This means

that the required pages need to be loaded into memory whenever required.
Semiconductor Main Memory
• The basic element of a semiconductor memory is the memory
cell.
• Main memory consists of DRAMs supported with SRAM cache.
• These are semiconductor memories.
• The semiconductor memories are classified as shown in fig
Semiconductor Memory Types
DRAM
• RAM technology is divided into two technologies: Dynamic and Static.

• A Dynamic RAM (DRAM) is made with cells that store data as charge on
capacitors. The presence or absence of charge in a capacitor is interpreted
as a binary 1 or 0. Because capacitors have a natural tendency to discharge,
dynamic RAMs require periodic charge refreshing to maintain data storage.
• The term dynamic refers to this tendency of the stored charge to leak away,
even with power continuously applied
DRAM
• Figure is a typical DRAM structure for an
individual cell that stores one bit. The address
line is activated when the bit value from this
cell is to be read or written. The transistor acts
as a switch that is closed (allowing current to
flow) if a voltage is applied to the address line
and open (no current flows) if no voltage is
present on the address line.
DRAM
• For the write operation, a voltage signal is applied to the bit line; a high voltage represents 1, and a low
voltage represents 0. A signal is then applied to the address line, allowing a charge to be transferred to the
capacitor.

• For the read operation, when the address line is selected, the transistor turns on and the charge stored on
the capacitor is fed out onto a bit line and to a sense amplifier. The sense amplifier compares the capacitor
voltage to a reference value and determines if the cell contains a logic 1 or a logic 0. The readout from the
cell discharges the capacitor, which must be restored to complete the operation.

• Although the DRAM cell is used to store a single bit (0 or 1), it is essentially an analog device. The
capacitor can store any charge value within a range; a threshold value determines whether the charge is
interpreted as 1 or 0.
SRAM
• static RAM (SRAM) is a digital device that uses the same logic

elements used in the processor. In a SRAM, binary values are stored

using traditional flip-flop logic-gate configurations. A static RAM will

hold its data as long as power is supplied to it.

• In fig, is a typical SRAM structure for an individual cell. Four transistors

(T1, T2, T3, T4) are cross connected in an arrangement that produces

a stable logic state. In logic state 1, point C1 is high and point C2 is

low; in this state, T1 and T4 are off and T2 and T3 are on.1 In logic

state 0, point C1 is low and point C2 is high; in this state, T1 and T4 are

on and T2 and T3 are off. Both states are stable as long as the direct

current (dc) voltage is applied. Unlike the DRAM, no refresh is needed

to retain data
SRAM versus DRAM
• Both static and dynamic RAMs are volatile; that is, power must be continuously supplied to the

memory to preserve the bit values. A dynamic memory cell is simpler and smaller than a static
memory cell. Thus, a DRAM is more dense (smaller cells = more cells per unit area) and less
expensive than a corresponding SRAM.

• On the other hand, a DRAM requires the supporting refresh circuitry. For larger memories, the fixed

cost of the refresh circuitry is more than compensated for by the smaller variable cost of DRAM cells.
Thus, DRAMs tend to be favored for large memory requirements.

• A final point is that SRAMs are somewhat faster than DRAMs. Because of these relative

characteristics, SRAM is used for cache memory (both on and off chip), and DRAM is used for main
memory.
ROM
• ROM (Read Only Memory) It is a read only memory. We can’t write data in this memory.

• It is non-volatile memory i.e. it can hold data even if power is turned off. Generally, ROM is used to store
the binary codes for the sequence of instructions you want the computer to carry out, and data, such as
look up tables.

• This is because this type of information does not change. It is important to note that although we give the
name RAM to static and dynamic read/write memory devices that does not mean that the ROMs that we
are using are also not random access devices.

• In fact, most ROMs are accessed randomly with unique addresses. There are four types of ROM : Masked
ROM, PROM, EPROM and EEPROM.
Chip Logic
• For semiconductor memories, one of the key design issues is the number of bits of

data that may be read/written at a time. At one extreme is an organization in which the
physical arrangement of cells in the array is the same as the logical arrangement of
words in memory. The array is organized into W words of B bits each.

• For example, a 16-Mbit chip could be organized as 1M 16-bit words. At the other

extreme is the so-called 1-bit-per-chip organization, in which data are read/written one
bit at a time. We will illustrate memory chip organization with a DRAM; ROM
organization is similar, though simpler
Typical 16-Mbit DRAM (4M * 4)
• Figure shows a typical organization of a 16-

Mbit DRAM. In this case, 4 bits are read or

written at a time. Logically, the memory array

is organized as four square arrays of 2048 by

2048 elements. Various physical arrangements

are possible. In any case, the elements of the

array are connected by both horizontal (row)

and vertical (column) lines. Each horizontal line

connects to the Select terminal of each cell in

its row; each vertical line connects to the Data-

In/Sense terminal of each cell in its column


Memory module organization
• If a RAM chip contains only one bit per word, then clearly we will need at
least a number of chips equal to the number of bits per word.
• This organization works as long as the size of memory equals the number of
bits per chip. In the case in which larger memory is required, an array of
chips is needed.
High speed memories- Associative memory
• Many data-processing applications require the search of items in a table stored in memory. They use object

names or number to identify the location of the named or numbered object within a memory space.

• For example, an account number may be searched in a file to determine the holder’s name and account

status. To search an object, the number of accesses to memory depends on the location of the object and

the efficiency of the search algorithm.

• The time required to find an object stored in memory can be reduced considerably if objects are selected

based on their contents, not on their locations.

• A memory unit accessed by the content is called an associative memory or content addressable memory

(CAM).  This type of memory is accessed simultaneously and in parallel on the basis of data content rather

than by specific address or location


Block diagram- Associative memory

• The fig. shows the block diagram of an


associative memory. It consists of memory
array with match logic for m n-bit words and
associated registers. The argument register
(A) and key register (K) each have n-bits per
word. Each word in memory is compared in
parallel with the contents of the argument
register.
• The key register provides a mask for choosing a particular field or bits in the

argument word. The words that match with the word stored in the argument

register set a corresponding bits in the match register.

• Therefore, reading can be accomplished by a sequential access to memory for those

words whose corresponding bits in the match register have been set.

• Only those bits in the argument register having 1’s in their corresponding position of

the key register are compared.

• For example, if argument register. A and the key register K have the bit configuration

shown below. Only the three rightmost bits of A are compared with memory words

because K has 1’s in these positions.


• Most of the times CPU accesses consecutive memory locations.  In such situations
address will be to the different modules.
• Since these modules can be accessed in parallel, the average access time of fetching word
from the main memory can be reduced.
• The low-order k bits of the memory address are generally used to select a module, and
the high-order m bits are used to access a particular location within the selected module.
• In this way, consecutive address is located in successive modules. Thus, any component of
the system that generates requests for access to consecutive memory system as a whole
High speed memories- Interleaved memory

• Main memory is composed of a collection of DRAM memory chips. A number of chips can
be grouped together to form a memory bank. It is possible to organize the memory banks
in a way known as interleaved memory. Each bank is independently able to service a
memory read or write request, so that a system with K banks can service K requests
simultaneously, increasing memory read or write rates by a factor of K.

• If consecutive words of memory are stored in different banks, then the transfer of a block
of memory is speeded up.
256-KByte Memory
Organization
• A 256-KByte memory organization can be built
using memory chips, address lines, and data
lines. The memory chips can be organized into
memory banks, and the address lines specify which
addresses to access.

• Memory chips

• Address lines

• Data lines
Why do we use Memory Interleaving?
• Whenever Processor requests Data from the main memory. A block (chunk) of Data is
Transferred to the cache and then to Processor.
• So whenever a cache miss occurs the Data is to be fetched from the main memory. But
main memory is relatively slower than the cache. So to improve the access time of the
main memory interleaving is used.
• We can access all four Modules at the same time thus achieving Parallelism. It is a
technique for compensating the relatively slow speed of DRAM(Dynamic RAM)
• In this technique, the main memory is divided into memory banks which can be accessed individually without any

dependency on the other.

• For example: If we have 4 memory banks(4-way Interleaved memory), with each containing 256 bytes, then, the Block

Oriented scheme(no interleaving), will assign virtual address 0 to 255 to the first bank, 256 to 511 to the second bank.

But in Interleaved memory, virtual address 0 will be with the first bank, 1 with the second memory bank, 2 with the

third bank and 3 with the fourth, and then 4 with the first memory bank again.

• Hence, CPU can access alternate sections immediately without waiting for memory to be cached. There are multiple

memory banks which take turns for supply of data.

• Memory interleaving is a technique for increasing memory speed. It is a process that makes the system more efficient,

fast and reliable


Discussion…
Thank You

You might also like