Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
20 views

Computer organization and architecture

Uploaded by

andom
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Computer organization and architecture

Uploaded by

andom
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 184

Computer organization and

architecture
By :
Andargie M.
Unit- one

Introduction to computer
organization and Architecture
computer organization and Architecture

• Computer Architecture
• Those attributes of the system that are visible to a programmer. Those
attributes that have a direct impact on the execution of a program.
• • Instruction sets
• • Data representation – number of bits used to represent data
• • Input/Output mechanisms
• • Memory addressing techniques
• Computer Organization
• The operational units and their interconnections that realize the architectural
specifications. Those hardware attributes that are transparent to the
programmer.
• • Control signals
• • Interfaces between the computer and peripherals
• • Memory technology
Structure and Function

• Structure – the way in which components relate to


each other.
• Function – the operation of individual components as
part of the structure.
• The four basic functions of a computer are:
• Processing data
• Storing data
• Moving data between the computer and the
outside world
• Control the operations above
Function
• Structural components
• Central processing unit (CPU) – controls the operation of the computer and
performs its data processing functions.
• Main memory – stores data.
• Input/Output (I/O) – moves data between the computer and its external
environment.
• System interconnections – some mechanism that provides for communication
among CPU, main memory, and I/O.
• The CPUs major structural components are as
follows:
• • Control unit – controls the operation of the CPU
and hence the computer.
• • Arithmetic and logic unit (ALU) – performs the
computer’s data processing functions.
• • Registers – provides storage internal to the CPU.
• • CPU interconnections – some mechanism that
provides for communication among the control unit,
ALU, and registers.
A View of Computer Function and
Interconnection
• An instruction cycle consists of an instruction fetch,
followed by zero or more operand fetches, followed
by zero or more operand stores, followed by an
interrupt check (if interrupts are enabled)
• The major computer system components
(processor, main memory, I/O modules)need to be
interconnected in order to exchange data and
control signals. The most popular means on
interconnection is the use of a shared system bus
consisting on multiple lines.
Computer Components

• The von Neumann architecture is based on three


key concepts:
• Data and instructions are stored in a single read-
write memory.
• The contents of this memory are addressable by
location, without regard to the type of data
contained there.
• Execution occurs in a sequential fashion (unless
explicitly modified) from one instruction to the
next.
Computer Components
• Main components
Computer Components
• CPU – combination of the instruction
interpreter (control unit) and the General
purpose arithmetic and logic functions (ALU)
• Input/output module – used to enter data and
instructions and to report results
• Main memory module – used to store both
data and instructions
Computer Components
• The CPU exchanges data with memory. The CPU typically makes use of two internal
registers: a memory address register (MAR), which specifies the address in memory
for the next read or write, and a memory buffer register (MBR), which contains the
data to be written into memory or receives the data read from memory.

• Similarly, an I/O address register (I/OAR) specifies a particular I/O device. An I/O
buffer register (I/OBR) is used for the exchange of data between an I/O module and
the CPU.
• A memory module consists of a set of locations, defined by sequentially numbered
addresses. Each location contains a binary number that can be interpreted as
either an instruction or data.

• An I/O module transfers data from external devices to CPU and memory, and vise
versa. It contains internal buffers for temporarily holding these data until they can
be sent on.
Computer Function
• In the simplest form, instruction processing consists of two steps: the processor reads
(fetches) instructions from memory one at a time and executes each instruction.
• The processing required for a single instruction is called an instruction cycle. An instruction
cycle is shown below:

• Program execution halts only if the machine is turned off, some sort of unrecoverable error
occurs, or a program instruction that halts the computer is encountered.
Instruction Fetch and Execute

• The processor fetches an instruction from memory – program counter (PC) register holds the
address of the instruction to be fetched next
• The processor increments the PC after each instruction fetch so that it will fetch the next
instruction in the sequence – unless told otherwise
• The fetched instruction is loaded into the instruction register (IR) in the processor – the
instruction contains bits that specify the action the processor will take.
• The processor interprets the instruction and performs the required action In general, these
actions fall into four categories:

• Processor-memory
• – Data transferred to or from the processor to memory
• Processor-I/O
– Data transferred to or from a peripheral device by transferring between the processor and an
I/O module
• Data processing
– The processor performs some arithmetic or logic operation on data
• Control
– An instruction may specify that the sequence of execution be altered.
This figure below illustrates a partial program execution, showing
the relevant portions of memory and processor registers.
Adds the contents at addresses 940 and 941 and stores the
result at address 941.
Instruction Fetch and Execute
• More detail of instruction cycle
Interconnections Structures

• The collection of paths connecting the various modules is


called the interconnection structure
• Memory
o Consists of N words of equal length
o Each word assigned a unique numerical address (0, 1, …,
N-1)
o A word of data can be read or write
o Operation specified by control signals
o Location specified by address signals
• I/O Module
o Similar to memory from computers viewpoint
o Consists of M external device ports (0, 1, …, M-1)
o External data paths for input and output
o Sends interrupt signal to the processor
• Processor
o Reads in instructions and data
o Writes out data after processing
o Uses control signals to control overall
operation of the system
o Receives interrupt signals
Interconnections Structures

• The interconnection structure must support the following types


of transfers:
• Memory to processor: processor reads an instruction or a unit of
data from memory.
• Processor to memory: processor writes a unit of data to memory.
• I/O to processor: processor reads data from an I/O device via an
I/O module.
• Processor to I/O: processor sends data to the I/O device via an
I/O module.
• I/O to or from memory: an I/O module is allowed to exchange
data directly with memory, without going through the processor,
using direct memory access (DMA).
Bus Interconnection

• A bus is a communication pathway connecting two


or more devices.
• Multiple devices can be connected to the same bus
at the same time.
• Typically, a bus consists of multiple communication
pathways, or lines.
• Each line is capable of transmitting signals
representing binary 1 or binary 0.
• A bus that connects major computer components
(processor, memory, I/O) is called a system bus.
Bus Structure
• Typically, a bus consists of 50 to hundreds of separate lines.
• On any bus the lines are grouped into three main function
groups: data, address, and control.
• There may also be power distribution lines for attached
modules.
Bus Structure
• • Data lines
• o Path for moving data and instructions between modules.
• o Collectively are called the data bus.
• o Consists of: 8, 16, 32, 64, etc… bits – key factor in overall system
performance
• • Address lines
• o Identifies the source or destination of the data on the data bus.
• 􀂃 CPU needs to read an instruction or data from a given memory location.
• o Bus width determines the maximum possible memory capacity for the
system.
• 􀂃 8080 has 16 bit addresses giving access to 64K address
• • Control lines
• o Used to control the access to and the use of the data and address lines.
• o Transmits command and timing information between modules.
Multiple-Bus Hierarchies

• If a great number of devices are connected to the bus,


performance will suffer.
• Propagation delays
o Long data paths mean that coordination of the bus
use can adversely affect performance.
• If aggregate data transfer approaches capacity
o Increasing data rate or making the bus wider may
help.
• Most computer systems use multiple buses, generally
laid out in a hierarchy.
Bus architecture
q
Bus architecture
Bus width

• The width of the data bus has an impact on


system performance: The wider the data bus,
the greater the number of bits transferred at
one time.
• The width of the address bus has an impact on
system capacity: The wider the address bus,
the greater the range of locations that can be
referenced.
Data Transfer Type

• Multiplexed
o Single shared bus for both addresses and data
o Bus first used to specify an address
o Bus then used to transfer data
• Dedicated
o Separate buses for both address and data
o The address is specified on the address bus and
remains while data transferred
o The data is transferred on the data bus
PCI (Peripheral Component Interconnection)

• It is a 32bit bus which extends the processor’s own local bus, and can be expanded up to 64 bit
when need arises.
• The PCI bus system is able to support 10 devices.
• Because, PCI devices do not electrically load down the CPU bus.
• The PCI bus system can transfer data at a rate of 130 MB per sec at 33 MHz.
• PCI bus is a high performance connection between the motherboard components and expansion
boards of a system.
• There is a bridge chip between the processor and the PCI bus, which connects the PCI bus to the
processor’s local bus. This allows to connect PCI peripherals directly to the PCI bus.

• Once a Host bridge is included in the system, the processor can access all available PCI
peripherals. This makes the PCI bus standard processor independent. When a new processor is
to be used, only the bridge chip needs to be replaced, the rest of the system remains unchanged.

• PCI bus employs a 124 pin, micro channel style connector (188 pin for a 64 bit systems)
• PCI specification are for 2 types of connection 5V system and 3.3v low power system.
• PCI design have ability to support future generation of peripherals.
PCI (Peripheral Component Interconnection)
USB – Universal Serial BUS

• Its is a high speed serial bus


• Its data transfer rate is higher than that of a serial port.
• It allows to interface several devices to a single port.
– It supports interface for a wide range of peripherals such as
monitor, keyboard, mouse, modem, speaker, microphone,
scanner, printer, etc.,.
• It provides power lines along with data lines
• USB cable contains 4 wires
– 2 wires are used to supply electrical power to peripherals.
– 2 wires are used to send data and commands.
Unit-II

INFORMATION REPRESENTATION
Computer information representation
• Inside the computer, there are integrated circuits with
thousands of transistors.
• These transistors are made to operate on a two-state. By this
design, all the input and output voltages are either HIGH or
LOW.
• Low voltage represents binary 0 and high voltage represents
binary 1.
• In Computer the data is represented in binary format (1s and
0s).
• Even though we use characters, decimals, punctuation marks,
symbols and graph, internally these things are represented in
binary format.
Introduction To Number System

• Decimal number system is important because it is universally used to


represent quantities outside a digital system. Decimal system uses
numbers from 0 to 9. This is to the base 10.
• Binary numbers are based on the concept of ON or OFF. Binary number
system has a base of 2. Its two digits are denoted by 0 & 1 and are
called bits.
• Octal Number system uses exactly eight symbols 0,1,2,3,4,5,6, and 7.
i.e., it has a base of 8. Each octal digit has a unique 3 bit binary
representation.
• Hexadecimal number system is to the base 16. It spans from 0 to 9 and
then A to E. In hexadecimal system A,B,C,D,E and F represents
10,11,12,13,14 and 15, i.e., it has a base of 16. Hexadecimal numbers
are more convenient for people to recognize and interpret than the
long strings of binary numbers
Number system
Decimal Binary Code Octal Hexadecimal

0 0000 0 0

1 0001 1 1

2 0010 2 2

3 0011 3 3

4 0100 4 4

5 0101 5 5

6 0110 6 6

7 0111 7 7

8 1000 8

9 1001 9

10 1010 A

11 1011 B

12 1100 C

13 1101 D

14 1110 E

15 1111 F
Floating Point Representation
• A floating point binary number is represented in a similar manner except that
it uses base 2 for the exponent.
• Example: the binary number +1001.11 is represented with an 8 bit fraction
and 6 bit exponent as follows:
Fraction Exponent
01001110 000100
• the fraction has a 0 in the Leftmost position to denote positive.
• The exponent has the equivalent binary number +4. The floating point

• M x 2e = +( .1001110)2 x 2+4
• A floating point number is said to be Normalized if the most significant digit of
the mantissa is nonzero.
• Normalized numbers provide the maximum possible precision for the floating
point number.
Floating Point Representation

• In this method, a real number is expressed as a combination of a mantissa and exponent.


• The mantissa is kept as less than 1 and greater than or equal to 0.1 and the exponent is the
power of 10
• For example, the decimal number +4567.89 is represented in floating point with a fraction
and an exponent as follows:
Fraction Exponent
+0.456789 +04  scientific notation is +0.456789 x 10+4

 m x re
• While storing numbers, the leading digit in the mantissa is always made non-zero by
approximately shifting it and adjusting the value of the exponent, i.e...004567  0.4567 * 10 -
2
 0.4567E-2
• This shifting of the mantissa to the left till its most significant digits is non-zero, is called
Normalization.
• Only the mantissa m and the exponent e are physically represented in the register (including
their signs). A floating point binary number is represented in a similar manner except that it
uses base 2 for the exponent.
Fixed Point Representation

• to represent negative integers, we need a notation for negative values.


• Because of hardware limitations, computers must represent everything
with 1’s and 0’s, including the sign of a number.
• As a consequence, it is customary to represent the sign with a bit placed
in the leftmost position of the number.
• There are two ways of specifying the position of the binary point in a
register by giving it a fixed position or by employing a floating point
representation.
• The fixed point method assumes that the binary point is always fixed in
one position. The two positions most widely used are
-A binary point in the extreme left of the register to make the stored
number a fraction, and
-A binary point in the extreme right of the register to make the stored
number an integer.
Fixed Point Representation
• Representation of number as an integer:
• When an integer binary number is positive, the sign is represented
by 0 and the magnitude by a positive binary number.
• When the number is negative, the sign is represented by 1 but the
rest of the number maybe represented in one of three possible ways:
-Signed magnitude representation
-Signed 1’s complement representation
-Signed 2’s complement representation
• The negative number is represented in either the 1’s or 2’s
complement of its positive value.
• It’s customary to use 0 for the + sign and 1 for the –sign. Therefore -
001, -010 and -011 are coded as 1001, 1010 and 1011.
Fixed Point Representation
• Complement
• Sign –magnitude numbers are easy to understand, but they require too much hardware for
addition and subtraction.
• It has led to the widespread use of complements for binary arithmetic.
• For instance, if A = 0111  The 1’s complement is Ā = 1000
• The 2’s complement is defined as the new word obtained by adding 1 to 1’s complement. As
an equation
A ‘ = Ā + 1 Where A ‘ = 2’s complement
Ā = 1’s complement
Here are some examples. If A = 0111
The 2’s complement is A’ = 1001
• In terms of a binary the 2’s complement is the next reading after the 1’s complement.
• Another example, If A = 0000 1000
Then Ā = 1111 0111
1  (adding 1)
And A ‘ = 1111 1000
Fixed Point Representation
• Example: Consider the signed number 14 stored in an 8 bit register.
• +14 is represented by a sign bit of 0 in the leftmost position
followed by the binary equivalent of 14  00001110.
• Note that each of the eight bits of the register must have a value
and, therefore, 0’s must be inserted in the most significant
positions following the sign bit.
• Although there is only one way to represent +14, there are three
different ways to represent -14 with eight bits.
Sign bit
In signed magnitude representation 1 0001110
In signed 1’s complement representation 1 1110001
In signed 2’s complement representation 1 1110010
Arithmetic Addition:

• The addition of two numbers in the signed-magnitude system follows the


rules of ordinary arithmetic.
• If the signs are same, we add the two magnitudes and give the sum the
common sign.
• If the sign are different, we subtract the smaller magnitude from the larger
and give the result the sign of the larger magnitude.
• Add the two numbers, including their sign bits, and discard any carry out of
the sign bit position.
• Numerical examples for addition are shown below. Note that negative
numbers must initially be in 2’s complement and that if the sum obtained
after the addition is negative, it is in 2’s complement form.
+6 00000110 -6 11111010
+13 00001101 +13 00001101
+19 00010011 +7 00000111
Arithmetic subtraction:
• Subtraction of two signed binary numbers when negative numbers are in
2’s complement form is very simple.
• The complement of a negative number in complement form produces the
equivalent positive number.
• Ex: (-6) – (-13) = +7
• In binary with 8 bits this is written as 11111010 – 11110011.
• The subtraction is changed to addition by taking the 2’s complement of (-
13) to give (+13).
• Ie., 11110011  2’s complement is  0000 1101
• In binary this is 11111010 + 00001101 = 100000111.
• Removing the end carry, we obtain the correct answer 00000111 (+7)
• It is worth noting that binary numbers in the signed 2’s complement
system are added and subtracted by the same basic addition and
subtraction rules as unsigned numbers.
• Therefore, computers need only one common hardware circuit to handle
both types of arithmetic.
Overflow:

• When two numbers of n digits each are added and the sum occupies n+1 digits, we say
that an overflow occurred.
• An overflow is a problem in digital computers because the width of registers is finite.
• A result that contains n+1 bits cannot be accommodated in a register with a standard
length of n bits.
• For this reason, many computers detect the occurrence of an overflow, and when it
occurs, a corresponding flip-flop is set which can then be checked by the user.
• An overflow may occur if the two numbers added are both positive or both negative.
• Example: Two signed binary numbers, +70 and +80, are stored in two 8bit registers.
carries : 0 1 carries: 1 0
+70 0 1000110 -70 1 0111010
+80 0 1010000 -80 1 0110000
+150 1 0010110 -150 0 1101010
Representation Of Numbers, Operand Of Code And Address

• Gray code: This is an un weighted code. It means that there are no specific
weights assigned to the bit position. This code is not suitable for arithmetic
operations, but it is very useful for input-output devices, Analog to digital
conversion etc., and the machines which use Shaft encoders as sensors.
• Note: Left most bit is the first binary digit as the first gray code. Then
move from left to right and add each adjacent pair of binary digit to get the
next gray code digit, neglect carries if any.
• Ex: 1010 = ?
• (1) (2) (3) (4)
1 0 1 0

• Add (1) + (2)  1 + 0 = 1 and Add (2) + (3)  0 +1 = 1 and Add (3) + (4)  1 +
0 = 1 then result is 1111  gray code
comparison of decimal, binary and gray codes. 4bit Gray code

Decimal Binary Code Gray Code


0 0000 0000
1 0001 0001
2 0010 0011
3 0011 0010
4 0100 0110
5 0101 0111
6 0110 0101
7 0111 0100
8 1000 1100
9 1001 1101
BCD(Binary –Coded decimal)

• it is based on the idea of converting each decimal


digits into its equivalent binary number.
• each decimal is represented b binary code of 4 bits.
• The devices such as Electronic calculators, Digital
voltmeters, frequency counters, electronic counters,
digital clocks etc, work with BCD numbers.
• BCD codes have also been used in early computers.
• Modern computers do not use BCD numbers as they
have to process names and other non numeric data.
BCD(Binary –Coded decimal)
Decimal BCD
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
10 0001 0000
11 0001 0001
12 0001 0010
….. ……
77 0111 0111
Hexadecimal versus BCD

• The Hexadecimal system utilizes the full capacity of four binary


bits, whereas BCD codes do not utilize the same.
• The BCD codes do not utilize the binary codes from 1010 to
1111.
• In the hexadecimal system an 8 bit word can represent up to
FF, that is 11111111 (255 decimal) whereas in BCD only up to
10011001 (99 decimal).
• Hence the hexadecimal is a compact form of representation,
and it occupies less memory space, thereby reducing the
hardware cost.
• The arithmetic operations are also simpler in hexadecimal
system.
Alpha numeric Code: ASCII (American standard code for
information interchange)

• It is the standardized alphanumeric code and most widely used by


several computer manufacturers as their computer’s internal code.
• It is a 7 bit alphanumeric code which has 27 =128 different characters
to encode 85 characters. (52 lowercase and uppercase alphabets, 10
numerals and 23 punctuation and others symbols as marked on
keyboards printers, video display etc.,)
• ASCII code can be classified into 2 types:
• ASCII -7:7 bit code, which represents 27 =128 different characters. In
7 bits, the first 3 bit represent zone bit and next 4 bit represent digit.
• ASCII-8 :8 bit code, which represent 28 = 256 different characters.
• In 8 bits, the first 4 bit represent zone bit and next 4 bit represent
digit.
ASCII 7 code for characters

LSB MSB b6b5b4


b3b2b1b0 010 011 100 101 110 111
0000 SPACE 0 @ P - P
0001 ! 1 A Q A Q
0010 “ 2 B R B R
0011 # 3 C S C S
0100 $ 4 D T D T
0101 % 5 E U E U
0110 & 6 F V F V
0111 ‘ 7 G W G W
1000 ( 8 H X H X
1001 ) 9 I Y I Y
1010 * : J Z J Z
1011 + ; K [ K {
1100 ‘ < L \ L |
1101 - = M ] M }
1110 . > N ^ N ~
1111 / ? O _ O Del
The 8 bit ASCII-8 codes…..

Character ASCII -8 Code Hexadecimal


Equivalent
Zone Digit
0 0101 0000 50
1 0101 0001 51
2 0101 0010 52
… … …. ..

A 1010 0001 A1
B 1010 0010 A2
C 1010 0011 A3
… … … …
Instruction Formats And Types

• A computer will usually have a variety of instruction code formats. It is the function
of the control unit within the CPU to interpret each instruction code and provide the
necessary control functions needed to process the instruction.
• The bits of the instruction are divided into groups called fields. The most common
fields found in instruction formats are:
 An operation code field that specifies the operation to be performed
 An address field that designates a memory address or a processor register
 A mode field that specifies the way the operand or the Effective Address is
determined.
• Operations specified by computer instructions are executed on some data stored in
memory or processor registers.
• Operands residing in memory are specified by their memory address.
• Operands residing in processor registers are specified with a register address.
• A register address is a binary number of k bits that defines one of 2 k registers in the
CPU. Thus a CPU with 16 processor registers R0 through R15 will have a register
address field of 4 bits.
Instruction Formats And Types
• Most computers fall into one of 3 types of CPU
organizations:
Single accumulator organization
General register organization
Stack organization
Single accumulator organization

• All operations are performed with an implied accumulator


register.
• The instruction format in this type of computer uses one
address field.
• For example, the instruction that specifies an arithmetic
addition is defined by an assembly language instruction as
ADD X
• Where X is the address of the operand. The ADD instruction
in this case results in the operation AC AC + M[X] .
• AC is the accumulator register and M[X] symbolizes the
memory word located at address X
General register organization
• The instruction format in this type of computer needs three register address
fields.
ADD R1, R2, R3 to denote the operating R1  R2 + R3.
• The number of address fields in the instruction can be reduced from 3 to 2 if
the destination register is the same as one of the source registers.
ADD R1, R2 denote the operation R1  R1 + R2.
• Only register addresses for R1 and R2 need be specified in this instruction.
• Computers with multiple processor registers use the move instruction with a
mnemonic MOV to symbolize a transfer instruction.
• MOV R1 , R2 denotes the transfer R1  R2
• General register type computers employ two or three address fields.
• ADD R1, X
• Would specify the operation R1  R1 + M[X]. It has two address fields, one
for register R1 and the other for the memory address X.
Stack organization
• Computers with stack organization would have PUSH and POP
instructions which require an address field .
PUSH X will push the word at address X to the top of the stack.
• The stack pointer is updated automatically.
• Operation type instructions do not need an address field in stack
organized computers because the operation is performed on the two
items that are on top of the stack.
ie. ADD
• a stack computer consists of an operation code only with no address
field. This operation has the effect of popping the two top numbers from
the stack, adding the numbers, and pushing the sum into the stack.
• There is no need to specify with an address field since all operands are
implied to be in the stack.
Instruction Formats And Types
• Some computers combine features from more than
one organizational structure
• For Example, the intel 8080 microprocessor has 7
CPU registers, one of which is an accumulator
register.
• The processor has some of the characteristics of a
general register type and some of the
characteristics of an accumulator type. Moreover,
the intel 8080 processor has a stack pointer and
instructions to push and pop from a memory stack.
Instruction Formats And Types
• To illustrate the influence of the number of addresses on
computer programs, we will evaluate the arithmetic statement
• X = (A + B) * (C + D)
• We will use the symbols ADD, SUB, MUL and DIV for the 4
arithmetic operations.
• MOV for the transfer type operation
• LOAD and STORE for transfers to and from memory and AC
register.
• We will assume that the operands are in memory addresses
A,B,C, and D, and the result must be stored in memory at
address X
Instruction Types
• Three Address Instructions:
• ADD R1, A, B ie., R1  M[A] + M[B]
• ADD R2, C, D i.e., R2  M[C] + M[D]
• MUL X, R1, R2 i.e., M[X]  R1 * R2
• It is assumed that the computer has two processor registers, R1, and R2. The symbol
M[A] denotes the operand at memory address symbolized by A.

• Two Address Instructions:
• MOV R1, A ie., R1  M[A]
• ADD R1, B i.e., R1  R1 + M[B]
• MOV R2, C ie., R2  M[C]
• ADD R2, D i.e., R2  R2 + M[D]
• MUL R1, R2 i.e., R1  R1 * R2
• MOV X, R1 i.e., M[X]  R1
Instruction Types
• One Address Instructions:
• LOAD A ie., AC  M[A]
• ADD B i.e., AC  AC + M[B]
• STORE T i.e., M[T]  AC
• LOAD C ie., AC  M[C]
• ADD D i.e., AC  AC + M[D]
• MUL T i.e., AC  AC * M[T]
• STORE X i.e., M[X]  AC
• All operations are done between the AC register and a memory operand. T is the address of a temporary
memory location required for storing the intermediate result.

• Zero Address Instructions:


• A stock organized computer does not use an address field for the instructions ADD and MUL. The PUSH and
POP instructions, however, need an address field to specify the operand that communicates with the stack.

• PUSH A i.e., TOS  A


• PUSH B i.e., TOS  B
• ADD i.e., TOS  ( A + B )
• PUSH C i.e., TOS  C
• PUCH D i.e., TOS  D
• ADD i.e., TOS  ( C + D )
• MUL i.e., TOS  ( C + D ) * ( A + B )
• POP X i.e., M[X]  TOS
Addressing Modes

• An operand reference in an instruction either contains the actual


value of the operand or a reference to the address of the operand.
• Once we have determined the number of addresses contained in an
instruction, the addressing mode must be determined
• The manner in which each address field specifies memory location is
called addressing mode which can be:
 Immediate
 Direct
 Indirect
 Register
 Register indirect
 Displacement
 Stack
Addressing Modes
• The Following Notation
A = contents of an address field in the
instruction
R = contents of an address field in the
instruction that refers to a register
EA = actual (effective) address of the location
containing the referenced operand
(X) = contents of memory location X or register X
Immediate Mode

Instruction
Operand

Algorithm : operand = A

– The operand is contained within the instruction itself


– Data is a constant at run time
• Advantage : No additional memory references are required after the
fetch of the instruction itself
• Disadvantage : Size of the operand (thus its range of values) is limited
Direct mode


• – The address field of the instruction contains the effective address
• \

– The address field of the instruction contains the effective address of the operand
– No calculations are required
– Address is a constant at run time but data itself can be changed during program
execution
• Advantage : One additional memory access is required to fetch the operand
• Disadvantage :Address range limited by the width of the field that contains the
address reference
Indirect addressing

– The address field in the instruction specifies a memory location which


contains the address of the data
– Two memory accesses are required
» The first to fetch the effective address
» The second to fetch the operand itself
• Advantage: Range of effective addresses is equal to 2n, where n is the width
of the memory data word i.e, large address space.
• -Disadvantage : instruction execution requires two memory references to
fetch the operand
Register-based addressing modes

– Register addressing: like direct, but address field


specifies a register location
– No memory reference
– Faster access to data, smaller address fields in the
instruction word
Register-Indirect addressing modes

Algorithm: EA = ( R )
– Register indirect: like indirect, but address field
specifies a register that contains the effective address
-advantages : Large address space
-disadvantage : extra memory reference
Displacement or address relative addressing

• Algorithm: EA = A + (R)
– Two address fields in the instruction are used
» One is an explicit address reference
» The other is a register reference
» EA = A + (R)
Displacement or address relative addressing
• – Relative addressing:
• » A is added to the program counter contents to cause a
branch operation in fetching the next instruction
• – Base-register addressing:
• » A is a displacement added to the contents of the
referenced “base register” to form the EA
• » Used by programmers and O/S to identify the start of
user areas, segments, etc. and provide accesses within
them
• -Indexing:
• » Indexing is used within programs for accessing data
structures
STACK

• Algorithm: EA = top of stack


• A stack is a linear array of locations. Items are appended to the top of the stack so that,
at any given time, the block is partially filled.
• Associated with the stack is a pointer whose value is the address of the top of the stack.
• The top two elements of the stack may be in processor registers, in which case the stack
pointer references the third element of the stack. The stack pointer is maintained in a
register.
• The stack mode of addressing is a form of implied addressing. The machine instructions
need not include a memory reference but implicitly operate on the top of the stack.
• Advantage : No memory reference
• Disadvantage : Limited applicability
Timing cycle

Increment
Clear(CLR)
clock
Timing cycle
• The timing for all registers in the basic computer is
controlled by a master clock generator.
• The clock pulses are applied to all flip-flops and
registers in the system.
• The clock pulses do not change the state of a register
unless the register is enabled by a control signal.
• The control signals are generated in the control unit
and provide control inputs for the multiplexers in the
common bus, control inputs in processor registers
and micro-operations for the accumulator
Timing cycle
• There are two major types of control
organization:
• Hardwired organization : the control logic is
implemented with gates, flip-flops, decoders
and other digital circuits.
• Micro programmed organization: the control
information is stored in a control memory. The
control memory is programmed to initiate the
required sequence of micro-operations.
Timing cycle
• In the block diagram of the control unit , it consists of two decoders, a
sequence counter and a number of control logic gates.
• An instruction read from memory is placed in the instruction register (IR).
• IR is divided into 3 parts; the 1 bit, the operation code and bits 0 through
11.
• The operation code in bits 12 through 14 are decoded with a 3 x 8
decoder.
• The eight outputs of the decoder are designated by the symbols D0
through D7.
• Bit 15 of the instruction is transferred to a flip-flop designated by the
symbol I.
• Bits 0 through 11 are applied to the control logic gates.
• The 4 bit sequence counter can count in binary from 0 through 15.
Timing cycle
• The outputs of the counter are decoded into 16 timing signals T0
through T15.
• The sequence counter SC can be incremented or cleared
synchronously.
• Most of the time, the counter is incremented to provide the
sequence of timing signals out of the 4 x 16 decoder.
• Once in a while, the counter is cleared to 0, causing the next active
timing signal to be T0.
• As an example consider the case where SC is incremented to provide
timing signals T0,T1,T2,T3 and T4 in sequence.
• At time T4, SC is cleared to 0 if decoder output D3 is active. This is
expressed symbolically by the statement
• D3T4 : SC  0
Unit-three

DIGITAL LOGIC
Logic gates
• In 1854 George Boole invented symbolic logic.
• This is known today as Boolean algebra.
• Each variable in Boolean algebra has either of two values. - True or False.
• The purpose of this two-state algebra was to solve logic problems.
• computers think in terms of binary (high or low) it is appropriate to say
logic is the core of computers.
• To implement this logic circuits are used.
• In the logic circuits Gates are used.
• A Gate is a small circuit with one or more input signals but only one
output signal.
• Gates are digital circuits (two-state).
• There are several gates such as OR, AND, Inverter, etc. Each has symbol to
represent them.
AND Gates

• Y=A.B
OR Gates

Y=A+B
NOT or Inverter

• function Y = A
NAND Gate

• Y = AB
NOR Gate

• Y = A+B


Exclusive OR Gate

• Y= A XOR B
• Y= A B
• Ex-OR sign
Exclusive NOR Gate or XNOR
• aa
Boolean Algebra

• Boolean Algebra is mathematical system for


formulating logical statements with symbols so that
problems can be solved in a manner to ordinary
algebra.
• Boolean algebra is the mathematics of digital
systems
• A basic knowledge in the Boolean algebra required
to study and analysis of logic circuits.
• It is a convenient and systematic way of expressing
and analyzing the operations of logic circuits.
Boolean Algebra
Karnaugh Map Method

• The Karnaugh map method is a graphical technique for


simplifying Boolean functions. It is a two-dimensional of
a Truth Table. It provides a simpler method for
minimizing logic expressions. The map method is ideally
suited for four or less variables.
• A Karnaugh map for n variables is made up of 2n squares.
Each square designates a product term of a Boolean
expression. For product terms which are present in the
expression, 1s are written in the corresponding squares;
0s are written in those squares which correspond to
product terms not present in the expression.
Karnaugh Map Method

• Karnaugh Map Method



• The Karnaugh map method is a graphical technique for simplifying
Boolean functions. It is a two-dimensional of a Truth Table. It
provides a simpler method for minimizing logic expressions. The
map method is ideally suited for four or less variables.
• A Karnaugh map for n variables is made up of 2n squares. Each
square designates a product term of a Boolean expression. For
product terms which are present in the expression, 1s are written
in the corresponding squares; 0s are written in those squares
which correspond to product terms not present in the expression.
Karnaugh Map Method
Karnaugh Map Method
Karnaugh Map Method
Karnaugh Map Method
Karnaugh Map Method
Karnaugh Map Method
Karnaugh Map Method
LIST OF LOGIC GATE ICS

• There are different Integrated Circuit (IC)


technologies are used to implement the basic
logic gates
They are CMOS (Complementary Metal – Oxide
Semiconductor), TTL (Transistor – Transistor
Logic) and ECL (Emitter – Coupled Logic).
Flip flops
• The flip flop is a biostable device.
• It exists in one of two states and, in the
absence of input, remains in that state.
• Thus, the flip flop can function as a 1 bit
memory.
• The flip flop has two outputs, which are
always the complementary of each other,
these are generally labelled Q and Q’
The S-R Flip flop
• The circuits has two inputs, s(Set) and R (Reset) and two outputs Q and Q and
consists of two NOR gates connected in a feedback arrangement.
• First, Let us show that the circuit is bistable.
• Assume that both S and R are 0 and that Q is 0. The inputs to the lower NOR gate
are Q = 0 and S = 0.
• Thus, the output Q = 1 means that the inputs to the upper NOR gate are Q=1
and R=0, which has the output Q = 0.
• Thus, the state of the circuit is internally consistent and remains stable as long as
S = R = 0.
• A similar line of reasoning shows that the state Q = 1, Q = 0 is also stable for R =
S = 0.
• Thus, this circuit can function as a 1-bit memory. Suppose that S changes to the
value 1.
• Now the inputs to the lower NOR gate are S =1 , Q =0. After some time delay t,
the output of the lower NOR gate will be Q = 0.
The S-R Flip flop
The S-R Flip flop
• So, at this point in time, the inputs to the upper NOR gate
become R=0 , Q = 0.
• After another gate delay of t, the output Q becomes 1. This is
again a stable state.
• The inputs to the lower gate are now S = 1, Q =1, which
maintain the output Q=0. As long as S =1 and R =0, the outputs
will remain Q =1, Q = 0.
• Furthermore, if S returns to 0, the outputs will remain
unchanged.
• Observe that the inputs S = 1, R =1 are not allowed, because
these would produce an inconsistent output ( both Q and Q
equal 0).
Clocked S-R Flip-Flop
• The output of the S-R latch changes, after a brief time delay, in response to a
change in the input.
• This is referred to as asynchronous operation.
• More typically, events in the digital computer are synchronized to a clock
pulse, so that changes occur only when a clock pulse occurs.
• This type of device is referred to as a clocked S-R flip-flop.
• Note that the R and S inputs are passed to the NOR gates only during the
clock pulse.
D Flip – Flop
• The D flip flop is sometimes referred to as the data flip flop because it is, in
effect, storage for one bit of data.
• the output of the D flip flop is always equal to the most recent value applied to
the input.
• Hence, it remembers and produces the last input.
• It is also referred to as the delay flip flop, because it delays a 0 or 1 applied to
its input for a single clock pulse.
J-K Flip-Flop:
• Like the S-R flip flop, it has 2 inputs.
• However, in this case all possible combinations of input
values are valid.
• In its characteristic table, we can note that the first
three combinations are the same as for the S-R flip-flop.
• With no input, the output is stable.
• The J input alone performs a set function, causing the
output to be 1;
• the K input alone performs a reset function, causing the
output to be 0.
• When both J and K are 1, the function performed is
referred to as the toggle function: the output is
reversed.
J-K Flip-Flop:
• s
Latches

• A flip flop in its simplest form is called a latch.


• A latch stores a binary bit 1 or 0.
• The unclocked simple flip-flops and D flip flops fall under the
category of latches.
• An n-bit latch consists of n-number single bit latches.
• It stores a binary word of n bits.
• The n-bits of the binary word are transferred to the latch
simultaneously in parallel.
• In a latch there is no facility to read its contents.
• The latches are temporary storage devices.
• They are ideally suited for storing information between processing
units and I/O units or indicator units.
Registers

• A register is a digital circuit used within the


CPU to store one or more bits of data.
• Two basic types of registers are commonly
used: Parallel registers and shift registers.
Parallel Registers:

• A parallel register consists of a set of 1-bit


memories that can be read or written
simultaneously.
• It is used to store data.
• The 8 bit register of figure illustrates the operation
of a parallel register using D flip flops.
• A control signal, labelled load, controls writing into
the register from signal lines, D11 through D18.
• These lines might be the output of multiplexers, so
that data from a variety of sources can be loaded
into the register.
Parallel Registers:
Shift Registers:
• A Shift register accepts and / or transfers information serially.
• Data are input only to the leftmost flip-flop.
• With each clock pulse, data are shifted to the right one position, and the
rightmost bit is transferred out.
• Shift registers can be used to interface to serial I/O devices.
• In addition, they can be used within the ALU to perform logical shift and
rotate functions .
Buffers/Drivers and counters
• The function of a buffer/driver is to increase the output current/voltage
ratings.
• When the output current of a digital device is insufficient to drive another
device which is to be connected to the output terminal of the device, a buffer
is employed to amplify the current.
• A buffer is a current amplifier.
• For an inverting buffer a bubble is placed at the output point of triangle.
• Sometimes, an increased voltage is required to drive relays, lamps etc., the
device is said to be a buffer only when the manufacturers optimize the design
for high current output.
• For example, IC 7426 is a quad 2-input NAND buffer. Its NAND gates are
optimized for high current output.
Counters
• A counter is a register whose value is easily incremented by 1
modulo the capacity of the register.
• Thus, a register made up of n flip flops can count up to 2 n-1.
• When the counter is incremented beyond its maximum value, it is
set to 0.
• An example of a counter in the CPU is the program counter.
• Counters can be designated as asynchronous or synchronous,
depending on the way in which they operate.
• Asynchronous counters are relatively slow because the output of
one flip flop triggers a change in the status of the next flip flop.
• In synchronous counter, all of the flip flops change state at the
same time.
• This type is much faster, it is the kind used in CPUs.
Ripple Counter:

• An asynchronous counter is also referred to as


a ripple counter, because the change that
occurs to increment the counter starts at one
end and ripples through to the other end.
Ripple Counter:
• the counter is incremented with each clock pulse.
• The J and K inputs to each flip flop are held at a
constant 1. This means that, when there is a clock
pulse, the output at Q will be inverted ( 1 to 0; 0
to 1).
• Note that the change in state is shown as
occurring with the falling edge of the clock pulse;
this is known as an edge-triggered flip flop.
• If one looks at patterns of output for this counter,
it can be seen that it cycles through 0000,
0001 ...1110, 1111, 0000 and so on.
Synchronous counters:
• the ripple counter has the disadvantage of the delay involved in
changing value.
• To overcome this disadvantage, CPUs make use of synchronous
counters, in which all of the flip-flops of the counter change at the
same time.
• For a 3 bit counter, three flip flops will be needed.
• Let us use JK flip-flops.
• Label the uncomplemented output of the three flip flops A, B,C
respectively, with C representing the least significant bit.
Digital Multiplexers/Data selectors
• A digital multiplexer has N inputs and only one output.
• By applying control signals anyone input can be made available at the output
terminal.
• It is also called data selector.
• The control signals are applied to the select lines to select the desired input.
• Examples :
• IC 74150 – 1 of 16 Data selectors /Multiplexers
• IC 74152 – 1 of 8 Data selectors/Multiplexers
Digital Demultiplexers/Decoders

• A digital demultiplexer has 1 input and N outputs.


• The meaning of demultiplexer is one into many.
• By applying control signals the input signal can be made available at
anyone of output terminals.
• It performs reverse operation of a multiplexer.
• Example : IC 74154 is a 1 to 16 demultiplexer 4 to 16 decoder.
Unit IV :

Memory Organization
MEMORY HIERARCHY

 The hierarchical arrangement of storage in current computer


architectures is called the memory hierarchy.
 Types of memory
• Semiconductor memory
• Magnetic memory
• Optical memory
 The memory unit that communicates directly with the CPU is
called the Main Memory (or Primary memory).
 Devices that provide backup storage are called auxiliary
Memory (or Secondary).
 Only programs and data currently needed by the processor
reside in Main memory.
 All other information is stored in Auxiliary memory and
transferred to main memory when needed.
MEMORY HIERARCHY

 The Memory hierarchy system consists of all storage devices employed in a


computer system from the slow but high capacity auxiliary memory to a
relatively faster main memory, to an even smaller and faster cache memory
accessible to the high speed processing logic.

 The cache is used for storing segments of programs currently being executed in
the CPU and temporary data frequently needed in the present calculations.
 By making programs and data available at a rapid rate, it is possible to increase
the performance rate of the computer.
MAIN MEMORY

 It is the central storage unit in a computer system.


 It is a relatively large and fast memory used to store programs and data
during the computer operation.
 The principal technology used for the main memory is based on
semiconductor integrated circuits.
 Integrated circuit RAM chips are available in two possible operating modes,
static and dynamic.
 The static RAM (SRAM) consists essentially of internal flip flops that store
the binary information. The stored information remains valid as long as
power is applied to the unit.
 The dynamic RAM (DRAM) stores the binary information in the form of
electric charges that are applied to capacitors. The capacitors are provided
inside the chip by MOS transistors. The stored charge on the capacitors
tend to discharge with time and the capacitors must be periodically
recharged by refreshing the dynamic memory.
 Most of the main memory in a general purpose computer is made up of
RAM integrated circuit chips, but a portion of the memory may be
constructed with ROM chips.
MAIN MEMORY

 ROM is used for storing programs that are permanently resident in


the computer and for tables of constants that do not change in
value once the production of the computer is completed.
 The ROM portion of main memory is needed for storing an initial
program called a bootstrap loader
 The bootstrap loader is a program whose function is to start the
computer software operating when power is turned on.
 RAM and ROM chips are available in a variety of sizes.
 If the memory needed for the computer is larger than the
capacity of one chip, it is necessary to combine a number of chips
to form the required memory size.
 Example: a 1024 x 8 memory constructed with 128 x 8 RAM chips
and 512 x 8 ROM chips.
 A RAM chip is better suited for communication with the CPU if it
has one or more control inputs that select the chip only when
needed.
MAIN MEMORY

 Another common feature is a bidirectional data bus that allows the


transfer of data either from memory to CPU during a read operation, or
from CPU to memory during a write operation.

 A bidirectional bus can be constructed with three state buffers.


 A three state buffer output can be placed in one of these possible states: a
signal equivalent to logic 1, a signal equivalent to logic 0, or a high
impedance state.
 The logic 1 and 0 are normal digital signals.
 The high impedance state behaves like an open circuit, which means that
the output does not carry a signal and has no logic significance.
MAIN MEMORY

 The capacity of the memory is 128 words of 8 bits per word.


 This requires a 7 bit address and an 8 bit bidirectional data bus.
 The read and write inputs specify the memory operation and the two
chips select (CS) control inputs are for enabling the chip only when it is
selected by the microprocessor.
 The availability of more than one control input to select the chip
facilitates the decoding of the address lines when multiple chips are
used in the microcomputer.
 When the chip is selected, the two binary states in this line specify the
two operations of read or write.
MAIN MEMORY

 A ROM chip is organized externally in a similar manner, however, since a


ROM can only read; the data bus can only be in an output mode.

 For the same size chip, it is possible to have more bits of ROM than of RAM,
because the internal binary cells in ROM occupy less space than the RAM.
 The two chip select inputs must be CS1=1 and CS2=0 for the unit to operate.
 Otherwise, the data bus is in a high impedance state.
 There is no need for a read or write control because the unit can only read.
 Thus when the chip is enabled by the two select inputs, the byte selected by
the address lines appears on the data bus.
Memory address map

• The designer of a computer system must calculate the amount of memory required for
the particular application and assign it to either RAM or ROM.
• The interconnection between memory and processor is then established from
knowledge of the size of memory needed and the type of RAM and ROM chips available.
• The addressing of memory can be established by means of a table that specifies the
memory address assigned to each chip.
• The table, called a memory address map, is a pictorial representation of assigned
address space for each chip in the system.
• Example , assume that a computer system needs 512 bytes of RAM and 512 bytes of
ROM
AUXILIARY MEMORY

• The important characteristics of any device are its access mode, access
time, transfer rate, capacity, and cot.
• The average time required to reach a storage location in memory and
obtain its contents is called the access time.
• In electromechanical devices with moving parts such as disks and tapes,
the access time consists of a seek time required to position the read-
write head to a location and a transfer time required to transfer data to
or from the device.
• Auxiliary storage is organized in records or blocks.
• A record is a specified number of characters or words.
• Reading or writing is always done on entire records.
• Magnetic drums and disks are consisting of high speed rotating surfaces
coated with a magnetic recording medium.
• The recording surface rates at uniform speed.
• Bits are recorded as magnetic spots on the surface as it passes a
stationary mechanism called a write head.
AUXILIARY MEMORY

• Stored bits are detected by a change in magnetic filed produced by a recorded spot on
the surface as it passes through a read head.
• The amount of surface available for recording in a disk is greater than in a drum of equal
physical size.
• Therefore, more information can be stored on a disk than on a drum of comparable size
• For this reason, disks have replaced drums in more recent computers.
• Magnetic disk include hard disks and floppy disks.
• Working principle is same for both hard disks and floppy disks.
• A magnetic disk is a surface device.
• It stores data on its surface.
• Its surface is divided into circular concentric tracks, and each track is divided into sectors.
Disk controller:

 magnetic disk drives require controller.


 The controller converts instructions received from software to
electrical signals to operate disks.
 Hard disk controller and floppy disk controller are available in IC
form.
 The functions of a disk controllers are:
• To interface a disk drive system to the CPU.
• Disk drive selection, because a computer uses more than one disk
drive
• Track and sector selection
• To issue commands to the disk drive system to perform read/write
operation
• Data separation
• Serial to parallel and parallel to serial conversion
• Error detection, etc.
Memory-CACHE -CPU

 When the processor attempts to read a word of memory, a check is made to determine
if the word is in the cache.
 If so, the word is delivered to the processor.
 If not, a block of main memory, consisting of some fixed number of words, is read into
the cache and then the word is delivered to the processor
 Main memory consists of up to 2n addressable words, with each word having a unique
n-bit address.
 The mapping purposes, this memory is considered to consist of a number of fixed length
blocks of K words each.
 That is, there are M=2n/K blocks.
Memory-CACHE -CPU

 The cache consists of C lines.


 Each line contains K words, plus a tag of a few bits; the number of words in the line is
referred to as the line size.
 The number of lines is considerably less than the number of main memory blocks ( C <<
M).
 At any time, some subset of the blocks of memory resides in lines in the cache.
 If a word in a block of memory is read, that block is transferred to one of lines of the
cache.
 An individual line cannot be uniquely and permanently dedicated to a particular block
 Thus, each line includes a tag that identifies which particular block is currently being
stored.
 The tag is usually a portion of the main memory address
 The processor generates the address, AU, of a word to be read.
 If the word is considered in the cache, it is delivered to the processor.
 the block containing that word is loaded into the cache, and the word is delivered to the
processor.
 the cache connects to the processor via data, control, and address lines.
 The data and address lines also attach to data and address buffers, which attach to a
system bus from which main memory is reached.
Memory-CACHE -CPU

 the cache connects to the processor via data, control, and address lines.
 The data and address lines also attach to data and address buffers, which attach to a
system bus from which main memory is reached.
 When a cache hit occurs, the data and address buffers are disabled and communication
is only between processor and cache, with no system bus traffic. When a cache miss
occurs, the desired address is loaded onto the system bus and the data are returned
through the data buffer to both the cache and the processor.
Mapping Function
• Cache of 64kByte
• Cache block of 4 bytes
– i.e. cache is 16k (214) lines of 4 bytes
• 16MBytes main memory
• 24 bit address
– (224=16M)
• There are fewer cache lines than main memory blocks, an algorithm
is needed for mapping main memory blocks into cache lines.
• Further, a means is needed for determining which main memory
block currently occupies a cache line.
• The choice of the mapping function dictates how the cache is
organized. Three techniques can be used: direct, associative, and set
associative.
Direct Mapping
• Each block of main memory maps to only one
cache line
– i.e. if a block is in cache, it must be in one specific
place
• Address is in two parts
• Least Significant w bits identify unique word
• Most Significant s bits specify one memory block
• The MSBs are split into a cache line field r and a
tag of s-r (most significant)
Direct Mapping
Address Structure

Tag s-r Line or Slot r Word w

8 14 2

• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
– 8 bit tag (=22-14)
– 14 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping from Cache to Main Memory
Direct Mapping
Cache Line Table

Cache line Main Memory blocks held


0 0, m, 2m, 3m…2s-m

1 1,m+1, 2m+1…2s-m+1


m-1 m-1, 2m-1,3m-1…2s-1
Direct Mapping Cache Organization
Direct
Mapping
Example
Direct Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or
bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w
= 2s
• Number of lines in cache = m = 2r
• Size of tag = (s – r) bits
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
– If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very high
Associative Mapping
• A main memory block can load into any line of
cache
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
• Cache searching gets expensive
Associative Mapping from
Cache to Main Memory
Fully Associative Cache Organization
Associative
Mapping
Example
Associative Mapping
Address Structure
Word
Tag 22 bit 2 bit

• 22 bit tag stored with each 32 bit block of data


• Compare tag field with tag entry in cache to check for hit
• Least significant 2 bits of address identify which 16 bit word is
required from 32 bit data block
• e.g.
– Address Tag Data Cache line
– FFFFFC FFFFFC 24682468 3FFF
Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or
bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w
= 2s
• Number of lines in cache = undetermined
• Size of tag = s bits
Set Associative Mapping
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given set
– e.g. Block B can be in any line of set i
• e.g. 2 lines per set
– 2 way associative mapping
– A given block can be in one of 2 lines in only one
set
Set Associative Mapping
Example
• 13 bit set number
• Block number in main memory is modulo 213
• 000000, 00A000, 00B000, 00C000 … map to
same set
UNIT V

THE CENTRAL PROCESSING UNIT


Introduction
• CPU is actually the main brain of the computer
system. It is here that the whole processing
takes place. It consists of 4 main units
Primary storage
 Arithmetic logic unit
 control unit
 output unit
Introduction
• Computer architecture is sometimes defined as the computer
structure and behaviour as seen by the programmer that uses
machine language instructions.
• This includes the instruction formats, addressing modes, the
instruction set, and the general organization of the CPU registers.
• From the designer’s point of view, the computer instruction set
provides the specifications for the design of the CPU.
• The design of a CPU is a task that in large part involves choosing the
hardware for implementing the machine instructions.
• The user who programs the computer in machine or assembly
language must be aware of the register set, the memory structure,
the type of data supported by the instructions and the function that
each instruction performs.
Block diagram of Digital Computer
General Register Organization

• The output of each register is connected to two multiplexers


(MUX) to form the two buses A and B.
• The selection lines in each multiplexer select one register or the
input data for the particular bus.
• The A and B buses form the inputs to a common arithmetic logic
unit (ALU).
• The operation selected in the ALU determines the arithmetic or
logic micro operation that is to be performed.
• The result of the micro operation is available for output data and
also goes into the inputs of all the registers.
• The register that receives the information from the output bus is
selected by a decoder.
• The control unit that operates the CPU bus system directs the
information flow through the registers and ALU by selecting the
various components in the system.
General Register Organization

• For example to perform the operation


• R1  R2 +R3
• The control must provide binary selection variables to the following selection
inputs
1. MUX A selector (SELA): to place the content of R2 into bus A
2. MUX B selector (SELB): to place the content of R3 into bus B
3. ALU operation selector (OPR): to provide the arithmetic addition A+B
4. Decoder destination selector (SELD): to transfer the content of the output bus into
R1
• The four control selection variables are generated in the CU and must be
available at the beginning of a clock cycle.
• The data from the two source registers propagate through the gates in the
multiplexers and the ALU, to the output bus, and into the inputs of the
destination register, all during the clock cycle interval.
• Then, when the next clock transition occurs, the binary information from the
output bus is transferred into R1.
• To achieve a fast response time, the ALU is constructed with high speed
circuits.
Control word
• There are 14 binary selection inputs in the unit, and their combined value specifies a
control word.

• The three bit s of SELA select a source register for the A input of the ALU.
• The three bit s of SELB select a register for the B input of the ALU.
• The three bits of SELD select a destination register using the decoder and its seven
load outputs.
• The five bits OPR select one of the operations in the ALU.
• The 14 bits control word when applied to the selection inputs specify a particular
micro operation.
• The encoding of the register selections is specified in the following table.
• The 3 bit binary code listed in the first column of the table specifies the binary code
for each of the three fields.
• When SELA or SELB is 000, the corresponding multiplexer selects the external input
data.
• when SELD=000, no destination register is selected but the contents of the output
bus are available in the external output.
Control word
OPR filed
• The OPR field has five bits and each operation
is designated with a symbolic name.
Examples of Micro operations:

• A control word of 14 bits is needed to specify a


microoperation in the CPU.
• The control word for a given microoperation can be
derived from the selection variables, for example, the
subtract microoperation given by the statement.
• R1  R2 – R3 , Specifies R2 for the A input of the ALU, R3
for the B input of the ALU, R1 for the destination register,
and an ALU operation to subtract A-B.
• The binary control word for the subtract microoperation is
010 011 001 00101 is obtained as follows:
• Field: SELA SELB SELD OPR
• Symbol: R2 R3 R1 SUB
• Control word 010 011 001 00101
Symbolic designation

• The most efficient way to generate control words with a large number of
bits is to store them in a memory unit.
• A memory unit that stores control words is referred to as a control
memory.
• This type of control is referred to as micro programmed control
Stack Organization

• A stack is a storage device that stores information in such


a manner that the item stored last in the first item
retrieved(LIFO).
• The register that holds the address for the stack is called a
stack pointer(SP). Because its value always points at the
top item in the stack.
• The operation of insertion is called push(as the result of
pushing a new item on top.
• The operation of deletion is called pop(as the result of
removing one item).
• However, nothing is pushed or popped in a computer
stack.
• These operations are simulated by incrementing or
decrementing the stack pointer register.
Register stack:

• A stack can be placed in a portion of a large


memory or it can be organized as a collection of
a finite number of memory words or register.
Register stack:
 The push operation is implemented with the following
sequence of Microoperations:
• SP SP+1 increment stack pointer
• M[SP] DRwrite item on top of the stack
• If(SP=0) then (FULL 1):check if stack is full
• EMTY 0 Mark the stack not empty
 The pop operation consists of the following sequence
of micro operations:
• DR  M[SP] Read item from the top of stack
• SP  SP-1 Decrement stack pointer
• If (SP=0) then (EMTY  1) check if stack is empty
• FULL 0 Mark the stack not full
Memory Stack:

• The implementation of a stack in the CPU is done by


assigning a portion of memory to a stack operation and
using a processor register as a stack pointer.
• A portion of computer memory partitioned into three
segments: program, data and stack.
• The program counter PC points at the address of the next
instruction in the program.
• The address register AR points at an array of data. the stack
pointer SP points at the top of the stack.
• The three registers are connected to a common address
bus, and either one can provide an address for memory. PC
is used during the fetch phase to read an instruction.
• AR is used during the execute phase to read an operand.
• SP is used to push or pop items into or from the stack.
Memory Stack:

• A new item is inserted with the push operation as follows:


• SPSP-1
• M[SP] DR
• The stack pointer is decremented so that it points at the address of the next
word. A memory write operation inserts the word from DR into the top of the
stack.
• A new item is deleted with a pop operation as follows:
• DR M[SP]
• SPSP+1
Reverse Polish Notation:
• A stack organization is very effective for evaluating arithmetic expressions.
• The common mathematical method of writing arithmetic expressions imposes
difficulties when evaluated by a computer.
• The common arithmetic expressions are written in infix notation, with each
operator written between the operands.
• Consider the simple arithmetic expression.
• A*B + C*D
• To evaluate this arithmetic expression it is necessary to compute the product A*B,
store the product while computing C*D, and then sum the two products.
• The polish mathematician Lukasiewicz showed that arithmetic expressions can be
represented in prefix notation. The postfix notation, referred to as Reverse Polish
Notation (RPN), places the operator after the operands.
• A+B infix notation
• +AB Prefix or Polish notation
• AB+ Postfix or Reverse Polish Notation
• The reverse polish notation is in a form suitable for stack manipulation.
• The expression A*B+C*D is written in reverse polish notation as AB*CD*+
Reverse Polish Notation:
• Scan the expression from left to right.
• When an operator is reached, perform the operation with the two operands found on
the left side of the operator.
• Continue to scan the expression and repeat the procedure for every operator
encountered until there are no more operators.
• We perform the operation A*B and replace A,B and * by the product to obtain (A*B)CD*+
• Where (A*B) is a single quantity obtained from the product.
• The next operator is a * and its previous two operands are C and D, so we perform C *D
and obtain an expression with two operands and one operator:
• (A*B)(C*D)+ the next operator is + and the two operands to be added are the two
products, so we add the two quantities to obtain result.
• This hierarchy dictates that we first perform all arithmetic inside inner parentheses, then
inside outer parentheses, and do multiplication and division operation before addition
and subtraction operations.
• (ie., BODMAS rule  Bracket of division, multiplication, addition and subtraction)
• Consider the expression: (A+B)*[C*(D+E)+F], The converted expression is AB+DE+C*F+*.
• Proceeding from left to right, we first add A and B, then add D and E.
• At this point we are left with (A+B)(D+E)C*F+*
Evaluation of Arithmetic Expressions:

• The following microoperations are executed with the stack when an operation is
entered in a calculator or issued by the control in a computer.
• The two topmost operands in stack are used for the operation
• The stack is popped and the result of the operation replaces the lower operand
• The following numerical example may clarify this procedure.
• Consider the arithmetic expression. (3*4) + (5*6) In reverse polish notation, this is
expressed as 34*56*+

• Most compliers, irrespective of their CPU organization, convert all arithmetic


expressions into Polish notation anyway because this is the most efficient method for
translating arithmetic expressions into machine language instructions.
RISC (Reduced Instruction Set Computer)

 RISC -computers use fewer instructions with simple constructs to they can be
executed mush faster within the CPU without having to use memory as often.
 The concept of RISC architecture involves an attempt to reduce execution time by
simplifying the instruction set of the computer.
 The major characteristics of a RISC processor are:
• Relatively few instructions
• Relatively few addressing modes
• Memory access limited to load and store instructions
• All operations done within the registers of the CPU
• Fixed length, easily decoded instruction format
• Single cycle instruction execution
• Hardwired rather than microprogrammed control.
 Other characteristics attributed to RISC architecture are:
• A relatively large number of registers in the processor unit.
• Use of overlapped register windows to speed up procedure call and return.
• Efficient instruction pipeline.
• Complier support for efficient translation of high level language programs into
machine language programs.
CISC(Complex Instruction Set Computer)

 A computer with a large number of instructions is classified as a


Complex Instruction Set Computer (CISC).
 The essential goal of a CISC architecture is to attempt to provide a single
machine instruction for each statement that is written in a high level
language.
 One reason for the trend to provide a complex instruction set is the
desire to simplify the compilation and improve the overall computer
performance.
 In summary, the major characteristic of CISC architecture are:
• A large number of instructions – typically from 100 to 250 instructions.
• Some instructions that perform specialized tasks and are used
infrequently
• A large variety of addressing modes – typically from 5 to 20 different
modes.
• Variable length instruction formats
• Instructions that manipulate operands in memory.
Control Unit

• The execution of an instruction involves the execution of a sequence of substeps,


generally called cycles.
• For example, an execution may consist of fetch, indirect, execute, and interrupt cycles.
• Each cycle is in turn made up of a sequence of more fundamental operations, called
micro operations.
• A single micro operation generally involves a transfer between registers, a transfer
between a register and an external bus or a simple ALU operation.
• The control unit of a processor performs two tasks:
(1) It causes the processor to execute micro operations in the proper sequence,
determined by the program being executed,
(2) it generates the control signals that cause each micro operation to be executed.
• The control signals generated by the control unit cause the opening and closing of
logic gates, resulting in the transfer of data to and from registers and the operation of
the ALU.
• One technique for implementing a control unit is referred to as hardwired
implementation, in which the control unit is a combinatorial circuit.
• Its input logic signals, governed by the current machine instruction, are transferred
into a set of output control signals.
• Micro operations are the functional, or atomic, operations of a processor.
Control Unit
Control Unit
 functional requirements is the basis for the design and implementation of the
control unit.
 The following three step process leads to a characterization of the control
unit:
• Define the basic elements of the processor
• Describe the micro operations that the processor performs
• Determine the functions that the control unit must perform to cause the
micro operations to be performed.
Control Unit
 For the control unit to perform its function, it must have inputs that allow it to
determine the state of the system and outputs that allow it to control the behaviour
of the system.
 The inputs are:
• Clock: this is how the control unit “keeps time”.
• Instruction register: The opcode of the current instruction is used to determine which
micro operations to perform during the execute cycle.
• Flags: These are needed by the control unit to determine the status of the processor
and the outcome of previous ALU operations.
• Control signals from control bus.
 The outputs are
• Control signals within the processor: These are two types: those that cause data to be
moved from one register to another, and those that activate specific ALU functions.
• Control signals to control bus: These are also of two types: control signals to memory,
and control signals to the I/O modules.
 A wide variety of techniques have been used for control unit implementation. Most
of these fall into one of two categories:
• Hardwired implementation
• Microprogrammed implementation
Control Unit
 Hardwired implementation:
• the control unit is essentially a combinational circuit.
• Its input logic signals are transformed into a set of output logic signals, which are the
control signals.
• The key inputs are the instruction register, the clock, flags, and control bus signals. In the
case of the flags and control bus signals, each individual bit typically has some meaning.
• The other two inputs, however, are not directly useful to the control unit.
• The internal logic of the control unit that produces output control signals as a function of
its input signals.
• As in a hardwired control unit, the control signals generated by a microinstruction are
used to cause register transfers and ALU operations.
 Micro programmed implementation:
• An alternative to a hardwired control unit is a microprogrammed control unit, in which
the logic of the control unit is specified by a microprogram.
• A microprogram consists of a sequence of instructions in a microprogramming
language.
• These are very simple instructions that specify micro operations.
• A microprogrammed control unit is a relatively simple logic circuit that is capable of (1)
sequencing through microinstruction and (2) generating control signals to execute each
microinstruction.
Control Unit
--Control Unit Micro architecture—

• The set of microinstructions is stored in the control memory.

• The control address register contains the address of the next microinstruction to be read.
• When a microinstruction is read from the control memory, it is transferred to a control buffer
register.
• Thus, reading a microinstruction from the control memory is the same as executing that
microinstruction.
• The third element is a sequencing unit that loads the control address register and issues a read
command.
• The two basic task performed by a microprgrammed control unit are
• Microinstruction sequencing: Get the next microinstruction from the control memory
• Microinstruction execution: Generate the control signals needed to execute the
ALU design

• A possible choice for shift unit would be a bidirectional


shift register with parallel load
• Information can be transferred to the register in parallel
and then shifted to the right or left.
• In this type of configuration, a clock pulse is needed for
loading the data into the register and another pulse is
needed to initiate the shift.
• The content of a register that has to be shifted is first
placed onto a common bus whose output is connected
to the combinational shifter and the shifted number is
then loaded back into the register
• This requires only one clock pulse for loading the shifted
value into the register.
Arithmetic logic shift unit:

• Instead of having individual registers performing the microoperations


directly, computer systems employ a number of storage registers
connected to a common operational unit called an arithmetic logic
unit(ALU).
• To perform a Microoperation, the contents of specified registers are
placed in the inputs of the common ALU.
• The ALU performs an operation and the result of the operation is
then transferred to a destination register.
• The ALU is a combinational circuit so that the entire register transfer
operation from the source registers through the ALU and into the
destination register can be performed during one clock pulse period.
• The shift microoperations are often performed in a separate unit, but
sometimes the shift unit is made part of the overall ALU.
• One stage of an arithmetic logic shift unit is shown in following figure.
The subscript i designates a typical stage. Inputs Ai and Bi are applied
to both the arithmetic and logic units.
Arithmetic logic shift unit:
Arithmetic logic shift unit:

• A particular Microoperation is selected with inputs S1 and S0.


• A 4 x 1 multiplexer at the output chooses between an arithmetic output in E i and a logic output
in Hi.
• The data in the multiplexer are selected with inputs S3 and S2.
• The other two data inputs to the multiplexer receive inputs Ai-1 for the shift right operation and
Ai+1 for the shift left operation.
• The output carry Ci+1 of a given arithmetic stage must be connected to the input carry Ci of the
next stage in sequence.
• The input carry to the first stage is the input carry Cin, which provides a selection variable for the
arithmetic operations.
• The circuit whose one stage is specified in figure provides eight arithmetic operations, four logic
operations and two shift operations.
• Each operation is selected with the five variables S3,S2,S1,S0 and Cin.
• The input carry Cin is used for selecting an arithmetic operation only.
• The above table lists the 14 operations of the ALU.
• The first eight are arithmetic operations and are selected with S3S2 =00.
• The next four are logic operations and are selected with S3S2=01.
• The input carry has no effect during the logic operations and is marked with don’t care X’s.
• The last two operations are shift operations and are selected with S3S2=10 and 11.
• The other three selection inputs have no effect on the shift.
Pipeline Processing

• Pipelining is a technique of decomposing a sequential process into


suboperations, with each subprocess being executed in a special dedicated
segment that operates concurrently with all other segments.
• Each segment performs partial processing dictated by the way the task is
partitioned
• The result obtained from the computation in each segment is transferred to the
next segment in the pipeline.
• The final result is obtained after the data have passed through all segments.
• The name “pipeline” implies a flow of information analogous to an industrial
assembly line.
• It is characteristic of pipelines that several computations can be in progress in
distinct segments at the same time.
• The registers provide isolation between each segment so that each can operate
on distinct data simultaneously.
• Perhaps the simplest way of viewing the pipeline structure is to imagine that
each segment consists of an input register followed by a combinational circuit.
• The register holds the data and the combinational circuit performs the
suboperation in the particular segment.
Pipeline Processing

• The output of the combinational circuit in a given segment is applied to


the input register of the next segment. A clock is applied to all registers
after enough time has elapsed to perform all segment activity.
• In this way the information flows through the pipeline one step at a time.
The pipeline organization will be demonstrated by means of a simple
example. Suppose that we want to perform the combined multiply and
add operations with a stream of numbers.
• Ai * Bi + Ci for i=1,2,3,......7
• Each suboperation is to be implemented in a segment within a pipeline.
• Each segment has one or two registers and a combinational circuit.
• R1 through R5 are registers that receive new data with every clock pulse.
The multiplier and adder are combinational circuits.
• The suboperations performed in each segment of pipeline are as follows:
• R1 Ai , R2 Bi input Ai and Bi
• R3  R1 * R2, R4  Ci multiply and input Ci
• R5  R3 + R4 Add Ci to product
Pipeline Processing

• The five registers are loaded with new data every clock pulse. The effect of each
clock is shown in the table.

• The first clock pulse transfers A1 and B1 into R1 and R2.


• The second clock pulse transfers the product of R1 and R2 into R3 and C1 into R4.
• The same clock pulse transfers A2 and B2 into R1 and R2.
• The third clock pulse operates on all three segments simultaneously.
Pipeline Processing

• It places A3 and B3 into R1 and R2, transfers the product of R1


and R2 into R3, transfers C2 into R4 and places the sum of R3 and
R4 into R5.
• It takes three clock pulses to fill up the pipe and retrieve the first
output from R5.
• From there on, each clock produces a new output and moves the
data one step down the pipeline.
• This happens as long as new input data flow into the system.
• When no more input data are available, the clock must continue
until the last output emerges out the pipeline.

You might also like