Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

COA Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 110

COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Title: Computer Organization and Architecture

Author’s Name: Ms. Ankita (Ph.D*, M.Tech.)

Mr. Sandeep Vishwakarma (Ph.D* , M.Tech.)

Published by: University Academy

Publishers Address: Mod Apartment, Vasundhara Enclave, Delhi -


110096

Printer Detail: eBook

Edition Detail: I

ISBN:

Copyright ©2023 University Academy

1 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

COMPUTER ORGANIZATION AND ARCHITECTURE


COURSE OUTCOMES

Course Statement
Outcomes
(On completion of this course, students will be able to,)

CO 1 Study of the basic structure and operation of a digital computer system.

Analysis of the design of arithmetic & logic unit and understanding of the fixed point and
CO 2
floating-point arithmetic operations

CO 3 Implementation of control unit techniques and the concept of Pipelining

CO 4 Understanding the hierarchical memory system, cache memories and virtual memory

Understanding the different ways of communicating with I/O devices and standard I/O
CO 5
interfaces

Syllabus
Unit 1 Introduction: Functional units of digital system and their interconnections, buses, bus
architecture, types of buses and bus arbitration. Register, bus and memory transfer. Processor
organization, general registers organization, stack organization and addressing modes
Unit 2 Arithmetic and logic unit: Look ahead carries adders. Multiplication: Signed operand
multiplication, Booths algorithm and array multiplier. Division and logic operations. Floating point
arithmetic operation, Arithmetic & logic unit design. IEEE Standard for Floating Point Numbers
Unit 3 Control Unit: Instruction types, formats, instruction cycles and sub cycles (fetch and execute
etc), micro operations, execution of a complete instruction. Program Control, Reduced Instruction
Set Computer, Pipelining. Hardwire and micro programmed control: micro programme sequencing,
concept of horizontal and vertical microprogramming.
Unit 4 Memory: Basic concept and hierarchy, semiconductor RAM memories, 2D & 2 1/2D memory
organization. ROM memories. Cache memories: concept and design issues & performance, address
mapping and replacement Auxiliary memories: magnetic disk, magnetic tape and optical disks Virtual
memory: concept implementation.
Unit 5 Input / Output: Peripheral devices, I/O interface, I/O ports, Interrupts: interrupt hardware,
types of interrupts and exceptions. Modes of Data Transfer: Programmed I/O, interrupt initiated I/O
and Direct Memory Access., I/O channels and processors. Serial Communication: Synchronous &
asynchronous communication, standard communication interfaces

2 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Index:
1. Introduction………………………………………………………………………………... 5-24
1.1. Functional Units of Digital System And Their Interconnections………………………………5
1.2. Buses & Bus Architecture, …………………………………………….………………………..7
1.3. Types of Buses …………………………………………………………………………….……8
1.4. Bus Arbitration…………………………………………………………………………….…….9
1.5. Register…………………………………………………………………………………………13
1.6. Bus and Memory Transfer. ………………………………………………………………..…..14
1.7. Processor Organization…………………………………………………………………...…….18
1.7.1. General Registers Organization……………………………………………………………..19
1.7.2. Stack Organization …………………………………………………………………………21
1.8. Addressing Modes…………………………………………………………………………..….21
2. Arithmetic And Logic Unit……………………………………………………………….25-42
2.1. Look Ahead Carries Adders……………………………………………………………………25
2.2. Multiplication…………………………………………………………………………………..28
2.2.1. Signed Operand Multiplication……………………………………………………………..28
2.2.2. Booths Algorithm And Array Multiplier……………………………………………………30
2.3. Division And Logic Operations………………………………………………………………...33
2.4. Floating Point Arithmetic Operation…………………………………………………………...35
2.5. IEEE Standard For Floating Point Numbers………………………………………………...…38
2.6. Arithmetic & Logic Unit Design……………………………………………………………….40
3. Control Unit………………………………………………………………………………..43-58
3.1. Instruction Types ……………………………………………………………………………...43
3.2. Formats………………………………………………………………………………………....43
3.3. Instruction Cycles And Sub Cycles (Fetch And Execute Etc)……………………………...…46
3.4. Micro Operations……………………………………………………………………………...48
3.5. Execution of A Complete Instruction………………………………………………………….49
3.6. Program Control……………………………………………………………………………….50
3.7. Reduced Instruction Set Computer…………………………………………………………….51
3.8. Pipelining………………………………………………………………………………………52
3.9. Hardwire And Micro Programmed Control…………………………………………………...54
3.10. Types of Micro Programmed Control…………………………………………………...….58
3.10.1. Horizontal Micro-programmed ………………………………………………………….…58

3 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

3.10.2. Vertical Microprogramming………………………………………………………………...58


4. Memory…………………………………………………………………………………….59-78
4.1. Basic Concept and Hierarchy……………………………………………………………….…59
4.2. Semiconductor RAM Memories………………………………………………………….……64
4.3. 2D & 2 1/2D Memory Organization………………………………………………………...…65
4.4. ROM Memories………………………………………………………………………………..60
4.5. Cache Memories: ………………………………………………………………………………67
4.5.1. Concept And Design Issues & Performance………………………………………………..68
4.5.2. Address Mapping And Replacement ………………………………………………………68
4.6. Auxiliary Memories…………………………………………………………………………….72
4.6.1. Magnetic Disk………………………………………………………………………………72
4.6.2. Magnetic Tape………………………………………………………………………...……73
4.6.3. Optical Disks …………………………………………………………………………….....73
4.7. Virtual Memory: Concept Implementation…………………………………………………….77
5. Input / Output …………………………………………………………………………..…79-93
5.1. Peripheral Devices……………………………………………………………………………...79
5.2. I/O Interface……………………………………………………………………………………80
5.3. I/O Ports………………………………………………………………………………………..81
5.4. Interrupts……………………………………………………………………………………….82
5.4.1. Interrupt Hardware………………………………………………………………………….83
5.4.2. Types Of Interrupts and Exceptions………………………………………………………...83
5.5. Modes of Data Transfer………………………………………………………………………...84
5.5.1. Programmed I/O…………………………………………………………………………….84
5.5.2. Interrupt Initiated I/O ……………………………………………………………………....85
5.5.3. Direct Memory Access……………………………………………………………………..85
5.6. I/O Channels And Processors…………………………………………………………………..86
5.7. Serial Communication……………………………………………………………………….…89
5.7.1. Synchronous & Asynchronous Communication…………………………………………....89
5.7.2. Standard Communication Interfaces………………………………………………………..91
Solved Question Paper 2020-21………………………………………...…………………………94

4 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

UNIT-1
1.1. Functional Units of Digital System
• A computer organization describes the functions and design of the various units of a digital system.

• A general-purpose computer system is the best-known example of a digital system. Other examples
include telephone switching exchanges, digital voltmeters, digital counters, electronic calculators and
digital displays.
• Computer architecture deals with the specification of the instruction set and the hardware units that
implement the instructions.
• Computer hardware consists of electronic circuits, displays, magnetic and optic storage media and also
the communication facilities.
• Functional units are a part of a CPU that performs the operations and calculations called for by the
computer program.
• Functional units of a computer system are parts of the CPU (Central Processing Unit) that performs
the operations and calculations called for by the computer program. A computer consists of five main
components namely, Input unit, Central Processing Unit, Memory unit Arithmetic & logical unit,
Control unit and an Output unit.

Input unit
• Input units are used by the computer to read the data. The most commonly used input devices
are keyboards, mouse, joysticks, trackballs, microphones, etc.
• However, the most well-known input device is a keyboard. Whenever a key is pressed, the
corresponding letter or digit is automatically translated into its corresponding binary code and
transmitted over a cable to either the memory or the processor.

5 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Central processing unit: Central processing unit commonly known as CPU can be referred as an
electronic circuitry within a computer that carries out the instructions given by a computer program
by performing the basic arithmetic, logical, control and input/output (I/O) operations specified by
the instructions.

Memory unit
• The Memory unit can be referred to as the storage area in which programs are kept which are
running, and that contains data needed by the running programs.
• The Memory unit can be categorized in two ways namely, primary memory and secondary
memory.
• It enables a processor to access running execution applications and services that are temporarily
stored in a specific memory location.
• Primary storage is the fastest memory that operates at electronic speeds. Primary memory
contains a large number of semiconductor storage cells, capable of storing a bit of information.
The word length of a computer is between 16-64 bits.
• It is also known as the volatile form of memory, means when the computer is shut down, anything
contained in RAM is lost.
• Cache memory is also a kind of memory which is used to fetch the data very soon. They are
highly coupled with the processor.
• The most common examples of primary memory are RAM and ROM.
• Secondary memory is used when a large amount of data and programs have to be stored for a
long-term basis.
• It is also known as the Non-volatile memory form of memory, means the data is stored
permanently irrespective of shut down.
• The most common examples of secondary memory are magnetic disks, magnetic tapes, and
optical disks.
Arithmetic & logical unit: Most of all the arithmetic and logical operations of a computer are
executed in the ALU (Arithmetic and Logical Unit) of the processor. It performs arithmetic
operations like addition, subtraction, multiplication, division and also the logical operations like
AND, OR, NOT operations.

Control unit

6 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• The control unit is a component of a computer's central processing unit that coordinates the
operation of the processor. It tells the computer's memory, arithmetic/logic unit and input and
output devices how to respond to a program's instructions.
• The control unit is also known as the nerve center of a computer system.
• Let's us consider an example of addition of two operands by the instruction given as Add
LOCA, RO. This instruction adds the memory location LOCA to the operand in the register
RO and places the sum in the register RO. This instruction internally performs several steps.
Output Unit
• The primary function of the output unit is to send the processed results to the user. Output
devices display information in a way that the user can understand.
• Output devices are pieces of equipment that are used to generate information or any other
response processed by the computer. These devices display information that has been held or
generated within a computer.
• The most common example of an output device is a monitor.

1.2. Buses, Bus Architecture


A bus that connects major components (CPU, memory and I/O devices) of a computer system is
called as a System Bus. A bus is a set of electrical wires (lines) that connects the various hardware
components of a computer system. It works as a communication pathway through which information
flows from one hardware component to the other hardware component. Bus Architecture is shown in
figure below.

7 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• A computer system is made of different components such as memory, ALU, registers etc.
• Each component should be able to communicate with other for proper execution of
instructions and information flow.
• If we try to implement a mesh topology among different components, it would be really
expensive.
• So, we use a common component to connect each necessary component i.e. BUS.

1.3. Types of buses


System bus contains 3 categories of lines used to provide the communication between the CPU,
memory and IO named as:
1. Data Bus
2. Address Bus
3. Control Bus

Data Bus: As the name suggests, data bus is used for transmitting the data / instruction from CPU
to memory/IO and vice-versa. It is bi-directional. The width of a data bus refers to the number of
bits (electrical wires) that the bus can carry at a time. Each line carries 1 bit at a time. So, the
number of lines in data bus determine how many bits can be transferred parallelly. The width of
data bus is an important parameter because it determines how much data can be transmitted at
one time. The wider the bus width, faster would be the data flow on the data bus and thus better
would be the system performance.
Examples-
• A 32-bit bus has thirty two (32) wires and thus can transmit 32 bits of data at a time.
• A 64-bit bus has sixty four (64) wires and thus can transmit 64 bits of data at a time.

Address Bus : As the name suggests, address bus is used to carry address from CPU to memory/IO
devices. It is used to identify the particular location in memory. It carries the source or destination
address of data i.e. where to store or from where to retrieve the data. It is uni-directional.

8 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Example- When CPU wants to read or write data, it sends the memory read or memory write
control signal on the control bus to perform the memory read or write operation from the main
memory and the address of the memory location is sent on the address bus.
• If CPU wants to read data stored at the memory location (address) 4, the CPU send the value
4 in binary on the address bus.
The width of address bus determines the amount of physical memory addressable by the processor.
• Inother words, it determines the size of the memory that the computer can use.
• The wider is the address bus, the more memory a computer will be able to use.
• The addressing capacity of the system can be increased by adding more address lines.

Examples-
• An address bus that consists of 16 wires can convey 216 (= 64K) different addresses.
• An address bus that consists of 32 wires can convey 232 (= 4G) different addresses.

Control Bus : As the name suggests, control bus is used to transfer the control and timing signals
from one component to the other component. The CPU uses control bus to communicate with the
devices that are connected to the computer system. The CPU transmits different types of control
signals to the system components. It is bi-directional.
Typical control signals hold by control bus
Memory read – Data from memory address location to be placed on data bus.
Memory write – Data from data bus to be placed on memory address location.
I/O Read – Data from I/O address location to be placed on data bus.
I/O Write – Data from data bus to be placed on I/O address location.
Other control signals hold by control bus are interrupt, interrupt acknowledge, bus request, bus
grant and several others. The type of action taking place on the system bus is indicated by these
control signals.
Example-
When CPU wants to read or write data, it sends the memory read or memory write control signal on
the control bus to perform the memory read or write operation from the main memory. Similarly,
when the processor wants to read from an I/O device, it generates the I/O read signal.

1.4. System Bus Arbitration

9 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Bus Arbitration is the procedure by which the active bus master accesses the bus, relinquishes
control of it, and then transfers it to a different bus-seeking processor unit. A bus master is a
controller that can access the bus for a given instance.
A conflict could occur if multiple DMA controllers, other controllers, or processors attempt to
access the common bus simultaneously, yet only one is permitted to access. Bus master status can
only be held by one processor or controller at once. By coordinating the actions of all devices
seeking memory transfers, the Bus Arbitration method is used to resolve these disputes
There are two approaches to bus arbitration:
• Centralized Bus Arbitration - In which the necessary arbitration is carried out by a lone
bus arbitrator.
• Distributive Bus Arbitration - In which every device takes part in choosing the new bus
master. A 4bit identification number is allocated to each device on the bus. The created ID
will decide the device's priority
a) Centralized Bus Arbitration Methodologies : There are three methods of Centralized
Bus Arbitration, which are listed below:
i. Daisy Chaining method: It is a simple and cheaper method where all the bus masters
use the same line for making bus requests. The bus grant signal serially propagates
through each master until it encounters the first one that is requesting access to the bus.
This master blocks the propagation of the bus grant signal, therefore any other
requesting module will not receive the grant signal and hence cannot access the bus.
During any bus cycle, the bus master may be any device – the processor or any DMA
controller unit, connected to the bus.

Advantages:
• Simplicity and Scalability.
• The user can add more devices anywhere along the chain, up to a certain maximum value.
10 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Disadvantages:
• The value of priority assigned to a device depends on the position of the master bus.
• Propagation delay arises in this method.
• If one device fails then the entire system will stop working.

ii. Polling or Rotating Priority method: In this, the controller is used to generate the
address for the master(unique priority), the number of address lines required depends
on the number of masters connected in the system. The controller generates a sequence
of master addresses. When the requesting master recognizes its address, it activates the
busy line and begins to use the bus.

Advantages –
• This method does not favor any particular device and processor.
• The method is also quite simple.

Disadvantages –
• Adding bus masters is difficult as increases the number of address lines of the circuit.
• If one device fails then the entire system will not stop working.

iii. Fixed priority or Independent Request method: In this, each master has a separate
pair of bus request and bus grant lines and each pair has a priority assigned to it. The
built-in priority decoder within the controller selects the highest priority request and
asserts the corresponding bus grant signal.

11 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Advantages –
This method generates a fast response.
Disadvantages –
Hardware cost is high as a large no. of control lines is required.

Distributed BUS Arbitration:


In distributed arbitration, all devices participate in the selection of the next bus master.

• In this scheme each device on the bus is assigned a4-bit identification number.
• The number of devices connected on the bus when one or more devices request for the
control of bus, they assert the start-arbitration signal and place their 4-bit ID numbers on
arbitration lines, ARB0 through ARB3.
• These four arbitration lines are all open-collector. Therefore, more than one device can place
their 4-bit ID number to indicate that they need to control of bus. If one device puts 1 on the
bus line and another device puts 0 on the same bus line, the bus line status will be 0. Device
reads the status of all lines through inverters buffers so device reads bus status 0as logic 1.
Scheme the device having highest ID number has highest priority.
• When two or more devices place their ID number on bus lines then it is necessary to identify
the highest ID number on bus lines then it is necessary to identify the highest ID number
from the status of bus line. Consider that two devices A and B, having ID number 1 and 6,
respectively are requesting the use of the bus.
• Device A puts the bit pattern 0001, and device B puts the bit pattern 0110. With this
combination the status of bus-line will be 1000; however because of inverter buffers code
seen by both devices is 0111.

12 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• Each device compares the code formed on the arbitration line to its own ID, starting from
the most significant bit. If it finds the difference at any bit position, it disables its drives at
that bit position and for all lower-order bits.
• It does so by placing a 0 at the input of their drive. In our example, device detects a different
on line ARB2 and hence it disables its drives on line ARB2, ARB1 and ARB0. This causes
the code on the arbitration lines to change to 0110. This means that device B has won the
race.
• The decentralized arbitration offers high reliability because operation of the bus is not
dependent on any single device.

1.5. Register

• Registers are a type of computer memory used to quickly accept, store, and transfer data and
instructions that are being used immediately by the CPU. The registers used by the CPU are
often termed as Processor registers.
• A processor register may hold an instruction, a storage address, or any data (such as bit
sequence or individual characters).
• The computer needs processor registers for manipulating data and a register for holding a
memory address. The register holding the memory location is used to calculate the address of
the next instruction after the execution of the current instruction is completed.

Following is the list of some of the most common registers used in a basic computer:

Register Symbol Number of bits Function

Data register DR 16 Holds memory operand

Address register AR 12 Holds address for the memory

Accumulator AC 16 Processor register

Instruction register IR 16 Holds instruction code

Program counter PC 12 Holds address of the instruction

Temporary register TR 16 Holds temporary data

Input register INPR 8 Carries input character

Output register OUTR 8 Carries output character

13 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

The following image shows the register and memory configuration for a basic computer.

• The Memory unit has a capacity of 4096 words, and each word contains 16 bits.
• The Data Register (DR) contains 16 bits which hold the operand read from the memory
location.
• The Memory Address Register (MAR) contains 12 bits which hold the address for the memory
location.
• The Program Counter (PC) also contains 12 bits which hold the address of the next instruction
to be read from memory after the current instruction is executed.
• The Accumulator (AC) register is a general purpose processing register.
• The instruction read from memory is placed in the Instruction register (IR).
• The Temporary Register (TR) is used for holding the temporary data during the processing.
• The Input Registers (IR) holds the input characters given by the user.
• The Output Registers (OR) holds the output after processing the input data.

1.6. Bus and Memory Transfers


A digital system composed of many registers, and paths must be provided to transfer information
from one register to another. The number of wires connecting all of the registers will be excessive if
separate lines are used between each register and all other registers in the system.

A bus structure, on the other hand, is more efficient for transferring information between registers in
a multi-register configuration system.

14 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

A bus consists of a set of common lines, one for each bit of register, through which binary information
is transferred one at a time. Control signals determine which register is selected by the bus during a
particular register transfer.

The following block diagram shows a Bus system for four registers. It is constructed with the help of
four 4 * 1 Multiplexers each having four data inputs (0 through 3) and two selection inputs (S1 and
S2).

The two selection lines S1 and S2 are connected to the selection inputs of all four multiplexers. The
selection lines choose the four bits of one register and transfer them into the four-line common bus.

When both of the select lines are at low logic, i.e. S1S0 = 00, the 0 data inputs of all four multiplexers
are selected and applied to the outputs that forms the bus. This, in turn, causes the bus lines to receive
the content of register A since the outputs of this register are connected to the 0 data inputs of the
multiplexers.

Similarly, when S1S0 = 01, register B is selected, and the bus lines will receive the content provided
by register B.

The following function table shows the register that is selected by the bus for each of the four possible
binary values of the Selection lines.

15 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

A bus system can also be constructed using three-state gates instead of multiplexers. The three state
gates can be considered as a digital circuit that has three gates, two of which are signals equivalent
to logic 1 and 0 as in a conventional gate. However, the third gate exhibits a high-impedance state.
The most commonly used three state gates in case of the bus system is a buffer gate.

The graphical symbol of a three-state buffer gate can be represented as:

The following diagram demonstrates the construction of a bus system with three-state buffers.

16 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• The outputs generated by the four buffers are connected to form a single bus line.
• Only one buffer can be in active state at a given point of time.
• The control inputs to the buffers determine which of the four normal inputs will
communicate with the bus line.
• A 2 * 4 decoder ensures that no more than one control input is active at any given point
of time.

Memory Transfer

Most of the standard notations used for specifying operations on memory transfer are stated below.

• The transfer of information from a memory unit to the user end is called a Read operation.
• The transfer of new information to be stored in the memory is called a Write operation.
• A memory word is designated by the letter M.
• We must specify the address of memory word while writing the memory transfer
operations.
• The address register is designated by AR and the data register by DR.
• Thus, a read operation can be stated as:

Read: DR ← M [AR]

• The Read statement causes a transfer of information into the data register (DR) from the
memory word (M) selected by the address register (AR).
• And the corresponding write operation can be stated as:

Write: M [AR] ← R1

• The Write statement causes a transfer of information from register R1 into the memory
word (M) selected by address register (AR).

17 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

1.7. Processor organizations

Processor Organizations:

There are 3 types of Processor organizations

1. Stored Program Organization


2. General Register Organization
3. Stack Organization

1. Stored Program Organization:


The simplest way to organize a computer is to have one processor register and instruction code format
with two parts. The first part specifies the operation to be performed and the second specifies an
address. The memory address tells the control where to find an operand in memory. This operand is
read from memory and used as the data to be operated on together with the data stored in the processor
register. Figure 5.1 depicts this type of organization. Instructions are stored in one section of memory
and data in another. For a memory unit with 4096 words we need 12 bits to specify an address since
212 = 4096. If we store each instruction code in one 16-bit memory word, we have available four bits
for the operation code (abbreviated op code) to specify one out of 16 possible operations, and 12 bits
to specify the address of an operand. The control reads a 16-bit instruction from the program portion
of memory. It uses the 12-bit address part of the instruction to read a 16-bit operand from the data
portion of memory. It then executes the operation specified by the operation code.

18 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Computers that have a single-processor register usually assign to it the name accumulator and label
it AC. The operation is performed with the memory operand and the content of AC.
Computers that have a single-processor register usually assign to it the name accumulator and label
it AC. The operation is performed with the memory operand and the content of AC.

2. General Register Organization


• Memory locations are needed for storing pointers, counters, return addresses, temporary
results, and partial products during multiplication.
• Having to refer to memory locations for such applications is time consuming because
memory access is the most time-consuming, operation in a computer.
• It is more convenient and more efficient to store these intermediate values in processor
registers.
• When a large number of registers are included in the CPU, it is most efficient to connect
them through a common bus system. The registers communicate with each other not only
for direct data transfers, but also while performing various microoperations.
• Hence it is necessary to provide a common unit that can perform all the arithmetic, logic,
and shift microoperations in the processor.
• A bus organization for seven CPU registers is shown in Fig. 2. The output of each register is
connected to two multiplexers (MUX) to form the two buses A and B. The selection lines in
each multiplexer select one register or the input data for the particular bus.
• The A and B buses form the inputs to a common arithmetic logic unit (ALU).
• The operation selected in the ALU determines the arithmetic or logic micro-operation that is
to be performed.
• The result of the microoperation is available for output data and also goes into the inputs of
all the registers.
• The register that receives the information from the output bus is selected by a decoder.
• The decoder activates one of the register load inputs, thus providing a transfer path between
the data in the output bus and the inputs of the selected destination register.

19 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• the control unit that operates the CPU bus system directs the information flow through the
registers and ALU by selecting the various components in the system. For example, to
perform the operation R1 ← R2 + R3
• the control must provide binary selection variables to the following selector inputs:
• MUX A selector (SELA): to place the content of R2 into bus A.
• MUX B selector (SELB): to place the content o f R 3 into bus B.
• ALU operation selector (OPR): to provide the arithmetic addition A + B.
• Decoder destination selector (SELD): to transfer the content of the output bus into R1.
• The four control selection variables are generated in the control unit and must be available at
the beginning of a clock cycle.
• The data from the two source registers propagate through the gates in the multiplexers and
the ALU, to the output bus, and into the inputs of the destination register, all during the
clock cycle interval.
• Then, when the next clock transition occurs, the binary information from the output bus is
transferred into R1.

20 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• To achieve a fast response time, the ALU is constructed with high-speed circuits.

3. Stack Organization
Stack is a storage structure that stores information in such a way that the last item stored is the first
item retrieved. It is based on the principle of LIFO (Last-in-first-out). The stack in digital computers
is a group of memory locations with a register that holds the address of top of element. This register
that holds the address of top of element of the stack is called Stack Pointer.
The two operations of a stack are:
1. Push: Inserts an item on top of stack.
2. Pop: Deletes an item from top of stack.

Implementation of Stack
1. Register Stack
2. Memory Stack.
Register Stack

A stack can be organized as a collection of finite number of registers that are used to store temporary
information during the execution of a program. The stack pointer (SP) is a register that holds the
address of top of element of the stack.

Memory Stack

A stack can be implemented in a random access memory (RAM) attached to a CPU. The
implementation of a stack in the CPU is done by assigning a portion of memory to a stack operation
and using a processor register as a stack pointer. The starting memory location of the stack is specified
by the processor register as stack pointer.

1.8. Addressing Modes


The addressing modes help us specify the way in which an operand’s effective address is represented
in any given instruction. Some addressing modes allow referring to a large range of areas efficiently,
like some linear array of addresses along with a list of addresses. The addressing modes describe an
efficient and flexible way to define complex effective addresses.

21 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

The programs are generally written in high-level languages, as it’s a convenient way in which one
can define the variables along with the operations that a programmer performs on the variables. This
program is later compiled so as to generate the actual machine code. A machine code includes low-
level instructions.
A set of low-level instructions has operands and opcodes. An addressing mode has no relation with
the opcode part. It basically focuses on presenting the address of the operand in the instructions.

Addressing Modes Types

The addressing modes refer to how someone can address any given memory location. Five different
addressing modes or five ways exist using which this can be done.
You can find the list below, showing the various kind of addressing modes:

• Implied Mode
• Immediate Mode
• Register Mode
• Register Indirect Mode
• Autodecrement Mode
• Autoincrement Mode
• Direct Address Mode
• Indirect Address Mode
• Indexed Addressing Mode
Before getting into discussing the addressing modes, one must understand more about the “effective
address” term.

Effective Address (EA)


The effective address refers to the address of an exact memory location in which an operand’s value
is actually present. Let us now explain all of the addressing modes.

Implied Mode
In the implied mode, the operands are implicitly specified in the definition of instruction. For instance,
the “complement accumulator” instruction refers to an implied-mode instruction. It is because, in the
definition of the instruction, the operand is implied in the accumulator register. All the register
reference instructions are implied-mode instructions that use an accumulator.

22 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Immediate Mode
In the immediate mode, we specify the operand in the instruction itself. Or, in simpler words, instead
of an address field, the immediate-mode instruction consists of an operand field. An operand field
contains the actual operand that is to be used in conjunction with an operation that is determined in
the given instruction. The immediate-mode instructions help initialize registers to a certain constant
value.

Register Mode
In the register mode, the operands exist in those registers that reside within a CPU. In this case, we
select a specific register from a certain register field in the given instruction. The k-bit field is capable
of determining one 2k register.

Register Indirect Mode


In the register indirect mode, the instruction available to us defines that particular register in the CPU
whose contents provides the operand’s address in the memory. In simpler words, any selected register
would include the address of an operand instead of the operand itself.
The reference to a register is equivalent to specifying any memory address. The pros of using this
type of instruction are that an instruction’s address field would make use of fewer bits to select a
register than would be require when someone wants to directly specify a memory address.

Autodecrement or the Autoincrement Mode


The Autodecrement or Autoincrement mode is very similar to the register indirect mode. The only
exception is that the register is decremented or incremented before or after its value is used to access
memory. When the address stored in the register defines a data table in memory, it is very crucial to
decrement or increment the register after accessing the table every time. It can be obtained using the
decrement or increment instruction.

Direct Address Mode


In the direct address mode, the address part of the instruction is equal to the effective address. The
operand would reside in memory, and the address here is given directly by the instruction’s address
field. The address field would specify the actual branch address in a branch-type instruction.

Indirect Address Mode


In an indirect address mode, the address field of an available instruction gives that address in which
the effective address gets stored in memory. The control fetches the instruction available in the
memory and then uses its address part in order to (again) access memory to read its effective address.

23 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Indexed Addressing Mode


In the indexed addressing mode, the content of a given index register gets added to an instruction’s
address part so as to obtain the effective address. Here, the index register refers to a special CPU
register that consists of an index value. An instruction’s address field defines the beginning address
of any data array present in memory.

24 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Unit-II
Arithmetic and logic unit
2.1. Carry Look Ahead Adders

A digital computer must contain circuits which can perform arithmetic operations such as addition,
subtraction, multiplication, and division. Among these, addition and subtraction are the basic
operations whereas multiplication and division are the repeated addition and subtraction respectively.
To perform these operations ‘Adder circuits’ are implemented using basic logic gates. Adder
circuits are evolved as Half-adder, Full-adder, Ripple-carry Adder, and Carry Look-ahead Adder.

Among these Carry Look-ahead Adder is the faster adder circuit. It reduces the propagation delay,
which occurs during addition, by using more complex hardware circuitry. It is designed by
transforming the ripple-carry Adder circuit such that the carry logic of the adder is changed into two-
level logic.

In parallel adders, carry output of each full adder is given as a carry input to the next higher-order
state. Hence, these adders it is not possible to produce carry and sum outputs of any state unless a
carry input is available for that state.

So, for computation to occur, the circuit has to wait until the carry bit propagated to all states This
induces carry propagation delay in the circuit.

Consider the 4-bit ripple carry adder circuit above. Here the sum S3 can be produced as soon as the
inputs A3 and B3 are given. But carry C3 cannot be computed until the carry bit C2 is applied whereas
C2 depends on C1. Therefore to produce final steady-state results, carry must propagate through all
the states. This increases the carry propagation delay of the circuit.

The propagation delay of the adder is calculated as “the propagation delay of each gate times the
number of stages in the circuit”. For the computation of a large number of bits, more stages have to

25 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

be added, which makes the delay much worse. Hence, to solve this situation, Carry Look-ahead Adder
was introduced.

To understand the functioning of a Carry Look-ahead Adder, a 4-bit Carry Look-ahead Adder is
described below.

In this adder, the carry input at any stage of the adder is independent of the carry bits generated at the
independent stages. Here the output of any stage is dependent only on the bits which are added in the
previous stages and the carry input provided at the beginning stage. Hence, the circuit at any stage
does not have to wait for the generation of carry-bit from the previous stage and carry bit can be
evaluated at any instant of time.

Truth Table of Carry Look-ahead Adder

For deriving the truth table of this adder, two new terms are introduced – Carry generate and carry
propagate. Carry generate Gi =1 whenever there is a carry Ci+1 generated. It depends on Ai and Bi
inputs. Gi is 1 when both Ai and Bi are 1. Hence, Gi is calculated as Gi = Ai.Bi.
Carry propagated Pi is associated with the propagation of carry from Ci to Ci+1. It is calculated as Pi
=Ai. Using the Gi and Pi terms the Sum Si and Carry Ci+1 are given as below –
Si = Pi ⊕ Gi.
Ci+1 = Ci.Pi +Gi.
Therefore, the carry bits C1, C2, C3, and C4 can be calculated as
C1 = C0.P0+G0.
C2 = C1.P1+G1 = ( C0.P0+G0).P1+G1.
C3 = C2.P2+G2 = (C1.P1+G1).P2+G2.
C4 = C3.P3+G3 = C0.P0.P1.P2.P3 + P3.P2.P1.G0 + P3.P2.G1 + G2.P3 + G3.

26 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

It can be observed from the equations that carry Ci+1 only depends on the carry C0, not on the
intermediate carry bits.
Ai ⊕ Bi. The truth table of this adder can be derived from modifying the truth table of a full adder.

Circuit Diagram

The above equations are implemented using two-level combinational circuits along with AND, OR
gates, where gates are assumed to have multiple inputs.

The Carry Look-ahead Adder circuit for 4-bit is given below

27 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Advantages of Carry Look-ahead Adder

In this adder, the propagation delay is reduced. The carry output at any stage is dependent only on
the initial carry bit of the beginning stage. Using this adder it is possible to calculate the
intermediate results. This adder is the fastest adder used for computation.

2.2. Multiplication Algorithm:

• Multiplication of two fixed-point binary numbers in signed-magnitude representation is done with process
of successive shift and adds operations. This process is best illustrated with a numerical example as follows:

2.2.1. Hardware Implementation for Signed-Magnitude Data Multiplication

Following components are required for the Hardware Implementation of multiplication algorithm :

28 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Flowchart of Multiplication:

• Initially, the multiplicand is in register B and the multiplier in Q.


• Initially multiplicand is stored in B register and multiplier is stored in Q register.
• Sign of registers B (Bs) and Q (Qs) are compared using XOR functionality (i.e., if both the signs are
alike, output of XOR operation is 0 unless 1) and output stored in As (sign of A register). Note:
Initially 0 is assigned to register A and E flip flop. Sequence counter is initialized with value n, n is
the number of bits in the Multiplier.

• Now least significant bit of multiplier is checked. If it is 1 add the content of register A with
Multiplicand (register B) and result is assigned in A register with carry bit in flip flop E. Content of
E A Q is shifted to right by one position, i.e., content of E is shifted to most significant bit (MSB) of
A and least significant bit of A is shifted to most significant bit of Q.
• If Qn = 0, only shift right operation on content of E A Q is performed in a similar fashion. Content of
Sequence counter is decremented by 1.
• Check the content of Sequence counter (SC), if it is 0, end the process and the final product is
present in register A and Q, else repeat the process.

29 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

2.2.2. Booth’s Algorithm for Signed Magnitude Data

Booth's algorithm is a fast multiplication algorithm that is used to multiply two signed magnitude binary
numbers. It is a parallel algorithm that uses two's complement representations and shift-and-add operations to
perform multiplication. The algorithm is named after Andrew Donald Booth, who developed it in 1951.

The basic idea behind Booth's algorithm is to take advantage of the representation of signed magnitude binary
numbers to perform multiplication efficiently. The algorithm starts by multiplying the two numbers one bit at
a time, shifting the intermediate result and adding or subtracting the multiplicand as necessary. The result of
the multiplication is then accumulated in a register.

Here is an example of how Booth's algorithm works for the multiplication of two signed magnitude:

30 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Array Multiplier:

An array multiplier is a digital circuit that is used to perform the multiplication of two arrays of numbers. Array
multipliers are used in various applications, including digital signal processing, image processing, and
cryptography. Array multipliers can be implemented in several ways, including using conventional digital
circuits such as full adders and partial product generators, or using specialized hardware such as field-
programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs).

Shift and Add Multiplier

It is similar to the normal multiplication process, which we do in mathematics, from array multiplier flow chat
where X = Multiplicand; Y = Multiplier; A = Accumulator, Q = Quotient. Firstly Q is checked if it’s 1 or no
if it is 1 then add A and B and shift A_Q arithmetic right, else if it is not 1 directly shift A_Q arithmetic right
and decrement N by 1, in the next step check if N is 0 or no. If N not 0 repeats from Q=0 step else terminate
the process.

31 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Construction and Working of a 4×4 Array Multiplier

The design structure of the array Multiplier is regular, it is based on the add shift algorithm principle.

Partial product = the multiplicand * multiplier bit……….(2)

where AND gates are used for the product, the summation is done using Full Adders and Half Adders where
the partial product is shifted according to their bit orders. In an n*n array multiplier, n*n AND gates compute
the partial products and the addition of partial products can be performed by using n* (n – 2) Full adders and
n Half adders. The 4×4 array multiplier shown has 8 inputs and 8 outputs

Advantages of 4×4 Array Multiplier:

• Minimum complexity
• Easily scalable
• Easily pipelined
• Regular shape, easy to place and route

32 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Disadvantages of 4×4 Array Multiplier

• High power consumption


• More digital gates resulting in large areas.

Applications of 4×4 Array Multiplier

• Array multiplier is used to perform the arithmetic operation, like filtering, Fourier transform,
image coding.
• High-speed operation.

2.3. Division Algorithm

A division algorithm provides a quotient and a remainder when we divide two number. They are
generally of two type slow algorithm and fast algorithm. Slow division algorithm are restoring,
non-restoring, non-performing restoring, SRT algorithm and under fast comes Newton–Raphson
and Goldschmidt.

In this article, will be performing restoring algorithm for unsigned integer. Restoring term is due to
fact that value of register A is restored after each iteration.

Here, register Q contain quotient and register A contain remainder. Here, n-bit dividend is loaded
in Q and divisor is loaded in M. Value of Register is initially kept 0 and this is the register whose
value is restored during iteration due to which it is named Restoring.

33 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Step-1: First the registers are initialized with corresponding values (Q = Dividend, M = Divisor, A
= 0, n = number of bits in dividend)

Step-2: Then the content of register A and Q is shifted left as if they are a single unit

Step-3: Then content of register M is subtracted from A and result is stored in A

Step-4: Then the most significant bit of the A is checked if it is 0 the least significant bit of Q is set
to 1 otherwise if it is 1 the least significant bit of Q is set to 0 and value of register A is restored i.e
the value of A before the subtraction with M

Step-5: The value of counter n is decremented

Step-6: If the value of n becomes zero we get of the loop otherwise we repeat from step 2

Step-7: Finally, the register Q contain the quotient and A contain remainder

34 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

2.4. Floating Point Arithmetic Operations:

Arithmetic operations on floating point numbers consist of addition, subtraction, multiplication and
division. The operations are done with algorithms similar to those used on sign magnitude integers
(because of the similarity of representation).
example, only add numbers of the same sign. If the numbers are of opposite sign, must do
subtraction.
ADDITION
Example on decimal value given in scientific notation:
3.25 x 10 ** 3
+ 2.63 x 10 ** -1
first step: align decimal points
second step: add
3.25 x 10 ** 3
+ 0.000263 x 10 ** 3
3.250263 x 10 ** 3 (presumes use of infinite precision, without regard for accuracy)
third step: normalize the result (already normalized!)
Example on floating pt. value given in binary:
.25 = 0 01111101 00000000000000000000000
100 = 0 10000101 10010000000000000000000
To add these fl. pt. representations,
step 1: align radix points
shifting the mantissa left by 1 bit decreases the exponent by 1

shifting the mantissa right by 1 bit increases the exponent by 1


we want to shift the mantissa right, because the bits that fall off the end should come
from the least significant end of the mantissa
-> choose to shift the .25, since we want to increase it’s exponent.
-> shift by 10000101
-01111101

35 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

00001000 (8) places.


0 01111101 00000000000000000000000 (original value)
0 01111110 10000000000000000000000 (shifted 1 place)
(note that hidden bit is shifted into msb of mantissa)
0 01111111 01000000000000000000000 (shifted 2 places)
0 10000000 00100000000000000000000 (shifted 3 places)
0 10000001 00010000000000000000000 (shifted 4 places)
0 10000010 00001000000000000000000 (shifted 5 places)
0 10000011 00000100000000000000000 (shifted 6 places)
0 10000100 00000010000000000000000 (shifted 7 places)
0 10000101 00000001000000000000000 (shifted 8 places)
step 2: add (don’t forget the hidden bit for the 100)
0 10000101 1.10010000000000000000000 (100)
+ 0 10000101 0.00000001000000000000000 (.25)
0 10000101 1.10010001000000000000000
step 3: normalize the result (get the “hidden bit” to be a 1) It already is for this
example.
result is 0 10000101 10010001000000000000000

SUBTRACTION

Same as addition as far as alignment of radix points Then the algorithm for subtraction
of sign mag. numbers takes over.

before subtracting,

compare magnitudes (don’t forget the hidden bit!)


change sign bit if order of operands is changed.
don’t forget to normalize number afterward.
MULTIPLICATION

36 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Example on decimal values given in scientific notation:

3.0 x 10 ** 1

+ 0.5 x 10 ** 2

Algorithm: multiply mantissas

add exponents

3.0 x 10 ** 1

+ 0.5 x 10 ** 2

1.50 x 10 ** 3
Example in binary: Consider a mantissa that is only 4 bits.

0 10000100 0100

x 1 00111100 1100

DIVISION

It is similar to multiplication.

do unsigned division on the mantissas (don’t forget the hidden bit)

subtract TRUE exponents


37 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Flowchart :

2.5. IEEE standard for Floating Point Numbers :

The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-
point computation which was established in 1985 by the Institute of Electrical and Electronics
Engineers (IEEE). The standard addressed many problems found in the diverse floating point
implementations that made them difficult to use reliably and reduced their portability. IEEE
Standard 754 floating point is the most common representation today for real numbers on
computers, including Intel-based PC’s, Macs, and most Unix platforms.

here are several ways to represent floating point number but IEEE 754 is the most efficient in most
cases. IEEE 754 has 3 basic components:

The Sign of Mantissa – This is as simple as the name. 0 represents a positive number while 1
represents a negative number.

38 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

The Biased exponent – The exponent field needs to represent both positive and negative exponents.
A bias is added to the actual exponent in order to get the stored exponent.

The Normalised Mantissa – The mantissa is part of a number in scientific notation or a floating-
point number, consisting of its significant digits. Here we have only 2 digits, i.e. O and 1. So a
normalised mantissa is one with only one 1 to the left of the decimal.

IEEE 754 numbers are divided into two based on the above three components: single precision
and double precision.

Example –

85.125

85 = 1010101

0.125 = 001

85.125 = 1010101.001

=1.010101001 x 2^6

sign = 0

1. Single precision:

biased exponent 127+6=133

39 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

133 = 10000101

Normalised mantisa = 010101001

we will add 0's to complete the 23 bits

The IEEE 754 Single precision is:

= 0 10000101 01010100100000000000000

This can be written in hexadecimal form 42AA4000

2. Double precision:

biased exponent 1023+6=1029

1029 = 10000000101

Normalised mantisa = 010101001

we will add 0's to complete the 52 bits

The IEEE 754 Double precision is:

= 0 10000000101 0101010010000000000000000000000000000000000000000000

This can be written in hexadecimal form 4055480000000000

2.6. ALU Design:

Arithmetic Logic Shift Unit (ALSU) It is a digital circuit that performs logical, arithmetic, and
shift operations. Rather than having individual registers calculating the micro operations directly,
the computer deploys a number of storage registers which is connected to a common operational
unit known as an arithmetic logic unit or ALU.

40 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

We can combine and make one ALU with common selection variables by adding arithmetic, logic,
and shift circuits. We can see the, One stage of an arithmetic logic shift unit in the diagram below.
Some particular micro operations are selected through the inputs S1 and S0.

4 x 1 multiplexer at the output chooses between associate arithmetic output between Ei and a logic
output in Hi. The data in the multiplexer are selected through inputs S3 and S2 and the other two
data inputs to the multiplexer obtain the inputs Ai – 1 for the shr operation and Ai + 1 for
the shl operation.

Note: The output carry Ci + 1 of a specified arithmetic stage must be attached to the input carry Ci
of the next stage in the sequence.

The circuit whose one stage is given in the below diagram provides 8 arithmetic operations, 4 logic
operations, and 2 shift operations, and Each operation is selected by the 5 variables S3, S2, S1, S0,
and Cin.

The below table shows the 14 operations perform by the Arithmetic Logic Unit:

The first 8 are arithmetic operations which are selected by S3 S2 = 00

The next 4 are logic operations which are selected by S3 S2 = 01

The last two are shift operations which are selected by S3 S2 = 10 & 11

41 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

42 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Unit-III
Control Unit
3.1. Instruction Types
The basic computer has 16-bit instruction register (IR) which can denote either memory reference
or register reference or input-output instruction.
Memory Reference – These instructions refer to memory address as an operand. The other operand
is always accumulator. Specifies 12-bit address, 3-bit opcode (other than 111) and 1-bit addressing
mode for direct and indirect addressing.

Register Reference – These instructions perform operations on registers rather than memory
addresses. The IR(14 – 12) is 111 (differentiates it from memory reference) and IR(15) is 0
(differentiates it from input/output instructions). The rest 12 bits specify register operation.

Input/Output – These instructions are for communication between computer and outside
environment. The IR(14 – 12) is 111 (differentiates it from memory reference) and IR(15) is 1
(differentiates it from register reference instructions). The rest 12 bits specify I/O operation.

3.2. Instruction Formats


A computer performs a task based on the instruction provided. Instruction in computers comprises
groups called fields. These fields contain different information as for computers everything is in 0
and 1 so each field has different significance based on which a CPU decides what to perform. The
most common fields are:
• Operation field specifies the operation to be performed like addition.
• Address field which contains the location of the operand, i.e., register or memory location.
• Mode field which specifies how operand is to be founded
.

43 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Instruction is of variable length depending upon the number of addresses it contains. Generally,
CPU organization is of three types based on the number of address fields:
• Single Accumulator organization
• General register organization
• Stack organization
In the first organization, the operation is done involving a special register called the accumulator.
In second on multiple registers are used for the computation purpose. In the third organization the
work on stack basis operation due to which it does not contain any address field. Only a single
organization doesn’t need to be applied, a blend of various organizations is mostly what we see
generally. Based on the number of address, instructions are classified as:
Note that we will use X = (A+B)*(C+D) expression to showcase the procedure.
i. Zero Address Instructions
A stack-based computer does not use the address field in the instruction. To evaluate an expression
first it is converted to reverse Polish Notation i.e. Postfix Notation.
Expression: X = (A+B)*(C+D)
Postfixed : X = AB+CD+*
TOP means top of stack
M[X] is any memory location
PUSH A TOP = A
PUSH B TOP = B
ADD TOP = A+B
PUSH C TOP = C
PUSH D TOP = D
ADD TOP = C+D
MUL TOP = (C+D)*(A+B)
POP X M[X] = TOP

ii. One Address Instructions –


This uses an implied ACCUMULATOR register for data manipulation. One operand is in the
accumulator and the other is in the register or memory location. Implied means that the CPU already
knows that one operand is in the accumulator so there is no need to specify it.

Expression: X = (A+B)*(C+D)

44 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

AC is accumulator
M[] is any memory location
M[T] is temporary location
LOAD A AC = M[A]
ADD B AC = AC + M[B]
STORE T M[T] = AC
LOAD C AC = M[C]
ADD D AC = AC + M[D]
MUL T AC = AC * M[T]
STORE X M[X] = AC

iii. Two Address Instructions –


This is common in commercial computers. Here two addresses can be specified in the instruction.
Unlike earlier in one address instruction, the result was stored in the accumulator, here the result
can be stored at different locations rather than just accumulators, but require more number of bit to
represent address.
Here destination address can also contain operand.
Expression: X = (A+B)*(C+D)
R1, R2 are registers
M[] is any memory location
MOV R1, A R1 = M[A]
ADD R1, B R1 = R1 + M[B]
MOV R2, C R2 = C
ADD R2, D R2 = R2 + D
MUL R1, R2 R1 = R1 * R2
MOV X, R1 M[X] = R1

iv. Three Address Instructions –


This has three address field to specify a register or a memory location. Program created are much
short in size but number of bits per instruction increase. These instructions make creation of
program much easier but it does not mean that program will run much faster because now
instruction only contain more information but each micro operation (changing content of register,
loading address in address bus etc.) will be performed in one cycle only.

45 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Expression: X = (A+B)*(C+D)
R1, R2 are registers
M[] is any memory location
ADD R1, A, B R1 = M[A] + M[B]
ADD R2, C, D R2 = M[C] + M[D]
MUL X, R1, R2 M[X] = R1 * R2

3.3. Instruction Cycle and Sub Cycles


A program residing in the memory unit of a computer consists of a sequence of instructions. These
instructions are executed by the processor by going through a cycle for each instruction.
Instruction Cycle is subdivided into following subcycles:
1. Fetch instruction from memory.(At the beginning of the fetch cycle, the address of the next
instruction to be executed is in the Program Counter(PC))
2. Decode the instruction.(Decoder circuit examines the opcode of the instruction.Result is
selecting a unique decoder output line.)
3. Read the effective address from memory.
4. Execute the instruction(Microcode for the instruction, selected by the decoder output line, is
executed by the ALU.)

State Diagram for Instruction Cycle

46 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Instruction Address Calculation − The address of the next instruction is computed. A permanent
number is inserted to the address of the earlier instruction.
Instruction Fetch − The instruction is read from its specific memory location to the processor.
Instruction Operation Decoding − The instruction is interpreted and the type of operation to be
implemented and the operand(s) to be used are decided.
Operand Address Calculation − The address of the operand is evaluated if it has a reference to an
operand in memory or is applicable through the Input/Output.
Operand Fetch − The operand is read from the memory or the I/O.
Data Operation − The actual operation that the instruction contains is executed.
Store Operands − It can store the result acquired in the memory or transfer it to the I/O.

Complete Computer Operation Flowchart:

47 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

You may be speculative however the central processor is programmed. It contains a special register
— the instruction register — whose bit pattern determines what the central processor unit can do.
Once that action has been completed, the bit pattern within the instruction register may be modified,
and also the central processor unit can perform the operation nominative by this next bit pattern.
Since directions are simply bit patterns, they will be kept in memory. The instruction pointer
register continuously has the memory address of (points to) the next instruction to be executed. so
as for the management unit to execute this instruction, it’s derived into the instruction register. the
case is as follows:
A sequence of instructions is stored in memory.
1. The memory address wherever the first instruction is found is copied to the instruction
pointer.
2. The CPU sends the address within the instruction pointer to memory on the address bus.
3. The CPU sends a “read” signal to the control bus.
4. Memory responds by sending a copy of the state of the bits at that memory location on the
data bus, that the CPU then copies into its instruction register.
5. The instruction pointer is automatically incremented to contain the address of the next
instruction in memory.
6. The CPU executes the instruction within the instruction register.
7. Go to step 3

3.4. Micro-operations
Operation of a computer consists of a sequence of instruction cycles, with one machine instruction
per cycle. Each instruction cycle is made up of a number of smaller units – Fetch, Indirect, Execute
and Interrupt cycles. Each of these cycles involves series of steps, each of which involves the
processor registers. These steps are referred as micro-operations. the prefix micro refers to the fact
that each of the step is very simple and accomplishes very little. Figure below depicts the concept
being discussed here.

48 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

3.5. Execution of a Complete Instructions:


We have discussed about four different types of basic operations:
• Fetch information from memory to CPU
• Store information to CPU register to memory
• Transfer of data between CPU registers.
• Perform arithmetic or logic operation and store the result in CPU registers.

To execute a complete instruction, we need to take help of these basic operations and we need to
execute this operation in some particular order.
As for example, consider the instruction: “Add contents of memory location NUM to the contents
of register R1 and store the result in register R1.” For simplicity, assume that the address NUM is
given explicitly in the address field of the instruction. That is, in this instruction, direct addressing
mode is used.
Execution of this instruction requires the following action :
1. Fetch instruction
2. Fetch first operand (Contents of memory location pointed atby the address field of the
instruction)
3. Perform addition
4. Load the result into R1.

49 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Following sequence of control steps are required to Implement

3.6. Program Control Instructions


Program Control Instructions are the machine code that are used by machine or in assembly
language by user to command the processor act accordingly. These instructions are of various types.
These are used in assembly language by user also. But in level language, user code is translated
into machine code and thus instructions are passed to instruct the processor do the task.
Types of Program Control Instructions: There are different types of Program Control
Instructions:
1. Compare Instruction: Compare instruction is specifically provided, which is similar to a
subtract instruction except the result is not stored anywhere, but flags are set according to the
result. Example:
CMP R1, R2 ;
2. Unconditional Branch Instruction: It causes an unconditional change of execution
sequence to a new location. Example:
JUMP L2
Mov R3, R1 goto L2
3. Conditional Branch Instruction: A conditional branch instruction is used to examine the
values stored in the condition code register to determine whether the specific condition exists
and to branch if it does. Example:
Assembly Code : BE R1, R2, L1
Compiler allocates R1 for x and R2 for y
High Level Code: if (x==y) goto L1;

50 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

4. Subroutines:
A subroutine is a program fragment that lives in user space, performs a well-defined task. It is
invoked by another user program and returns control to the calling program when finished.
Example:
CALL and RET
5. Halting Instructions:
• NOP Instruction – NOP is no operation. It cause no change in the processor state
other than an advancement of the program counter. It can be used to synchronize timing.

• HALT – It brings the processor to an orderly halt, remaining in an idle state until
restarted by interrupt, trace, reset or external action.
6. Interrupt Instructions: Interrupt is a mechanism by which an I/O or an instruction can
suspend the normal execution of processor and get itself serviced.
• RESET – It reset the processor. This may include any or all setting registers to an
initial value or setting program counter to standard starting location.
• TRAP – It is non-maskable edge and level triggered interrupt. TRAP has the
highest priority and vectored interrupt.
• INTR – It is level triggered and maskable interrupt. It has the lowest priority. It
can be disabled by resetting the processor.

3.7. Reduced Instruction Set Computer

RISC stands for Reduced Instruction Set Computer. In Reduced Instruction Set Computer (RISC)
architecture, the instruction set of the computer is simplified to reduce the execution time. RISC
has a small set of instructions, which generally include register-to-register operations.

Thus, data is stored in processor registers for computations, and results of the computations are
transferred to the memory using store instructions. All operations are performed within the registers
of the CPU. In RISC, all instructions have simple register addressing and hence use less number of
addressing modes.

RISC uses relatively a simple instruction format and is easy to decode. Here, the instruction length
can be fixed and aligned on word boundaries. The RISC processors can execute one instruction per
clock cycle.

51 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

This is done using pipelining, which involves overlapping the fetch, decode, and execute phases of
two or three instructions. As RISC takes relatively a large number of registers in the processor unit,
it takes less time to execute its program when compared to CISC.

Features of RISC Processor

• It can relatively few instructions.


• It can relatively few addressing modes.
• It is used for memory access limited to load and store instructions.
• All operations are done within the registers of the CPU.
• It can fixed-length, easily decoded instruction format.
• It is used for single-cycle instruction execution.
• It can be hardwired rather than micro-programmed control.

3.8. Pipelining:
To improve the performance of a CPU we have two options:
1) Improve the hardware by introducing faster circuits.
2) Arrange the hardware such that more than one operation can be performed at the same time.
Since there is a limit on the speed of hardware and the cost of faster circuits is quite high, we have
to adopt the 2nd option.
Pipelining is a process of arrangement of hardware elements of the CPU such that its overall
performance is increased. Simultaneous execution of more than one instruction takes place in a
pipelined processor. Let us see a real-life example that works on the concept of pipelined operation.
Consider a water bottle packaging plant. Let there be 3 stages that a bottle should pass through,
Inserting the bottle(I), Filling water in the bottle(F), and Sealing the bottle(S). Let us consider these
stages as stage 1, stage 2, and stage 3 respectively. Let each stage take 1 minute to complete its
operation. Now, in a non-pipelined operation, a bottle is first inserted in the plant, after 1 minute it
is moved to stage 2 where water is filled. Now, in stage 1 nothing is happening. Similarly, when
the bottle moves to stage 3, both stage 1 and stage 2 are idle. But in pipelined operation, when the
bottle is in stage 2, another bottle can be loaded at stage 1. Similarly, when the bottle is in stage 3,
there can be one bottle each in stage 1 and stage 2. So, after each minute, we get a new bottle at the
end of stage 3. Hence, the average time taken to manufacture 1 bottle is:
Without pipelining = 9/3 minutes = 3m

52 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

IFS||||||
|||IFS|||
| | | | | | I F S (9 minutes)
With pipelining = 5/3 minutes = 1.67m

What is RISC Pipeline in Computer Architecture?


RISC stands for Reduced Instruction Set Computers. It was introduced to execute as fast as one
instruction per clock cycle. This RISC pipeline helps to simplify the computer architecture’s design.
It relates to what is known as the Semantic Gap, that is, the difference between the operations
provided in the high-level languages (HLLs) and those provided in computer architectures.
To avoid these consequences, the conventional response of the computer architects is to add layers
of complexity to newer architectures. This also increases the number and complexity of instructions
together with an increase in the number of addressing modes. The architecture which resulted from
the adoption of this “add more complexity” are known as Complex Instruction Set Computers (CISC).
The main benefit of RISC to implement instructions at the cost of one per clock cycle is continually
not applicable because each instruction cannot be fetched from memory and implemented in one
clock cycle correctly under all circumstances.
The method to obtain the implementation of an instruction per clock cycle is to initiate each
instruction with each clock cycle and to pipeline the processor to manage the objective of single-cycle
instruction execution.
RISC compiler gives support to translate the high-level language program into a machine language
program. There are various issues in managing complexity about data conflicts and branch penalties
are taken care of by the RISC processors, which depends on the adaptability of the compiler to identify
and reduce the delays encountered with these issues.

Principles of RISCs Pipeline


There are various principles of RISCs pipeline which are as follows −
• Keep the most frequently accessed operands in CPU registers.
• It can minimize the register-to-memory operations.
• It can use a high number of registers to enhance operand referencing and decrease the
processor memory traffic.
• It can optimize the design of instruction pipelines such that minimum compiler code
generation can be achieved.

53 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• It can use a simplified instruction set and leave out those complex and unnecessary
instructions.
• Let us consider a three-segment instruction pipeline that shows how a compiler can optimize
the machine language program to compensate for pipeline conflicts.
• A frequent collection of instructions for a RISC processor is of three types are as follows
• Data Manipulation Instructions − Manage the data in processor registers.
• Data Transfer Instructions − These are load and store instructions that use an effective address
that is obtained by adding the contents of two registers or a register and a displacement
constant provided in the instruction.
• Program Control Instructions − These instructions use register values and a constant to
evaluate the branch address, which is transferred to a register or the program counter (PC).

3.9. Hardwire and Micro Programmed control


Control Units are classified into two major categories:
• Hardwired Control
• Microprogrammed Control

3.9.1. Hardwired Control Unit

54 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Here inputs are status signals, i.e., any instructions’ particular status and clock input, and the output
we get is control signals. Instruction registers are used to keep track of current instruction and the
next instruction.
Control memory is absent in this control unit. This control unit uses RISC (Reduced Instruction Set
Computer) microprocessors.
Factors Considered for the design of the hardwired control unit.
• Amount of hardware - Minimise the number of hardware used.
• Speed of operation - If a single IC can replace a group of IC, replace it. The amount of
hardware and speed of operation are inversely proportional to each other.
• Cost
Characteristics of Hardwired Control Unit
• The characteristics of Hardwired Control Unit are following:
• Combinational circuits are the foundation of hardwired control units.
• The inputs and transformations are set into control signals in this type of system.
• These units are known to be quicker and to have a more intricate structure.
Designing of Hardwired Control Unit
State-table method - This classical sequential design method minimises hardware and constructs a
state transition table. Every generation of states has a set of control signals. This method faces the
issue of desynchronisation.
Delay Element Method - Control signals follow a proper sequence. There is a specific time delay
between two groups of control signals. D flip flops are controlled by a standard clock signal to ensure
synchronisation.
Sequence Counter Method - It uses counter for timing purposes.
PLA method - It uses a programmable logic array to generate control signals.

Advantages of Hardwired Control Unit


• Extremely fast
• Instruction set size is small as hardwired control unit relies on hardware more.
• The rapid mode of operation can be produced by optimising it.

Disadvantages of Hardwired Control Unit


• Modification becomes tougher as each time we have to play with complex circuits.
55 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• Difficult to handle complex instructions.


• Design is complicated, and decoding is complex.

3.9.2. Microprogrammed Control Unit


As the name suggests, these control units are designed with the help of a micro-program. This micro-
program is a collections of micro-instructions stored in the control memory. This control unit uses
CISC (Complex Instruction Set Computer) microprocessors.
Example of micro-instruction:
MAR←R3
In the above instruction, we are fetching the operand.
The control signal for the above example:
MARᵢₙ, R3ₒᵤₜ
A micro-instruction consist of one or more micro-operations to be executed and the address of the
next micro-instruction.

If the control memory grows horizontally, due to an increase in control signals, it is referred to as
horizontal microprogramming. If the control memory grows vertically, due to an increase in control
signals, and the bits are decoded then it is referred to as vertical microprogramming. If we are using
two-level control memory providing both the advantage of horizontal and vertical
microprogramming, then it is referred to as nanoprogramming. Memory access time in nono
programming is increased as two levels are to be traversed.

56 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Micro programmed control unit with a single level control store passes instruction from the main
memory to the instruction register, then microinstruction address generation unit sends the data to the
address register, from where it is decoded and sent to the control store. Then the data is stored in the
operations part of the micro-instruction register and after decoding in understandable form, it is
received in the form of the control signal.

Characteristics of Microprogrammed Control Unit


• The characteristics of Microprogrammed Control Unit are following:
• Micro programmes of procedures are used to implement these control units.
• The control unit used in the micro programme is a CPU nested inside of another CPU.
• These circuits are straightforward yet relatively sluggish.
• Advantages of Micro-programmed Control Unit
• Micro-program can be updated easily.
• Flexible.
• Better in terms of scalability than hardwired.
• Easier to handle complex instructions.
• Design is systematic, and decoding is pretty much straightforward.
• Disadvantages of the Micro-programmed Control Unit
• Hardware cost is more because of control memory and its access circuitry.
• Slower than a hardwired control unit.
• Instruction set size is comparatively large as this relies on microprogramming.
57 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

3.10. Types of Micro-programmed Control Unit: Based on the type of Control Word stored in the
Control Memory (CM), it is classified into two types.
3.10.1. Horizontal Micro-programmed Control Unit :
The control signals are represented in the decoded binary format that is 1 bit/CS. Example: If 53
Control signals are present in the processor then 53 bits are required. More than 1 control signal
can be enabled at a time.
• It supports longer control words.
• It is used in parallel processing applications.
• It allows a higher degree of parallelism. If degree is n, n CS is enabled at a time.
• It requires no additional hardware(decoders). It means it is faster than Vertical
Microprogrammed.
• It is more flexible than vertical microprogrammed
3.10.2. Vertical Micro-programmed Control Unit :
The control signals are represented in the encoded binary format. For N control signals- Log2(N)
bits are required.
• It supports shorter control words.
• It supports easy implementation of new control signals therefore it is more flexible.
• It allows a low degree of parallelism i.e., the degree of parallelism is either 0 or 1.
• Requires additional hardware (decoders) to generate control signals, it implies it is
slower than horizontal microprogrammed.
• It is less flexible than horizontal but more flexible than that of a hardwired control
unit.

58 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Unit-IV
Memory
4.1. Basic Concepts of Memory Hierarchy

In the Computer System Design, Memory Hierarchy is an enhancement to organize the memory
such that it can minimize the access time. The Memory Hierarchy was developed based on a
program behavior known as locality of references.The figure below clearly demonstrates the
different levels of memory hierarchy :

This Memory Hierarchy Design is divided into 2 main types:

1. External Memory or Secondary Memory – Comprising of Magnetic Disk, Optical Disk,


Magnetic Tape i.e. peripheral storage devices which are accessible by the processor via I/O
Module.
2. Internal Memory or Primary Memory – Comprising of Main Memory, Cache Memory &
CPU registers. This is directly accessible by the processor.
We can infer the following characteristics of Memory Hierarchy Design from above figure:

1. Capacity: It is the global volume of information the memory can store. As we move from top
to bottom in the Hierarchy, the capacity increases.
2. Access Time: It is the time interval between the read/write request and the availability of the
data. As we move from top to bottom in the Hierarchy, the access time increases.

59 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

3. Performance: Earlier when the computer system was designed without Memory Hierarchy
design, the speed gap increases between the CPU registers and Main Memory due to large
difference in access time.
4. Cost per bit: As we move from bottom to top in the Hierarchy, the cost per bit increases i.e.
Internal Memory is costlier than External Memory.
4.2. Semi-Conductor Memories
A device for storing digital information that is fabricated by using integrated circuit technology is
known as semiconductor memory. Also known as integrated-circuit memory, large-scale integrated
memory, memory chip, semiconductor storage, transistor memory.
Definition:- Semiconductor memory is the main memory element of a microcomputer-based system
and is used to store program and data. The main memory elements are nothing but semiconductor
devices that stores code and information permanently. The semiconductor memory is directly
accessible by the microprocessor. And the access time of the data present in the primary memory
must be compatible with the operating time of the microprocessor.
Thus semiconductor devices are preferred as primary memory. With the rapid growth in the
requirement for semiconductor memories there have been a number of technologies and types of
memory that have emerged. Names such as ROM, RAM, EPROM, EEPROM, Flash memory,
DRAM, SRAM, SDRAM, and the very new MRAM can now be seen in the electronics literature.
Each one has its own advantages and area in which it may be use.

Types of semiconductor memory : Electronic semiconductor memory technology can be split into
two main types or categories, according to the way in which the memory operates :
1. RAM - Random Access Memory
2. ROM - Read Only Memory

60 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

1. Random Access Memory (RAM)


As the names suggest, the RAM or random access memory is a form of semiconductor
memorytechnology that is used for reading and writing data in any order - in other words as it is
required by theprocessor. It is used for such applications as the computer or processor memory where
variables and otherstorage are required on a random basis. Data is stored and read many times to and
from this type ofmemory.
Random access memory is used in huge quantities in computer applications as current day
computingand processing technology requires large amounts of memory to enable them to handle the
memoryhungry applications used today. Many types of RAM including SDRAM with its DDR3,
DDR4, and soon DDR5variants are used in huge quantities.
a. DRAM
Dynamic RAM is a form of random access memory. DRAM uses a capacitor to store each bit of
data, andthe level of charge on each capacitor determines whether that bit is a logical 1 or 0. However
thesecapacitors do not hold their charge indefinitely, and therefore the data needs to be refreshed
periodically.As a result of this dynamic refreshing it gains its name of being a dynamic RAM.
DRAM is the form of semiconductor memory that is often used in equipment including personal
computersand workstations where it forms the main RAM for the computer. The semiconductor
devices are normallyavailable as integrated circuits for use in PCB assembly in the form of surface
mount devices or lessfrequently now as leaded components.
Disadvantages of DRAM.
• Complex manufacturing process.
• Data requires refreshing.
• More complex external circuitry required (read and refresh periodically).
• Volatile memory.
• Relatively slow operational speed.

b. SRAM
SRAM is stands for Static Random Access Memory. This form of semiconductor memory gains its
name from the fact that, unlike DRAM, the data does not need to be refreshed dynamically. These
semiconductor devices are able to support faster read and write times than DRAM (typically 10 ns
against 60 ns for DRAM),and in addition its cycle time is much shorter because it does not need to
pause between accesses.

61 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

However they consume more power, they are less dense and more expensive than DRAM. As a result
of this SRAM is normally used for caches, while DRAM is used as the main semiconductor memory
technology.
SDRAM
Synchronous DRAM. This form of semiconductor memory can run at faster speeds than
conventional DRAM. It is synchronized to the clock of the processor and is capable of keeping two
sets of memory addresses open simultaneously. By transferring data alternately from one set of
addresses, and then the other, SDRAM cuts down on the delays associated with non-synchronous
RAM, which must close one address bank before opening the next.
Within the SDRAM family there are several types of memory technologies that are seen. These are
referred to by the letters DDR - Double Data Rate. DDR4 is currently the latest technology, but this
is soon to be followed by DDR5 which will offer some significant improvements in performance.
MRAM
This is Magneto-resistive RAM, or Magnetic RAM. It is a non-volatile RAM memory technology
that uses magnetic charges to store data instead of electric charges. Unlike technologies including
DRAM, which require a constant flow of electricity to maintain the integrity of the data, MRAM
retains data even when the power is removed.
An additional advantage is that it only requires low power for active operation. As a result this
technology could become a major player in the electronics industry now that production processes
have been developed to enable it to be produced.

2. Read Only Memory (ROM)


A ROM is a form of semiconductor memory technology used where the data is written once and then
not changed. In view of this it is used where data needs to be stored permanently, even when the
power is removed - many memory technologies lose the data once the power is removed. As a result,
this type of semiconductor memory technology is widely used for storing programs and data that must
survive when a computer or processor is powered down.
For example, the BIOS of a computer will be stored in ROM. As the name implies, data cannot be
easily written to ROM.
This stands for Programmable Read Only Memory. It is a semiconductor memory which can only
have data written to it once, the data written to it is permanent. These memories are bought in a blank
format and they are programmed using a special PROM programmer. Typically a PROM will consist

62 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

of an array off useable links some of which are "blown" during the programming process to provide
the required data pattern.
The PROM stores its data as a charge on a capacitor. There is a charge storage capacitor for each cell
and this can be read repeatedly as required. However it is found that after many years the charge may
leak away and the data may be lost. Nevertheless, this type of semiconductor memory used to be
widely used in applications where a form of ROM was required, but where the data needed to be
changed periodically, asin a development environment, or where quantities were low.
a. EPROM
This is an Erasable Programmable Read Only Memory. This form of semiconductor memory can
be programmed and then erased at a later time. This is normally achieved by exposing the silicon to
ultraviolet light. To enable this to happen there is a circular window in the package of the EPROM to
enable the light to reach the silicon of the chip. When the PROM is in use, this window is normally
covered by a label, especially when the data may need to be preserved for an extended period.
b. EEPROM
This is an Electrically Erasable Programmable Read Only Memory. Data can be written to it and
it can be erased using an electrical voltage. This is typically applied to an erase pin on the chip. Like
other types of PROM, EEPROM retains the contents of the memory even when the power is turned
off. Also like other types of ROM, EEPROM is not as fast as RAM. EEPROM memory cells are
made from floating-gate MOSFETS (known as FGMOS).
c. Flash memory
Flash memory may be considered as a development of EEPROM technology. Data can be written to
it and it can be erased, although only in blocks, but data can be read on an individual cell basis. To
erase and re-program areas of the chip, programming voltages at levels that are available within
electronic equipment are used. It is also non-volatile, and this makes it particularly useful. As a result
Flash memory is widely used in many applications including memory cards for digital cameras,
mobile phones, computer memory sticks and many other applications.
Flash memory stores data in an array of memory cells. The memory cells are made from floating-
gate MOSFETS (known as FGMOS). These FG MOSFETs (or FGMOS in short) have the ability to
store an electrical charge for extended periods of time (2 to 10 years) even without a connecting to a
power supply.
Disadvantages of Flash Memory
• Higher cost per bit than hard drives
• Slower than other forms of memory

63 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• Limited number of write / erase cycles


• Data must be erased before new data can be written
• Data typically erased and written in blocks

d. PCM
This type of semiconductor memory is known as Phase change Random Access Memory, P-RAM
or just Phase Change memory, PCM. It is based around a phenomenon where a form of chalcogenide
glass changes is state or phase between an amorphous state (high resistance) and a polycrystalline
state (low resistance). It is possible to detect the state of an individual cell and hence use this for data
storage. Currently this type of memory has not been widely commercialized, but it is expected to be
a competitor for flash memory.
Semiconductor memory technology is developing at a fast rate to meet the ever-growing needs of the
electronics industry. Not only are the existing technologies themselves being developed, but
considerable amounts of research are being invested in new types of semiconductor memory
technology. In terms of the memory technologies currently in use, SDRAM versions like DDR4 are
being further developed to provide DDR5 which will offer significant performance improvements. In
time, DDR5 will be developed to provide the next generation of SDRAM. Other forms of memory
are seen around the home in the form of USB memory sticks, Compact Flash, CF cards or SD memory
cards for cameras and other applications as well as solid state hard drives for computers. The
semiconductor devices are available in a wide range of formats to meet the differing PCB assembly
and other needs.

4.3. Semi Conductor RAM Memories


Semiconductor memories are available in a wide range of speeds. Their cycle times range from
l00ns to less than 10 ns.

INTERNAL ORGANIZATION OF MEMORY CHIPS:

Memory cells are usually organized in the form of array, in which each cell is capable of storing one bit of
information. Each row of cells constitute a memory word and all cells of a row are connected to a common
line called as word line. The cells in each column are connected to Sense / Write circuit by two bit lines. The
Sense / Write circuits are connected to data input or output lines of the chip. During a write operation, the
sense / write circuit receive input information and store it in the cells of the selected word.

64 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

The data input and data output of each senses / write ckt are connected to a single bidirectional data line that
can be connected to a data bus of the cptr.
R / W → Specifies the required operation.
CS → Chip Select input selects a given chip in the multi-chip memory system

4.4. 2D & 2 1/2D memory organization

2D Memory organization – In 2D organization, memory is divided in the form of rows and


columns(Matrix). Each row contains a word, now in this memory organization, there is a decoder.
A decoder is a combinational circuit that contains n input lines and 2 n output lines. One of the
output lines selects the row by the address contained in the MAR and the word which is represented
by that row gets selected and is either read or written through the data lines.

65 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

2.5 D Memory organization

In 2.5D Organization the scenario is the same but we have two different decoders one is a column
decoder and another is a row decoder. Column decoder is used to select the column and a row
decoder is used to select the row. The address from the MAR goes as the decoders’ input. Decoders
will select the respective cell through the bit outline, then the data from that location will be read
or through the bit, inline data will be written at that memory location.

66 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Read and Write Operations –


1. If the select line is in Reading mode then the Word/bit which is represented by the MAR will
be available to the data lines and will get read.
2. If the select line is in write mode then the data from the memory data register (MDR) will be
sent to the respective cell which is addressed by the memory address register (MAR).
3. With the help of the select line, we can select the desired data and we can perform read and
write operations on it.
Comparison between 2D & 2.5D Organizations –
1. In 2D organization hardware is fixed but in 2.5D hardware changes.
2. 2D Organization requires more gates while 2.5D requires less.
3. 2D is more complex in comparison to the 2.5D organization.
4. Error correction is not possible in the 2D organization but in 2.5D it could be done easily.
5. 2D is more difficult to fabricate in comparison to the 2.5D organization.

4.5. Cache Memory in Computer Organization

Cache Memory is a special very high-speed memory. It is used to speed up and synchronize with
high-speed CPU. Cache memory is costlier than main memory or disk memory but more
economical than CPU registers. Cache memory is an extremely fast memory type that acts as a
buffer between RAM and the CPU. It holds frequently requested data and instructions so that they
are immediately available to the CPU when needed. Cache memory is used to reduce the average
time to access data from the Main memory. The cache is a smaller and faster memory that stores
copies of the data from frequently used main memory locations. There are various different
independent caches in a CPU, which store instructions and data.

67 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Levels of memory:
• Level 1 or Register – It is a type of memory in which data is stored and accepted that are
immediately stored in CPU. Most commonly used register is accumulator, Program counter,
address register etc.
• Level 2 or Cache memory – It is the fastest memory which has faster access time where data
is temporarily stored for faster access.
• Level 3 or Main Memory – It is memory on which computer works currently. It is small in
size and once power is off data no longer stays in this memory.
• Level 4 or Secondary Memory – It is external memory which is not as fast as main memory
but data stays permanently in this memory.

Cache Performance: When the processor needs to read or write a location in main memory, it first
checks for a corresponding entry in the cache.
• If the processor finds that the memory location is in the cache, a cache hit has occurred and
data is read from the cache.
• If the processor does not find the memory location in the cache, a cache miss has occurred. For
a cache miss, the cache allocates a new entry and copies in data from main memory, then the
request is fulfilled from the contents of the cache.
The performance of cache memory is frequently measured in terms of a quantity called Hit
ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
We can improve Cache performance using higher cache block size, and higher associativity,
reduce miss rate, reduce miss penalty, and reduce the time to hit in the cache.
Cache Mapping: There are three different types of mapping used for the purpose of cache memory
which is as follows: Direct mapping, Associative mapping, and Set-Associative mapping. These
are explained below.
A. Direct Mapping
• The simplest technique, known as direct mapping, maps each block of main memory into
only one possible cache line. or In Direct mapping, assign each memory block to a specific
line in the cache. If a line is previously taken up by a memory block when a new block needs
to be loaded, the old block is trashed. An address space is split into two parts index field
and a tag field. The cache is used to store the tag field whereas the rest is stored in the main
memory. Direct mapping`s performance is directly proportional to the Hit ratio.

68 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

i = j modulo m
where
i=cache line number
j= main memory block number
m=number of lines in the cache

For purposes of cache access, each main memory address can be viewed as consisting of three
fields. The least significant w bits identify a unique word or byte within a block of main memory.
In most contemporary machines, the address is at the byte level. The remaining s bits specify one
of the 2s blocks of main memory. The cache logic interprets these s bits as a tag of s-r bits (most
significant portion) and a line field of r bits. This latter field identifies one of the m=2 r lines of the
cache. Line offset is index bits in the direct mapping

B. Associative Mapping
In this type of mapping, the associative memory is used to store content and addresses of the
memory word. Any block can go into any line of the cache. This means that the word id bits are
used to identify which word in the block is needed, but the tag becomes all of the remaining bits.

69 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

This enables the placement of any word at any place in the cache memory. It is considered to be
the fastest and the most flexible mapping form. In associative mapping the index bits are zero.

C. Set-associative Mapping
This form of mapping is an enhanced form of direct mapping where the drawbacks of direct
mapping are removed. Set associative addresses the problem of possible thrashing in the direct
mapping method. It does this by saying that instead of having exactly one line that a block can map
to in the cache, we will group a few lines together creating a set. Then a block in memory can map
to any one of the lines of a specific set. Set-associative mapping allows that each word that is
present in the cache can have two or more words in the main memory for the same index address.
Set associative cache mapping combines the best of direct and associative cache mapping
techniques. In set associative mapping the index bits are given by the set offset bits. In this case,
the cache consists of a number of sets, each of which consists of a number of lines. The relationships
are
m=v*k
i= j mod v
where
i=cache set number
j=main memory block number
v=number of sets
m=number of lines in the cache number of sets
k=number of lines in each set

70 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Application of Cache Memory:


1. Usually, the cache memory can store a reasonable number of blocks at any given time, but this
number is small compared to the total number of blocks in the main memory.
2. The correspondence between the main memory blocks and those in the cache is specified by a
mapping function.
3. Primary Cache – A primary cache is always located on the processor chip. This cache is
small and its access time is comparable to that of processor registers.
4. Secondary Cache – Secondary cache is placed between the primary cache and the rest of the
memory. It is referred to as the level 2 (L2) cache. Often, the Level 2 cache is also housed on
the processor chip.
5. Spatial Locality of reference This says that there is a chance that the element will be present
in close proximity to the reference point and next time if again searched then more close
proximity to the point of reference.
6. Temporal Locality of reference In this Least recently used algorithm will be used. Whenever
there is page fault occurs within a word will not only load word in main memory but complete
71 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

page fault will be loaded because the spatial locality of reference rule says that if you are
referring to any word next word will be referred to in its register that’s why we load complete
page table so the complete block will be loaded.

4.6. Auxiliary Memory

An Auxiliary memory is referred to as the lowest-cost, highest-space, and slowest-approach storage


in a computer system. It is where programs and information are preserved for long-term storage or
when not in direct use. The most typical auxiliary memory devices used in computer systems are
magnetic disks and tapes.

Magnetic Disks

A magnetic disk is a type of memory constructed using a circular plate of metal or plastic coated with
magnetized materials. Usually, both sides of the disks are used to carry out read/write operations.
However, several disks may be stacked on one spindle with read/write head available on each surface.

The following image shows the structural representation for a magnetic disk.

• The memory bits are stored in the magnetized surface in spots along the concentric circles
called tracks.
• The concentric circles (tracks) are commonly divided into sections called sectors.

72 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Magnetic Tape

Magnetic tape is a storage medium that allows data archiving, collection, and backup for different
kinds of data. The magnetic tape is constructed using a plastic strip coated with a magnetic recording
medium. The bits are recorded as magnetic spots on the tape along several tracks. Usually, seven or
nine bits are recorded simultaneously to form a character together with a parity bit.

Magnetic tape units can be halted, started to move forward or in reverse, or can be rewound. However,
they cannot be started or stopped fast enough between individual characters. For this reason,
information is recorded in blocks referred to as records.

Optical Disc

An optical disc is an electronic data storage medium that is also referred to as an optical disk, optical
storage, optical media, Optical disc drive, disc drive, which reads and writes data by using optical
storage techniques and technology. An optical disc, which may be used as a portable and secondary
storage device, was first developed in the late 1960s. James T. Russell invented the first optical disc,
which could store data as micron-sized light and dark dots.

73 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

An optical disc can store more data and has a longer lifespan than the preceding generation of
magnetic storage medium. To read and write to CDs and DVDs, computers use a CD writer or DVD
writer drive, and to read and write to Blu-ray discs, they require a Blu-ray drive. MO drives, such as
CD-R and DVD-R drives, are used to read and write information to discs (magneto-optic). The CDs,
Blu-ray, and DVDs are the most common types of optical media, which are usually used to:

o They are used to transfer data to various devices or computers.


o These media are used to deliver the software to others.
o They help users to hold large amounts of data, like videos, photos, music, and more.
o Also, optical media are used to get back up from a local machine.

With the introduction of an all-new generation of optical media, the storage capacity to store data has
increased. CDs have the potential to store 700 MB of data, whereas DVDs allow you to store up to
8.4 GB of data. Blu-ray discs, the newest type of optical media, can hold up to 50 GB of data. This
storage capacity is the most convenient benefit as compared to the floppy disk storage media, which
can store up to 1.44 MB of data

Additionally, a Blu-ray, a new type of optical media that can read CDs, DVDs, and Blu-ray discs. In
other words, older drives are not able to read newer optical discs, but the latest drives have the ability
to read older optical discs.

Different Kinds of Optical Drives

Optical drives are disk-based drives that were introduced to the market in the 1980s to allow for
increased storage capacity and faster read and write times. There are multiple kinds of optical media,
which are discussed below:

• CD-ROM

CD-ROM, short for compact disk read-only memory, was the first disk on the basis of drives for the
latest PCs. CD-ROM devices populate Compact Disk Filing System discs with data encoded in ISO
9660. To reduce noise and increase stability, most CD-ROM drives in computers run at a slower
speed, and if the drive experiences read errors, it will only speed up for larger data files. However,
the newest CD-ROM drives have the potential to achieve read speeds of 60 revolutions in a second
(60x).

74 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• DVD-ROM

DVD-ROM drives, which stand for Digital Versatile Disk Read Only Memory and are a direct
evolution from CD-ROM drives, have significantly more performance and capacity than their CD
counterparts while maintaining the same physical dimensions. The DVD Forum is a non-profit
organization that establishes several standards for DVD functionality and construction, as well as
overseeing DVD development.

• Blu-ray

In the commercial market, Blu-ray drives are the newest drives available as of 2011. During the early
2000s, Sony developed the Blu-ray technology that was one of the founding proponents

• RW Drives

The rewritable drive types are Blu-ray drives, DVD-ROMs, and CD-ROMs. All the functionalities
of read-only counterparts are available in RW drives. Write processes are particularly sensitive to
shock and can ruin the disc beyond repair if forcibly interrupted; write speeds are slower to preserve
stability than read speeds. Writable disks come in multiple-time write and one-time write variations;
however, RW drives can write multiple times.

Advantages of Optical Disk

• Cost

Only plastics and aluminum foils are used in the production of an optical disk, which makes their
manufacturing cost less expensive. Therefore, users get the advantage to purchase optical disks in
bulk, and also, the optical disk drive is included with many computers by their manufacturers, and
users can be benefited from purchasing optical disk drives separately.

• Durability

While comparing with Volatile and Non-Volatile memories, optical disks are more durable. It is not
caused to data losses due to any power failure and is not subjected to wear. Hence, it can run a long
time. However, it is not much safe from any physical damages, including against scratching and heat.

75 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• Simplicity

With the help of using optical disks, the process of backup of data is much easier. The data should be
placed inside the drive icon; the data, which needs to be burnt. And, the users can backup data easily
only by clicking on "Burn Disk."

• Stability

A very high level of stability is provided by optical disk because it is not unprotected from
electromagnetic fields and other kinds of environmental influences, unlike magnetic disks.

Disadvantages of Optical Disk

Although it has numerous advantages, it also contains some limitations that are discussed below:

• Security

Optical disks need to keep safe from the hands of thieves when you are using them for backup
purposes. If thieves get success in stealing optical disks, it can be more harmful. Hence, it may provide
insecurity; and due to its size, there are more chances to optical disk more prone to lose and theft.

• Reliability

Unlike flash drives, any plastic casings can be caused to damage optical disks. Therefore, they make
the disk unreadable as they are prone to scratching the disks. The data stored on an optical disk cannot
be recovered anymore.

• Capacity

As compared to other forms of storage drives, the cost of optical disks is high per GB/TB. Also, while
comparing with any other forms of storage medias, it has less storage capacity. Except the Blu-ray
disk, the maximum storage capacity of optical disk is 4.7GB.

• Duplication

Like a USB flash drive, it is not easy to make a duplicate copy by using an optical disk. There is
needed a separate software and hardware in order to process of burning. For this purpose, also, there

76 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

are multiple third-party programs available. Writing software is furnished with the newest version of
windows.

4.7. Virtual Memory

Virtual memory is a valuable concept in computer architecture that allows you to run large,
sophisticated programs on a computer even if it has a relatively small amount of RAM. A computer
with virtual memory artfully juggles the conflicting demands of multiple programs within a fixed
amount of physical memory. A PC that's low on memory can run the same programs as one with
abundant RAM, although more slowly.

Virtual memory works similarly, but at one level up in the memory hierarchy. A memory
management unit (MMU) transfers data between physical memory and some gradual storage device,
generally a disk. This storage area can be defined as a swap disk or swap file, based on its execution.
Retrieving data from physical memory is much faster than accessing data from the swap disk.

virtual memory can be configured using any of the following technique:

1. Paging
Paging is a technique of memory management where small fixed-length pages are allocated instead
of a single large variable-length contiguous block in the case of the dynamic allocation technique. In
a paged system, each process is divided into several fixed-size ‘chunks’ called pages, typically 4k
bytes in length. The memory space is also divided into blocks of the equal size known as frames.

77 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Advantages of Paging

There are the following advantages of Paging are −

• In Paging, there is no requirement for external fragmentation.


• In Paging, the swapping among equal-size pages and page frames is clear.
• Paging is a simple approach that it can use for memory management.

Disadvantage of Paging

There are the following disadvantages of Paging are −

• In Paging, there can be a chance of Internal Fragmentation.


• In Paging, the page table employs more memory.
• Because of Multi-level Paging, there can be a chance of memory reference overhead.
2. Segmentation
The partition of memory into logical units called segments, according to the user’s perspective is
called segmentation. Segmentation allows each segment to grow independently, and share. In other
words, segmentation is a technique that partition memory into logically related units called a
segment. It means that the program is a collection of the segment.

Unlike pages, segments can vary in size. This requires the MMU to manage segmented memory
somewhat differently than it would manage paged memory. A segmented MMU contains a segment
table to maintain track of the segment’s resident in memory.

A segment can initiate at one of several addresses and can be of any size, each segment table entry
should contain the start address and segment size. Some system allows a segment to start at any
address, while other limits the start address. One such limit is found in the Intel X86 architecture,
which requires a segment to start at an address that has 6000 as its four low-order bits.

78 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Unit-V

Input / Output

5.1. Peripheral Devices


A Peripheral Device is defined as the device which provides input/output functions for a computer
and serves as an auxiliary computer device without computing-intensive functionality.
Generally peripheral devices, however, are not essential for the computer to perform its basic tasks,
they can be thought of as an enhancement to the user’s experience. A peripheral device is a device
that is connected to a computer system but is not part of the core computer system architecture.
Generally, more people use the term peripheral more loosely to refer to a device external to the
computer case

1. Input Devices: The input devices is defined as it converts incoming data and instructions into
a pattern of electrical signals in binary code that are comprehensible to a digital computer
Example: Keyboard, mouse, scanner, microphone etc.
2. Output Devices: An output device is generally reverse of the input process and generally
translating the digitized signals into a form intelligible to the user. The output device is also
performed for sending data from one computer system to another. For some time punched-
card and paper-tape readers were extensively used for input, but these have now been
supplanted by more efficient devices.
Example: Monitors, headphones, printers etc.
3. Storage Devices: Storage devices are used to store data in the system which is required
for performing any operation in the system. The storage device is one of the most
requirement devices and also provide better compatibility.
Example: Hard disk, magnetic tape, Flash memory etc.

Advantage of Peripherals Devices:


• It is helpful for taking input very easily.
• It is also provided a specific output.
• It has a storage device for storing information or data
• It also improves the efficiency of the system.

79 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

5.2. I/O Interface


The I/O interface supports a method by which data is transferred between internal storage and external
I/O devices. All the peripherals connected to a computer require special communication connections
for interfacing them with the CPU.

I/O Bus and Interface Modules: The I/O bus is the route used for peripheral devices to interact with
the computer processor. A typical connection of the I/O bus to I/O devices is shown in the figure.

The I/O bus includes data lines, address lines, and control lines. In any general-purpose computer,
the magnetic disk, printer, and keyboard, and display terminal are commonly employed. Each
peripheral unit has an interface unit associated with it. Each interface decodes the control and address
received from the I/O bus.

It can describe the address and control received from the peripheral and supports signals for the
peripheral controller. It also conducts the transfer of information between peripheral and processor
and also integrates the data flow.

The I/O bus is linked to all peripheral interfaces from the processor. The processor locates a device
address on the address line to interact with a specific device. Each interface contains an address
decoder attached to the I/O bus that monitors the address lines.

When the address is recognized by the interface, it activates the direction between the bus lines and
the device that it controls. The interface disables the peripherals whose address does not equivalent
to the address in the bus.

An interface receives any of the following four commands −

80 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• Control − A command control is given to activate the peripheral and to inform its next task.
This control command depends on the peripheral, and each peripheral receives its sequence
of control commands, depending on its mode of operation.
• Status − A status command can test multiple test conditions in the interface and the peripheral.
• Data Output − A data output command creates the interface counter to the command by
sending data from the bus to one of its registers.
• Data Input − The data input command is opposite to the data output command. In data input,
the interface gets an element of data from the peripheral and places it in its buffer register.
5.3. Input Output Ports

Ports The connection point acts as an interface between the computer and external devices like
printers, modems, etc.

There are two types of ports :

1. Internal Port: It connects the system’s motherboard to internal devices like hard disk, CD
drive, internal Bluetooth, etc.
2. External Port: It connects the system’s motherboard to external devices like a mouse, printer,
USB, etc.

81 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Some important types of ports are as per follows:

1. Serial Port:
• Used for external modems and older computer mouse
• Two versions-9pin,25pin
• Data travels at 115 kilobits per second
2. Parallel Port :
• Used for scanners and printers
• 25 pin model
3. Universal Serial Bus (or USB) Port :
• It can connect all kinds of external USB devices such as external hard disks, printers,
scanners, mouse, keyboards, etc.
• Data travels at 12 megabits per second.
4. Firewire Port :
• Transfers large amounts of data at a very fast speed.
• Connects camcorders and video equipment to the computer.
• Data travels at 400 to 800 megabits per second.
5. Ethernet Port :
• Connects to a network and high-speed Internet.
• Data travels at 10 megabits to 1000 megabits per second depending upon the network
bandwidth.

5.4. Interrupts
An interrupt in computer architecture is a signal that requests the processor to suspend its current
execution and service the occurred interrupt. To service the interrupt the processor executes the
corresponding interrupt service routine (ISR). After the execution of the interrupt service routine, the
processor resumes the execution of the suspended program. Interrupts can be of two types of hardware
interrupts and software interrupts.

82 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

5.4.1. Types of Interrupts


The interrupts can be various type but they are basically classified into hardware interrupts and
software interrupts.
1. Hardware Interrupts
If a processor receives the interrupt request from an external I/O device it is termed as a hardware
interrupt. Hardware interrupts are further divided into maskable and non-maskable interrupt.

• Maskable Interrupt: The hardware interrupt that can be ignored or delayed for some time if the
processor is executing a program with higher priority are termed as maskable interrupts.
• Non-Maskable Interrupt: The hardware interrupts that can neither be ignored nor delayed and must
immediately be serviced by the processor are termed as non-maskeable interrupts.
2. Software Interrupts
The software interrupts are the interrupts that occur when a condition is met or a system call occurs
Interrupt Cycle
A normal instruction cycle starts with the instruction fetch and execute. But, to accommodate the
occurrence of the interrupts while normal processing of the instructions, the interrupt cycle is added
to the normal instruction cycle as shown in the figure below.

83 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

After the execution of the current instruction, the processor verifies the interrupt signal to check
whether any interrupt is pending. If no interrupt is pending then the processor proceeds to fetch the
next instruction in the sequence.

If the processor finds the pending interrupts, it suspends the execution of the current program by
saving the address of the next instruction that has to be executed and it updates the program counter
with the starting address of the interrupt service routine to service the occurred interrupt.

5.5. Mode of Data Transfer


1. Programmed I/Ο
2. Interrupt- initiated I/Ο
3. Direct memory access( DMA)

5.5.1. Programmed I/Ο


It is due tο the result οf thе I/Ο instructions that arе writtеn in thе cοmputеr prοgram. Еach data itеm
transfеr is initiatеd by an instructiοn in thе prοgram. Usually thе transfеr is frοm a CPU Rеgistеr and
mеmοry.

Еxamplе οf Prοgrammеd I/Ο

• In Programmed Input Output mode of data transfer thе I/Ο dеvicе dοеs nοt havе dirеct accеss tο
thе mеmοry unit.
• A transfеr frοm I/Ο dеvicе tο mеmοry rеquirеs thе еxеcutiοn οf sеvеral instructiοns by thе CPU,
including an input instructiοn tο transfеr thе data frοm dеvicе tο thе CPU and stοrе instructiοn tο
transfеr thе data frοm CPU tο mеmοry.

84 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

• In prοgrammеd I/Ο, thе CPU stays in thе prοgram lοοp until thе I/Ο unit indicatеs that it is rеady
fοr data transfеr.
• This is a timе cοnsuming prοcеss sincе it nееdlеssly kееps thе CPU busy. This situatiοn can bе
avοidеd by using an intеrrupt facility.

5.5.2. Intеrrupt- initiatеd I/Ο

• Sincе in the Programmed Input Output mode of transfer casе wе saw thе CPU is kеpt busy
unnеcеssarily.
• This situatiοn can vеry wеll bе avοidеd by using an intеrrupt drivеn mеthοd fοr data transfеr.
• By using intеrrupt facility and spеcial cοmmands tο infοrm thе intеrfacе tο issuе an intеrrupt
rеquеst signal whеnеvеr data is availablе frοm any dеvicе.
• In thе mеantimе thе CPU can prοcееd fοr any οthеr prοgram еxеcutiοn.
• Thе intеrfacе mеanwhilе kееps mοnitοring thе dеvicе.
• Whеnеvеr it is dеtеrminеd that thе dеvicе is rеady fοr data transfеr it initiatеs an intеrrupt rеquеst
signal tο thе cοmputеr.
Drawbacks of Programmed Input Output and Interrupt Driven Input-Output
Bοth thе mеthοds prοgrammеd I/Ο and Intеrrupt-drivеn I/Ο rеquirе thе activе intеrvеntiοn οf thе
prοcеssοr tο transfеr data bеtwееn mеmοry and thе I/Ο mοdulе, and any data transfеr must transvеrsе
a path thrοugh thе prοcеssοr.

5.5.3. Dirеct Mеmοry Accеss

• Thе data transfеr bеtwееn a fast stοragе mеdia such as magnеtic disk and mеmοry unit is limitеd
by thе spееd οf thе CPU.
• Thus wе can allοw thе pеriphеrals dirеctly cοmmunicatе with еach οthеr using thе mеmοry busеs,
rеmοving thе intеrvеntiοn οf thе CPU. This typе οf data transfеr tеchniquе is knοwn as DMA οr
dirеct mеmοry accеss.
• During DMA thе CPU is idlе and it has nο cοntrοl οvеr thе mеmοry busеs.
• Thе DMA cοntrοllеr takеs οvеr thе busеs tο managе thе transfеr dirеctly bеtwееn thе I/Ο dеvicеs
and thе mеmοry unit.
Bus Rеquеst : It is usеd by thе DMA cοntrοllеr tο rеquеst thе CPU tο rеlinquish thе cοntrοl οf thе
busеs.

85 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Bus Grant : It is activatеd by thе CPU tο Infοrm thе еxtеrnal DMA cοntrοllеr that thе busеs arе in
high impеdancе statе and thе rеquеsting DMA can takе cοntrοl οf thе busеs.

5.6. I/O channels and processors

The DMA mode of data transfer reduces CPU’s overhead in handling I/O operations. It also allows
parallelism in CPU and I/O operations. Such parallelism is necessary to avoid wastage of valuable
CPU time while handling I/O devices whose speeds are much slower as compared to CPU. The
concept of DMA operation can be extended to relieve the CPU further from getting involved with
the execution of I/O operations. This gives rises to the development of special purpose processor
called Input-Output Processor (IOP) or IO channel.

The Input Output Processor (IOP) is just like a CPU that handles the details of I/O operations. It is
more equipped with facilities than those are available in typical DMA controller. The IOP can fetch
and execute its own instructions that are specifically designed to characterize I/O transfers. In
addition to the I/O – related tasks, it can perform other processing tasks like arithmetic, logic,
branching and code translation. The main memory unit takes the pivotal role. It communicates with
processor by the means of DMA.

86 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

The Input Output Processor is a specialized processor which loads and stores data into memory
along with the execution of I/O instructions. It acts as an interface between system and devices. It
involves a sequence of events to executing I/O operations and then store the results into the
memory.

Advantages –
• The I/O devices can directly access the main memory without the intervention by the processor
in I/O processor based systems.
• It is used to address the problems that are arises in Direct memory access method.

I/O Channel is an extension of the DMA concept. It has ability to execute I/O instructions using
special-purpose processor on I/O channel and complete control over I/O operations. Processor
does not execute I/O instructions itself. Processor initiates I/O transfer by instructing the I/O
channel to execute a program in memory.
• Program specifies – Device or devices, Area or areas of memory, Priority, and Error
condition actions
• Types of I/O Channels :


Selector Channel :
Selector channel controls multiple high-speed devices. It is dedicated to the transfer of data with

87 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

one of the devices. In selector channel, each device is handled by a controller or I/O module. It
controls the I/O controllers shown in the figure.

Multiplexer Channel : Multiplexer channel is a DMA controller that can handle multiple devices at
the same time. It can do block transfers for several devices at once.

Two types of multiplexers are used in this channel:

1. Byte Multiplexer –
It is used for low-speed devices. It transmits or accepts characters. Interleaves bytes
from several devices.
2. Block Multiplexer –
It accepts or transmits block of characters. Interleaves blocks of bytes from several
devices. Used for high-speed devices.

88 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

5.7. Serial Communication

Serial communication is the process of sequentially transferring the information/bits on the same
channel. Due to this, the cost of wire will be reduced, but it slows the transmission speed. Generally,
communication can be described as the process of interchanging information between individuals in
the form of audio, video, verbal words, and written documents. The serial protocol is run on every
device that can be our mobile, personal computers, and many more with the help of following some
protocols. The protocol is a type of reliable and secure form of communication that contains a set of
rules addressed with the help of a source host and a destination host. In serial communication, binary
pulses are used to show the data. Binary contains the two numbers 0 and 1. 0 is used to show the
LOW or 0 Volts, and 1 is used to show the HIGH or 5 Volts. The serial communication can either be
asynchronous or synchronous.

5.7.1. Synchronous Communication

In synchronous communication, the frames or data will be constructed with the help of combining
the groups of bits. That frames will be continuously sent in time with a master clock. It uses a
synchronized clock frequency to operate the data of sender or receiver. In synchronous
communication, there is no need to use the gaps, start bits and stop bits. The time taken by the sender
and receiver is synced that's why the frequency of timing error will be less, and the data will move
faster. On the basis of the timing being synced correctly between the sender and receiver devices, the
data accuracy is totally dependent. The synchronous serial transmission is more expensive as
compared to asynchronous serial transmission.

89 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Asynchronous Communication

In asynchronous communication, the groups of bits will be treated as an independent unit, and these
data bits will be sent at any point in time. In order to make synchronization between sender and
receiver, the stop bits and start bits are used between the data bytes. These bits are useful to ensure
that the data is correctly sent. The time taken by data bits of sender and receiver is not constant, and
the time between transmissions will be provided by the gaps. In asynchronous communication, we
don't require synchronization between the sender and receiver devices, which is the main advantage
of asynchronous communication. This method is also cost-effective. In this method, there can be a
case when data transmission is slow, but it is not compulsory, and it is the main disadvantage of the
asynchronous method.

On the basis of the data transfer rate and the type of transmission mode, serial communication will
take many forms. The transmission mode can be classified into simplex, half-duplex, and full-duplex.
Each transmission mode contains the source, also known as sender or transmitter, and destination,
also known as the receiver

90 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Transmission Mode

In the simplex method, the data transmission can be performed only in one direction. At a time, only
one client (either sender or receiver) can be active. That means among the two devices, one device
can only transmit a link while the other device can only receive it. A sender can only transmit the
data, and the receiver can only accept that data. The receiver cannot reply back to the sender. In
another case, if the receiver sends the data, the sender will only accept it. The sender cannot reply
back to the receiver.

There are various examples of the simplex. Example 1: Keyboard and CPU are the best examples
of a simplex. The keyboard always transmits characters to the CPU (Central processing unit), but the
CPU does not require transmitting characters or data to the keyboard. Example 2:
Printers and computers are one more example of the simplex. Computers always send data to the
printers, but printers are not able to send the data to the computers. In some cases, printers can also
talk back, and this case is an exception. There is only one lane in the simplex

Half Duplex

In the half-duplex, the sender and receiver can communicate in both directions, but not at the same
time. If a sender sends some data, the receiver is able to accept it, but at that time, the receiver cannot
send anything to the receiver. Same as if the receiver sends data to the sender, the sender cannot send.
If there is a case where we don't need to communicate at a time in both the direction, we can use the
half-duplex. For example, the internet is a good example of half-duplex. With the help of internet,
if a user sends a web page request to the web server, the server processes the application and sends
the requested page to the user.

91 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

One lane bridge can also explain the half-duplex. In a one-lane bridge, the two-way vehicles will
provide the way so that they can cross. At a time, only one end will send, and the other end will only
receive. We can also perform the error correction that means if the information received by the
receiver is corrupted, then it can again request the sender to retransmit that information. Walkie-
talkie is also a classic example of half-duplex. Both ends of walkie talkie contain the speakers. We
can use each handset or walkie talkie to either send the message or receive it, but we cannot do both
things at the same time.

Full Duplex

In the full-duplex, the sender and the receiver are able to send and receive at the same time. The
communication mode of full-duplex is widely used in the world. In this mode, signals travelling in
one direction are able to share the capacity of links with signals travelling in the opposite directions.
There are two ways in which sharing can occur, which is described as follow

o Either capacity of the link is divided into the signals going in both directions.
o Or the links have two physically separated transmission parts. Where one part can be used for
sending, and another part can be used for receiving.

If we need communication in both directions constantly, in this case, we will use the full-duplex
mode. The capacity of the channel will be split into two directions.

Examples: Telephone Network is a good example of full-duplex mode. While using the telephone
or phone, the two persons are able to talk and hear both things at the same time. The ordinary two-
92 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

lane highway is helpful to explain the full-duplex. If traffic is very much, in this case, the railroad is
decided to lay a double tack which is used to allow trains to pass in both directions. This type of case
is usually used while communicating in networking. The fiber optic hubs are used to contain two
connectors on each port. The full-duplex fiber is a type of two cables, which tie together so that they
can form two-lane roadways

93 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

BTECH

(SEM III) THEORY EXAMINATION 2021-22

COMPUTER ORGANIZATION AND ARCHITECTURE

SECTION A

1. List and briefly define the main structural components of a computer

The main structural components of a computer are:

1. Central Processing Unit (CPU): The brain of the computer, responsible for executing instructions and
performing arithmetic and logical operations.
2. Memory (RAM): The place where data and instructions are temporarily stored for processing by the CPU.
3. Hard Disk Drive (HDD) or Solid-State Drive (SSD): A storage device used to store large amounts of data
permanently.
4. Motherboard: The main circuit board of the computer that houses and connects the other components of
the system.
5. Power Supply Unit (PSU): Converts AC power from the electrical outlet into the DC power required by
the computer components.
6. Graphics Processing Unit (GPU): A specialized processor responsible for handling graphical data and
rendering images on the display.
7. Input Devices: Devices such as the keyboard, mouse, and touchpad used to input data and commands into
the computer.
8. Output Devices: Devices such as the display screen, speakers, and printers used to output processed data.

2. Differentiate between horizontal and vertical microprogramming.

Horizontal and vertical microprogramming are two approaches to designing microcode, which is a low-level
instruction set used by a computer's control unit to execute higher-level machine language instructions.

Horizontal microprogramming represents each microinstruction as a single, long bit string, with each bit position
corresponding to a control signal. The control signals are generated in parallel, with each microinstruction executing
concurrently. This approach is simple and efficient, but can be limited in terms of the number of control signals that
can be generated in a single cycle.

Vertical microprogramming represents each microinstruction as a series of control signals, with each cycle
generating one control signal. This approach is more flexible, as it allows for a larger number of control signals to
be generated, but is less efficient as each cycle generates only a single control signal.

94 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

In summary, horizontal microprogramming is characterized by its simplicity and efficiency, while vertical
microprogramming is characterized by its flexibility and expandability.

3. Represent the following conditional control statements by two register transfer statements
with control functions. If(P=1) then (R1 <- R2) else if (Q=1) then (R1 <-R3)

The given conditional control statement can be represented by two register transfer statements with control functions
as follows:

1. (R1 <- R2) if P=1


2. (R1 <- R3) if Q=1 and P=0

Here, the first register transfer statement transfers the contents of register R2 to register R1 if P=1, and the second
register transfer statement transfers the contents of register R3 to register R1 if Q=1 and P=0. These statements
represent the "then" part of the conditional control statement and the "else if" part, respectively.

4. Design a 4-bit combinational incremental circuit using four full adder circuits.

A 4-bit combinational incremental circuit can be designed using four full adder circuits as follows:

1. Connect the first bit of the input to the first full adder's A input, the second bit of the input to the second
full adder's A input, the third bit of the input to the third full adder's A input, and the fourth bit of the input
to the fourth full adder's A input.
2. Connect the carry input of the first full adder to logic high (1), the carry input of the second full adder to
the carry output of the first full adder, the carry input of the third full adder to the carry output of the second
full adder, and the carry input of the fourth full adder to the carry output of the third full adder.
3. Connect the B input of the first full adder to logic low (0), the B input of the second full adder to logic low,
the B input of the third full adder to logic low, and the B input of the fourth full adder to logic low.
4. Connect the C input of the first full adder to logic low, the C input of the second full adder to logic low, the
C input of the third full adder to logic low, and the C input of the fourth full adder to logic low.
5. Connect the sum outputs of the full adders to the corresponding output bits.

This circuit will take a 4-bit binary number as input and produce its incremented value as output. The incremented
value is obtained by adding one to the input binary number.

5. Differentiate between Daisy chaining and centralized parallel arbitration

95 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Daisy chaining and centralized parallel arbitration are two methods of managing communication among multiple
devices connected in a parallel manner.

Daisy chaining is a method where each device in a parallel system is connected to the next device in a chain-like
arrangement. Data is passed from one device to the next in a sequential manner, with each device acting as both a
transmitter and receiver. This method is simple and requires less hardware, but can lead to longer communication
delays as data must pass through multiple devices before reaching its destination.

Centralized parallel arbitration, on the other hand, uses a central controller to manage communication among the
parallel devices. The central controller acts as a mediator, deciding which device can transmit data at any given
time. This method provides faster communication as data is transmitted directly to its destination, but requires more
hardware as a central controller is necessary.

In summary, daisy chaining is a simpler method with longer communication delays, while centralized parallel
arbitration provides faster communication with more hardware complexity.

6. What is the transfer rate of an eight-track magnetic tape whose speed is 120 inches per
second and whose density is 1600 bits per inch?

The transfer rate of an eight-track magnetic tape can be calculated as follows:

1. First, calculate the total number of bits that can be stored on one inch of tape by multiplying the density by
the length of one inch: 1600 bits/inch * 1 inch = 1600 bits
2. Next, calculate the number of bits that can be stored in one second of tape by multiplying the total number
of bits on one inch of tape by the tape speed in inches per second: 1600 bits/inch * 120 inches/second =
192000 bits/second
3. Finally, divide the number of bits stored in one second of tape by the number of tracks on the tape to find
the transfer rate for each track: 192000 bits/second / 8 tracks = 24000 bits/second/track.

So, the transfer rate of an eight-track magnetic tape with a speed of 120 inches per second and a density of 1600
bits per inch is 24000 bits/second/track.

7. Register A holds the binary values 10011101. What is the register value after arithmetic shift
right? Starting from the initial number 10011101, determine the register value after
arithmetic shift left, and state whether there is an overflow.

Arithmetic shift right is a type of binary shift operation that shifts the bits of a binary number to the right while
preserving the sign bit (the most significant bit).

96 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Starting with the initial number 10011101, the value of the register after an arithmetic shift right would be 1001110.
The sign bit is preserved and the last bit is filled with a 0.

Arithmetic shift left is a type of binary shift operation that shifts the bits of a binary number to the left, filling the
last bit with a 0.

Starting with the initial number 10011101, the value of the register after an arithmetic shift left would be 00111010.

There is no overflow in this case, as the number of bits in the register remains the same after the shift.

8. What is an Associative memory? What are its advantages and disadvantages?

An associative memory is a type of computer memory that allows the retrieval of information based on its content,
rather than its physical location in the memory. This is achieved by associating each piece of data with a unique
identifier or "key". When a search for data is performed, the memory compares the search key with all stored keys
and returns the data associated with the matching key.

Advantages of associative memory include its ability to perform fast searches, its ability to store and retrieve data
in an organized and flexible manner, and its ability to handle a large amount of data.

Disadvantages of associative memory include its relatively high cost compared to other types of memory, its
sensitivity to hardware failures, and its complexity in implementation. Additionally, associative memory may
require significant processing power to perform searches, and the accuracy of results may be affected by noise and
other errors in the memory system.

9. Differentiate between static RAM and Dynamic RAM.

Static RAM (SRAM) and Dynamic RAM (DRAM) are two types of random access memory (RAM) used in
computers.

Static RAM (SRAM) is a type of RAM that stores each bit of data in a separate flip-flop circuit, allowing the stored
data to persist without the need for constant refreshing like in DRAM. This results in faster access times and lower
power consumption compared to DRAM. However, SRAM is more expensive and requires more transistors to
implement than DRAM.

Dynamic RAM (DRAM) is a type of RAM that stores each bit of data as a charge in a capacitor, which must be
constantly refreshed to prevent loss of data. The constant refresh requirement results in slower access times and
higher power consumption compared to SRAM. However, DRAM is less expensive and requires fewer transistors
to implement than SRAM, making it more widely used in computer memory applications.

97 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

In summary, SRAM is faster and uses less power than DRAM, but is more expensive, while DRAM is slower and
uses more power, but is less expensive.

10. What are the different types of instruction formats?

There are several types of instruction formats used in computer architecture, including:

1. One-address format: An instruction format in which a single operand is specified, and the result of the
instruction is stored in the same operand.
2. Two-address format: An instruction format in which two operands are specified, one as the source operand
and one as the destination operand.
3. Three-address format: An instruction format in which three operands are specified, one as the source
operand, one as the destination operand, and one as an auxiliary operand.
4. Zero-address format: An instruction format in which no operands are specified and the instruction only
performs a specific operation on the contents of a register or memory location.
5. Load/store format: An instruction format in which a load instruction is used to transfer data from memory
to a register, and a store instruction is used to transfer data from a register to memory.

Each type of instruction format has its own advantages and disadvantages, and the choice of format depends on the
specific requirements of the computer architecture and the type of operations being performed.

SECTION B
1. A digital computer has a common bus system for 8 registers of 16 bit each. The bus is constructed
using multiplexers.
i. How many select input are there in each multiplexer?
ii. What is the size of multiplexers needed?
iii. How many multiplexers are there in the bus?

I. There would be 8 select inputs in each multiplexer. This is because each multiplexer must be able to select one of
the 8 registers to be connected to the bus.

II. The size of the multiplexers needed would depend on the number of bits in each register and the width of the
bus. If each register is 16 bits wide and the bus is also 16 bits wide, then the multiplexers would need to be 16-bit
multiplexers.

III. The number of multiplexers in the bus would depend on the specific design of the bus and the number of signals
that need to be multiplexed. If each multiplexer handles one signal, such as the data signal, then there would be 16
multiplexers in the bus (one for each bit of the 16-bit wide bus). If each multiplexer handles multiple signals, such
as data, address, and control signals, then the number of multiplexers would be reduced. However, without more

98 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

information about the specific design of the bus, it is not possible to determine the exact number of multiplexers in
the bus.

2. Explain destination-initiated transfer using handshaking method.

Destination-initiated transfer using handshaking is a method of transferring data between two devices, such as
between a processor and a memory module, in a computer system. In this method, the destination device initiates
the transfer and controls the flow of data using handshaking signals.

The process can be described as follows:

1. The destination device sends a request to the source device, indicating that it wants to receive data.
2. The source device receives the request and prepares to send the data. It sends an acknowledgment (ACK)
signal to the destination device to indicate that it is ready to start the transfer.
3. The destination device receives the ACK signal and begins to receive the data. As each block of data is
received, the destination device sends a "busy" signal to the source device to indicate that it is still receiving
data.
4. The source device continues to send data until the entire transfer is complete. It then sends a final ACK
signal to the destination device to indicate that the transfer is complete.
5. The destination device receives the final ACK signal and sends a "ready" signal to the source device to
indicate that it is done receiving data and that the bus is now free for use by other devices.

This handshaking method ensures that data is transferred reliably and efficiently between the two devices, as the
destination device controls the flow of data and the source device cannot send data until it receives an
acknowledgment from the destination device.

3. Explain 2-bit by 2-bit Array multiplier. Draw the flowchart for divide operation of two numbers in
signed magnitude form.

A 2-bit by 2-bit array multiplier is a type of digital circuit that can perform the multiplication of two 2-bit binary
numbers. The circuit is composed of an array of full adders, with each full adder being used to perform the
multiplication of a single bit of one of the numbers with a corresponding bit of the other number.

The flowchart for the divide operation of two numbers in signed magnitude form would involve the following steps:

1. Determine the signs of the two numbers.


2. Convert both numbers to absolute values.
3. Initialize a quotient register to zero.

99 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

4. Repeat the following steps until the dividend is less than the divisor: a. Shift the dividend left by one bit. b.
Increment the quotient register. c. If the dividend is now greater than or equal to the divisor, subtract the
divisor from the dividend and set the corresponding bit in the quotient register to 1.
5. If the signs of the two numbers were different, negate the final quotient.
6. The final result is stored in the quotient register.

Note: This is a high-level description of the divide operation and may not accurately reflect the exact implementation
details of a particular design.

4. A digital computer has a memory unit of 64K X 16 and a cache memory of 1K words. The cache uses
direct mapping with a block size of four words. I. How many bits are there in the tag, index, block,
and word fields of the address format? II. How many bits are there in each word of cache, and how
they are divided into functions? Include a valid bit. III. How many blocks can the cache
accommodate?

I. The address format of the cache memory can be represented as follows:

Tag (t) - a field that stores the tag of the address to indicate which block of memory the data comes from. Index (i)
- a field that represents the index into the cache to determine which cache block to access. Block (b) - a field that
represents the offset within the cache block to determine which word of the block to access. Word (w) - a field that
represents the offset within a cache word to determine which byte of the word to access.

For this digital computer with a cache memory of 1K words, the number of bits in each field can be calculated as
follows:

100 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Tag = log2(64K) - log2(1K) = 16 - 10 = 6 bits Index = log2(1K) - log2(4) = 10 - 2 = 8 bits Block = log2(4) = 2 bits
Word = log2(16) = 4 bits

II. Each word of cache memory consists of 16 bits, and it can be divided into the following functions:

Valid bit (v) - a single bit that indicates whether the data in the cache is valid and can be used. Data (d) - the actual
data stored in the cache, which consists of the remaining 15 bits.

Thus, each word of cache would consist of 1 valid bit and 15 bits of data.

III. The cache can accommodate a total of 1K / 4 = 256 blocks, as each block is 4 words in size.

5. Explain with neat diagram, the address selection for control memory.

In a computer, the control memory stores the program or set of instructions that the CPU uses to perform operations.
The CPU needs to access the control memory to fetch the next instruction to execute. The process of selecting the
memory location to access is known as address selection.

The address selection process typically involves multiple steps, including:

1. PC (Program Counter) - the program counter holds the address of the next instruction to be executed.
2. Incrementing the PC - after each instruction is executed, the program counter is incremented to point to the
next instruction.
3. Address generation - the program counter value is used as the memory address to access the instruction in
the control memory.
4. Address multiplexing - in some cases, the control memory may be shared with other components of the
computer. Address multiplexing is used to select the correct memory location to access.
5. Memory access - the selected memory location is accessed, and the instruction is fetched.

101 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

In summary, the address selection for control memory involves incrementing the program counter, generating the
memory address, multiplexing the address to select the correct memory location, and accessing the memory to fetch
the instruction.

SECTION C
3.a. A binary floating-point number has seven bits for a biased exponent. The constant used for the bias
is 64. I. List the biased representation of all exponents from -64 to +63. II. Show that after addition of
two biased exponents, it is necessary to subtract 64 in order to have a biased exponent’s sum. III. Show
that after subtraction of two biased exponents, it is necessary to add 64 in order to have a biased
exponent’s difference

I. The biased representation of all exponents from -64 to +63 is as follows:

-64: 00...00 (7 bits) -63: 00...01 -62: 00...10 ... ... 62: 111...10 63: 111...11

II. To show that after addition of two biased exponents, it is necessary to subtract 64 in order to have a biased
exponent's sum, consider the following example:

Suppose we have two biased exponents, A and B, with a representation of 1000001 and 1100000 respectively. The
sum of these two exponents, A + B, is 1000001 + 1100000 = 1010001. However, this sum is not a valid biased
exponent representation, as it exceeds the range of -64 to +63. To obtain a valid representation, we need to subtract
64 from the sum:

A + B - 64 = 1010001 - 64 = 0011111

This result is a valid biased exponent representation.

III. To show that after subtraction of two biased exponents, it is necessary to add 64 in order to have a biased
exponent's difference, consider the following example:
102 University Academy
COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Suppose we have two biased exponents, C and D, with a representation of 1100000 and 1000001 respectively. The
difference of these two exponents, C - D, is 1100000 - 1000001 = 011111. However, this difference is not a valid
biased exponent representation, as it is negative. To obtain a valid representation, we need to add 64 to the
difference:

C - D + 64 = 011111 + 64 = 100000

This result is a valid biased exponent representation.

3.b. Show the multiplication process using the Booth algorithm, when the following binary numbers, (+13) x
(-15) are multiplied.

The Booth algorithm is a method for multiplying binary numbers that reduces the number of bit shifts required
compared to traditional binary multiplication. To perform the multiplication of (+13) x (-15) using the Booth
algorithm, we can follow the steps below:

1. Convert the numbers to 2's complement form:

+13 in binary is 00001101 -15 in binary is 10001001 (2's complement of 15)

2. Multiply the first bit of the multiplicand (-15) with the entire multiplier (+13) and write the result to the
right of the partial product:

0 * 00001101 = 00000000

3. Shift the partial product one bit to the left:

00000000 << 1 = 00000000

4. Add or subtract the multiplicand depending on the next bit of the multiplier:

Since the next bit of the multiplier is 1, add the multiplicand (-15) to the partial product:

00000000 + 10001001 = 10001001

5. Multiply the next bit of the multiplicand (-15) with the entire multiplier (+13) and write the result to the
right of the partial product:

1 * 00001101 = 00001101

103 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

6. Repeat steps 3 to 5 until all the bits of the multiplier have been processed:

00001101 << 1 = 00011010 00011010 + 10001001 = 11011011 11011011 << 1 = 10110110 10110110 + 10001001
= 01101101

7. The final result is the contents of the partial product:

01101101 = -195 in decimal

Note that the result of the multiplication is in 2's complement form. To obtain the positive decimal representation,
the 2's complement form should be converted back to its positive equivalent.

4.a. Draw a diagram of a Bus system in which it uses 3 state buffers and a decoder instead of the multiplexers

A bus system using 3-state buffers and a decoder is a type of bus architecture where the data is transferred between
different components of a computer. The 3-state buffers are used to control the flow of data on the bus, and the
decoder is used to determine which component should receive the data.

In this architecture, the decoder acts as a switch that selects one of the 3-state buffers to transmit the data on the bus.
The 3-state buffers have three states: high impedance, active high, and active low. When a buffer is in the high
impedance state, it is effectively disconnected from the bus and does not drive any signals onto the bus. When a
buffer is in the active high state, it drives a high signal onto the bus. When a buffer is in the active low state, it drives
a low signal onto the bus.

The decoder selects one of the 3-state buffers based on the address of the target component, and the selected buffer
then drives the data onto the bus. This way, the bus system ensures that only one component can drive the data onto
the bus at any given time, preventing any conflicts or collisions.

104 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

This type of bus system is often used in computer systems that require high-speed data transfer and efficient use of
the bus resources.

4.b. Explain in detail multiple bus organization with the help of a diagram.

Multiple bus organization is a computer architecture that uses multiple independent buses to transfer data between
the different components of a computer system. This type of architecture is designed to improve the performance
of the computer system by reducing the number of bottlenecks that can limit the overall speed of the system.

Each bus in a multiple bus organization is dedicated to a specific task, such as data transfer, memory access, or
input/output operations. For example, a computer system might have a data bus for transferring data between the
CPU and the main memory, a memory bus for accessing the main memory, and an I/O bus for transferring data
between the CPU and the I/O devices.

The different buses are connected to each other and to the other components of the computer system through a set
of bridges, which act as intermediaries between the different buses. These bridges help to manage the flow of data
between the different buses and ensure that data is transferred efficiently and accurately.

105 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

Advantages of multiple bus organization include improved performance and scalability, as well as the ability to
handle different types of data transfer with different bus speeds and widths. However, multiple bus organization can
also be complex and difficult to manage, and can lead to additional latency and overhead compared to a single bus
organization.

In general, multiple bus organization is a useful design choice for computer systems that require high-speed data
transfer and efficient use of system resources. However, careful consideration of the trade-offs involved is necessary
to determine whether multiple bus organization is the best choice for a particular computer system.

5.a. The logical address space in a computer system consists of 128 segments. Each segment can have up to
32 pages of 4K words each. Physical memory consists of 4K blocks of 4K words each. Formulate the logical
and physical address formats.

The logical address space in this computer system consists of 128 segments, each having up to 32 pages. The logical
address format can be divided into two parts: segment number and page number. With 128 segments, 7 bits are
needed to represent the segment number. And with 32 pages per segment, 5 bits are needed to represent the page
number. Therefore, the logical address format can be represented as follows:

Segment number (7 bits) + Page number (5 bits)

The physical address space consists of 4K blocks of 4K words each. The physical address format can be divided
into two parts: block number and word number. With 4K blocks, 12 bits are needed to represent the block number.
And with 4K words per block, 12 bits are needed to represent the word number. Therefore, the physical address
format can be represented as follows:

Block number (12 bits) + Word number (12 bits)

5.b. How is the Virtual address mapped into physical address? What are the different methods of writing
into cache?

Virtual address to physical address mapping is achieved through the use of a page table. The page table is a data
structure stored in memory that maps virtual page numbers to physical page frames. The operating system maintains
the page table and updates it whenever pages are swapped in or out of physical memory.

There are three main methods of writing into cache:

1. Write-through cache: In this method, when a write operation is performed, the data is written to both the
cache and main memory. This ensures that the contents of the cache are always in sync with main memory.

106 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

The disadvantage of this method is that it increases the number of write operations to main memory,
slowing down the overall performance of the system.
2. Write-back cache: In this method, when a write operation is performed, the data is only written to the cache.
The main memory is updated later when the cache block containing the modified data is evicted from the
cache. This method reduces the number of write operations to main memory and improves performance,
but it also increases the complexity of cache management.
3. Write-allocate cache: In this method, when a write operation is performed, the data is first written to the
cache. If the cache block containing the data is not already present in the cache, it is loaded from main
memory into the cache. This method is a combination of the write-through and write-back methods and
offers a balance between performance and simplicity.

Note: The choice of cache write method depends on various factors, including the hardware and software design of
the system, the workload, and the desired performance characteristics.

6.a. Explain how the computer buses can be used to communicate with memory and I/O. Also draw the block
diagram for CPU-IOP communication.

Computer buses are used to facilitate communication between different components of the computer, such as the
CPU, memory, and I/O devices. The bus acts as a communication channel that transfers data, control signals, and
addresses between these components.

There are three types of buses in a computer system: data bus, address bus, and control bus. The data bus is used to
transfer data between the CPU and memory or I/O devices. The address bus is used to specify the location of the
data being transferred. The control bus is used to carry control signals, such as read/write signals and interrupt
signals, between the CPU and other components.

107 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

For example, when the CPU wants to read data from memory, it sends an address on the address bus to specify the
location of the data. It also sends a read signal on the control bus to indicate that it wants to read the data. The
memory responds by sending the data on the data bus back to the CPU.

Similarly, when the CPU wants to write data to an I/O device, it sends the data on the data bus and the address of
the I/O device on the address bus. It also sends a write signal on the control bus to indicate that it wants to write the
data. The I/O device responds by storing the data.

In summary, the computer buses provide a flexible and efficient way for the CPU to communicate with memory
and I/O devices and to transfer data, addresses, and control signals between these components.

6.b. What are the different methods of asynchronous data transfer? Explain in detail.

Asynchronous data transfer is a method of transmitting data between two devices, where the data can be transmitted
at any time, and the receiver does not have to be ready to receive it. The following are the different methods of
asynchronous data transfer:

1. Start-stop method: In this method, a start bit and a stop bit are added to the beginning and end of each data
word respectively. The receiver waits for the start bit and then starts to receive the data bits. When the stop
bit is detected, it indicates the end of the data word.
2. Bit stuffing method: In this method, the sender adds an extra bit whenever there are five consecutive 1s in
the data. The receiver recognizes the extra bit and removes it before processing the data.
3. Byte stuffing method: This method is similar to the bit stuffing method, but instead of bits, extra bytes are
added to the data. The receiver recognizes the extra bytes and removes them before processing the data.
4. Synchronous start-stop method: In this method, a sync word is added to the beginning of each data word,
and the receiver uses the sync word to synchronize the data. This method is useful in situations where the
data is transmitted over noisy channels.

Each of these methods has its advantages and disadvantages, and the method used depends on the requirements of
the system and the type of data being transmitted.

7.a. Write a program to evaluate arithmetic expression using stack organized computer with 0-address
instructions. X = (A-B) * (((C - D * E) / F) / G)

To evaluate the arithmetic expression X = (A-B) * (((C - D * E) / F) / G) using a stack organized


computer with 0-address instructions, we can use the following steps:

1. Load the values of A, B, C, D, E, F, and G into the stack.

108 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

2. Perform the subtraction of A and B and store the result in the stack.
3. Load D and E from the stack and perform the multiplication of D and E.
4. Load C and the result of D * E from the stack and perform the subtraction of C and D * E.
5. Load F from the stack and perform the division of (C - D * E) and F.
6. Load G from the stack and perform the division of ((C - D * E) / F) and G.
7. Load the result of (((C - D * E) / F) / G) from the stack and perform the multiplication of (A
- B) and (((C - D * E) / F) / G).
8. Store the result in X.

Note: The specific 0-address instructions and the order of operations will vary based on the
implementation of the stack organized computer.

7.b. List the differences between hardwired and micro programmed control in tabular format. Write the
sequence of control steps for the following instruction for single bus architecture. R1 <- R2 * (R3)

Differences between Hardwired and Microprogrammed control:

Hardwired control Microprogrammed control

Fixed, not easily changeable Easily changeable

Fast execution time Slower execution time

Requires fewer components Requires more components

Design is more complex Design is relatively simple

Suitable for high-speed computing Suitable for general purpose computing

Sequence of control steps for the instruction R1 <- R2 * (R3) in single bus architecture:

109 University Academy


COMPUTER ORGANIZATION AND ARCHITECTURE 2022-23

1. Fetch the instruction R1 <- R2 * (R3) from memory.


2. Decode the instruction to determine the operation and the operands.
3. Fetch the contents of register R2 and R3 from the register file.
4. Perform the multiplication operation, R2 * R3.
5. Store the result of the multiplication in R1.
6. Transfer the contents of R1 to the bus.
7. Write the result back to memory.
8. End the instruction execution.

110 University Academy

You might also like