Computer organization and architecture
Computer organization and architecture
architecture
By :
Andargie M.
Unit- one
Introduction to computer
organization and Architecture
computer organization and Architecture
• Computer Architecture
• Those attributes of the system that are visible to a programmer. Those
attributes that have a direct impact on the execution of a program.
• • Instruction sets
• • Data representation – number of bits used to represent data
• • Input/Output mechanisms
• • Memory addressing techniques
• Computer Organization
• The operational units and their interconnections that realize the architectural
specifications. Those hardware attributes that are transparent to the
programmer.
• • Control signals
• • Interfaces between the computer and peripherals
• • Memory technology
Structure and Function
• Program execution halts only if the machine is turned off, some sort of unrecoverable error
occurs, or a program instruction that halts the computer is encountered.
Instruction Fetch and Execute
• The processor fetches an instruction from memory – program counter (PC) register holds the
address of the instruction to be fetched next
• The processor increments the PC after each instruction fetch so that it will fetch the next
instruction in the sequence – unless told otherwise
• The fetched instruction is loaded into the instruction register (IR) in the processor – the
instruction contains bits that specify the action the processor will take.
• The processor interprets the instruction and performs the required action In general, these
actions fall into four categories:
• Processor-memory
• – Data transferred to or from the processor to memory
• Processor-I/O
– Data transferred to or from a peripheral device by transferring between the processor and an
I/O module
• Data processing
– The processor performs some arithmetic or logic operation on data
• Control
– An instruction may specify that the sequence of execution be altered.
This figure below illustrates a partial program execution, showing
the relevant portions of memory and processor registers.
Adds the contents at addresses 940 and 941 and stores the
result at address 941.
Instruction Fetch and Execute
• More detail of instruction cycle
Interconnections Structures
• Multiplexed
o Single shared bus for both addresses and data
o Bus first used to specify an address
o Bus then used to transfer data
• Dedicated
o Separate buses for both address and data
o The address is specified on the address bus and
remains while data transferred
o The data is transferred on the data bus
PCI (Peripheral Component Interconnection)
• It is a 32bit bus which extends the processor’s own local bus, and can be expanded up to 64 bit
when need arises.
• The PCI bus system is able to support 10 devices.
• Because, PCI devices do not electrically load down the CPU bus.
• The PCI bus system can transfer data at a rate of 130 MB per sec at 33 MHz.
• PCI bus is a high performance connection between the motherboard components and expansion
boards of a system.
• There is a bridge chip between the processor and the PCI bus, which connects the PCI bus to the
processor’s local bus. This allows to connect PCI peripherals directly to the PCI bus.
• Once a Host bridge is included in the system, the processor can access all available PCI
peripherals. This makes the PCI bus standard processor independent. When a new processor is
to be used, only the bridge chip needs to be replaced, the rest of the system remains unchanged.
• PCI bus employs a 124 pin, micro channel style connector (188 pin for a 64 bit systems)
• PCI specification are for 2 types of connection 5V system and 3.3v low power system.
• PCI design have ability to support future generation of peripherals.
PCI (Peripheral Component Interconnection)
USB – Universal Serial BUS
INFORMATION REPRESENTATION
Computer information representation
• Inside the computer, there are integrated circuits with
thousands of transistors.
• These transistors are made to operate on a two-state. By this
design, all the input and output voltages are either HIGH or
LOW.
• Low voltage represents binary 0 and high voltage represents
binary 1.
• In Computer the data is represented in binary format (1s and
0s).
• Even though we use characters, decimals, punctuation marks,
symbols and graph, internally these things are represented in
binary format.
Introduction To Number System
0 0000 0 0
1 0001 1 1
2 0010 2 2
3 0011 3 3
4 0100 4 4
5 0101 5 5
6 0110 6 6
7 0111 7 7
8 1000 8
9 1001 9
10 1010 A
11 1011 B
12 1100 C
13 1101 D
14 1110 E
15 1111 F
Floating Point Representation
• A floating point binary number is represented in a similar manner except that
it uses base 2 for the exponent.
• Example: the binary number +1001.11 is represented with an 8 bit fraction
and 6 bit exponent as follows:
Fraction Exponent
01001110 000100
• the fraction has a 0 in the Leftmost position to denote positive.
• The exponent has the equivalent binary number +4. The floating point
• M x 2e = +( .1001110)2 x 2+4
• A floating point number is said to be Normalized if the most significant digit of
the mantissa is nonzero.
• Normalized numbers provide the maximum possible precision for the floating
point number.
Floating Point Representation
m x re
• While storing numbers, the leading digit in the mantissa is always made non-zero by
approximately shifting it and adjusting the value of the exponent, i.e...004567 0.4567 * 10 -
2
0.4567E-2
• This shifting of the mantissa to the left till its most significant digits is non-zero, is called
Normalization.
• Only the mantissa m and the exponent e are physically represented in the register (including
their signs). A floating point binary number is represented in a similar manner except that it
uses base 2 for the exponent.
Fixed Point Representation
• When two numbers of n digits each are added and the sum occupies n+1 digits, we say
that an overflow occurred.
• An overflow is a problem in digital computers because the width of registers is finite.
• A result that contains n+1 bits cannot be accommodated in a register with a standard
length of n bits.
• For this reason, many computers detect the occurrence of an overflow, and when it
occurs, a corresponding flip-flop is set which can then be checked by the user.
• An overflow may occur if the two numbers added are both positive or both negative.
• Example: Two signed binary numbers, +70 and +80, are stored in two 8bit registers.
carries : 0 1 carries: 1 0
+70 0 1000110 -70 1 0111010
+80 0 1010000 -80 1 0110000
+150 1 0010110 -150 0 1101010
Representation Of Numbers, Operand Of Code And Address
• Gray code: This is an un weighted code. It means that there are no specific
weights assigned to the bit position. This code is not suitable for arithmetic
operations, but it is very useful for input-output devices, Analog to digital
conversion etc., and the machines which use Shaft encoders as sensors.
• Note: Left most bit is the first binary digit as the first gray code. Then
move from left to right and add each adjacent pair of binary digit to get the
next gray code digit, neglect carries if any.
• Ex: 1010 = ?
• (1) (2) (3) (4)
1 0 1 0
• Add (1) + (2) 1 + 0 = 1 and Add (2) + (3) 0 +1 = 1 and Add (3) + (4) 1 +
0 = 1 then result is 1111 gray code
comparison of decimal, binary and gray codes. 4bit Gray code
A 1010 0001 A1
B 1010 0010 A2
C 1010 0011 A3
… … … …
Instruction Formats And Types
• A computer will usually have a variety of instruction code formats. It is the function
of the control unit within the CPU to interpret each instruction code and provide the
necessary control functions needed to process the instruction.
• The bits of the instruction are divided into groups called fields. The most common
fields found in instruction formats are:
An operation code field that specifies the operation to be performed
An address field that designates a memory address or a processor register
A mode field that specifies the way the operand or the Effective Address is
determined.
• Operations specified by computer instructions are executed on some data stored in
memory or processor registers.
• Operands residing in memory are specified by their memory address.
• Operands residing in processor registers are specified with a register address.
• A register address is a binary number of k bits that defines one of 2 k registers in the
CPU. Thus a CPU with 16 processor registers R0 through R15 will have a register
address field of 4 bits.
Instruction Formats And Types
• Most computers fall into one of 3 types of CPU
organizations:
Single accumulator organization
General register organization
Stack organization
Single accumulator organization
Instruction
Operand
Algorithm : operand = A
•
• – The address field of the instruction contains the effective address
• \
– The address field of the instruction contains the effective address of the operand
– No calculations are required
– Address is a constant at run time but data itself can be changed during program
execution
• Advantage : One additional memory access is required to fetch the operand
• Disadvantage :Address range limited by the width of the field that contains the
address reference
Indirect addressing
Algorithm: EA = ( R )
– Register indirect: like indirect, but address field
specifies a register that contains the effective address
-advantages : Large address space
-disadvantage : extra memory reference
Displacement or address relative addressing
• Algorithm: EA = A + (R)
– Two address fields in the instruction are used
» One is an explicit address reference
» The other is a register reference
» EA = A + (R)
Displacement or address relative addressing
• – Relative addressing:
• » A is added to the program counter contents to cause a
branch operation in fetching the next instruction
• – Base-register addressing:
• » A is a displacement added to the contents of the
referenced “base register” to form the EA
• » Used by programmers and O/S to identify the start of
user areas, segments, etc. and provide accesses within
them
• -Indexing:
• » Indexing is used within programs for accessing data
structures
STACK
Increment
Clear(CLR)
clock
Timing cycle
• The timing for all registers in the basic computer is
controlled by a master clock generator.
• The clock pulses are applied to all flip-flops and
registers in the system.
• The clock pulses do not change the state of a register
unless the register is enabled by a control signal.
• The control signals are generated in the control unit
and provide control inputs for the multiplexers in the
common bus, control inputs in processor registers
and micro-operations for the accumulator
Timing cycle
• There are two major types of control
organization:
• Hardwired organization : the control logic is
implemented with gates, flip-flops, decoders
and other digital circuits.
• Micro programmed organization: the control
information is stored in a control memory. The
control memory is programmed to initiate the
required sequence of micro-operations.
Timing cycle
• In the block diagram of the control unit , it consists of two decoders, a
sequence counter and a number of control logic gates.
• An instruction read from memory is placed in the instruction register (IR).
• IR is divided into 3 parts; the 1 bit, the operation code and bits 0 through
11.
• The operation code in bits 12 through 14 are decoded with a 3 x 8
decoder.
• The eight outputs of the decoder are designated by the symbols D0
through D7.
• Bit 15 of the instruction is transferred to a flip-flop designated by the
symbol I.
• Bits 0 through 11 are applied to the control logic gates.
• The 4 bit sequence counter can count in binary from 0 through 15.
Timing cycle
• The outputs of the counter are decoded into 16 timing signals T0
through T15.
• The sequence counter SC can be incremented or cleared
synchronously.
• Most of the time, the counter is incremented to provide the
sequence of timing signals out of the 4 x 16 decoder.
• Once in a while, the counter is cleared to 0, causing the next active
timing signal to be T0.
• As an example consider the case where SC is incremented to provide
timing signals T0,T1,T2,T3 and T4 in sequence.
• At time T4, SC is cleared to 0 if decoder output D3 is active. This is
expressed symbolically by the statement
• D3T4 : SC 0
Unit-three
DIGITAL LOGIC
Logic gates
• In 1854 George Boole invented symbolic logic.
• This is known today as Boolean algebra.
• Each variable in Boolean algebra has either of two values. - True or False.
• The purpose of this two-state algebra was to solve logic problems.
• computers think in terms of binary (high or low) it is appropriate to say
logic is the core of computers.
• To implement this logic circuits are used.
• In the logic circuits Gates are used.
• A Gate is a small circuit with one or more input signals but only one
output signal.
• Gates are digital circuits (two-state).
• There are several gates such as OR, AND, Inverter, etc. Each has symbol to
represent them.
AND Gates
• Y=A.B
OR Gates
Y=A+B
NOT or Inverter
• function Y = A
NAND Gate
• Y = AB
NOR Gate
• Y = A+B
•
Exclusive OR Gate
• Y= A XOR B
• Y= A B
• Ex-OR sign
Exclusive NOR Gate or XNOR
• aa
Boolean Algebra
Memory Organization
MEMORY HIERARCHY
The cache is used for storing segments of programs currently being executed in
the CPU and temporary data frequently needed in the present calculations.
By making programs and data available at a rapid rate, it is possible to increase
the performance rate of the computer.
MAIN MEMORY
For the same size chip, it is possible to have more bits of ROM than of RAM,
because the internal binary cells in ROM occupy less space than the RAM.
The two chip select inputs must be CS1=1 and CS2=0 for the unit to operate.
Otherwise, the data bus is in a high impedance state.
There is no need for a read or write control because the unit can only read.
Thus when the chip is enabled by the two select inputs, the byte selected by
the address lines appears on the data bus.
Memory address map
• The designer of a computer system must calculate the amount of memory required for
the particular application and assign it to either RAM or ROM.
• The interconnection between memory and processor is then established from
knowledge of the size of memory needed and the type of RAM and ROM chips available.
• The addressing of memory can be established by means of a table that specifies the
memory address assigned to each chip.
• The table, called a memory address map, is a pictorial representation of assigned
address space for each chip in the system.
• Example , assume that a computer system needs 512 bytes of RAM and 512 bytes of
ROM
AUXILIARY MEMORY
• The important characteristics of any device are its access mode, access
time, transfer rate, capacity, and cot.
• The average time required to reach a storage location in memory and
obtain its contents is called the access time.
• In electromechanical devices with moving parts such as disks and tapes,
the access time consists of a seek time required to position the read-
write head to a location and a transfer time required to transfer data to
or from the device.
• Auxiliary storage is organized in records or blocks.
• A record is a specified number of characters or words.
• Reading or writing is always done on entire records.
• Magnetic drums and disks are consisting of high speed rotating surfaces
coated with a magnetic recording medium.
• The recording surface rates at uniform speed.
• Bits are recorded as magnetic spots on the surface as it passes a
stationary mechanism called a write head.
AUXILIARY MEMORY
• Stored bits are detected by a change in magnetic filed produced by a recorded spot on
the surface as it passes through a read head.
• The amount of surface available for recording in a disk is greater than in a drum of equal
physical size.
• Therefore, more information can be stored on a disk than on a drum of comparable size
• For this reason, disks have replaced drums in more recent computers.
• Magnetic disk include hard disks and floppy disks.
• Working principle is same for both hard disks and floppy disks.
• A magnetic disk is a surface device.
• It stores data on its surface.
• Its surface is divided into circular concentric tracks, and each track is divided into sectors.
Disk controller:
When the processor attempts to read a word of memory, a check is made to determine
if the word is in the cache.
If so, the word is delivered to the processor.
If not, a block of main memory, consisting of some fixed number of words, is read into
the cache and then the word is delivered to the processor
Main memory consists of up to 2n addressable words, with each word having a unique
n-bit address.
The mapping purposes, this memory is considered to consist of a number of fixed length
blocks of K words each.
That is, there are M=2n/K blocks.
Memory-CACHE -CPU
the cache connects to the processor via data, control, and address lines.
The data and address lines also attach to data and address buffers, which attach to a
system bus from which main memory is reached.
When a cache hit occurs, the data and address buffers are disabled and communication
is only between processor and cache, with no system bus traffic. When a cache miss
occurs, the desired address is loaded onto the system bus and the data are returned
through the data buffer to both the cache and the processor.
Mapping Function
• Cache of 64kByte
• Cache block of 4 bytes
– i.e. cache is 16k (214) lines of 4 bytes
• 16MBytes main memory
• 24 bit address
– (224=16M)
• There are fewer cache lines than main memory blocks, an algorithm
is needed for mapping main memory blocks into cache lines.
• Further, a means is needed for determining which main memory
block currently occupies a cache line.
• The choice of the mapping function dictates how the cache is
organized. Three techniques can be used: direct, associative, and set
associative.
Direct Mapping
• Each block of main memory maps to only one
cache line
– i.e. if a block is in cache, it must be in one specific
place
• Address is in two parts
• Least Significant w bits identify unique word
• Most Significant s bits specify one memory block
• The MSBs are split into a cache line field r and a
tag of s-r (most significant)
Direct Mapping
Address Structure
8 14 2
• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
– 8 bit tag (=22-14)
– 14 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping from Cache to Main Memory
Direct Mapping
Cache Line Table
1 1,m+1, 2m+1…2s-m+1
…
m-1 m-1, 2m-1,3m-1…2s-1
Direct Mapping Cache Organization
Direct
Mapping
Example
Direct Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or
bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w
= 2s
• Number of lines in cache = m = 2r
• Size of tag = (s – r) bits
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
– If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very high
Associative Mapping
• A main memory block can load into any line of
cache
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
• Cache searching gets expensive
Associative Mapping from
Cache to Main Memory
Fully Associative Cache Organization
Associative
Mapping
Example
Associative Mapping
Address Structure
Word
Tag 22 bit 2 bit
• The three bit s of SELA select a source register for the A input of the ALU.
• The three bit s of SELB select a register for the B input of the ALU.
• The three bits of SELD select a destination register using the decoder and its seven
load outputs.
• The five bits OPR select one of the operations in the ALU.
• The 14 bits control word when applied to the selection inputs specify a particular
micro operation.
• The encoding of the register selections is specified in the following table.
• The 3 bit binary code listed in the first column of the table specifies the binary code
for each of the three fields.
• When SELA or SELB is 000, the corresponding multiplexer selects the external input
data.
• when SELD=000, no destination register is selected but the contents of the output
bus are available in the external output.
Control word
OPR filed
• The OPR field has five bits and each operation
is designated with a symbolic name.
Examples of Micro operations:
• The most efficient way to generate control words with a large number of
bits is to store them in a memory unit.
• A memory unit that stores control words is referred to as a control
memory.
• This type of control is referred to as micro programmed control
Stack Organization
• The following microoperations are executed with the stack when an operation is
entered in a calculator or issued by the control in a computer.
• The two topmost operands in stack are used for the operation
• The stack is popped and the result of the operation replaces the lower operand
• The following numerical example may clarify this procedure.
• Consider the arithmetic expression. (3*4) + (5*6) In reverse polish notation, this is
expressed as 34*56*+
RISC -computers use fewer instructions with simple constructs to they can be
executed mush faster within the CPU without having to use memory as often.
The concept of RISC architecture involves an attempt to reduce execution time by
simplifying the instruction set of the computer.
The major characteristics of a RISC processor are:
• Relatively few instructions
• Relatively few addressing modes
• Memory access limited to load and store instructions
• All operations done within the registers of the CPU
• Fixed length, easily decoded instruction format
• Single cycle instruction execution
• Hardwired rather than microprogrammed control.
Other characteristics attributed to RISC architecture are:
• A relatively large number of registers in the processor unit.
• Use of overlapped register windows to speed up procedure call and return.
• Efficient instruction pipeline.
• Complier support for efficient translation of high level language programs into
machine language programs.
CISC(Complex Instruction Set Computer)
• The control address register contains the address of the next microinstruction to be read.
• When a microinstruction is read from the control memory, it is transferred to a control buffer
register.
• Thus, reading a microinstruction from the control memory is the same as executing that
microinstruction.
• The third element is a sequencing unit that loads the control address register and issues a read
command.
• The two basic task performed by a microprgrammed control unit are
• Microinstruction sequencing: Get the next microinstruction from the control memory
• Microinstruction execution: Generate the control signals needed to execute the
ALU design
• The five registers are loaded with new data every clock pulse. The effect of each
clock is shown in the table.