Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CO Pipelining PDF notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

RISC –Reduced Instruction computer

1. Supports few instructions and addressing modes


2. Memory access limited to Load & store Instructions
3. Supports Fixed length instruction .
4. Single cycle instruction execution.
5. Hardwired control unit.
6. Large number of registers in the CPU.
7. Use of overlapped register windows to speed up call
and return.
8. Efficient instruction pipeline can be implemented.
9. Machine code length is more than CISC code.
10. Low performance when complex operations are
frequently used.
Ex: Berkely/ RISC

CISC: Complex instruction set computer.


1. A large number of instructions and addressing modes.
2. Memory access can be done by other instructions
too, apart from load and store.
3. Supports variable length of instructions.
4. More than one cycle is required for execution.
5. Microprogrammed control unit.
6. Few number of registers comparatively.
7. No/less pipeline applicable.
8. Less code length.
PIPELINING
- It is an implementation technique where multiple
instructions are overlapped in execution thereby
reducing the average execution time per instruction.
 Instruction pipelining is a multifunction
reconfigurable pipeline designed to speed up
computers performance by efficiently overlapping
the process of the instructions.
 An instruction pipeline is normally is invisible
to programmer and manged automatically by
program compilers and by the CPU’S internal
program control unit.
 Ex; IBM 7030---first implemented ,
80x86/Pentium, 80486 .
 A pipe line is an assembly line .Ex: Automobile
assembly.
 Where pipeline can be used.
1. Instruction execution
2. Arithmetic computations.
3. Memory access.
In computer each stage completes part of
instruction, where each stage is called as pipe stage
or pipe segment .
Rule of pipe is --- Instructions enter at one end of the
pipe and progress through stages and exit at other
end of the pipe.
 How many pipe stages in the pipeline.
1. Depends on the complexity of the instruction
set
2. Organization of external memory.

3. Depends on CPU’S data path implementation.


4. Normal range is 3 to 12.
**** The purpose of pipelining is to improve or
enhance the performance of a system without
investing in the hardware.
Ex; If hardware is increased then cost is increased
but in the pipelining , overlapping of instructions is
done.
Clock skew/ Jitter/setup time;
The minimum clock period of the pipeline must
satisfy the inequality.
Tp ≥ tskew + Jitter + tlogic + setuptime
Skew: Maximum delay between the arrival of the
clock signals at the stage latches.
Jitter: Maximum delay difference between arrival of
the clock signal at the same latch.
Logic circuit delay: Maximum delay of the slowest
stage in the pipeline.
Setup time: Minimum time a signal needs to be
stable at the input of a latch before it can be
captured.

Note :
1. Throughput of an instruction pipeline is determined
by how often an instruction exists the pipeline.
2. Microprocessor cycle is the time required between
moving an instruction one step down the pipeline. So
length of the processor cycle is time for the slowest
pipe stage.

BASIC OF RISC ARCHITECTURE


Key properties;
1. All operations on data apply to data in registers and
change the entire register ( 32 bit /64bit)
2. The only operations that affect memory are load and
store operations.
3. Load and store that load /store less than a register are
available.
Instruction formats are few in number and all of them
are of same size.
Ex: MIPS 64---64 bit version.
RISC architecture has 32 registers and Reg 0 is always 0.

Three classes of Instructions:


1. ALU instructions: between two registers or register
and immediate operands with sign extension .store
result in the third register.
2. Load and store isntructions.
Base Register + offset = effective address.
3. Branch & Jump instructions.
Based on condition flags , comparing between
registers , compare between Reg 0 and register.

IMPLEMENTATION OF RISC ARCHITECTURE


Every instruction in the RISC architecture can be
implemented by most 5 clock cycles/stages.
1. IF: Instruction Fetch
PC = MEMORY
IR = OPCODE FROM MEMORY
PC = PC + 4
2. ID : Instruction decode / register fetch cycle :
- decode instruction and read register from register file.
 Check for branch if any
Extend sign for offset if needed
compute possible branch target address by adding
signed offset to the incremented PC .
EA = [PC] + X ;; Note: PC- PC + 4
Decoding is done in parallel with reading registers since
register specifiers are at a fixed location in RISC
architecture. This technique is known as fixed field
decoding.
- Immediate operand may also be calculated .
3. EX : Effective address/ Execution cycle:
ALU operates 3 functions on operands
1. Memory reference, E.A= Base Reg + X
2. Register to register ALU operations.
3. Register --- Immediate ALU operations.

Note : In LOAD AND STORE , E.A & execution cycle


can be combined in to single clock cycle.
4. MEM : Memory Access:
If Load- Using E.A , access the data from memory.
If Store  Stores the data to memory using the E.A.
5.WB: Write Back stage:
 Register to register ALU instruction - to store into
destination i.e into the RF.
For Load instruction- to destination register.
In this implementation :
1. Branch instructions require- 2 cycles
2. Store instructions  4 cycles
3. Remaining all require 5 cycles.

Note : we have to determine what happens for every


clock cycle of the processor and make sure that two
different operations can not use the same resource in the
data path in the same cycle.
Basic Performance issues in the pipelining:
**Pipelining increases the CPU instruction throughput i.e
instructions/unit time, but it doesn’t reduce the execution
time of an individual instruction.
** In fact it usually slightly increases the execution time of
each instruction due to overhead in the control of the
pipeline .
** The increase in instruction throughput means a
program runs faster and has lower total execution time ,
even though no single instruction runs faster.

Major hurdles of pipelining---Pipeline


hazards/Dependencies:
The situation called hazards prevent the next instruction
in the instruction stream from executing during its
designated clock cycle. Hazards can reduce the
performance provided by the speed up gained by the
pipelining.
3 types of hazards:
1. Structural hazards/Resource hazards.
They arise from resource conflicts. Occur due to
improper support from hardware , when two or more
resources need to access the same resource.
Ex. Memory, ALU .
Solution:-- Renaming of memory. Instruction Cache for
Instructions and Data cache for storing data.
2. Data hazards:
It occurs when there is a conflict in the access of an
operand location (or) data hazard can occur when the
pipeline changes the order of read/write accesses to
operands so that the order differs from the order
seen by sequentially executing instructions on an
unpipelined processor.
Solution: Operand forwarding or by- passing / short
circuiting .
3. Control hazards:
The purpose of instruction stream is to supply the
execution unit with a steady stream of instructions.
When this stream is interrupted , the pipeline stalls i.e
input stops at the input of the pipeline thereby the
CPI > 1. Ideally CPI= 1 for a pipeline.
A branch, conditional or unconditional instruction
may cause a pipeline to stall.
Solution:
1. Hardware : Branch prediction buffer/loop buffer
2. Software: Delayed branch by inserting a NOP
instruction.
*Branch penalty = Number of stalls created in the
pipeline due to branch operation.
* Branch penalty = (Stage at which target address is
available - 1)
 Total number of stalls created from the branch
during execution of the program
= branch frequency X Branch penalty.
**Speed up in pipeline with stall
= ___________Pipeline depth_________________
1 + Pipeline stall cycles / instruction
** CPI = Ideal CPI + Pipeline stall cycles /instruction.
= 1 + Pipeline stall cycles / instruction.

You might also like