Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CS104: Computer Organization: 11 March, 2020

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

L9

11/03/2020

CS104: Computer Organization


11th March, 2020
Dip Sankar Banerjee
Computer Organization (IS F242)
LectureManojit Ghose
1 : Introductory Thoughts

Dip Sankar Banerjee


dsb@hyderabad.bits-pilani.ac.in
Department
Indian Institute of CS &Technology,
of Information IS Guwahati
Jan-Apr 2020
L9
11/03/2020

Five Components of a Computer

Computer Keyboard,
Memory Devices Mouse
Processor Disk
(passive)
Input (where
Control (where programs,
programs, Output data live
data live when not
Datapath when running)
running) Display,
Printer
L9
11/03/2020

Designing the CPU


• How to design a CPU:
– Analyze ISA => datapath requirements
• Meaning of each instruction given by register transfers
(ISA Model => RTL Model)
• Datapath must include storage elements for ISA
registers (possibly more)
• Datapath must support each register transfer
– Select datapath components and establish a
clocking methodology
– Assemble datapath meeting the RTL requirements
L9
11/03/2020

Designing the CPU (cont..)


– Analyze implementation of each instruction to
determine the setting of the control points that
effect the register transfers
– Assemble the control logic
– RTL datapath and control design are refined to
track physical design and functional validation
• Changes in timing and errors
• Amount of work varies with capabilities of CAD tools
and degree of optimization for cost/performance.
L9
11/03/2020

The CPU
• Processor (CPU): the active part of the
computer that does all the work (data
manipulation and decision-making)
• Datapath: portion of the processor that
contains hardware necessary to perform
operations required by the processor (the
brawn)
• Control: portion of the processor (also in
hardware) that tells the datapath what needs
to be done (the brain)
L9
11/03/2020

Stages of the Datapath : Overview


• Problem: a single, atomic block that “executes
an instruction” (performs all necessary
operations beginning with fetching the
instruction) would be too bulky and inefficient
• Solution: break up the process of “executing
an instruction” into stages, and then connect
the stages to create the whole datapath
– smaller stages are easier to design
– easy to optimize (change) one stage without
touching the others
L9
11/03/2020

Five Stages of the Datapath

• Stage 1: Instruction Fetch


• Stage 2: Instruction Decode
• Stage 3: ALU (Arithmetic-Logic Unit)
• Stage 4: Memory Access
• Stage 5: Register Write
L9
11/03/2020

Stages of the Datapath (1/5)

• There is a wide variety of MIPS instructions: so


what general steps do they have in common?
• Stage 1: Instruction Fetch
– no matter what the instruction, the 32-bit
instruction word must first be fetched from
memory (the cache-memory hierarchy)
– also, this is where we Increment PC
(that is, PC = PC + 4, to point to the next
instruction: byte addressing so + 4)
L9
11/03/2020

Stages of the Datapath (2/5)

• Stage 2: Instruction Decode


– upon fetching the instruction, we next gather data
from the fields (decode all necessary instruction
data)
– first, read the opcode to determine instruction
type and field lengths
– second, read in data from all necessary registers
• for add, read two registers
• for addi, read one register
• for jal, no reads necessary
L9
11/03/2020

Stages of the Datapath (3/5)


• Stage 3: ALU (Arithmetic-Logic Unit)
– the real work of most instructions is done here:
arithmetic (+, -, *, /), shifting, logic (&, |),
comparisons (slt)

– what about loads and stores?


• lw $t0, 40($t1)
• the address we are accessing in memory = the value in
$t1 PLUS the value 40
• so we do this addition in this stage
L9
11/03/2020

Stages of the Datapath (4/5)

• Stage 4: Memory Access


– actually only the load and store instructions do
anything during this stage; the others remain idle
during this stage or skip it all together
– since these instructions have a unique step, we
need this extra stage to account for them
– as a result of the cache system, this stage is
expected to be fast
L9
11/03/2020

Stages of the Datapath (5/5)

• Stage 5: Register Write


– most instructions write the result of some
computation into a register
– examples: arithmetic, logical, shifts, loads, slt
– what about stores, branches, jumps?
• don’t write anything into a register at the end
• these remain idle during this fifth stage or skip it all
together
L9
11/03/2020

Generic Steps of Datapath

registers
rd
instruction
memory
PC

rs

memory
ALU

Data
rt

+4 imm

1. Instruction 2. Decode/ 5. Register


3. Execute 4. Memory
Fetch Register Write
Read
L9
11/03/2020

Datapath Walkthroughs (1/3)


• add $r3,$r1,$r2 # r3 = r1+r2
– Stage 1: fetch this instruction, increment PC
– Stage 2: decode to determine it is an add,
then read registers $r1 and $r2
– Stage 3: add the two values retrieved in Stage 2
– Stage 4: idle (nothing to write to memory)
– Stage 5: write result of Stage 3 into register $r3
L9
11/03/2020
Example: add Instruction

reg[1]+
reg[1] reg[2]

registers
instruction 3
memory
PC

memory
reg[2] ALU

Data
2

imm
+4
add r3, r1, r2
L9
11/03/2020

Datapath Walkthroughs (2/3)


• slti $r3,$r1,17
# if (r1 <17 )r3 = 1 else r3 = 0
– Stage 1: fetch this instruction, increment PC
– Stage 2: decode to determine it is an slti,
then read register $r1
– Stage 3: compare value retrieved in Stage 2
with the integer 17
– Stage 4: idle
– Stage 5: write the result of Stage 3 (1 if reg source was less
than signed immediate, 0 otherwise) into register $r3
L9
11/03/2020
Example: slti Instruction

reg[1]
reg[1]
<17?

registers
instruction x
memory
PC

memory
ALU

Data
3

imm 17
+4
slti r3, r1, 17
L9
11/03/2020

Datapath Walkthroughs (3/3)


• sw $r3,17($r1) #
Mem[r1+17]=r3
– Stage 1: fetch this instruction, increment PC
– Stage 2: decode to determine it is a sw,
then read registers $r1 and $r3
– Stage 3: add 17 to value in register $r1
(retrieved in Stage 2) to compute address
– Stage 4: write value in register $r3 (retrieved in
Stage 2) into memory address computed in
Stage 3
– Stage 5: idle (nothing to write into a register)
L9
11/03/2020
Example: sw Instruction

reg[1]
reg[1]
+17

registers
x
instruction
memory
PC

memory
reg[3] ALU

Data
3

imm 17
+4

MEM[r1+17]<=r3
SW r3, 17(r1)
L9
11/03/2020

Why Five Stages? (1/2)

• Could we have a different number of stages?


– Yes, and other architectures do
• So why does MIPS have five if instructions
tend to idle for at least one stage?
– Five stages are the union of all the operations
needed by all the instructions.
– One instruction uses all five stages: the load
L9
11/03/2020

Why Five Stages? (2/2)


• lw $r3,17($r1) #
r3=Mem[r1+17]
– Stage 1: fetch this instruction, increment PC
– Stage 2: decode to determine it is a lw,
then read register $r1
– Stage 3: add 17 to value in register $r1
(retrieved in Stage 2)
– Stage 4: read value from memory address
computed in Stage 3
– Stage 5: write value read in Stage 4 into
register $r3
L9
11/03/2020
Example: lw Instruction

reg[1]
reg[1]
x +17

registers
instruction
memory
PC

memory
ALU

Data
3

MEM[r1+17]
imm 17
+4
LW r3, 17(r1)
L9
11/03/2020

Datapath and Control


• Datapath based on data transfers required to perform
instructions
• Controller causes the right transfers to happen

registers
rd
instruction
memory
PC

rs

memory
ALU

Data
rt

+4 imm

opcode, funct
Controller
L9
11/03/2020

What Hardware Is Needed? (1/2)

• PC: a register that keeps track of address of


the next instruction to be fetched
• General Purpose Registers
– Used in Stages 2 (Read) and 5 (Write)
– MIPS has 32 of these
• Memory
– Used in Stages 1 (Fetch) and 4 (R/W)
– Caches makes these stages as fast as the others
(on average, otherwise multicycle stall)
L9
11/03/2020

What Hardware Is Needed? (2/2)


• ALU
– Used in Stage 3
– Performs all necessary functions: arithmetic, logicals,
etc.
• Miscellaneous Registers
– One stage per clock cycle: Registers inserted between
stages to hold intermediate data and control signals as
they travel from stage to stage
– Note: Register is a general purpose term meaning
something that stores bits. Realize that not all
registers are in the “register file”
L9
11/03/2020

CPU Clocking (1/2)

• For each instruction, how do we control the flow of


information though the datapath?
• Single Cycle CPU: All stages of an instruction
completed within one long clock cycle
– Clock cycle sufficiently long to allow each instruction to
complete all stages without interruption within one cycle

1. Instruction 2. Decode/ 5. Reg.


3. Execute 4. Memory
Fetch Register Write
Read
L9
11/03/2020

CPU Clocking (2/2)


• Alternative multiple-cycle CPU: only one stage of instruction
per clock cycle
– Clock is made as long as the slowest stage

1. Instruction 2. Decode/ 3. Execute 4. Memory 5. Register


Fetch Register Write
Read

– Several significant advantages over single cycle execution:


Unused stages in a particular instruction can be skipped
OR instructions can be pipelined (overlapped)
L9
11/03/2020

1 bit ALU
• Using a MUX we can add the AND, OR, and
adder operations into a single ALU

Cin ALUOp

Mux
Result

1-bit
Full
B Adder

Cout
L9
11/03/2020

4 bit ALU
ALUop ALUop
CIn0 3
A0 1-bit A
Result0
B0 ALU 4
CIn1 COut0
A1 1-bit
Result1
B1 ALU
CIn2 COut1
A2 1-bit
Result2
B2 ALU
CIn3 COut2
A3 1-bit Result3
B3 ALU B COut3

COut3 4
L9
11/03/2020

Combinational Elements
Carry_In Select
A
32
Adder
A
32

MUX
Sum
32 32 Y
32
B Carry B
32
Adder MUX
OP

A
32
ALU

Result
32
B Zero
32

ALU
L9
11/03/2020

D Latches
• Modified SR Latch
• Latches value when C is asserted

C
Q

Q
D
L9
11/03/2020

D Flip Flop
• Uses Master/Slave D Latches

D D D Q D D Q Q

C Latch Q C Latch Q Q
CLK
L9
11/03/2020

Storage Element: Register


• Register
– Similar to D Flip Flop Write Enable
• N bit input and output
• Write Enable input Data In Data Out
– Write Enable N N
• 0: Data Out will not change
• 1: Data Out will become Data In

– Data changes only on falling edge! Clk


L9
11/03/2020

Storage Element: Reg File


• Register File consists of 32 registers
– Two 32 bit output busses
RW RA RB
• busA and busB Write Enable 5 5 5
– One 32 bit input bus busA
• busW busW
32 32 32-bit 32
– Register 0 hard wired to value 0 Registers
busB
– Register selected by Clk
32
• RA selects register to put on busA
• RB selects register to put on busB
• RW selects register to be written via busW when Write Enable is 1
– Clock input (CLK)
• CLK input is a factor only for write operation
• During read, behaves as combinational logic block
– RA or RB stable  busA or busB valid after “access time”
– Minor simplification of reality
L9
11/03/2020

Storage Element: Memory


• Memory Address
Write Enable
– One input bus: Data In
– One output bus: Data Out
Data In Data Out
– Address selection 32 32
• Address selects the word
to put on Data Out Clk
• To write to address, set
Write Enable to 1
– Clock input (CLK)
• CLK input is a factor only for write operation
• During read, behaves as combinational logic block
– Valid Address  Data Out valid after “access time”
– Minor simplification of reality
L9
11/03/2020

Some Logic Design…


• All storage elements have same clock
– Edge-triggered clocking
– “Instantaneous” state change (simplification!)
– Timing always work if the clock is slow enough
Cycle Time = Clk-to-Q + Longest Delay + Setup + Clock Skew

Clk
Setup Hold Setup Hold
Don’t Care

. . . .
. . . .
. . . .
L9
11/03/2020

Summary

• CPU design involves Datapath, Control


– 5 Stages for MIPS Instructions
1. Instruction Fetch
2. Instruction Decode & Register Read
3. ALU (Execute)
4. Memory
5. Register Write
• Datapath timing: single long clock cycle or one
short clock cycle per stage

You might also like