Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
18 views41 pages

Falut Collapsing

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 41

Hardware-based speculation

Dominating control dependencies


Control dependencies increasingly complex
Branch prediction may not be enough to keep high ILP
(especially with multiple-issue)
Precise Interrupts
all previous instructions must be completed
all instructions after interrupt must behave as if they
never started
Speculation: Fetch, issue, execute the target instruction
of a predicted branch
branch prediction may be wrong: how to update the
processor state?
Out-of-Order completion
Post-interrupt/mispredict writebacks change the state!

Speculating with Tomasulo


Modern processors such as PowerPC 603/604, MIPS
R10000, Intel Pentium 4, Alpha 21264, etc. extend
Tomasulos approach to support speculation
Key ideas:
separate execution from completion
add a final step after an instruction is no longer
speculative when it is allowed to make register
and memory updates, called instruction commit
allow instructions to execute out of order but
force them to commit in order
add a hardware buffer, called the reorder buffer
(ROB), with registers to hold the result of an
instruction between execution completion and
commit
3

Hardware-Based Speculation
Instead of just instruction fetch and decode, also
execute instructions based on prediction of branch.
Execute instructions out of order as soon as their
operands are available.
But hold instruction commit operation until branch is
decided.
Re-order instructions after execution and commit
them in order
Using reorder buffer or ROB
register file not updated until commit
Do not raise exceptions until the instruction is
committed
ROB holds and provides operands until commit.
4

Hardware support
Reorder Buffer
Use reorder buffer slot
number instead of reservation
station when execution
completes
Supplies operands between
execution completion &
commit
Reorder buffer could also be
used as operand source 
more registers like RS
Once an instruction commits,
result is put into register
As a result, easy to undo
speculated instructions
on mispredicted branches
or exceptions

Reorder
FP

Buffer

Op
Queue

Res Stations
FP Adder

FP Regs

Res Stations
FP Adder

ROB Data Structure


ROB entry fields
Instruction type: branch, store, register operation (i.e., ALU
or load)
State: indicates if instruction has completed execution and
value is ready
Destination: where result is to be written register number
for register operation (i.e. ALU or load), memory address
for store
branch has no destination result
Value: holds the value of instruction result until time to
commit
Additional reservation station field
Destination: Corresponding ROB entry number

ROB & Store buffer are


combined into one

ROB
Executed instructions
are held in ROB until
they are no longer
Speculative 
instruction commit 
commit in-order

Four Steps of Speculative Tomasulo Algorithm


1. Issueget instruction from Op Queue
If reservation station and reorder buffer slot free, issue
instr & send operands & reorder buffer no. for
destination (this stage sometimes called dispatch)
2. Executionoperate on operands (EX)
When both operands ready then execute; if not ready,
watch CDB for result; when both in reservation station,
execute; checks RAW (sometimes called issue)
3. Write resultfinish execution (WB)
Write on Common Data Bus to all awaiting FUs
& reorder buffer; mark reservation station available.
4. Commitupdate register with reorder result
When instr. at head of reorder buffer & result present,
update register with result (or store to memory) and
remove instr from reorder buffer. Mispredicted branch
flushes reorder buffer (sometimes called graduation)

Tomasulo with Speculation: 4-step execution


1) Issueget instruction from instruction queueamong the instructions,
there will be branch instructions and target instructions (of course with
predicted branch)These target instructions(along with Store) will go into
ROB
If a reservation station and reorder buffer slot free, issue instr & send(to
RS) operands & reorder buffer # (for RS to get the operand when it(#)
commits)
2) Execute
When both operands ready then execute; if not ready, watch CDB for
result;
when both operands are in reservation station, execute; checks for RAW
Cycles taken are dependent on the operationLoad=2 cycles, Store=1
cycle,
3) Write resultfinish execution (WB)
Write on Common Data Bus to all awaiting RSs & reorder buffer; mark
reservation station as available(free). (tags are now ROB #s not RS #s)
4) Commitupdate register with reorder result
Three sequences of actions when an instruction reaches the head of ROB:

Normal Commit: write registers, in-order commit

Store: update memory

Branch with incorrect prediction: flush ROB and flush reservation


stations and restart execution at correct PC`

Program Counter

Valid

Exceptions?

Result

Reorder Table

FP
Op
Queue

Compar network

Dest Reg

What are the hardware complexities with


reorder buffer (ROB)?
Reorder
Buffer

Res Stations
FP Adder

FP Regs

Res Stations
FP Adder

How do you find the latest version of a register?


need associative comparison network
Could use future file or just use the register result status buffer to track which
specific reorder buffer has received the value

Need as many ports on ROB as register file

Notes
If a branch is mispredicted, recovery is done by
flushing the ROB of all entries that appear after
the mispredicted branch
entries before the branch are allowed to continue
restart the fetch at the correct branch successor

When an instruction commits or is flushed from


the ROB then the corresponding slots become
available for subsequent instructions

Op Operation to perform in the unit (e.g., + or )


Qj, Qk ROB # producing source registers
Vj, Vk Value of source operandstemp regs for renaming
Busy Indicates reservation station and FU is busy

State Which state? Issue, EX, WB, Commit


Destination Destination register #

ROB

Load buffer

Value Result of operation


Busy ROB entry is occupied

FP register status

12

Example

13

Register
Renaming!

14

15

When Mult1 finishes, it


sends the result on CDB
with #3 as the tag

16

For SUBD, the Control logic


checks F2 and finds that it was
renamed to #2, so look at the
CDB for #2(which is broadcasted
at this moment) and gets the
dataMem[45+Regs[R3]]

17

18

19

20

21

22

23

24

25

26

27

28

29

Summary: Tomasulo with ROB

30

Speculative Tomasulo: another example


LD
ADDD
DIVD
BNEZ
LD
ADDD
SD

Exit:

F0
F10
F2
F2
F4
F0
F4

10
F4
F10
Exit
0
F4
0

R2
F0
F6
R3
F9
R3

Speculative Tomasulo: another example


Dest.

Value

Instruction

ROB7
Done?

FP Op

ROB6

Newest

Queue
ROB5
ROB4

Reorder Buffer

ROB3
F0

LD F0,10(R2)

ROB2
ROB1

Registers
Dest

To
Memory
from
Memory

Dest

Dest
1 10+R2

Reservation
Stations
FP adders

FP multipliers

Oldest

Speculative Tomasulo: another example


Dest.

Value

Instruction

Done?
ROB7

FP Op

ROB6

Newest

Queue
ROB5
ROB4

Reorder Buffer

ROB3
F10
F0

ADDD F10,F4,F0
LD F0,10(R2)

N
N

ROB2
ROB1

Registers
Dest
2

ADDD

R(F4),ROB1

To
Memory
from
Memory

Dest

Dest
1 10+R2

Reservation
Stations
FP adders

FP multipliers

Oldest

Speculative Tomasulo: another example


Dest.

Value

Instruction

Done?
ROB7

FP Op

ROB6

Newest

Queue
ROB5
ROB4

Reorder Buffer

F2

DIVD F2,F10,F6

F10

ADDD F10,F4,F0

F0

LD F0,10(R2)

ROB3
ROB2
ROB1

Registers
Dest
2

ADDD

R(F4),ROB1

To
Memory
Dest
3

DIVD

ROB2,R(F6)

Dest
1 10+R2

Reservation
Stations
FP adders

from
Memory

FP multipliers

Oldest

Skip some cycles

Speculative Tomasulo: another example


Dest.

Value

Instruction

Done?
ROB7

FP Op

ROB6

Queue

Reorder Buffer

F0

ADDD F0,F4,F6

F4

LD F4,0(R3)

--

BNE F2,<>

F2

DIVD F2,F10,F6

F10

ADDD F10,F4,F0

F0

LD F0,10(R2)

ROB5
ROB4
ROB3
ROB2
ROB1

Registers
Dest
2
6

ADDD
ADDD

R(F4),ROB1
ROB5, R(F6)

To
Memory
Dest
3

DIVD

ROB2,R(F6)

Dest
1 10+R2
5 0+R3

Reservation
Stations
FP adders

from
Memory

FP multipliers

Newest

Oldest

Speculative Tomasulo: another example


Dest.

Value

Instruction

Done?
ROB7

FP Op

-- ROB5

SD 0(R3),F4

Queue

F0

ADDD F0,F4,F6

F4

LD F4,0(R3)

--

BNE F2,<>

F2

DIVD F2,F10,F6

F10

ADDD F10,F4,F0

Reorder Buffer

F0

LD F0,10(R2)

ROB6
ROB5
ROB4
ROB3
ROB2
ROB1

Registers
Dest
2
6

ADDD
ADDD

R(F4),ROB1
ROB5, R(F6)

To
Memory
Dest
3

DIVD

ROB2,R(F6)

Dest
1 10+R2
5 0+R3

Reservation
Stations
FP adders

from
Memory

FP multipliers

Newest

Oldest

Speculative Tomasulo: another example


Dest.

Value

Instruction

Done?
ROB7

FP Op

-- M[10]

SD 0(R3),F4

Queue

F0

ADDD F0,F4,F6

F4 M[10]

LD F4,0(R3)

--

BNE F2,<>

F2

DIVD F2,F10,F6

F10

ADDD F10,F4,F0

Reorder Buffer

F0

LD F0,10(R2)

ROB6
ROB5
ROB4
ROB3
ROB2
ROB1

Registers
Dest
2
6

ADDD
ADDD

R(F4),ROB1
M[10],R(F6)

To
Memory
Dest
3

DIVD

ROB2,R(F6)

Reservation
Stations
FP adders

from
Memory
Dest
1 10+R2

FP multipliers

Newest

Oldest

Speculative Tomasulo: another example


Dest.

Value

Instruction

Done?
ROB7

FP Op

-- M[10]

SD 0(R3),F4

Queue

F0 ---

ADDD F0,F4,F6

Ex

F4 M[10]

LD F4,0(R3)

--

BNE F2,<>

F2

DIVD F2,F10,F6

F10

ADDD F10,F4,F0

Reorder Buffer

F0

LD F0,10(R2)

ROB6
ROB5
ROB4
ROB3
ROB2
ROB1

Registers
Dest
2

ADDD

R(F4),ROB1

To
Memory
Dest
3

DIVD

ROB2,R(F6)

Reservation
Stations
FP adders

from
Memory
Dest
1 10+R2

FP multipliers

Newest

Oldest

Precise State with ROB


ROB maintains precise state and allows speculation
Waits until precise condition reaches retire/commit
stage
(Or until branch is noted mis-predicted)
Clear ROB, RS, and register status table
(Flush)
Service exception and then Restart from True
Branch target
Need to do similar things with memory ops
Called Memory Ordering Buffer (MOB)
Completed stores are written to MOB then
complete (write to memory) in-order (when they
reach head of buffer)
40

Tomasulo + ROB Summary


Many implementations are very similar
Pentium 4, PowerPC, etc
Some limitations
Too many value copy operations
Register file => RS => ROB => Register File
Too many muxes/busses (CDB)
Values are coming from everywhere to
everywhere else!
Reservation Stations mix values(data) and tags
(control)
More complex logic

41

You might also like