Falut Collapsing
Falut Collapsing
Falut Collapsing
Hardware-Based Speculation
Instead of just instruction fetch and decode, also
execute instructions based on prediction of branch.
Execute instructions out of order as soon as their
operands are available.
But hold instruction commit operation until branch is
decided.
Re-order instructions after execution and commit
them in order
Using reorder buffer or ROB
register file not updated until commit
Do not raise exceptions until the instruction is
committed
ROB holds and provides operands until commit.
4
Hardware support
Reorder Buffer
Use reorder buffer slot
number instead of reservation
station when execution
completes
Supplies operands between
execution completion &
commit
Reorder buffer could also be
used as operand source
more registers like RS
Once an instruction commits,
result is put into register
As a result, easy to undo
speculated instructions
on mispredicted branches
or exceptions
Reorder
FP
Buffer
Op
Queue
Res Stations
FP Adder
FP Regs
Res Stations
FP Adder
ROB
Executed instructions
are held in ROB until
they are no longer
Speculative
instruction commit
commit in-order
Program Counter
Valid
Exceptions?
Result
Reorder Table
FP
Op
Queue
Compar network
Dest Reg
Res Stations
FP Adder
FP Regs
Res Stations
FP Adder
Notes
If a branch is mispredicted, recovery is done by
flushing the ROB of all entries that appear after
the mispredicted branch
entries before the branch are allowed to continue
restart the fetch at the correct branch successor
ROB
Load buffer
FP register status
12
Example
13
Register
Renaming!
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Exit:
F0
F10
F2
F2
F4
F0
F4
10
F4
F10
Exit
0
F4
0
R2
F0
F6
R3
F9
R3
Value
Instruction
ROB7
Done?
FP Op
ROB6
Newest
Queue
ROB5
ROB4
Reorder Buffer
ROB3
F0
LD F0,10(R2)
ROB2
ROB1
Registers
Dest
To
Memory
from
Memory
Dest
Dest
1 10+R2
Reservation
Stations
FP adders
FP multipliers
Oldest
Value
Instruction
Done?
ROB7
FP Op
ROB6
Newest
Queue
ROB5
ROB4
Reorder Buffer
ROB3
F10
F0
ADDD F10,F4,F0
LD F0,10(R2)
N
N
ROB2
ROB1
Registers
Dest
2
ADDD
R(F4),ROB1
To
Memory
from
Memory
Dest
Dest
1 10+R2
Reservation
Stations
FP adders
FP multipliers
Oldest
Value
Instruction
Done?
ROB7
FP Op
ROB6
Newest
Queue
ROB5
ROB4
Reorder Buffer
F2
DIVD F2,F10,F6
F10
ADDD F10,F4,F0
F0
LD F0,10(R2)
ROB3
ROB2
ROB1
Registers
Dest
2
ADDD
R(F4),ROB1
To
Memory
Dest
3
DIVD
ROB2,R(F6)
Dest
1 10+R2
Reservation
Stations
FP adders
from
Memory
FP multipliers
Oldest
Value
Instruction
Done?
ROB7
FP Op
ROB6
Queue
Reorder Buffer
F0
ADDD F0,F4,F6
F4
LD F4,0(R3)
--
BNE F2,<>
F2
DIVD F2,F10,F6
F10
ADDD F10,F4,F0
F0
LD F0,10(R2)
ROB5
ROB4
ROB3
ROB2
ROB1
Registers
Dest
2
6
ADDD
ADDD
R(F4),ROB1
ROB5, R(F6)
To
Memory
Dest
3
DIVD
ROB2,R(F6)
Dest
1 10+R2
5 0+R3
Reservation
Stations
FP adders
from
Memory
FP multipliers
Newest
Oldest
Value
Instruction
Done?
ROB7
FP Op
-- ROB5
SD 0(R3),F4
Queue
F0
ADDD F0,F4,F6
F4
LD F4,0(R3)
--
BNE F2,<>
F2
DIVD F2,F10,F6
F10
ADDD F10,F4,F0
Reorder Buffer
F0
LD F0,10(R2)
ROB6
ROB5
ROB4
ROB3
ROB2
ROB1
Registers
Dest
2
6
ADDD
ADDD
R(F4),ROB1
ROB5, R(F6)
To
Memory
Dest
3
DIVD
ROB2,R(F6)
Dest
1 10+R2
5 0+R3
Reservation
Stations
FP adders
from
Memory
FP multipliers
Newest
Oldest
Value
Instruction
Done?
ROB7
FP Op
-- M[10]
SD 0(R3),F4
Queue
F0
ADDD F0,F4,F6
F4 M[10]
LD F4,0(R3)
--
BNE F2,<>
F2
DIVD F2,F10,F6
F10
ADDD F10,F4,F0
Reorder Buffer
F0
LD F0,10(R2)
ROB6
ROB5
ROB4
ROB3
ROB2
ROB1
Registers
Dest
2
6
ADDD
ADDD
R(F4),ROB1
M[10],R(F6)
To
Memory
Dest
3
DIVD
ROB2,R(F6)
Reservation
Stations
FP adders
from
Memory
Dest
1 10+R2
FP multipliers
Newest
Oldest
Value
Instruction
Done?
ROB7
FP Op
-- M[10]
SD 0(R3),F4
Queue
F0 ---
ADDD F0,F4,F6
Ex
F4 M[10]
LD F4,0(R3)
--
BNE F2,<>
F2
DIVD F2,F10,F6
F10
ADDD F10,F4,F0
Reorder Buffer
F0
LD F0,10(R2)
ROB6
ROB5
ROB4
ROB3
ROB2
ROB1
Registers
Dest
2
ADDD
R(F4),ROB1
To
Memory
Dest
3
DIVD
ROB2,R(F6)
Reservation
Stations
FP adders
from
Memory
Dest
1 10+R2
FP multipliers
Newest
Oldest
41