Iterative Data Flow Analysis
Iterative Data Flow Analysis
r4 = 4 r4 is dead, as it is redefined.
r6 = 8 So is r6. r2, r3, r5 are live
r6 = r2 + r3
r7 = r4 – r5 r6 = r4 – r5 is useless,
it produces a dead value !!
Eliminate it
Computation of use [ ] and def[ ]
for each basic block BB do
def[B] = Φ ; use[B] = Φ ;
for each statement (x := y op z) in sequential order, do
for each operand y, do
if (y not in def[B])
u := 4 * i
t2 := 4 * i
t2 := u
t3 := a[t2]
t3 := a[t2]
t6 := 4 * i t6 := u
t7 := a[t6] t7 := a[t6]
Example – Value numbering based
(15) := 4 * i
(18) := a[t2]
t6 := (15)
t7 := (18)
Copy Propagation
• Copies gets generated due to elimination of common sub-expression
• S: x:= y
• Determine where the value of ‘x’ is used
• Substitute ‘y’ in place of ‘x’ where ‘u’ is a statement that uses y
Copy Propagation
• Statement ‘s’ must be the only definition of x reaching u – UD chain could be
used
• On every path from ‘s’ to ‘u’, there are no assignments to ‘y’ – new data-flow
analysis problem
Forward Copy Propagation
• Forward propagation of RHS of assignment or mov’s.
r1 := r2 r1 := r2
. .
. .
. .
r4 := r1 + 1 r4 := r2 + 1
• Reduce chain of dependency
• Possibly create dead code
Copy Propagation
• out[B] = c_gen[B] U (in[B] – c_kill[B])
• in[B] = ∩ out[P] where P is a predecessor block
• in[B1] = Φ – B1 is the initial block
• This is similar to the available expressions algorithm and the
computation are also same.
Example
c_gen [B1] = {x := y }
B1 x := y c_gen[B3] = { x:= z}
c_kill [ B2] = {x:= y}
c_kill[B1] = {x:=z}
y := B2 B3 x := z c_kill[B3] = {x := y}
All other c_gen and
c_kill are Φ
:= x B4 No copy of x:=y or x:= z
reaches B5
:= x B5
Algorithm – Copy Propagation
• Input: A flow graph with ud-chains and c_in[B], du-chains that has the
use of all definitions
• Output: revised flow graph
Algorithm
• For each copy ‘s’ x:=y, do the following
• Determine the uses of ‘x’ that are reached by this definition
• Determine whether for every use of ‘x’, ‘s’ is in c_in[B] where B is the block
for this use and no definitions of ‘x’ or ‘y’ occur prior to this use
• If the statement meets this condition, remove ‘s’ and replace all uses of ‘x’ by
‘y’
Constant Propagation
• Forward propagation of moves/assignment of the form
d: rx := L where L is literal
• Replacement of “rx” with “L” wherever possible.
• d must be available at point of replacement.
Unreachable Code Elimination
Mark initial BB visited
to_visit = initial BB entry
while (to_visit not empty)
current = to_visit.pop() bb1 bb2
for each successor block of current
Mark successor as visited; bb3 bb4
to_visit += successor
endfor bb5
endwhile Which BB(s) can be deleted?
Eliminate all unvisited blocks
46
Loop invariant computation
• UD-chains could be used to find out values that does not change as
long as control stays within the loop
• Loop have at least one way to get back to the header from any block
in the loop
• If x:= y+z is at a position in the loop and all possible definitions of ‘y’
and ‘z’ are outside the loop then y+z is loop invariant
Algorithm
• Input: A loop L consisting of a set of basic blocks having three address
statements
• The set of three-address statements that compute the same value
each time executed, from the time control enters the loop L until
control leaves L
Algorithm
• Mark the statements whose operands are all either constants or have
all reaching definitions outside L as ‘invariant’
• Repeat the following step until at some repetition no new statements
are marked ‘invariant’
• Mark ‘invariant’ all those statements not previously so marked all of
whose operands are either constant or having definitions reaching
outside ‘L’ or have only one reaching definition which is marked
invariant
Performing Code motion
• Applied to statements found to be loop invariant
• Statements are moved to the pre-header of the loop
• Some conditions need to be checked and applied to perform code
motion
Conditions ‘s’ x:= y+z
• Block containing a statement ‘s’ must dominate all exit nodes of the
loop – a successor node that is not in the loop
• No other statement in the loop assigns to ‘x’
• No use of ‘x’ in the loop is reached by any definition of x other than
‘s’.
Algorithm
• Input: a loop L with ud-chain information and dominator information
• Output: a revised loop with a pre-header and some statements
moved to the pre-header (if any)
Algorithm
• Identify the loop invariant statements defining ‘x’
• For each statement ‘s’ defining ‘x’ find
• That it is in a block that dominates all exits of L
• That ‘x’ is not defined elsewhere in L
• All uses in L of x can only be reached using statement ‘s’
Algorithm
• Move, in the order found by loop invariant algorithm, each statement
‘s’ to a newly created pre-header
Illegal code motion
i := 1 B1
B1 i := 1
B6
i := 2
if u < v goto B3 B2
if u < v goto B3 B2
i := 2
u := u+1 B3
u := u+1 B3
v := v -1
B4
if v <= 20 goto B5 v := v -1
B4
if v <= 20 goto B5
j := i B5
j := i B5
Loop Transformation and Aliases
Induction variable elimination
• A variable ‘x’ is induction if every time the value of ‘x’ is changed by a
constant ‘c’
• Look for basic induction variable i := i +/- c
• Look for derived induction variable ‘j’ which are defined in terms of
the basic ‘i'
Induction variable identification algorithm
• Input: A loop L with reaching definitions and loop-invariant
computation
• Output: a set of induction variables
Algorithm
• Find basic induction variable based on loop-invariant computation (i,I,
0)
• Search for a variable k having the following forms
• k := j * b, k := b*j, k := j/b, k:= j +/-b, k:= b +/-j
• b – constant, j is an induction variable
Algorithm
• Triple for k is (j, b, 0)
• Compute the triple and accumulate to the list of inductions variables
• Modify them to use additions / subtractions as against multiplication /
division
• Replace it and this is called strength reduction
Strength reduction – induction variables
• Input: A loop L with reaching definition information and induction
variables computed
• Output: A revised loop
Algorithm
• For each induction variable i in turn, for every induction variable j in
the family of i with triple (i, c, d): (j := i *c +d)
• Create a new variable ‘s’
• Replace the assignment to j by j:=s
• Immediately after each assignment i := i+n, append
• s := s + c* n
Algorithm
• Place ‘s’ in the family of ‘i‘ with triple (i,c,d)
• s is initialized to c*i+d
• s := c * i
• s := s+d
Quicksort CFG i := m-1
B1
j := n
t1 := 4*n Control Flow Graph
v := a[t1]
B2
i := i+1
t2 := 4*i
t3 := a[t2]
if t3<v goto BB2
B3
j := j-1
t4 := 4*j
t5 := a[t4]
if t5 > v goto BB3
B4
if i >= j goto BB6
B5 B6
goto BB2
Example
• B2 and B3 are inner loops
• Induction variable in B3 is ‘j’ and t4 (j, 4, 0)
• A new variable is construction s4
• t4 := 4 * j is replaced with t4:= s4
• Inserts the assignment s4 := s4 – 4 after j:= j-1
i := m-1
Quicksort CFG j := n
t1 := 4*n
v := a[t1]
B1 s2:= 4 * i Control Flow Graph
S4 := 4 * j
B2
i := i+1, s2:= s2+ 4
t2 := s2
t3 := a[t2]
if t3<v goto BB2
B3
j := j-1, s4:= s4 - 4
t4 := s4
t5 := a[t4]
if t5 > v goto BB3
B4
if i >= j goto BB6
B5 B6
goto BB2
Elimination of induction variable
• Input: A loop L with reaching definition information, loop-invariant
computation and live variable information
• Output: a revised loop
Algorithm
• Take some induction variable ‘j’ in ‘i‘s family with (i,c,d) and modify
each test that ‘i‘ appears in to use ‘j’ instead. ‘c’ is positive.
• if i relop x goto B is replaced as
• r := c*x, r := r+d, if j relop r goto B
Algorithm
• if i1 relop i2 is also replace with new variable if j1 relop j2
• Delete all assignments to the eliminated induction variables from the
loop L
• Consider every new statement j:= s
• Verify that no assignment to ‘s’ between the introduced statement
and the use of ‘j’
• Replace all uses of j by uses of ‘s’ and delete j := s
i := m-1
Quicksort CFG j := n
t1 := 4*n
v := a[t1]
B1 s2:= 4 * i Control Flow Graph
S4 := 4 * j
B2
s2:= s2+ 4
t3 := a[t2]
if t3<v goto BB2
B3
s4:= s4 – 4
t5 := a[t4]
if t5 > v goto BB3
B4
if s2 >= s4 goto BB6
B5 B6
goto BB2
Dealing with aliases
• If two or more expressions denote the same memory address we say
that the expressions are aliases of one another
• Presence of pointers makes data-flow analysis more complex
• Pointer p can point to is to assume that an indirect assignment
through a pointer can potentially change any variable
Dealing with aliases
• Consider a language having preliminary data types
• If pointer ‘p’ points to a primitive data element, then any arithmetic
operation on ‘p’ produces a value that may be an integer
Dealing with aliases
• If ‘p’ points to an array, addition/subtractive leads to ‘p’ somewhere
in the array
• If ‘p’ points to other array, then the impact of this would have to be
dealt by the optimizing compiler
Effects of pointer assignments
• Variables that could possibly be used as pointers are those declared to
be pointers and temporaries that receive a value is a pointer plus or
minus a constant
Pointer can point to
• If there is an assignment s: p := & a then immediately after s, ‘p’
points only to ‘a’. If a is an array, then p can point only to a after
assignment of the form p := &a +/-c, where &a refers to &a[0]
Pointer can point to
• If there is an assignment, s: p:= q +/-c , p and q are pointers, then
immediately after s, p can point to any array that q could point to
before ‘s’
• If there is an assignment, s: q := p, p points to what q points to
Pointer can point to
• Any other assignment to p, there is no object that p could point to
such an assignment is probably meaningless
• After any assignment to a variable other than p, p points to whatever
it did before the assignment
Alias computation
• in[B] – (p, a) – set of variables {a} to which p could point at the
beginning of B
• transB – transfer function that defines the effect of block B
• Takes a set of pairs, S of the form (p,a) and produces another set T
Alias computation
• transB is computed for every statement and transB is the union of
transS
Rules for computing transS
• if s: p:= & a or p:= &a+/- c, where ‘a’ is array then
• transS (S) = (S – {(p,b) | any variable b}) U (p,a)
• If s: p: = q +/- c for pointer q and c is non-zero
• transS (S) = (S – {(p,b) | any variable b}) U
{(p,b) | (q,b) is in S and b is any variable}
Alias computation
• If s: p:= q then
• transS (S) = (S – {(p,b) | any variable b}) U {(p,b) | (q,b) is in S}
• If s assigns to pointer p another expression then
• transS (S) = (S – {(p,b) | any variable b})
• If s is not an assignment to a pointer then
• transS (S) = S
Data-flow equations
• out[B] = transB (in[B])
• in[B] = U out(P) where P is a predecessor block
• transB (S) = transsk (transsk-1 (transsk-2 … )))
Example
q := & c B1
p := p+1 B4
p := q B5
Example
• Out[B1] = transB1 (Φ)
• B1 has one statement and hence
Out[B1] = transB1 (Φ) = {(q,c)}
• p:= & c replace all pairs of p with (p,c)
• q replaces (q,a)
Out[B2] = transB2 ((q,c)) = {(p,c), (q,a)}
Example
Block in[ ] out [ ] trans [ ]
B1 Φ {(q,c)} {(q,c)}
B2 {(q,c)} {(p,c), (q,a)} {(p,c), (q,a)}
B3 {(q,c)} {(p,a), (q,c)} {(p,a), (q,c)}
B4 {(p,a), (q,c), {(p,a), (q,c), (q,a)} {(p,a), (q,c), (q,a)}
(p,c), (q,a)}
B5 {(p,a), (q,c), {(p,a), (q,c), (p,c), {(p,a), (q,c), (p,c), (q,a)}
(q,a)} (q,a)}
Example – II pass
Block in[ ] out [ ] trans [ ]
B1 Φ {(p,a), (q,c)} {(p,a), (q,c)}
B2 {(p,a),(q,c)} {(p,c), (q,a)} {(p,c), (q,a)}
B3 {(p,a),(q,c)} {(p,a), (q,c)} {(p,a), (q,c)}
B4 {(p,a), (q,c), {(p,a), (q,c), (q,a)} {(p,a), (q,c), (q,a)}
(p,c), (q,a)}
B5 {(p,a), (q,c), {(p,a), (q,c), (p,c), {(p,a), (q,c), (p,c), (q,a)}
(q,a)} (q,a)}
Usage
• For live variable analysis
• Dead variable analysis
• Reaching definitions
Summary
• Iterative data flow equations for reaching definitions, available
expression and live variable analysis
• Algorithm and examples were discussed
• Algorithm for Common sub-expression and copy propagation
• Identified unreachable code and eliminated
• Code motion and loop invariant computation identified
• Loop optimizations
• Dealing with aliases and its impact