Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

code optimization

Code generation is the final phase of compilation, transforming high-level source code into lower-level object code while ensuring efficiency and correctness. Directed acyclic graphs (DAGs) are utilized for optimizing basic blocks by identifying common sub-expressions and simplifying code. Various optimization techniques, including compile-time evaluation, variable propagation, and loop optimization, aim to enhance performance and reduce resource usage in the generated code.

Uploaded by

pofomax827
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

code optimization

Code generation is the final phase of compilation, transforming high-level source code into lower-level object code while ensuring efficiency and correctness. Directed acyclic graphs (DAGs) are utilized for optimizing basic blocks by identifying common sub-expressions and simplifying code. Various optimization techniques, including compile-time evaluation, variable propagation, and loop optimization, aim to enhance performance and reduce resource usage in the generated code.

Uploaded by

pofomax827
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Unit -5

Code generation can be considered as the final phase of compilation.


Through post code generation, optimization process can be applied on the
code, but that can be seen as a part of code generation phase itself. The
code generated by the compiler is an object code of some lower-level
programming language, for example, assembly language. We have seen
that the source code written in a higher-level language is transformed into
a lower-level language that results in a lower-level object code, which
should have the following minimum properties:

 It should carry the exact meaning of the source code.


 It should be efficient in terms of CPU usage and memory
management.

DAG representation for basic blocks


A DAG for basic block is a directed acyclic graph with the
following labels on nodes:

1. The leaves of graph are labeled by unique identifier and that


identifier can be variable names or constants.
2. Interior nodes of the graph is labeled by an operator symbol.
3. Nodes are also given a sequence of identifiers for labels to
store the computed value.

o DAGs are a type of data structure. It is used to implement


transformations on basic blocks.
o DAG provides a good way to determine the common sub-
expression.
o It gives a picture representation of how the value computed
by the statement is used in subsequent statements.

Algorithm for construction of DAG


Input:It contains a basic block
Output: It contains the following information:

o Each node contains a label. For leaves, the label is an


identifier.
o Each node contains a list of attached identifiers to hold the
computed values.

1. Case (i) x:= y OP z


2. Case (ii) x:= OP y
3. Case (iii) x:= y

Method:
Step 1:

If y operand is undefined then create node(y).

If z operand is undefined then for case(i) create node(z).

Step 2:

For case(i), create node(OP) whose right child is node(z) and left
child is node(y).

For case(ii), check whether there is node(OP) with one child


node(y).

For case(iii), node n will be node(y).

Output:

For node(x) delete x from the list of identifiers. Append x to


attached identifiers list for the node n found in step 2. Finally set
node(x) to n.
Example:
Consider the following three address statement.

1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= s2 * S4
6. S6:= prod + S5
7. Prod:= s6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
Stages in DAG Construction:
A
Advantages of DAG

DAGs are used for the following purposes-


 To determine the expressions which have been computed more than once
(called common sub-expressions).
 To determine the names whose computation has been done outside the
block but used inside the block.
 To determine the statements of the block whose computed value can be
made available outside the block.
 To simplify the list of Quadruples by not executing the assignment
instructions x:=y unless they are necessary and eliminating the common
sub-expressions.
Sources of optimization.
o Machine independent optimization attempts to improve the
intermediate code to get a better target code. The part of
the code which is transformed here does not involve any
absolute memory location or any CPU registers.
o The process of intermediate code generation introduces
much inefficiency like: using variable instead of constants,
extra copies of variable, repeated evaluation of expression.
Through the code optimization, you can remove such
efficiencies and improves code.
o It can change the structure of program sometimes of beyond
recognition like: unrolls loops, inline functions, eliminates
some variables that are programmer defined.

Code Optimization can perform in the following different ways:

(1) Compile Time Evaluation:


(a) z = 5*(45.0/5.0)*r
Perform 5*(45.0/5.0)*r at compile time.

(b) x = 5.7
y = x/3.6
Evaluate x/3.6 as 5.7/3.6 at compile time.

(2) Variable Propagation:


Before Optimization the code is:

1. c = a * b
2. x = a
3. till
4. d = x * b + 4

After Optimization the code is:

1. c = a * b
2. x = a
3. till
4. d=a*b+4

Here, after variable propagation a*b and x*b identified as


common sub expression.

(3) Dead code elimination:


Before elimination the code is:

1. c = a * b
2. x = b
3. till
4. d = a * b + 4

After elimination the code is:


1. c = a * b
2. till
3. d = a * b + 4

Here, x= b is a dead state because it will never subsequently


used in the program. So, we can eliminate this state.

(4) Code Motion:


o It reduces the evaluation frequency of expression.
o It brings loop invariant statements out of the loop.

1. do
2. {
3. item = 10;
4. valuevalue = value + item;
5. } while(value<100);
6.
7.
8. //This code can be further optimized as
9.
10. item = 10;
11. do
12. {
13. valuevalue = value + item;
14. } while(value<100);

(5) Induction Variable and Strength Reduction:


o Strength reduction is used to replace the high strength
operator by the low strength.
o An induction variable is used in loop for the following kind of
assignment like i = i + constant.

Before reduction the code is


1. i = 1;
2. while(i<10)
3. {
4. y = i * 4;
5. }

After Reduction the code is:

1. i = 1
2. t = 4
3. {
4. while( t<40)
5. y = t;
6. t = t + 4;
7. }

Loop Optimization
Loop optimization is most valuable machine-independent
optimization because program's inner loop takes bulk to time of a
programmer.

If we decrease the number of instructions in an inner loop then


the running time of a program may be improved even if we
increase the amount of code outside that loop.

For loop optimization the following three techniques are


important:

1. Code motion
2. Induction-variable elimination
3. Strength reduction

1.Code Motion:
Code motion is used to decrease the amount of code in loop. This
transformation takes a statement or expression which can be
moved outside the loop body without affecting the semantics of
the program.

Exception Handling in Java - Javatpoint

For example
In the while statement, the limit-2 equation is a loop invariant
equation.

1. while (i<=limit-2) /*statement does not change limit*/


2. After code motion the result is as follows:
3. a= limit-2;
4. while(i<=a) /*statement does not change limit or a*/

2.Induction-Variable Elimination
Induction variable elimination is used to replace variable from
inner loop.

It can reduce the number of additions in a loop. It improves both


code space and run time performance.

In this figure, we can replace the assignment t4:=4*j by t4:=t4-4.


The only problem which will be arose that t4 does not have a
value when we enter block B2 for the first time. So we place a
relation t4=4*j on entry to the block B2.

3.Reduction in Strength
o Strength reduction is used to replace the expensive
operation by the cheaper once on the target machine.
o Addition of a constant is cheaper than a multiplication. So
we can replace multiplication with an addition within the
loop.
o Multiplication is cheaper than exponentiation. So we can
replace exponentiation with multiplication within the loop.

Example:
1. while (i<10)
2. {
3. j= 3 * i+1;
4. a[j]=a[j]-2;
5. i=i+2;
6. }

After strength reduction the code will be:

1. s= 3*i+1;
2. while (i<10)
3. {
4. j=s;
5. a[j]= a[j]-2;
6. i=i+2;
7. s=s+6;
8. }

In the above code, it is cheaper to compute s=s+6 than j=3 *i


Global data flow analysis
o To efficiently optimize the code compiler collects all the
information about the program and distribute this
information to each block of the flow graph. This process is
known as data-flow graph analysis.
o Certain optimization can only be achieved by examining the
entire program. It can't be achieve by examining just a
portion of the program.
o For this kind of optimization user defined chaining is one
particular problem.
o Here using the value of the variable, we try to find out that
which definition of a variable is applicable in a statement.

Based on the local information a compiler can perform some


optimizations. For example, consider the following code:

1. x = a + b;
2. x=6*3
o In this code, the first assignment of x is useless. The value
computer for x is never used in the program.
o At compile time the expression 6*3 will be computed,
simplifying the second assignment statement to x = 18;

Some optimization needs more global information. For example,


consider the following code:

1. a = 1;
2. b = 2;
3. c = 3;
4. if (....) x = a + 5;
5. else x = b + 4;
6. c = x + 1;
In this code, at line 3 the initial assignment is useless and x +1
expression can be simplified as 7.

But it is less obvious that how a compiler can discover these facts
by looking only at one or two consecutive statements. A more
global analysis is required so that the compiler knows the
following things at each point in the program:

o Which variables are guaranteed to have constant values


o Which variables will be used before being redefined

Data flow analysis is used to discover this kind of property. The


data flow analysis can be performed on the program's control flow
graph (CFG).

The control flow graph of a program is used to determine those


parts of a program to which a particular value assigned to a
variable might propagate.

Peephole Optimization
Peephole optimization is a type of Code Optimization performed on
a small part of the code. It is performed on the very small set of
instructions in a segment of code.

The small set of instructions or small part of code on which peephole


optimization is performed is known as peephole or window.

it basically works on the theory of replacement in which a part of


code is replaced by shorter and faster code without change in
output.
Peephole is the machine dependent optimization.
Objectives of Peephole Optimization:
The objective of peephole optimization is:
1. To improve performance
2. To reduce memory footprint
3. To reduce code size
Peephole Optimization Techniques:
1. Redundant load and store elimination:
In this technique the redundancy is eliminated.
Initial code:
y = x + 5;
i = y;
z = i;
w = z * 3;

Optimized code:
y = x + 5;
i = y;
w = y * 3;
2. Constant folding:
The code that can be simplified by user itself, is simplified.
Initial code:
x = 2 * 3;

Optimized code:
x = 6;
3. Strength Reduction:
The operators that consume higher execution time are replaced
by the operators consuming less execution time.
Initial code:
y = x * 2;

Optimized code:
y = x + x; or y = x << 1;

Initial code:
y = x / 2;

Optimized code:
y = x >> 1;
4. Null sequences:
Useless operations are deleted.
5. Combine operations:
Several operations are replaced by a single equivalent operation.

Design Issues
In the code generation phase, various issues can arises:

1. Input to the code generator


2. Target program
3. Memory management
4. Instruction selection
5. Register allocation
6. Evaluation order

1. Input to the code generator


o The input to the code generator contains the intermediate
representation of the source program and the information of
the symbol table. The source program is produced by the
front end.
o Intermediate representation has the several choices:
a) Postfix notation
b) Syntax tree
c) Three address code
o We assume front end produces low-level intermediate
representation i.e. values of names in it can directly
manipulated by the machine instructions.
o The code generation phase needs complete error-free
intermediate code as an input requires.

2. Target program:
The target program is the output of the code generator. The
output can be:

a) Assembly language: It allows subprogram to be separately


compiled.

b) Relocatable machine language: It makes the process of


code generation easier.

16.5M

332

History of Java
c) Absolute machine language: It can be placed in a fixed
location in memory and can be executed immediately.

3. Memory management
o During code generation process the symbol table entries
have to be mapped to actual p addresses and levels have to
be mapped to instruction address.
o Mapping name in the source program to address of data is
co-operating done by the front end and code generator.
o Local variables are stack allocation in the activation record
while global variables are in static area.

4. Instruction selection:
o Nature of instruction set of the target machine should be
complete and uniform.
o When you consider the efficiency of target machine then the
instruction speed and machine idioms are important factors.
o The quality of the generated code can be determined by its
speed and size.

Example:
The Three address code is:

1. a:= b + c
2. d:= a + e

Inefficient assembly code is:

1. MOV b, R0 R0→b
2. ADD c, R0 R0 c + R0
3. MOV R0, a a → R0
4. MOV a, R0 R0→ a
5. ADD e, R0 R0 → e + R0
6. MOV R0, d d → R0

5. Register allocation
Register can be accessed faster than memory. The instructions
involving operands in register are shorter and faster than those
involving in memory operand.

The following sub problems arise when we use registers:

Register allocation: In register allocation, we select the set of


variables that will reside in register.

Register assignment: In Register assignment, we pick the


register that contains variable.

Certain machine requires even-odd pairs of registers for some


operands and result.

For example:
Consider the following division instruction of the form:

1. D x, y

Where,

x is the dividend even register in even/odd register pair

y is the divisor

Even register is used to hold the reminder.

Old register is used to hold the quotient.

6. Evaluation order
The efficiency of the target code can be affected by the order in
which the computations are performed. Some computation orders
need fewer registers to hold results of intermediate than others.
Code Generator
Code generator is used to produce the target code for three-
address statements. It uses registers to store the operands of the
three address statement.

Example:
Consider the three address statement x:= y + z. It can have the
following sequence of codes:
MOV x, R0
ADD y, R0

Register and Address Descriptors:


o A register descriptor contains the track of what is currently
in each register. The register descriptors show that all the
registers are initially empty.
o An address descriptor is used to store the location where
current value of the name can be found at run time.

A code-generation algorithm:
The algorithm takes a sequence of three-address statements as
input. For each three address statement of the form a:= b op c
perform the various actions. These are as follows:

1. Invoke a function getreg to find out the location L where the


result of computation b op c should be stored.
2. Consult the address description for y to determine y'. If the
value of y currently in memory and register both then prefer
the register y' . If the value of y is not already in L then
generate the instruction MOV y' , L to place a copy of y in L.
3. Generate the instruction OP z' , L where z' is used to show
the current location of z. if z is in both then prefer a register
to a memory location. Update the address descriptor of x to
indicate that x is in location L. If x is in L then update its
descriptor and remove x from all other descriptor.
4. If the current value of y or z have no next uses or not live on
exit from the block or in register then alter the register
descriptor to indicate that after execution of x : = y op z
those register will no longer contain y or z.

Generating Code for Assignment Statements:


The assignment statement d:= (a-b) + (a-c) + (a-c) can be
translated into the following sequence of three address code:

C++ vs Java

1. t:= a-b
2. u:= a-c
3. v:= t +u
4. d:= v+u

Code sequence for the example is as follows:

Statement Code Generated Register Address descriptor


descriptor
Register empty

t:= a - b MOV a, R0 R0 contains t t in R0


SUB b, R0

u:= a - c MOV a, R1 R0 contains t t in R0


SUB c, R1 R1 contains u u in R1

v:= t + u ADD R1, R0 R0 contains v u in R1


R1 contains u v in R1

d:= v + u ADD R1, R0 R0 contains d d in R0


MOV R0, d d in R0 and memory
GENERATING CODE FROM DAGs
The advantage of generating code for a basic block from its dag
representation is that from a dag we can easily see how to rearrange the order of
the final computation sequence than we can start from a linear sequence of
three-address statements or quadruples.

Rearranging the order

The order in which computations are done can affect the cost of resulting
object code. For example, consider the following basic block:
t1 : = a + b
t2 : = c + d
t3 : = e - t2
t4 : = t1 - t3

Generated code sequence for basic block:


MOV a , R0
ADD b , R0
MOV c , R1
ADD d , R1
MOV R0 , t1
MOV e , R0
SUB R1 , R0
MOV t1 , R1
SUB R0 , R1
MOV R1 , t4

Rearranged basic block:


Now t1 occurs immediately before t4.
t2 : = c + d
t3 : = e - t2
t1 : = a + b
t4 : = t1 - t3
Revised code sequence:
MOV c , R0
ADD d , R0
MOV a , R0
SUB R0 , R1
MOV a , R0
ADD b , R0
SUB R1 , R0
MOV R0 , t4
In this order, two instructions MOV R0 , t1 and MOV t1 , R1 have been saved.

A Heuristic ordering for Dags

The heuristic ordering algorithm attempts to make the evaluation of a nod


the evaluation of its leftmost argument. The algorithm shown below produces
the ordering in reverse.

Algorithm:
1) while unlisted interior nodes remain do begin
2) select an unlisted node n, all of whose parents have been listed;
3) list n;
4) while the leftmost child m of n has no unlisted parents and is not a leaf
do
begin
5) list m;
6) n:=m

end
end

Example: Consider the DAG shown below


Initially, the only node with no unlisted parents is 1 so set n=1 at line (2)
and list 1 at line (3). Now, the left argument of 1, which is 2, has its
parents listed, so we list 2 and set n=2 at line (6). Now, at line (4) we find
the leftmost child of 2, which is 6, has an unlisted parent 5. Thus we
select a new n at line (2), and node 3 is the only candidate. We list 3 and
proceed down its left chain, listing 4, 5 and 6. This leaves only 8 among
the interior nodes so we list that. The resulting list is 1234568 and the
order of evaluation is 8654321.
Code sequence:
t8 : = d + e
t6 : = a + b
t5 : = t6 - c
t4 : = t5 * t8
t3 : = t4 - e
t2 : = t6 + t4
t1 : = t2 * t3
This will yield an optimal code for the DAG on machine whatever be the
number of registers.

You might also like