CD Unit - 4

UNIT – 4
Runtime Environment: storage organization, runtime storage

allocation, activation records, procedure calls, displays.
Code optimization: The Principle sources of optimization, Basic

Blocks, Optimization of Basic Blocks, Structure Preserving,
Transformations, Flow Graphs, Loop optimization, Data Flow Analysis,
Peephole optimization.
Storage Organization
 When the target program executes then it runs in its own logical address
space in which the value of each program has a location.
 The logical address space is shared among the compiler, operating system
and target machine for management and organization. The operating system
is used to map the logical address into physical address which is usually
spread throughout the memory.
Subdivision of Run-time Memory:
 Runtime storage comes into blocks, where a byte is used to show the
smallest unit of addressable memory. Using the four bytes a machine word
can form. The object of multibyte is stored in consecutive bytes and gives
the first byte address.
 Run-time storage can be subdivided to hold the different components of an
executing program:
1. Generated executable code
2. Static data objects
3. Dynamic data-object- heap
4. Automatic data objects- stack
STORAGE ALLOCATION TECHNIQUES
I. Static Storage Allocation
The names are bound with the storage at compiler time only and hence every time
procedure is invoked its names are bound to the same storage location only So
values of local names can be retained across activations of a procedure. Here
compiler can decide where the activation records go with respect to the target code
and can also fill the addresses in the target code for the data it operates on.
 For any program, if we create a memory at compile time, memory will be

created in the static area.
 For any program, if we create a memory at compile-time only, memory is
created only once.
 It doesn’t support dynamic data structure i.e memory is created at compile-
time and deallocated after program completion.
 The drawback with static storage allocation is recursion is not supported.
 Another drawback is the size of data should be known at compile time
Eg- FORTRAN was designed to permit static storage allocation.
II. Stack Storage Allocation
 Storage is organized as a stack and activation records are pushed and popped
as activation begins and end respectively. Locals are contained in activation
records, so they are bound to fresh storage in each activation.
 Recursion is supported in stack allocation
III. Heap Storage Allocation

 Memory allocation and deallocation can be done at any time and at any
place depending on the requirement of the user.
 Heap allocation is used to dynamically allocate memory to the variables and
claim it back when the variables are no longer required.
 Recursion is supported.
PARAMETER PASSING: The communication medium among procedures is

known as parameter passing. The values of the variables from a calling procedure
are transferred to the called procedure by some mechanism.
Basic terminology:
 R- value: The value of an expression is called its r-value. The value

contained in a single variable also becomes an r-value if it appears on the
right side of the assignment operator. R-value can always be assigned to
some other variable.
 L-value: The location of the memory(address) where the expression is
stored is known as the l-value of that expression. It always appears on the
left side of the assignment operator.
 I. Formal Parameter: Variables that take the information passed by the
caller procedure are called formal parameters. These variables are declared
in the definition of the called function.
 ii. Actual Parameter: Variables whose values and functions are passed to
the called function are called actual parameters. These variables are
specified in the function call as arguments.
Different ways of passing the parameters to the procedure:
 Call by Value: In call by value the calling procedure passes the r-value of
the actual parameters, and the compiler puts that into the called procedure’s
activation record. Formal parameters hold the values passed by the calling
procedure, thus any changes made in the formal parameters do not affect the
actual parameters.
 Call by Reference: call by reference the formal and actual parameters
refers to same memory location. The l-value of actual parameters is copied
to the activation record of the called function. Thus, the called function has
the address of the actual parameters. If the actual parameters do not have a l-
value (eg- i+3) then it is evaluated in a new temporary location and the
address of the location is passed. Any changes made in the formal parameter
are reflected in the actual parameters (because changes are made at the
address).
 Call by Copy Restore In call by copy restore compiler copies the value in
formal parameters when the procedure is called and copies them back in
actual parameters when control returns to the called function. The r-values
are passed and on return r-value of formals are copied into l-value of actuals.
 Call by Name In call by name the actual parameters are substituted for
formals in all the places formals occur in the procedure. It is also referred to
as lazy evaluation because evaluation is done on parameters only when
needed.
Activation Records [important]

Activation Record :
An activation record is a contiguous block of storage that manages information

required by a single execution of a procedure. When you enter a procedure, you
allocate an activation record, and when you exit that procedure, you de-allocate it.
Basically, it stores the status of the current activation function. So, whenever a
function call occurs, then a new activation record is created, and it will be pushed
onto the top of the stack. It will remain in stack till the execution of that function.
So, once the procedure is completed and it is returned to the calling function, this
activation function will be popped out of the stack.
If a procedure is called, an activation record is pushed into the stack, and it is

popped when the control returns to the calling function.
Activation Record includes some fields which are –
Return values, parameter list, control links, access links, saved machine status,
local data, and temporaries.
Activation Record
Temporaries:
The temporary values, such as those arising in the evaluation of expressions, are
stored in the field for temporaries.
Local data:
The field for local data holds data that is local to an execution of a procedure.
Saved Machine States:
The field for Saved Machine Status holds information about the state of the
machine just before the procedure is called. This information includes the value of
the program counter and machine registers that have to be restored when control
returns from the procedure.
Access Link:
It refers to information stored in other activation records that is non-local. The

access link is a static link, and the main purpose of the access link is to access the
data which is not present in the local scope of the activation record. It is a static
link.
Let’s take an example to understand this –
#include
<stdio.h>
int g=12;
void
Geeks()
{
printf("%d"
, g);
}
void main()
{
Geeks();
}
Now, In this example, when Geeks() is called in a main(), the task of Geeks() in
main() is to print(g), but g is not defined within its scope(local scope of Geeks());
in this case, Geeks() would use the access link to access ‘g’ from Global Scope and
then print its value (g=12).
As a chain of access links (think of scopes), the program traces its static structure.
Now, let’s take another example to understand the concept of access link in detail –
#include <stdio.h>
int main (int argc, char

*argv[]) {
int a = 100;
int geeks(int b) {
int c = a+b;
return c;
}
int geek1(int b) {
return geeks(2*b);
}
(void) printf("The answer is
%d\n", geek1(a));
return 0;
}
There are no errors detected while compiling the program, and the correct answer
is displayed, which is 300. Now, let’s discuss the nesting paths. Nested procedures
include an AR(Activation Record) access link that enables users to access the AR
of the most recent action taken by their immediately outer procedure. So, in this
example, the access link for geeks and access link for geeks1 would each point to
the AR of the activation of the main.
Each activation record gets a pointer called the access link that facilitates the direct
implementation of the normal static scope rule.
Control Links :
In this case, it refers to an activation record of the caller. They are generally used
for links and saved status. It is a dynamic link in nature. When a function calls
another function, then the control link points to the activation record of the caller.
Record A contains a control link pointing to the previous record on the stack.
Dynamically executed programs are traced by the chain of control links.
Example –
#include<stdio.h>
int geeks(int x)
{
printf("value of x
is: %d", x);
}
int main()
{
geeks(10);
}
Let’s take another example –
#include <stdio.h>
int geeks();
int main() {
int x, y;
//Calling a function
geeks();
return 0;
}
int geeks() {
//Function called from

main()
printf("Function called
from main()");
return 0;
}
When the function geeks() are called, it uses the access link method to access x and
y (statically scoped) in its calling function main ().
Parameter List:
The field for parameters list is used by the calling procedure to supply parameters
to the called procedure. We show space for parameters in the activation record, but
in practice, parameters are often passed in machine registers for greater efficiency.
Return value:
The field for the return value is used by the called procedure to return a value to
the calling procedure. Again, in practice, this value is often returned in a register
for greater efficiency.
Procedure calls
Procedures call
Procedure is an important and frequently used programming construct

for a compiler. It is used to generate good code for procedure calls and
returns.
Calling sequence:
The translation for a call includes a sequence of actions taken on entry

and exit from each procedure. Following actions take place in a calling
sequence:
 When a procedure call occurs then space is allocated for activation

record.
 Evaluate the argument of the called procedure.
 Establish the environment pointers to enable the called procedure
to access data in enclosing blocks.
 Save the state of the calling procedure so that it can resume
execution after the call.
 Also save the return address. It is the address of the location to
which the called routine must transfer after it is finished.
 Finally generate a jump to the beginning of the code for the called
procedure.
Let us consider a grammar for a simple procedure call statement
5. S → call id(Elist)
6. Elist → Elist, E
7. Elist → E
A suitable transition scheme for procedure call would be:
Production Rule Semantic Action

S → call id (Elist) for each item p on QUEUE do
GEN (param p)
GEN (call id.PLACE)
Elist → Elist, E append E.PLACE to the end of
QUEUE
Elist → E initialize QUEUE to contain only
E.PLACE
A queue is used to store the list of parameters in the procedure call.
Displays
An access link is a pointer to each activation record that obtains a direct
implementation of lexical scope for nested procedures. In other words, an access
link is used to implement lexically scoped language. An “access line” can be
required to place data required by the called procedure.
An improved scheme for handling static links defined at various lexical levels is
the usage of a data structure called display. A display is an array of pointers to the
activation records. Display [0] contains a pointer to the activation record of the
most recent activation of a procedure defined at lexical level 0.
The number of elements in the display array is given by the maximum level of
nesting in the input source program. In the display scheme of accessing the non-
local variables defined in the enclosing procedures, each procedure on activation
stores a pointer to its activation record in the display array at its lexical level.
It saves the previous value at that location in the display array and restores it when
the procedure exists. The advantage of the display scheme is that the activation
record of any enclosing procedure at lexical level ‘n’ can be directly fetched using
Display [n] as opposed to traversing of the access links in the previous scheme.
There are two types of scope rules for the non-local names are as follows −
Static Scope or Lexical Scope
Lexical scope is known as static scope. In this type of scope, the scope is tested by
determining the text of the program. An example such as PASCAL, C and ADA
are the languages that use the static scope rule. These languages are also known as
block structured languages.
Dynamic Scope
The dynamic scope allocation rules are used for non-block structured languages. In
this type of scoping, the non-local variables access refers to the non-local data
which is declared in most recently called and still active procedures. There are two
methods to implement non-local accessing under dynamic scoping are −
Deep Access − The basic concept is to keep a stack of active variables. Use control
links instead of access links and to find a variable, search the stack from top to
bottom looking for the most recent activation record that contains the space for
desired variables. This method of accessing nonlocal variables is called deep
access. Since search is made “deep” in the stack, hence the method is called deep
access. In this method, a symbol table should be used at runtime.
Shallow Access − The idea to keep central storage and allot one slot for every
variable name. If the names are not created at runtime, then the storage layout can
be fixed at compile time
Principle sources of Optimization
A transformation of a program is called local if it can be performed by
looking only at the statements in a basic block; otherwise, it is called global. Many
transformations can be performed at both the local and global levels. Local
transformations are usually performed first.
Function-Preserving Transformations
There are a number of ways in which a compiler can improve a program
without changing the function it computes.
Function preserving transformations examples:
Common sub expression elimination
Copy propagation,
Dead-code elimination
Constant folding
The other transformations come up primarily when global optimizations are
performed.
Frequently, a program will include several calculations of the offset in an
array. Some of the duplicate calculations cannot be avoided by the programmer
because they lie below the level of detail accessible within the source language.
***
Common Sub expressions elimination:

• An occurrence of an expression E is called a common sub-expression if E was
previously computed, and the values of variables in E have not changed since the
previous computation. We can avoid recomputing the expression if we can use the
previously computed value.
•For example
t1:=4*i
t2:=a[t1]
t3:=4*j
t4:=4*i
t5:=n
t6:=b[t4]+t5
The above code can be optimized using the common sub-expression elimination as
t1:=4*i
t2:=a[t1]
t3:=4*j
t5:=n
t6:=b[t1]+t5
The common sub expression t4: =4*i is eliminated as its computation is already in t1
and the value of i is not been changed from definition to use.
CopyPropagation:
Assignments of the form f: = g called copy statements, or copies for short. The idea
behind the copy-propagation transformation is to use g for f, whenever possible after
the copy statement f: = g. Copy propagation means use of one variable instead of
another. This may not appear to be an improvement, but as we shall see it gives us an
opportunity to eliminate x.
•For example:
x=Pi;
A=x*r*r;
The optimization using copy propagation can be done as follows: A=Pi*r*r;
Here the variable x is eliminated
Dead-Code Eliminations:
A variable is live at a point in a program if its value can be used subsequently;

otherwise, it is dead at that point. A related idea is dead or useless code, statements
that compute values that never get used. While the programmer is unlikely to
introduce any dead code intentionally, it may appear as the result of previous
transformations.
Example:
i=0;
if(i=1)
a=b+5;
Here, ‘if’ statement is dead code because this condition will never get satisfied.
Constant folding:
Deducing at compile time that the value of an expression is a constant and using the
constant instead is known as constant folding. One advantage of copy propagation is
that it often turns the copy statement into dead code.
For example,
a=3.14157/2 can be replaced by
a=1.570 thereby eliminating a division operation.
Loop Optimizations:
In loops, especially in the inner loops, programs tend to spend the bulk of their time.
The running time of a program may be improved if the number of instructions in an
inner loop is decreased, even if we increase the amount of code outside that loop.
Three techniques are important for loop optimization:

Ø Code motion, which moves code outside a loop;
Ø Induction-variable elimination, which we apply to replace variables from inner
loop.
Ø Reduction in strength, which replaces and expensive operation by a cheaper one,

such as a multiplication by an addition.
Fig. 5.2 Flow graph
Code Motion:
An important modification that decreases the amount of code in a loop is code
motion. This transformation takes an expression that yields the same result
independent of the number of times a loop is executed (a loop-invariant

computation) and places the expression before the loop. Note that the notion “before
the loop” assumes the existence of an entry for the loop. For example, evaluation of
limit-2 is a loop-invariant computation in the following while-statement:
while (i <= limit-2) /* statement does not change limit*/
Code motion will result in the equivalent of
t= limit-2;
while (i<=t) /* statement does not change limit or t */
Induction Variables:
Loops are usually processed inside out. For example, consider the loop
around B3. Note that the values of j and t4 remain in lockstep; every time the value
of j decreases by 1, that of t4 decreases by 4 because 4*j is assigned to t4. Such
identifiers are called induction variables.
When there are two or more induction variables in a loop, it may be possible to get
rid of all but one, by the process of induction-variable elimination. For the inner
loop around B3 in Fig.5.3 we cannot get rid of either j or t4 completely; t4 is used in
B3 and j in B4.
Basic Blocks:
Basic Block is a straight line code sequence that has no branches in and out
branches except to the entry and at the end respectively. Basic Block is a set of
statements that always executes one after other, in a sequence.
The first task is to partition a sequence of three-address codes into basic blocks. A
new basic block is begun with the first instruction and instructions are added until
a jump or a label is met. In the absence of a jump, control moves further
consecutively from one instruction to another. The idea is standardized in the
algorithm below:
Algorithm: Partitioning three-address code into basic blocks.
Input: A sequence of three address instructions.
Process: Instructions from intermediate code which are leaders are determined.
The following are the rules used for finding a leader:
8. The first three-address instruction of the intermediate code is a leader.

9. Instructions that are targets of unconditional or conditional jump/goto
statements are leaders.
10.Instructions that immediately follow unconditional or conditional jump/goto
statements are considered leaders.
Each leader thus determined its basic block contains itself and all instructions up to
excluding the next leader.
Example 1:
The following sequence of three-address statements forms a basic block:
t1 := a*a
t2 := a*b
t3 := 2*t2
t4 := t1+t3
t5 := b*b
t6 := t4 +t5
A three address statement x:= y+z is said to define x and to use y and z. A name in
a basic block is said to be live at a given point if its value is used after that point in
the program, perhaps in another basic block.
Example 2:
Intermediate code to set a 10*10 matrix to an identity matrix:
1) i=1 //Leader 1 (First statement)

2) j=1 //Leader 2 (Target of 11th statement)
3) t1 = 10 * i //Leader 3 (Target of 9th statement)
4) t2 = t1 + j
5) t3 = 8 * t2
6) t4 = t3 - 88
7) a[t4] = 0.0
8) j = j + 1
9) if j <= 10 goto (3)
10) i = i + 1 //Leader 4 (Immediately
following Conditional goto statement)
11) if i <= 10 goto (2)
12) i = 1 //Leader 5 (Immediately
following Conditional goto statement)
13) t5 = i - 1 //Leader 6 (Target of 17th
statement)
14) t6 = 88 * t5
15) a[t6] = 1.0
16) i = i + 1
17) if i <= 10 goto (13)
The given algorithm is used to convert a matrix into identity matrix i.e. a matrix
with all diagonal elements 1 and all other elements as 0.
Steps (3)-(6) are used to make elements 0, step (14) is used to make an element 1.
These steps are used recursively by goto statements.
There are 6 Basic Blocks in the above code :
B1) Statement 1
B2) Statement 2
B3) Statement 3-9
B4) Statement 10-11
B5) Statement 12
B6) Statement 13-17
Optimization of basic blocks

Optimization is applied to the basic blocks after the intermediate code generation
phase of the compiler. Optimization is the process of transforming a program that
improves the code by consuming fewer resources and delivering high speed. In
optimization, high-level codes are replaced by their equivalent efficient low-level
codes. Optimization of basic blocks can be machine-dependent or machine-
independent. These transformations are useful for improving the quality of code
that will be ultimately generated from basic block.
There are two types of basic block optimizations:
11.Structure preserving transformations

12.Algebraic transformations
Structure-Preserving Transformations:
The structure-preserving transformation on basic blocks includes:
13.Dead Code Elimination

14.Common Subexpression Elimination
15.Renaming of Temporary variables
16.Interchange of two independent adjacent statements
1.Dead Code Elimination:
Dead code is defined as that part of the code that never executes during the
program execution. So, for optimization, such code or dead code is eliminated. The
code which is never executed during the program (Dead code) takes time so, for
optimization and speed, it is eliminated from the code. Eliminating the dead code
increases the speed of the program as the compiler does not have to translate the
dead code.
Example:
// Program with Dead code

int main()
{
x = 2
if (x > 2)
cout << "code"; // Dead code
else
cout << "Optimization";
return 0;
} // Optimized Program without dead code
int main()
{
x = 2;
cout << "Optimization"; // Dead Code Eliminated
return 0;
}
2.Common Subexpression Elimination:
In this technique, the sub-expression which are common are used frequently are
calculated only once and reused when needed. DAG ( Directed Acyclic Graph ) is
used to eliminate common subexpressions. ex:
3.Renaming of Temporary Variables:
Statements containing instances of a temporary variable can be changed to

instances of a new temporary variable without changing the basic block value.
Example: Statement t = a + b can be changed to x = a + b where t is a temporary

variable and x is a new temporary variable without changing the value of the basic
block.
4.Interchange of Two Independent Adjacent Statements:
If a block has two adjacent statements which are independent can be interchanged
without affecting the basic block value.
Example:
t1 = a + b
t2 = c + d
These two independent statements of a block can be interchanged without affecting
the value of the block.
Algebraic Transformation:
Countless algebraic transformations can be used to change the set of expressions

computed by a basic block into an algebraically equivalent set. Some of the
algebraic transformation on basic blocks includes:
17.Constant Folding
18.Copy Propagation
19.Strength Reduction
1. Constant Folding:
Solve the constant terms which are continuous so that compiler does not need to
solve this expression.
Example:
x = 2 * 3 + y ⇒ x = 6 + y (Optimized code)
2. Copy Propagation:
It is of two types, Variable Propagation, and Constant Propagation.
Variable Propagation:
x=y ⇒ z = y + 2 (Optimized code)
z=x+2
Constant Propagation:
x=3 ⇒ z = 3 + a (Optimized code)
z=x+a
3. Strength Reduction:
Replace expensive statement/ instruction with cheaper ones.

x = 2 * y (costly) ⇒ x = y + y (cheaper)
x = 2 * y (costly) ⇒ x = y << 1 (cheaper)
Loop Optimization:
Loop optimization includes the following strategies:
20.Code motion & Frequency Reduction

21.Induction variable elimination
22.Loop merging/combining
23.Loop Unrolling
1. Code Motion & Frequency Reduction
Move loop invariant code outside of the loop.
// Program with loop variant inside loop

int main()
{
for (i = 0; i < n; i++) {
x = 10;
y = y + i;
}
return 0;
} // Program with loop variant outside loop
int main()
{
x = 10;
for (i = 0; i < n; i++)
y = y + i;
return 0;
}
2. Induction Variable Elimination:
Eliminate various unnecessary induction variables used in the loop.

// Program with multiple induction variables
int main()
{
i1 = 0;
i2 = 0;
for (i = 0; i < n; i++) {
A[i1++] = B[i2++];
}
return 0;
} // Program with one induction variable
int main()
{
for (i = 0; i < n; i++) {
A[i] = B[i]; // Only one induction variable
}
return 0;
}
3. Loop Merging/Combining:
If the operations performed can be done in a single loop then, merge or combine
the loops.
// Program with multiple loops

int main()
{
for (i = 0; i < n; i++)
A[i] = i + 1;
for (j = 0; j < n; j++)
B[j] = j - 1;
return 0;
} // Program with one loop when multiple loops are merged
int main()
{
for (i = 0; i < n; i++) {
A[i] = i + 1;
B[i] = i - 1;
}
return 0;
}
4. Loop Unrolling:
If there exists simple code which can reduce the number of times the loop executes
then, the loop can be replaced with these codes.
Flow graphs
Flow graph is a directed graph. It contains the flow of control information for the
set of basic block.
A control flow graph is used to depict that how the program control is being parsed
among the blocks. It is useful in the loop optimization.
Flow graph for the vector dot product is given as follows:

 Block B1 is the initial node. Block B2 immediately follows B1, so from B2
to B1 there is an edge.
 The target of jump from last statement of B1 is the first statement B2, so
from B1 to B2 there is an edge.
 B2 is a successor of B1 and B1 is the predecessor of B2.
Loop optimization:
Loop optimization is most valuable machine-independent optimization because
program's inner loop takes bulk to time of a programmer.
If we decrease the number of instructions in an inner loop then the running time of
a program may be improved even if we increase the amount of code outside that
loop.
For loop optimization the following three techniques are important:
24.Code motion
25.Induction-variable elimination
26.Strength reduction
1.Code Motion:
Code motion is used to decrease the amount of code in loop. This transformation
takes a statement or expression which can be moved outside the loop body without
affecting the semantics of the program.
For example
In the while statement, the limit-2 equation is a loop invariant equation.
27.while (i<=limit-2) /*statement does not change limit*/

28.After code motion the result is as follows:
29. a= limit-2;
30. while(i<=a) /*statement does not change limit or a*/
2.Induction-Variable Elimination
Induction variable elimination is used to replace variable from inner loop.

It can reduce the number of additions in a loop. It improves both code space and
run time performance.
In this figure, we can replace the assignment t4:=4*j by t4:=t4-4. The only problem
which will be arose that t4 does not have a value when we enter block B2 for the
first time. So we place a relation t4=4*j on entry to the block B2.
3.Reduction in Strength
 Strength reduction is used to replace the expensive operation by the cheaper

once on the target machine.
 Addition of a constant is cheaper than a multiplication. So we can replace
multiplication with an addition within the loop.
 Multiplication is cheaper than exponentiation. So we can replace
exponentiation with multiplication within the loop.
Example:
31.while (i<10)
32. {
33.j= 3 * i+1;
34.a[j]=a[j]-2;
35.i=i+2;
36. }
After strength reduction the code will be:
37.s= 3*i+1;
38. while (i<10)
39. {
40. j=s;
41. a[j]= a[j]-2;
42. i=i+2;
43. s=s+6;
44. }
In the above code, it is cheaper to compute s=s+6 than j=3 *i
Data flow analysis

Global data flow analysis
 To efficiently optimize the code compiler collects all the information about
the program and distribute this information to each block of the flow graph.
This process is known as data-flow graph analysis.
 Certain optimization can only be achieved by examining the entire program.
It can't be achieve by examining just a portion of the program.
 For this kind of optimization user defined chaining is one particular
problem.
 Here using the value of the variable, we try to find out that which definition
of a variable is applicable in a statement.
Based on the local information a compiler can perform some optimizations. For
example, consider the following code:
45.x = a + b;
46. x = 6 * 3
 In this code, the first assignment of x is useless. The value computer for x is
never used in the program.
 At compile time the expression 6*3 will be computed, simplifying the
second assignment statement to x = 18;
Some optimization needs more global information. For example, consider the
following code:
47.a = 1;
48. b = 2;
49. c = 3;
50. if (....) x = a + 5;
51. else x = b + 4;
52. c = x + 1;
In this code, at line 3 the initial assignment is useless and x +1 expression can be
simplified as 7.
But it is less obvious that how a compiler can discover these facts by looking only
at one or two consecutive statements. A more global analysis is required so that the
compiler knows the following things at each point in the program:
 Which variables are guaranteed to have constant values

 Which variables will be used before being redefined
Data flow analysis is used to discover this kind of property. The data flow analysis
can be performed on the program's control flow graph (CFG).
The control flow graph of a program is used to determine those parts of a program
to which a particular value assigned to a variable might propagate .

Peephole optimization
Peephole optimization in Compiler Design is a technique performed on compiler-
generated instructions.
Before we dive into the discussion of the peephole, let's understand how a program
is executed. When a program runs, source code first gets compiled to bytecode.
Source code is the code written by the user( generally in a high-level language
such as python, c++, etc.). Whereas bytecode is machine code(in the form of 0 and
1) that the machine can easily understand. This byte code is generally executed by
a language compiler and contains an optimized and faster source code version. The
code’s performance can be improved by various program transformation
techniques, making the program consume fewer resources and deliver high speed.
Peephole optimization is an optimization technique by which code is optimized to

improve the machine's performance. More formally,
What is Peephole Optimization in Compiler Design?
Peephole optimization is an optimization technique performed on a small set of

compiler-generated instructions; the small set is known as the peephole
optimization in compiler design or window.
Some important aspects regarding peephole optimization:
53.It is applied to the source code after it has been converted to the target code.
54.
55.Peephole optimization comes under machine-dependent optimization.
Machine-dependent optimization occurs after the target code has been
generated and transformed to fit the target machine architecture. It makes
use of CPU registers and may make use of absolute memory references
rather than relative memory references.
56.
57.It is applied to a small piece of code, repeatedly.
58.
The objectives of peephole optimization are as follows:
59.Improve performance
60.Reduce memory footprint
61.Reduce code size.
62.
Objectives of Peephole Optimization in Compiler Design
The objectives of Peephole optimization in compiler design are:

 It makes the generated machine code smaller, improving cache usage and
saving memory.
 Improve the performance of instructions arranged to make the program run
faster.
 Get rid of operations that are not needed or are repeated to make things work
better and smoother.
 It calculates fixed values in advance and substitutes variables with already
known numbers.
 Improve how choices are made to control the program's flow more
effectively.
 Replace slower instructions with quicker options for better performance.
 Optimize memory operations for better data handling.
 Use specific hardware features for better performance on the target platform.
Working of Peephole Optimization in Compiler design
There are mainly four steps in Peephole Optimization in Compiler Design. The
steps are as follows:
Identification
 The first step says that you must identify the code section where you need
the Peephole Optimization.

 Peephole is an instruction with a fixed window size, so the window size
depends on the specific optimization being performed.

 The compiler helps to define the instructions within the window.
Optimization
 In the next step, you must apply the rules of optimizations pre-defined in the
Peephole.

 The compiler will search for the specific pattern of instructions in the
window.

 There can be many types of patterns, such as insufficient code, series of
loads and stores or complex patterns like branches.
Analysis
 After the pattern is identified, the compiler will make the changes in the
instructions.

 Now the compiler will cross-check the codes to determine whether the
changes improved the code.

 It will check the improvement based on size, speed and memory usage.
Iteration
 The above steps will go on a loop by finding the Peephole repeatedly until
no more optimisation is left in the code.

 The compiler will go to each instruction one at a time and make the changes
and reanalyse it for the best result.
Peephole Optimization Techniques
There are various peephole optimization techniques.
Redundant Load and Store
In this optimization, the redundant operations are removed. For example, loading
and storing values on registers can be optimized.
For example,
a= b+c
d= a+e
It is implemented on the register(R0) as

MOV b, R0; instruction to copy b to the register
ADD c, R0; instruction to Add c to the register, the
register is now b+c
MOV R0, a; instruction to Copy the register(b+c) to a
MOV a, R0; instruction to Copy a to the register
ADD e, R0 ;instruction to Add e to the register, the
register is now a(b+c)+e
MOV R0, d; instruction to Copy the register to d
This can be optimized by removing load and store operation, like in third
instruction value in register R0 is copied to a, and it again loaded to R0 in the next
step for further operation. The optimized implementation will be:
MOV b, R0; instruction to Copy b to the register

ADD c, R0; instruction to Add c to the register, which is
now b+c (a)
MOV R0, a; instruction to Copy the register to a
ADD e, R0; instruction to Add e to the register, which is
now b+c+e [(a)+e]
MOV R0, d; instruction to Copy the register to d
Strength Reduction
In strength reduction optimization, operators that consume higher execution time

are replaced by the operators consuming less execution time. Like multiplication
and division, operators can be replaced by shift operators.
Initial code:
n = a * 2;
Optimized code:
b= a << 1;
//left shifting the bit
Initial code:
b = a / 2;
Optimized code:
b = a >> 1;
// right shifting the bit by one will give the same result
Simply Algebraic Expressions
The algebraic expressions that are useless or written inefficiently are transformed.
For example:
a=a+0
a=a*1
a=a/1
a=a-0
//All these above expression are causing calculation
overhead.
// These can be removed for optimization
Replace Slower Instructions With Faster
Slower instructions can be replaced with faster ones, and registers play an
important role. For example, a register supporting unit increment operation will
perform better than adding one to the register. The same can be done with many
other operations, like multiplication.
Add #1
SUB #1
//The above instruction can be replaced with
// INC R
// DEC R
//If the register supports increment and decrement
Let’s see another example of Java bytecode:
Here X is loaded on ‘a’ twice and then multiplied. We can use dup function, it will
copy the value on the top of the stack( ‘X’ need not be loaded again), and then we
can perform our operation. It works faster and can be preferred over slower
operations.
a load X
a load X
Mul
// The above instructions can be replaced with
a load X
dup
Mul
Dead code Elimination
The dead code can be eliminated to improve the system's performance; resources
will be free and less memory will be needed.
int dead(void)
{
int a=1;
int b=5;
int c=a+b;
return c;
// c will be returned
// The remaining part of code is dead code, never reachable
int k=1;
k=k*2;
k=k+b;
return k;
// This dead code can be removed for optimization
}
Moreover, null sequences and user less operations can be deleted too.

CD Unit - 4

Uploaded by

Copyright:

Available Formats

CD Unit - 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CD Unit - 4

Uploaded by

Copyright:

Available Formats

UNIT – 4

Runtime Environment: storage organization, runtime storage

Code optimization: The Principle sources of optimization, Basic

Subdivision of Run-time Memory:

STORAGE ALLOCATION TECHNIQUES

I. Static Storage Allocation

 For any program, if we create a memory at compile time, memory will be

Eg- FORTRAN was designed to permit static storage allocation.

II. Stack Storage Allocation

III. Heap Storage Allocation

PARAMETER PASSING: The communication medium among procedures is

 R- value: The value of an expression is called its r-value. The value

Different ways of passing the parameters to the procedure:

Activation Records [important]

An activation record is a contiguous block of storage that manages information

If a procedure is called, an activation record is pushed into the stack, and it is

Activation Record includes some fields which are –

Saved Machine States:

It refers to information stored in other activation records that is non-local. The

Let’s take an example to understand this –

int main (int argc, char

Let’s take another example –

//Function called from

Procedure is an important and frequently used programming construct

The translation for a call includes a sequence of actions taken on entry

 When a procedure call occurs then space is allocated for activation

Let us consider a grammar for a simple procedure call statement

A suitable transition scheme for procedure call would be:

Production Rule Semantic Action

A queue is used to store the list of parameters in the procedure call.

Static Scope or Lexical Scope

transformations are usually performed first.

There are a number of ways in which a compiler can improve a program

without changing the function it computes.

Function preserving transformations examples:

Common sub expression elimination

The other transformations come up primarily when global optimizations are

Frequently, a program will include several calculations of the offset in an

array. Some of the duplicate calculations cannot be avoided by the programmer

Common Sub expressions elimination:

previously computed value.

and the value of i is not been changed from definition to use.

behind the copy-propagation transformation is to use g for f, whenever possible after

The optimization using copy propagation can be done as follows: A=Pi*r*r;

Here the variable x is eliminated

A variable is live at a point in a program if its value can be used subsequently;

constant instead is known as constant folding. One advantage of copy propagation is

that it often turns the copy statement into dead code.

a=3.14157/2 can be replaced by

a=1.570 thereby eliminating a division operation.

The running time of a program may be improved if the number of instructions in an

Three techniques are important for loop optimization:

Ø Reduction in strength, which replaces and expensive operation by a cheaper one,

Fig. 5.2 Flow graph

An important modification that decreases the amount of code in a loop is code

independent of the number of times a loop is executed (a loop-invariant

limit-2 is a loop-invariant computation in the following while-statement:

while (i <= limit-2) /* statement does not change limit*/

Code motion will result in the equivalent of

The optimization using copy propagation can be done as follows: A=Pirr;

27.while (i<=limit-2) /statement does not change limit/