CD Unit - 4
CD Unit - 4
CD Unit - 4
Storage Organization
When the target program executes then it runs in its own logical address
space in which the value of each program has a location.
The logical address space is shared among the compiler, operating system
and target machine for management and organization. The operating system
is used to map the logical address into physical address which is usually
spread throughout the memory.
Runtime storage comes into blocks, where a byte is used to show the
smallest unit of addressable memory. Using the four bytes a machine word
can form. The object of multibyte is stored in consecutive bytes and gives
the first byte address.
Run-time storage can be subdivided to hold the different components of an
executing program:
1. Generated executable code
2. Static data objects
3. Dynamic data-object- heap
4. Automatic data objects- stack
The names are bound with the storage at compiler time only and hence every time
procedure is invoked its names are bound to the same storage location only So
values of local names can be retained across activations of a procedure. Here
compiler can decide where the activation records go with respect to the target code
and can also fill the addresses in the target code for the data it operates on.
Storage is organized as a stack and activation records are pushed and popped
as activation begins and end respectively. Locals are contained in activation
records, so they are bound to fresh storage in each activation.
Recursion is supported in stack allocation
Basic terminology:
Call by Value: In call by value the calling procedure passes the r-value of
the actual parameters, and the compiler puts that into the called procedure’s
activation record. Formal parameters hold the values passed by the calling
procedure, thus any changes made in the formal parameters do not affect the
actual parameters.
Call by Reference: call by reference the formal and actual parameters
refers to same memory location. The l-value of actual parameters is copied
to the activation record of the called function. Thus, the called function has
the address of the actual parameters. If the actual parameters do not have a l-
value (eg- i+3) then it is evaluated in a new temporary location and the
address of the location is passed. Any changes made in the formal parameter
are reflected in the actual parameters (because changes are made at the
address).
Call by Copy Restore In call by copy restore compiler copies the value in
formal parameters when the procedure is called and copies them back in
actual parameters when control returns to the called function. The r-values
are passed and on return r-value of formals are copied into l-value of actuals.
Call by Name In call by name the actual parameters are substituted for
formals in all the places formals occur in the procedure. It is also referred to
as lazy evaluation because evaluation is done on parameters only when
needed.
Return values, parameter list, control links, access links, saved machine status,
local data, and temporaries.
Activation Record
Temporaries:
The temporary values, such as those arising in the evaluation of expressions, are
stored in the field for temporaries.
Local data:
The field for local data holds data that is local to an execution of a procedure.
The field for Saved Machine Status holds information about the state of the
machine just before the procedure is called. This information includes the value of
the program counter and machine registers that have to be restored when control
returns from the procedure.
Access Link:
#include
<stdio.h>
int g=12;
void
Geeks()
{
printf("%d"
, g);
}
void main()
{
Geeks();
}
Now, In this example, when Geeks() is called in a main(), the task of Geeks() in
main() is to print(g), but g is not defined within its scope(local scope of Geeks());
in this case, Geeks() would use the access link to access ‘g’ from Global Scope and
then print its value (g=12).
As a chain of access links (think of scopes), the program traces its static structure.
Now, let’s take another example to understand the concept of access link in detail –
#include <stdio.h>
int geek1(int b) {
return geeks(2*b);
}
(void) printf("The answer is
%d\n", geek1(a));
return 0;
}
There are no errors detected while compiling the program, and the correct answer
is displayed, which is 300. Now, let’s discuss the nesting paths. Nested procedures
include an AR(Activation Record) access link that enables users to access the AR
of the most recent action taken by their immediately outer procedure. So, in this
example, the access link for geeks and access link for geeks1 would each point to
the AR of the activation of the main.
Each activation record gets a pointer called the access link that facilitates the direct
implementation of the normal static scope rule.
Control Links :
In this case, it refers to an activation record of the caller. They are generally used
for links and saved status. It is a dynamic link in nature. When a function calls
another function, then the control link points to the activation record of the caller.
Record A contains a control link pointing to the previous record on the stack.
Dynamically executed programs are traced by the chain of control links.
Example –
#include<stdio.h>
int geeks(int x)
{
printf("value of x
is: %d", x);
}
int main()
{
geeks(10);
}
#include <stdio.h>
int geeks();
int main() {
int x, y;
//Calling a function
geeks();
return 0;
}
int geeks() {
When the function geeks() are called, it uses the access link method to access x and
y (statically scoped) in its calling function main ().
Parameter List:
The field for parameters list is used by the calling procedure to supply parameters
to the called procedure. We show space for parameters in the activation record, but
in practice, parameters are often passed in machine registers for greater efficiency.
Return value:
The field for the return value is used by the called procedure to return a value to
the calling procedure. Again, in practice, this value is often returned in a register
for greater efficiency.
Procedure calls
Procedures call
Calling sequence:
5. S → call id(Elist)
6. Elist → Elist, E
7. Elist → E
Displays
An access link is a pointer to each activation record that obtains a direct
implementation of lexical scope for nested procedures. In other words, an access
link is used to implement lexically scoped language. An “access line” can be
required to place data required by the called procedure.
An improved scheme for handling static links defined at various lexical levels is
the usage of a data structure called display. A display is an array of pointers to the
activation records. Display [0] contains a pointer to the activation record of the
most recent activation of a procedure defined at lexical level 0.
The number of elements in the display array is given by the maximum level of
nesting in the input source program. In the display scheme of accessing the non-
local variables defined in the enclosing procedures, each procedure on activation
stores a pointer to its activation record in the display array at its lexical level.
It saves the previous value at that location in the display array and restores it when
the procedure exists. The advantage of the display scheme is that the activation
record of any enclosing procedure at lexical level ‘n’ can be directly fetched using
Display [n] as opposed to traversing of the access links in the previous scheme.
There are two types of scope rules for the non-local names are as follows −
Lexical scope is known as static scope. In this type of scope, the scope is tested by
determining the text of the program. An example such as PASCAL, C and ADA
are the languages that use the static scope rule. These languages are also known as
block structured languages.
Dynamic Scope
The dynamic scope allocation rules are used for non-block structured languages. In
this type of scoping, the non-local variables access refers to the non-local data
which is declared in most recently called and still active procedures. There are two
methods to implement non-local accessing under dynamic scoping are −
Deep Access − The basic concept is to keep a stack of active variables. Use control
links instead of access links and to find a variable, search the stack from top to
bottom looking for the most recent activation record that contains the space for
desired variables. This method of accessing nonlocal variables is called deep
access. Since search is made “deep” in the stack, hence the method is called deep
access. In this method, a symbol table should be used at runtime.
Shallow Access − The idea to keep central storage and allot one slot for every
variable name. If the names are not created at runtime, then the storage layout can
be fixed at compile time
Principle sources of Optimization
A transformation of a program is called local if it can be performed by
looking only at the statements in a basic block; otherwise, it is called global. Many
transformations can be performed at both the local and global levels. Local
Function-Preserving Transformations
Copy propagation,
Dead-code elimination
Constant folding
performed.
because they lie below the level of detail accessible within the source language.
***
previously computed, and the values of variables in E have not changed since the
previous computation. We can avoid recomputing the expression if we can use the
•For example
t1:=4*i
t2:=a[t1]
t3:=4*j
t4:=4*i
t5:=n
t6:=b[t4]+t5
The above code can be optimized using the common sub-expression elimination as
t1:=4*i
t2:=a[t1]
t3:=4*j
t5:=n
t6:=b[t1]+t5
The common sub expression t4: =4*i is eliminated as its computation is already in t1
CopyPropagation:
Assignments of the form f: = g called copy statements, or copies for short. The idea
the copy statement f: = g. Copy propagation means use of one variable instead of
another. This may not appear to be an improvement, but as we shall see it gives us an
opportunity to eliminate x.
•For example:
x=Pi;
A=x*r*r;
Dead-Code Eliminations:
that compute values that never get used. While the programmer is unlikely to
introduce any dead code intentionally, it may appear as the result of previous
transformations.
Example:
i=0;
if(i=1)
a=b+5;
Here, ‘if’ statement is dead code because this condition will never get satisfied.
Constant folding:
Deducing at compile time that the value of an expression is a constant and using the
For example,
Loop Optimizations:
In loops, especially in the inner loops, programs tend to spend the bulk of their time.
inner loop is decreased, even if we increase the amount of code outside that loop.
Code Motion:
motion. This transformation takes an expression that yields the same result
the loop” assumes the existence of an entry for the loop. For example, evaluation of
t= limit-2;
while (i<=t) /* statement does not change limit or t */
Induction Variables:
Loops are usually processed inside out. For example, consider the loop
around B3. Note that the values of j and t4 remain in lockstep; every time the value
When there are two or more induction variables in a loop, it may be possible to get
rid of all but one, by the process of induction-variable elimination. For the inner
B3 and j in B4.
Basic Blocks:
Basic Block is a straight line code sequence that has no branches in and out
branches except to the entry and at the end respectively. Basic Block is a set of
statements that always executes one after other, in a sequence.
The first task is to partition a sequence of three-address codes into basic blocks. A
new basic block is begun with the first instruction and instructions are added until
a jump or a label is met. In the absence of a jump, control moves further
consecutively from one instruction to another. The idea is standardized in the
algorithm below:
Process: Instructions from intermediate code which are leaders are determined.
The following are the rules used for finding a leader:
Each leader thus determined its basic block contains itself and all instructions up to
excluding the next leader.
Example 1:
t1 := a*a
t2 := a*b
t3 := 2*t2
t4 := t1+t3
t5 := b*b
t6 := t4 +t5
A three address statement x:= y+z is said to define x and to use y and z. A name in
a basic block is said to be live at a given point if its value is used after that point in
the program, perhaps in another basic block.
Example 2:
Intermediate code to set a 10*10 matrix to an identity matrix:
Steps (3)-(6) are used to make elements 0, step (14) is used to make an element 1.
These steps are used recursively by goto statements.
B1) Statement 1
B2) Statement 2
B3) Statement 3-9
B5) Statement 12
Structure-Preserving Transformations:
The structure-preserving transformation on basic blocks includes:
Dead code is defined as that part of the code that never executes during the
program execution. So, for optimization, such code or dead code is eliminated. The
code which is never executed during the program (Dead code) takes time so, for
optimization and speed, it is eliminated from the code. Eliminating the dead code
increases the speed of the program as the compiler does not have to translate the
dead code.
Example:
In this technique, the sub-expression which are common are used frequently are
calculated only once and reused when needed. DAG ( Directed Acyclic Graph ) is
used to eliminate common subexpressions. ex:
3.Renaming of Temporary Variables:
If a block has two adjacent statements which are independent can be interchanged
without affecting the basic block value.
Example:
t1 = a + b
t2 = c + d
These two independent statements of a block can be interchanged without affecting
the value of the block.
Algebraic Transformation:
17.Constant Folding
18.Copy Propagation
19.Strength Reduction
1. Constant Folding:
Solve the constant terms which are continuous so that compiler does not need to
solve this expression.
Example:
x = 2 * 3 + y ⇒ x = 6 + y (Optimized code)
2. Copy Propagation:
Variable Propagation:
z=x+2
Constant Propagation:
z=x+a
3. Strength Reduction:
Loop Optimization:
3. Loop Merging/Combining:
If the operations performed can be done in a single loop then, merge or combine
the loops.
4. Loop Unrolling:
If there exists simple code which can reduce the number of times the loop executes
then, the loop can be replaced with these codes.
Flow graphs
Flow graph is a directed graph. It contains the flow of control information for the
set of basic block.
A control flow graph is used to depict that how the program control is being parsed
among the blocks. It is useful in the loop optimization.
If we decrease the number of instructions in an inner loop then the running time of
a program may be improved even if we increase the amount of code outside that
loop.
24.Code motion
25.Induction-variable elimination
26.Strength reduction
1.Code Motion:
Code motion is used to decrease the amount of code in loop. This transformation
takes a statement or expression which can be moved outside the loop body without
affecting the semantics of the program.
For example
2.Induction-Variable Elimination
In this figure, we can replace the assignment t4:=4*j by t4:=t4-4. The only problem
which will be arose that t4 does not have a value when we enter block B2 for the
first time. So we place a relation t4=4*j on entry to the block B2.
3.Reduction in Strength
Example:
31.while (i<10)
32. {
33.j= 3 * i+1;
34.a[j]=a[j]-2;
35.i=i+2;
36. }
37.s= 3*i+1;
38. while (i<10)
39. {
40. j=s;
41. a[j]= a[j]-2;
42. i=i+2;
43. s=s+6;
44. }
To efficiently optimize the code compiler collects all the information about
the program and distribute this information to each block of the flow graph.
This process is known as data-flow graph analysis.
Certain optimization can only be achieved by examining the entire program.
It can't be achieve by examining just a portion of the program.
For this kind of optimization user defined chaining is one particular
problem.
Here using the value of the variable, we try to find out that which definition
of a variable is applicable in a statement.
Based on the local information a compiler can perform some optimizations. For
example, consider the following code:
45.x = a + b;
46. x = 6 * 3
In this code, the first assignment of x is useless. The value computer for x is
never used in the program.
At compile time the expression 6*3 will be computed, simplifying the
second assignment statement to x = 18;
Some optimization needs more global information. For example, consider the
following code:
47.a = 1;
48. b = 2;
49. c = 3;
50. if (....) x = a + 5;
51. else x = b + 4;
52. c = x + 1;
In this code, at line 3 the initial assignment is useless and x +1 expression can be
simplified as 7.
But it is less obvious that how a compiler can discover these facts by looking only
at one or two consecutive statements. A more global analysis is required so that the
compiler knows the following things at each point in the program:
Data flow analysis is used to discover this kind of property. The data flow analysis
can be performed on the program's control flow graph (CFG).
The control flow graph of a program is used to determine those parts of a program
Before we dive into the discussion of the peephole, let's understand how a program
is executed. When a program runs, source code first gets compiled to bytecode.
Source code is the code written by the user( generally in a high-level language
such as python, c++, etc.). Whereas bytecode is machine code(in the form of 0 and
1) that the machine can easily understand. This byte code is generally executed by
a language compiler and contains an optimized and faster source code version. The
code’s performance can be improved by various program transformation
techniques, making the program consume fewer resources and deliver high speed.
53.It is applied to the source code after it has been converted to the target code.
54.
55.Peephole optimization comes under machine-dependent optimization.
Machine-dependent optimization occurs after the target code has been
generated and transformed to fit the target machine architecture. It makes
use of CPU registers and may make use of absolute memory references
rather than relative memory references.
56.
57.It is applied to a small piece of code, repeatedly.
58.
59.Improve performance
60.Reduce memory footprint
61.Reduce code size.
62.
There are mainly four steps in Peephole Optimization in Compiler Design. The
steps are as follows:
Identification
The first step says that you must identify the code section where you need
the Peephole Optimization.
Peephole is an instruction with a fixed window size, so the window size
depends on the specific optimization being performed.
The compiler helps to define the instructions within the window.
Optimization
In the next step, you must apply the rules of optimizations pre-defined in the
Peephole.
The compiler will search for the specific pattern of instructions in the
window.
There can be many types of patterns, such as insufficient code, series of
loads and stores or complex patterns like branches.
Analysis
After the pattern is identified, the compiler will make the changes in the
instructions.
Now the compiler will cross-check the codes to determine whether the
changes improved the code.
It will check the improvement based on size, speed and memory usage.
Iteration
The above steps will go on a loop by finding the Peephole repeatedly until
no more optimisation is left in the code.
The compiler will go to each instruction one at a time and make the changes
and reanalyse it for the best result.
In this optimization, the redundant operations are removed. For example, loading
and storing values on registers can be optimized.
For example,
a= b+c
d= a+e
This can be optimized by removing load and store operation, like in third
instruction value in register R0 is copied to a, and it again loaded to R0 in the next
step for further operation. The optimized implementation will be:
Strength Reduction
Initial code:
n = a * 2;
Optimized code:
b= a << 1;
//left shifting the bit
Initial code:
b = a / 2;
Optimized code:
b = a >> 1;
// right shifting the bit by one will give the same result
The algebraic expressions that are useless or written inefficiently are transformed.
For example:
a=a+0
a=a*1
a=a/1
a=a-0
//All these above expression are causing calculation
overhead.
// These can be removed for optimization
Slower instructions can be replaced with faster ones, and registers play an
important role. For example, a register supporting unit increment operation will
perform better than adding one to the register. The same can be done with many
other operations, like multiplication.
Add #1
SUB #1
//The above instruction can be replaced with
// INC R
// DEC R
//If the register supports increment and decrement
Here X is loaded on ‘a’ twice and then multiplied. We can use dup function, it will
copy the value on the top of the stack( ‘X’ need not be loaded again), and then we
can perform our operation. It works faster and can be preferred over slower
operations.
a load X
a load X
Mul
// The above instructions can be replaced with
a load X
dup
Mul
The dead code can be eliminated to improve the system's performance; resources
will be free and less memory will be needed.
int dead(void)
{
int a=1;
int b=5;
int c=a+b;
return c;
// c will be returned
// The remaining part of code is dead code, never reachable
int k=1;
k=k*2;
k=k+b;
return k;
// This dead code can be removed for optimization
}
Moreover, null sequences and user less operations can be deleted too.