0% found this document useful (0 votes)

25 views

Compiler Optimization

Optimization is the process of modifying a system to make it more efficient by using fewer resources like execution time, memory usage, or battery power. It can occur at many levels from design to implementation to coding. The tradeoff is that optimizing for one resource like speed may increase usage of others like memory. Compiler optimization aims to minimize execution time by techniques like loop and data structure optimizations, common subexpression elimination, and peephole optimizations. The optimization must also be optimized as increased optimization time can outweigh the execution time savings.

Uploaded by

Yakshi Mangal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Compiler Optimization

Uploaded by

Yakshi Mangal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

I

n computing, optimization is the process of modifying a system to make some aspect of it work more
efficiently or use fewer resources. For instance, a computer program may be optimized so that it executes
more rapidly, or is capable of operating within a reduced amount of memory storage, or draws less battery
power (ie, in a portable computer). Optimization can occur at a number of levels. At the highest level the design
may be optimized to make best use of the available resources. The implementation of this design will benefit
from the use of efficient algorithms and the coding of these algorithms will benefit from the writing of good
quality code. Use of an optimizing compiler can help ensure that the executable program is optimized. At the
lowest level, it is possible to bypass the compiler completely and write assembly code by hand. With modern
optimizing compilers and the greater complexity of recent CPUs, it takes great skill to write code that is better
than the compiler can generate and few projects ever have to resort to this ultimate optimization step.
Optimization will generally focus on one or two of execution time, memory usage, disk space, bandwidth or
some other resource. This will usually require a tradeoff — where one is optimized at the expense of others. For
example, increasing the size of cache improves runtime performance, but also increases the memory consump-
tion. Other common tradeoffs include code clarity and conciseness.
“The order in which the operations shall be performed in every particular case is a very interesting and
curious question, on which our space does not permit us fully to enter. In almost every computation a great
variety of arrangements for the succession of the processes is possible, and various considerations must
influence the selection amongst them for the purposes of a Calculating Engine. One essential object is to
choose that arrangement which shall tend to reduce to a minimum the time necessary for completing the
calculation.” - Ada Byron’s notes on the analytical engine 1842.
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root
of all evil. Yet we should not pass up our opportunities in that critical 3%.” - Knuth
The optimizer (a program that does optimization) may have to be optimized as well. The compilation with
the optimizer being turned on usually takes more time, though this is only a problem when the program is
significantly large. In particular, for just-in-time compilers the performance of the optimizer is a key in improving
execution speed. Usually spending more time can yield better code, but it is also the precious computer time
that we want to save; thus in practice tuning the performance requires the trade-off between the time taken for
optimization and the reduction in the execution time gained by optimizing code.
Compiler optimization is the process of tuning the output of a compiler to minimize some attribute (or
maximize the efficiency) of an executable program. The most common requirement is to minimize the time taken
to execute a program; a less common one is to minimise the amount of memory occupied, and the growth of
portable computers has created. It has been shown that some code optimization problems are NP-complete. Now
in this chapter we discuss about all types of optimizations and target of optimization. Now we are standing at fifth
phase of compiler. Input to this phase is intermediate code of expressions from intermediate code generator as:

LEXICAL SYNTAX SYMANTIC

ANALYSIS ANALYSIS ANALYSIS
Source Tokens
code Parse tree/
Hieratical Error free
Structure parse tree

Optimize
intermediate code to CODE INTERMEDIATE
code generator OPTIMIZATION CODE
GENERATION
Techniques in optimization can be broken up among various scopes which affect anything from a single
statement to an entire program. Some examples of scopes include:

Optimization
Techniques

Functional
Loop Data-flow SSA based
language Other
Optimization optimizations optimizations
optimizations

Induction Common Global value Removing Dead code

variable subexpression numbering recursion elimination
analysis elimination

Loop fission or Constant Sparse conditional Parial

loop folding and constant Data structure
redundancy
distribution propagation propagation fusion
elimination

Loop fission or Alias

loop folding and Strength
combining propagation reduction

Loop Copy
inversion Propagation

Common Optimization
Loop Function
interchange
Algorithms Chunking

Rematerializa
Code motion
tion

Loop nest
Code Hoisting
optimization

Loop reversal

Loop
unrolling

Loop splitting

Loop
unswitching
Usually performed late in the compilation process after machine code has been generated. This form of optimi-
zation examines a few adjacent instructions (like looking through a peephole at the code) to see whether they
can be replaced by a single instruction or a shorter sequence of instructions. This example is also an instance
of strength reduction.
Example 7.1:
1. a = b + c;
2. d = a + e ; becomes
MOV b, R0
ADD c, R0
MOV R0, a
MOV a, R0 # redundant load, can be removed
ADD e, R0
MOV R0, d

Peephole
optimizations

Machine
Local or
independent vs.
intraprocedural
Machine
optimizations
dependent

Types of
optimizations

Programming
Interprocedural
language-
or whole-
independent vs.
program
Language
optimization
dependent

Loop
optimizations

These only consider information local to a function definition. This reduces the amount of analysis that needs
to be performed i.e.saving time and reducing storage requirements.
These analyse all of a programs source code. The greater quantity of information extracted means that optimi-
zations can be more effective compared to when they only have access to local information (i.e., within a single
function). This kind of optimization can also allow new techniques to be performed. For instance function
inlining, where a call to a function is replaced by a copy of the function body.

These act on the statements which make up a loop, such as a for loop. Loop optimizations can have a significant
impact because many programs spend a large percentage of their time inside loops.
In addition to scoped optimizations there are two further general categories of optimization:

Most high-level languages share common programming constructs and abstractions — decision (if, switch,
case), looping (for, while, repeat... until, do... while), encapsulation (structures, objects). Thus similar optimiza-
tion techniques can be used across languages. However certain language features make some kinds of optimi-
zations possible or difficult. For instance, the existence of pointers in C and C++ makes certain optimizations of
array accesses difficult. Conversely, in some languages functions are not permitted to have “side effects”.
Therefore, if repeated calls to the same function with the same arguments are made, the compiler can immedi-
ately infer that results need only be computed once and the result referred to repeatedly.

Many optimizations that operate on abstract programming concepts (loops, objects, structures) are indepen-
dent of the machine targeted by the compiler. But many of the most effective optimizations are those that best
exploit special features of the target platform.

1. Avoid redundancy: If something has already been computed, it is generally better to store it and reuse
it later, instead of recomputing it.
2. Less code: There is less work for the CPU, cache, and memory. So, likely to be faster.
3. Straight line code/ fewer jumps: Less complicated code. Jumps interfere with the prefetching of
instructions, thus slowing down code.
4. Code locality: Pieces of code executed close together in time should be placed close together in
memory, which increases spatial locality of reference.
5. Extract more information from code: The more information the compiler has, the better it can
optimize.
6. Avoid memory accesses: Accessing memory, particularly if there is a cache miss, is much more
expensive than accessing registers.
7. Speed: Improving the runtime performance of the generated object code. This is the most common
optimisation.
8. Space: Reducing the size of the generated object code.
9. Safety: Reducing the possibility of data structures becoming corrupted (for example, ensuring that
an illegal array element is not written to).
“Speed” optimizations make the code larger, and many “Space” optimizations make the code slower — this
is known as the space-time tradeoff.

Many of the choices about which optimizations can and should be done depend on the characteristics of the
target machine. GCC is a compiler which exemplifies this approach.

• Number of CPU registers: To a certain extent, the more registers, the easier it is to optimize for
performance. Local variables can be allocated in the registers and not on the stack. Temporary/
intermediate results can be left in registers without writing to and reading back from memory.
• RISC vs CISC: CISC instruction sets often have variable instruction lengths, often have a larger
number of possible instructions that can be used, and each instruction could take differing amounts
of time. RISC instruction sets attempt to limit the variability in each of these: instruction sets are
usually constant length, with few exceptions, there are usually fewer combinations of registers and
memory operations, and the instruction issue rate is usually constant in cases where memory latency
is not a factor. There may be several ways of carrying out a certain task, with CISC usually offering
more alternatives than RISC.
• Pipelines: A pipeline is essentially an ALU broken up into an assembly line. It allows use of parts of
the ALU for different instructions by breaking up the execution of instructions into various stages:
instruction decode, address decode, memory fetch, register fetch, compute, register store, etc. One
instruction could be in the register store stage, while another could be in the register fetch stage.
Pipeline conflicts occur when an instruction in one stage of the pipeline depends on the result of
another instruction ahead of it in the pipeline but not yet completed. Pipeline conflicts can lead to
pipeline stalls: where the CPU wastes cycles waiting for a conflict to resolve.
• Number of functional units: Some CPUs have several ALUs and FPUs (Floating Point Units). This
allows them to execute multiple instructions simultaneously. There may be restrictions on which
instructions can pair with which other instructions and which functional unit can execute which
instruction. They also have issues similar to pipeline conflicts.

• Cache Size (256 KB–4 MB) & type (direct mapped, 2-/4-/8-/16-way associative, fully associative):
Techniques like inline expansion may increase the size of the generated code and reduce code
locality. The program may slow down drastically if an oft-run piece of code suddenly cannot fit in the
cache. Also, caches which are not fully associative have higher chances of cache collisions even in
an unfilled cache.
• Cache/memory transfer rates: These give the compiler an indication of the penalty for cache
misses. This is used mainly in specialized applications.

A basic block is a straight-line piece of code without jump targets in the middle; jump targets, if any, start a
block, and jumps end a block.
or
A sequence of instructions forms a basic block if the instruction in each position dominates, or always
executes before, all those in later positions, and no other instruction executes between two instructions in the
sequence.
Basic blocks are usually the basic unit to which compiler optimizations are applied in compiler theory. Basic
blocks form the vertices or nodes in a control flow graph or they can be represented by control flow graph. The
blocks to which control may transfer after reaching the end of a block are called that block’s successors, while
the blocks from which control may have come when entering a block are called that block’s predecessors.
Instructions which begin a new basic block include
• Procedure and function entry points.
• Targets of jumps or branches.
• Instructions following some conditional branches.
• Instructions following ones that throw exceptions.
• Exception handlers.
Instructions that end a basic block include
• Unconditional and conditional branches, both direct and indirect.
• Returns to a calling procedure.
• Instructions which may throw an exception.
• Function calls can be at the end of a basic block if they may not return, such as functions which
throw exceptions or special calls.

INPUT: A sequence of three address code

OUTPUT: A list of basic block.
FUNCTION:
1. Find out leaders as per following rules .
• First statement of three address code is header.
• Statement for which ‘goto’ statement indicate.
• Statement next to ‘goto’ statement.
2. Every statement between first header to next header including first header and excluding next header
form basic block.
Example 7.2: Basic Blocks are shown in example 7.1(Page 238).

A control flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed
through a program during its execution. Each node in the graph represents a basic block and directed edges are
used to represent jumps in the control flow. There are two specially designated blocks: the entry block, through
which control enters into the flow graph, and the exit block, through which all control flow leaves.
1
2 3

5
6

8 9

Terminology: These terms are commonly used when discussing control flow graphs
Entry block: Block through which all control flow enters the graph. ( 4 is entry block for 5, 6, 7, 8, 9, 0 )
Exit block: Block through which all control flow leaves the graph. ( 8 is exit block)
Back edge: An edge that points to an ancestor in a depth-first (DFS) traversal of the graph. (edge from 0 to 3)
Critical edge: An edge which is neither the only edge leaving its source block, nor the only edge entering
its destination block. These edges must be split (a new block must be created in the middle of the edge) in order
to insert computations on the edge.
Abnormal edge: An edge whose destination is unknown. These edges tend to inhibit optimization. Excep-
tion handling constructs can produce them.
Impossible edge / Fake edge: An edge which has been added to the graph solely to preserve the property
that the exit blocks post dominates all blocks. It cannot ever be traversed.
Dominator: Block M dominates block N written as M dom N if every path from the entry node to block N
has to pass through block M or if every possible execution path from entry to N includes M. The entry block
dominates all blocks. Dominator must satisfy three properties as:
(i) Reflexive: Due to every node dominates itself.
( ii ) Transitive: If A dom B and B dom C then A dom C.
( iii ) Antisymmetric : If A dom B and B dom A then A = B
4 dominates 7, 8, 9, 0 not 5 and 6 because there is an edge from 4 to 7 similar 5 dominates 5 also 6 dominates 6.
Post-dominator: Block M postdominates block N if every path from N to the exit has to pass through block
M. The exit block postdominates all blocks. ( 7 is Postdominator for 8 and 9 )
Immediate dominator: Block M immediately dominates block N written as M idom N if M dominates N,
and there is no intervening block P such that M dominates P and P dominates N. In other words, M is the last
dominator on any path from entry to N. Each block has a unique immediate dominator, if it has any at all. ( 7
is immediate dominator for 9 not for 0 )
Immediate postdominator: Similar to immediate dominator.
Dominator tree: An ancillary data structure depicting the dominator relationships. There is an arc from
Block M to Block N if M is an immediate dominator of N known as dominator tree.
Postdominator tree: Similar to dominator tree. This tree is rooted at the exit block.
Loop header: Loop header dominates all blocks in the loop body and sometimes called as the entry point
of the loop.
(1, 2, 3 are loop header)

1 2 3

4 Dominator
Tree

5 6 7

8 9 10

Common optimization algorithms deal with specific cases:

1. Common Sub-expression Elimination
2. Copy Propagation
3. Dead Code Elimination
4. Code motion
5. Induction variable Elimination
6. Reduction in strength
7. Function Chunking
Common Sub-expression elimination:
Common sub-expression elimination is a speed optimization that aims to reduce unnecessary recalculation
by identifying, through code-flow, expressions (or parts of expressions) which will evaluate to the same value
i.e. the recomputation of an expression can be avoided if the expression has previously been computed and the
values of the operands have not changed since the previous computation.
Example 7.3: Consider the following program:
In the this example, the first and last statement’s right hand side are identical and the value of the operands
do not change between the two statements; thus this expression can be considered as having a common sub-
expression.

t1 = a B1 t1 = a B1 t1 = a B1
t2 = b t2 - b t2 - b
t3 = a + b t3 = a + b t3 = a + b
t4= t3 * a t4= t3 * a t4= t3 * a
t6 = a * b t6 = a * b t6 = a * b
t7 = a + b t7 = t3 t7 = t3
a = t7 a = t7 a = t7
if a > b goto B2 if a > b goto B2 if a > b goto B2

B3 B3 B3
B2 B2 B2
t7 = a + b t8 = a*b t7 = a + b t8 = a*b t7 = t3 t8 = a*b

t9 = a – b B4 t9 = a – b B4 t9 = a – b B4
t10 = a* t9 t10 = a* t9 t10 = a* t9
B5 B5 B5
t11 = a – b t11 = a – b t11 = a – b
t 12 = a*b t 12 = a*b t 12 = t6
t13 = t11 * t12 t13 = t11 * t12 t13 = t11 * t12

Original Code After local common sub- After global Common Sub-
expression Elimination expression Elimination

The common sub-expression can be avoided by storing its value in a temporary variable which can cache
its result. After applying this Common Sub-expression Elimination technique the program becomes:

Thus in the last statement the recomputation of the expression b + c is avoided. Compiler writers distin-
guish two kinds of CSE:
• Local Common Subexpression Elimination works within a single basic block and is thus a simple
optimization to implement.
• Global Common Subexpression Elimination works on an entire procedure, and relies on dataflow
analysis which expressions are available at which points in a procedure.
Copy propagation
Copy propagation is the process of replacing the occurrences of targets of direct assignments with their
values. A direct assignment is an instruction of the form x = y, which simply assigns the value of y to x.
From the following code:
y=x
z=3+y
Copy propagation would yield:
z=3+x
Copy propagation often makes use of reaching definitions, use-def chains* and def-use chains+ when
computing which occurrences of the target may be safely replaced. If all upwards exposed uses of the target
may be safely modified, the assignment operation may be eliminated. Copy propagation is a useful “clean up”
optimization frequently used after other optimizations have already been run. Some optimizations require that
copy propagation be run afterward in order to achieve an increase in efficiency. Copy propagation also classi-
fied as local and global copy propagation.
• Local copy propagation is applied to an individual basic block.
• Global copy propagation is applied to all code’s basic block.
Dead code elimination
Removation of instructions that will not affect the behaviour of the program, for example definitions which
have no uses or code which will never execute regardless of input called dead code and this process is known
as dead code elimination which is a size optimization (although it also produces some speed improvement) that
aims to remove logically impossible statements from the generated object code. This technique is common in
debugging to optionally activate blocks of code; using an optimizer with dead code elimination eliminates the
need for using a preprocessor to perform the same task.
Example 7.4: Consider the following program:

t1 = x B1 t1 = x B1 t1 = x B1
t2 = y + 11 t2 = y + x t2 = y + x
t3 = t2 t3 = t2 t3 = t2
t4 = z * t3 t4 = z * t2 t4 = z * t2
if z > y goto B2 if z > y goto B2 if z > y goto B2

B2 B3 B2 B3 B2 B3
t5 = z t5 = z t5 = z
t11 = t3 + 5 t11 = t3 + 5 t11 = t3 + 5
t6 = y t6 = y t6 = y
t12 = t1 * 9 t12 = t1 * 9 t12 = t1 * 9
t7 = t5 + 16 t7 = z + y t7 = z + y
t13 = t12 t13 = t12 t13 = t12
t8 = x t8 = x t8 = x
t9 = y + t8 t9 = y + t8 t9 = y + t8
t10 = t9 t10 = t9 t10 = t9

x = t8 + t10 B4 x = t8 + t10 B4 x = t8 + t10 B4

t14 = x t14 = x t14 = x
y = x + 50 y = x + 50 y = x + 50
B5 B5 B5
z = t13 + t3 z = t13 + t3 z = t12 + t2
x = t6 + x x = t6 + x x=y+x

Original Code After Local Copy After Global Copy

Propagation Propagation
It is obvious that the complicated calculation will never be performed; since the last value assigned to a
before the if statement is a constant, we can calculate the result at compile-time. Simple substitution of argu-
ments produces if (5 != 5), which is false. Since the body of if(false) statement will never execute - it is dead code
we can rewrite the code:

Example 7.5:

The variable b is assigned a value after a return statement, which makes it impossible to get to. That is,
since code execution is linear, and there is no conditional expression wrapping the return statement, any code
after the return statement cannot possibly be executed. Furthermore, if we eliminate that assignment, then we
can see that the variable b is never used at all, except for its declaration and initial assignment. Depending on
the aggressiveness of the optimizer, the variable b might be eliminated entirely from the generated code.

Example 7.6:

i=1
i=1
j=2
j=2
k=3
l=4
l=4
if i<j goto B2
if i<j goto B2

i=i+j k=k–i i=i+j l=l–i

j=i l=l–i j=i j=l
PRINT (i) PRINT (j) PRINT (i) PRINT (j)
PRINT (j) PRINT (i) PRINT (j) PRINT (i)
Because the expression 0 will always evaluate to false, the code inside the if statement can never be
executed, and dead code elimination would remove it entirely from the optimized program.
Code motion : This optimization technique mainly deals to reduce the number of source code lines in the
program, which could be moved to before the loop (if the loop always terminates), or after the loop, without
affecting the semantics of the program as a result it is executed less often, providing a speedup. For correct
implementation, this technique must be used with loop inversion, because not all code is safe to be hoisted
outside the loop.
Example 7.7: Consider the code below

In the above mentioned code, a = a + c can be moved out of the ‘for’ loop.

Example 7.8: If we consider the following code

Original loop After
while (j < maximum – 1) int maxval = maximum – 1;
int calcval = (4+array[k])*pi+5;
{ j = j + (4+array[k])*pi+5; } while (j < maxval){ j = j + calcval; }

The calculation of maximum – 1 and (4+array[k])*pi+5 can be moved outside the loop, and precalculated,
resulting in something similar to:

Induction variable elimination: An induction variable is a variable that gets increased or decreased by a fixed
amount on every iteration of a loop. A common compiler optimization is to recognize the existence of induction
variables and replace them with simpler computations.
Example 7.9:
(i) In the following loop, i and j are induction variables:

Original loop After

for (i=0; i < 10; ++i) { j = 0;
j = 17 * i; for (i=0; i < 10; ++i) {
} j = j + 17;}

( ii ) In some cases, it is possible to reverse this optimization in order to remove an induction variable from
the code entirely.

Original loop After

extern int sum; extern int sum;
int foo(int n) { int foo(int n) {
int i, j; int i;
j = 5; for (i=0; i < n; ++i) {
Contd...
for (i=0; i < n; ++i) { sum += (5 + 2*i);
j += 2; }
sum += j; return sum;
} }
return sum;
}

This function’s loop has two induction variables: i and j either one can be rewritten as a linear function of
the other.

Strength reduction is a compiler optimization where replacing a complex or difficult or expensive operations
with simpler ones. In a procedural programming language this would apply to an expression involving a loop
variable and in a declarative language it would apply to the argument of a recursive function. One of the most
important uses of the strength reduction is computing memory addresses inside a loop. Several peephole
optimizations also fall into this category, such as replacing division by a constant with multiplication by its
reciprocal, converting multiplies into a series of bit-shifts and adds, and replacing large instructions with
equivalent smaller ones that load more quickly.
Example 7.10: Multiplication can be replaced by addition.

Original loop After

Function chunking: Function chunking is a compiler optimization for improving code locality. Profiling infor-
mation is used to move rarely executed code outside of the main function body. This allows for memory pages
with rarely executed code to be swapped out.

As compiler technologies have improved, good compilers can often generate better code than human program-
mers and good post pass optimizers can improve highly hand-optimized code even further. Compiler optimiza-
tion is the key for obtaining efficient code, because instruction sets are so compact that it is hard for a human
to manually schedule or combine small instructions to get efficient results. However, optimizing compilers are
by no means perfect. There is no way that a compiler can guarantee that, for all program source code, the fastest
(or smallest) possible equivalent compiled program is output. Additionally, there are a number of other more
practical issues with optimizing compiler technology:
• Usually, an optimizing compiler only performs low-level, localized changes to small sets of opera-
tions. In other words, high-level inefficiency in the source program (such as an inefficient algorithm)
remains unchanged.
• Modern third-party compilers usually have to support several objectives. In so doing, these compil-
ers are a ‘jack of all trades’ yet master of none.
• A compiler typically only deals with a small part of an entire program at a time, at most a module at a
time and usually only a procedure; the result is that it is unable to consider at least some important
contextual information.
• The overhead of compiler optimization. Any extra work takes time, whole-program optimization
(interprocedural optimization) is very costly.
• The interaction of compiler optimization phases: what combination of optimization phases are opti-
mal, in what order and how many times?
Work to improve optimization technology continues. One approach is the use of so-called “post pass”
optimizers. These tools take the executable output by an “optimizing” compiler and optimize it even further. As
opposed to compilers which optimize intermediate representations of programs, post pass optimizers work on
the assembly language level.

Data flow analysis is a technique for gathering information about the possible set of values calculated at
various points in a computer program. A program’s control flow graph is used to determine those parts of
a program to which a particular value assigned to a variable might propagate. The information gathered is
often used by compilers when optimizing a program. A canonical example of a data flow analysis is
reaching definitions.
A simple way to perform data flow analysis of programs is to set up data flow equations for each node of the
control flow graph and solve them by repeatedly calculating the output from the input locally at each node until
the whole system stabilizes, i.e., it reaches a fixpoint. This general approach was developed by Gary Kildall
while teaching at the Naval Postgraduate School. Data flow analysis analysis can be partitioned into two parts
as:
• Local Data Flow Analysis : Data flow analysis that is applied to only one basic block.
• Global Data Flow Analysis : Data flow analysis that is applied to all function at a time.

Data flow analysis is inherently flow-sensitive and typically path-insensitive. Some causes are :
• A flow-sensitive analysis takes into account the order of statements in a program. For example, a
flow-insensitive pointer alias analysis may determine “variables x and y may refer to the same
location”, while a flow-sensitive analysis may determine “after statement 20, variables x and y may
refer to the same location”.
• A path-sensitive analysis only considers valid paths through the program. For example, if two
operations at different parts of a function are guarded by equivalent predicates, the analysis must
only consider paths where both operations execute or neither operation executes. Path-sensitive
analyses are necessarily flow-sensitive.
• A context-sensitive analysis is an interprocedural analysis that takes the calling context into account
when analyzing the target of a function call. For example, consider a function that accepts a file
handle and a boolean parameter that determines whether the file handle should be closed before the
function returns. A context-sensitive analysis of any callers of the function should take into account
the value of the boolean parameter to determine whether the file handle will be closed when the
function returns.
Data flow analysis of programs is to set up data flow equations for each node of the control flow graph. General
form of data flow equation is :
OUT [ S ] = GEN [ S ] U ( IN [ S ] – KILL [ S ] )
We define the GEN and KILL sets as follows:
GEN[d : y f ( x1 ,...., xn )] {d}
KILL[d : y f ( x1 ,..., xn )] {DEFS[ y] {d}
Where DEFS[y] is the set of all definitions that assign to the variable y. Here d is a unique label attached to
the assigning instruction.

The efficiency of iteratively solving data flow equations is influenced by the order at which local nodes are
visited and whether the data flow equations are used for forward or backward data flow analysis over the CFG.
In the following, a few iteration orders for solving data flow equations are discussed.
• Random order: This iteration order is not aware whether the data flow equations solve a forward or
backward data-flow problem. Therefore, the performance is relatively poor compared to specialized
iteration orders.
• Post order: This is a typical iteration order for backward data flow problems. In postorder iteration a
node is visited after all its successor nodes have been visited. Typically, the postorder iteration is
implemented with the depth-first strategy.
• Reverse post order: This is a typical iteration order for forward data flow problems. In reverse-
postorder iteration a node is visited before all its successor nodes have been visited, except when
the successor is reached by a back edge.

A definition of a variable ‘x’ is a statement that assigns or may assign a value to ‘x’. The most common forms of
definition are assignments to ‘x’ and the statements that read a value from input device and store it in ‘x’. These
statements certainly define value for ‘x’ and they are referred to as unambiguous definition of ‘x’. a variable ‘z’
reaches to a point ‘y’ if there is a path from the point following z to y such that ‘z’ is not killed along that path.
The most common forms of ambiguous definition of ‘x’ are :
• A call of a procedure with ‘x’ as a parameter or procedure that access ‘x’.
• An assignment through a pointer that could refer to ‘x’.
A definition of variable is said to reach a given point in a function if there is an execution path from the
definition to that point. Reaching definition can be done in classic form know as an ‘iterative forward bit vector
problem’ – ‘iterative’ because we construct a collection of data flow equation to represent the information flow
and solve it by iteration from an appropriate set of initial values; ‘forward’ because information flow is in the
direction of execution along the control flow edge in the program; ‘bit vector’ because we can represent each
definition by a 1 or a 0.

This concept include some grammar for structured program and control flow graph for them by which we
calculate the data flow equations.
1. S id : = E | S ; S | if E then S else S | do S while E
2. E id + id | id
Some optimization techniques primarily designed to operate on loops include:
• Code Motion
• Induction variable analysis
• Loop fission or loop distribution
• Loop fusion or loop combining
• Loop inversion
• Loop interchange
• Loop nest optimization
• Loop unrolling
• Loop splitting
• Loop unswitching

Similar to section 7.6.

Loop fission is a technique attempting to break a loop into multiple loops over the same index range but each
taking only a part of the loop’s body. The goal is to break down large loop body into smaller ones to achieve
better data locality. It is the reverse action to loop fusion.

Original loop After loop fission

int i, a[100], b[100]; int i, a[100], b[100];
for (i = 0; i < 100; i++) { for (i = 0; i < 100; i++) {
a[i] = 1; a[i] = 1;
b[i] = 2;} }
for (i = 0; i < 100; i++) {
b[i] = 2;
}

Loop fusion is a loop transformation, which replaces multiple loops with a single one but don’t always improve
the run-time performance, due to architectures that provide better performance if there are two loops rather than
one, for example due to increased data locality within each loop. In those cases, a single loop may be trans-
formed into two.

Original loop After loop fusion

int i, a[100], b[100]; int i, a[100], b[100];
for (i = 0; i < 100; i++) { for (i = 0; i < 100; i++) {
a[i] = 1; a[i] = 1;
Contd...
} b[i] = 2;
for (i = 0; i < 100; i++) { }
b[i] = 2;}

Loop inversion is a loop transformation, which replaces a while loop by an if block containing a do…while loop.

Original loop After loop inversion

int i, a[100]; int i, a[100];
i = 0; i = 0;
while (i < 100) { if (i < 100) {
a[i] = 0; do {
i++; a[i] = 0;
} i++;} while (i < 100);
}

At a first glance, this seems like a bad idea: there’s more code so it probably takes longer to execute.
However, most modern CPUs use a pipeline for executing instructions. By nature, any jump in the code causes
a pipeline stall. Let’s watch what happens in Assembly-like Three address code version of the above code:

On the other hand, if we looked at the order of instructions at the moment when i was assigned value 99,
we would have seen:

Now, let’s look at the optimized version:

Again, let’s look at the order of instructions executed for i equal 100
We didn’t waste any cycles compared to the original version. Now again jump into when i was assigned
value 99:

As you can see, two gotos (and thus, two pipeline stalls) have been eliminated in the execution.

Loop interchange is the process of exchanging the order of two iteration variables. When the loop variables
index into an array then loop interchange can improve locality of reference, depending on the array’s layout.
One major purpose of loop interchange is to improve the cache performance for accessing array elements.
Cache misses occur if the contiguously accessed array elements within the loop come from a different cache
line. Loop interchange can help prevent this. The effectiveness of loop interchange depends on and must be
considered in light of the cache model used by the underlying hardware and the array model used by the
compiler. It is not always safe to exchange the iteration variables due to dependencies between statements for
the order in which they must execute. In order to determine whether a compiler can safely interchange loops,
dependence analysis is required.

Original loop After loop inversion

for i from 0 to 10 for j from 0 to 20
for j from 0 to 20 for i from 0 to 10
a[i,j] = i + j a[i,j] = i + j

Example 7.11:
do i = 1, 10000
do j = 1, 1000
a(i) = a(i) + b(i,j) * c(i)
end do
end do
Loop interchange on this example can improve the cache performance of accessing b(i,j), but it will ruin the
reuse of a(i) and c(i) in the inner loop, as it introduces two extra loads (for a(i) and for c(i)) and one extra store
(for a(i)) during each iteration. As a result, the overall performance may be degraded after loop interchange.

To ever-smaller processes making it possible to put very fast fully pipelined floating-point units onto commod-
ity CPUs. But delivering that performance is also crucially dependent on compiler transformations that reduce
the need for the high-bandwidth memory system.
Example 7.12: Matrix Multiply
Many large mathematical operations on computers end up spending much of their time doing matrix
multiplication. Examining this loop nest can be quite instructive. The operation is:
C = A*B
where A, B, and C are NxN arrays. Subscripts, for the following description, are in the form C[row][column].
The basic loop is:

There are three problems to solve:

• Floating point additions take some number of cycles to complete. In order to keep an adder with
multiple cycle latency busy, the code must update multiple accumulators in parallel.
• Machines can typically do just one memory operation per multiply-add, so values loaded must be
reused at least twice.
• Typical PC memory systems can only sustain 18-byte doubleword per 10-30 double-precision multi-
ply-adds, so values loaded into the cache must be reused many times.
The original loop calculates the result for one entry in the result matrix at a time. By calculating a small block
of entries simultaneously, the following loop reuses each loaded value twice, so that the inner loop has four
loads and four multiply-adds, thus solving problem #2. By carrying four accumulators simultaneously, this
code can keep a single floating point adder with a latency of 4 busy nearly all the time (problem #1). However,
the code does not address the third problem.
This code has had both the i and j iterations blocked by a factor of two, and had both the resulting two-
iteration inner loops completely unrolled. This code would run quite acceptably on a Cray Y-MP, which can
sustain 0.8 multiply-adds per memory operation to main memory. The Y-MP was built in the early 1980s. A
machine like a 2.8 GHz Pentium 4, built in 2003, has slightly less memory bandwidth and vastly better floating
point, so that it can sustain 16.5 multiply-adds per memory operation. As a result, the code above will run slower
on the 2.8 GHz Pentium 4 than on the 166 MHz Y-MP!

Loop unwinding, also known as loop unrolling, is a technique for optimizing, the idea is to save time by
reducing the number of overhead instructions that the computer has to execute in a loop, thus improving the
cache hit rate and reducing branching. To achieve this, the instructions that are called in multiple iterations of
the loop are combined into a single iteration. This will speed up the program if the overhead instructions of the
loop impair performance significantly.
The major side effects of loop unrolling are:
(a) the increased register usage in a single iteration to store temporary variables, which may hurt performance.
(b) the code size expansion after the unrolling, which is undesirable for embedded applications.

Example 7.13:
A procedure in a computer program needs to delete 100 items from a collection. This is accomplished by
means of a for-loop which calls the function. If this part of the program is to be optimized, and the overhead of
the loop requires significant resources, loop unwinding can be used to speed it up.

Original loop After

for (int x = 0; x < 100; x++) for (int x = 0; x < 100; x += 5)
{ {
delete(x); delete(x);
} delete(x+1);
delete(x+2);
delete(x+3);
delete(x+4);
}

As a result of this optimization, the new program has to make only 20 loops, instead of 100. There are now
1/5 as many jumps and conditional branches that need to be taken, which over many iterations would be a great
improvement in the loop administration time while the loop unrolling makes the code size grow from 3 lines to
7 lines and the compiler has to allocate more registers to store variables in the expanded loop iteration.

Loop splitting / loop peeling that attempts to simplify a loop or eliminate dependencies by breaking it into
multiple loops which have the same bodies but iterate over different contiguous portions of the index range. A
useful special case is loop peeling, which can simplify a loop with a problematic first iteration by performing
that iteration separately before entering the loop.
Example 7.14:

Original loop After

Loop unswitching moves a conditional inside a loop outside of it by duplicating the loop’s body, and placing
a version of it inside each of the if and else clauses of the conditional. This can improve the parallelization of the
loop. Since modern processors can operate fast on vectors this increases the speed.
Example 7.15:
Suppose we want to add the two arrays x and y (vectors) and also do something depending on the variable
w. The conditional inside this loop makes it hard to safely parallelize this loop. After unswitching this becomes:

Original loop After

for i to 1000 do if (w) then
x[i] = x[i] + y[i]; for i to 1000 do
if (w) then y[i] = 0; x[i] = x[i] + y[i];
end_for; y[i] = 0;
end_for;
else
for i to 1000 do
x[i] = x[i] + y[i];
end_for
end_if;

Each of these new loops can be separately optimized. Note that loop unswitching will double the amount
of code generated.

Data flow optimizations, based on Data flow analysis, primarily depend on how certain properties of data are
propagated by control edges in the control flow graph. Some of these include:
• Common Subexpression elimination
• Constant folding and propagation
• Aliasing.

Similar to section 7.6.

Constant folding and constant propagation are related optimization techniques used by many modern compil-
ers. A more advanced form of constant propagation known as sparse conditional constant propagation may be
utilized to simultaneously remove dead code and more accurately propagate constants.
Constant folding is the process of simplifying constant expressions at compile time. Terms in constant expres-
sions are typically simple literals, such as the integer values and variables whose values are never modified or
explicit variables. Constant folding can be done in a compiler’s front end on intermediate representation, then
that represents the high-level source language, before it is translated into three-address code, or in the back
end. Consider the statement:
i = 320 * 200 * 32;
Most modern compilers would not actually generate two multiply instructions and a store for this statement.
Instead, they identify constructs such as these, and substitute the computed values at compile time (in this
case, 2,048,000), usually in the intermediate representation.
Constant propagation is the process of substituting the values of known constants in expressions at
compile time. A typical compiler might apply constant folding now, to simplify the resulting expressions, before
attempting further propagation.

Original loop After constant propagation

Example 7.16:

Original loop After constant folding & propagation

As a and b have been simplified to constants and their values substituted everywhere they occurred, the
compiler now applies dead code elimination to discard them, reducing the code further:
int c;
c = 12;
if (12 > 10) {
c = 2;
}
return c * 2;

Aliasing is a term that generally means that one variable or some reference, when changed, has an indirectly
effect on some other data. For example in the presence of pointers, it is difficult to make any optimizations at all,
since potentially any variable can have been changed when a memory location is assigned to.
Example 7.17:
(i) Array bounds checking: C programming language does not perform array bounds checking. If an
array is created on the stack, with a variable laid out in memory directly beside that array, one could
index outside that array and then directly change that variable by changing the relevant array
element. For example, if we have a int array of size ten (for this example’s sake, calling it vector), next
to another int variable (call it i), vector [10] would be aliased to i if they are adjacent in memory.
This is possible in some implementations of C because an array is in reality a pointer to some location
in memory, and array elements are merely offsets off that memory location. Since C has no bounds
checking, indexing and addressing outside of the array is possible. Note that the aforementioned
aliasing behaviour is implementation specific. Some implementations may leave space between ar-
rays and variables on the stack, for instance, to minimize possible aliasing effects. C programming
language specifications do not specify how data is to be laid out in memory.
(ii) Aliased pointers: Another variety of aliasing can occur in any language that can refer to one location
in memory with more than one name .See the C example of the xor swap algorithm that is a function;
it assumes the two pointers passed to it are distinct, but if they are in fact equal (or aliases of each
other), the function fails. This is a common problem with functions that accept pointer arguments,
and their tolerance (or the lack thereof) for aliasing must be carefully documented, particularly for
functions that perform complex manipulations on memory areas passed to them.
Specified Aliasing
Controlled aliasing behaviour may be desirable in some cases. It is common practice in FORTRAN. The
Perl programming language specifies, in some constructs, aliasing behaviour, such as in for each loops. This
allows certain data structures to be modified directly with less code. For example,

will print out “2 3 4” as a result. If one would want to bypass aliasing effects, one could copy the
contents of the index variable into another and change the copy.
Conflicts With Optimization
(i) Many times optimizers have to make conservative assumptions about variables in the presence of
pointers. For example, a constant propagation process which knows that the value of variable x is 5
would not be able to keep using this information after an assignment to another variable (for example,
*y = 10) because it could be that *y is an alias of x. This could be the case after an assignment like y
= &x. As an effect of the assignment to *y, the value of x would be changed as well, so propagating
the information that x is 5 to the statements following *y = 10 would be potentially wrong . However,
if we have information about pointers, the constant propagation process could make a query like:
can x be an alias of *y? Then, if the answer is no, x = 5 can be propagated safely.
(ii) Another optimisation that is impacted by aliasing is code reordering; if the compiler decides that x is
not an alias of *y, then code that uses or changes the value of x can be moved before the assignment
*y = 10, if this would improve scheduling or enable more loop optimizations to be carried out. In order
to enable such optimisations to be carried out in a predictable manner, the C99 edition of the C
programming language specifies that it is illegal (with some exceptions) for pointers of different
types to reference the same memory location. This rule, known as “strict aliasing”, allows impressive
increases in performance, but has been known to break some legacy code.

These optimizations are intended to be done after transforming the program into a special form called static
single assignment. Although some function without SSA, they are most effective with SSA. Compiler optimiza-
tion algorithms which are either enabled or strongly enhanced by the use of SSA include:
• Constant propagation
• Sparse conditional constant propagation
• Dead code elimination
• Global value numbering
• Partial redundancy elimination
• Strength reduction
SSA based optimization includes :
• Global value numbering
• Sparse conditional constant propagation

Global value numbering (GVN) is based on the SSA intermediate representation so that false variable name-
value name mappings are not created. It sometimes helps eliminate redundant code that common subexpression
evaluation (CSE) does not. GVN are often found in modern compilers. Global value numbering is distinct from
local value numbering in that the value-number mappings hold across basic block boundaries as well, and
different algorithms are used to compute the mappings.
Global value numbering works by assigning a value number to variables and expressions. To those
variables and expressions which are provably equivalent, the same value number is assigned.
Example 7.18:

Original loop After global value numbering

w := 3 w := 3
x := 3 x := w
y := x + 4 y := w + 4
z := w + 4 z := y

The reason that GVN is sometimes more powerful than CSE comes from the fact that CSE matches lexically
identical expressions whereas the GVN tries to determine an underlying equivalence. For instance, in the code:
a := c × d
e := c
f := e × d
CSE would not eliminate the recomputation assigned to f, but even a poor GVN algorithm should discover
and eliminate this redundancy.

Sparse conditional constant propagation is an optimization frequently applied after conversion to static single
assignment form (SSA). It simultaneously removes dead code and propagates constants throughout a pro-
gram. It must be noted, however, that it is strictly more powerful than applying dead code.

Functional language optimizations includes:

• Removing recursion
• Data structure fusion.

Recursion is often expensive, as a function call consumes stack space and involves some overhead related to
parameter passing and flushing the instruction cache. Tail recursive algorithms can be converted to iteration,
which does not have call overhead and uses a constant amount of stack space, through a process called tail
recursion elimination.

Because of the high level nature by which data structures are specified in functional languages such as Haskell,
it is possible to combine several recursive functions which produce and consume some temporary data struc-
ture so that the data is passed directly without wasting time constructing the data structure.

Other optimizations techniques included:

• Dead code elimination
• Partial redundancy elimination
• Strength reduction
• Copy Propagation
• Function Chunking
• Rematerialization
• Code Hoisting.

Similar to Section 7.6 (point 3)

Partial redundancy elimination (PRE) eliminates expressions that are redundant on some but not necessarily all
paths through a program. PRE is a form of common subexpression elimination.
An expression is called partially redundant if the value computed by the expression is already available on
some but not all paths through a program to that expression. An expression is fully redundant if the value
computed by the expression is available on all paths through the program to that expression. PRE can eliminate
partially redundant expressions by inserting the partially redundant expression on the paths that do not
already compute it, thereby making the partially redundant expression fully redundant.
Example 7.19:

Original loop After global value numbering

The expression x+4 assigned to z is partially redundant because it is computed twice if some_condition is
true. PRE would perform code motion on the expression to yield optimized code as above.

Similar to section 7.6.

Rematerialization saves time by recomputing a value instead of loading it from memory. It is typically tightly
integrated with register allocation, where it is used as an alternative to spilling registers to memory. It was
conceived by Preston Briggs, Keith D. Cooper, and Linda Torczon in 1992. Rematerialization decreases register
pressure by increasing the amount of CPU computation. To avoid adding more computation time than neces-
sary, rematerialize is done only when the compiler can be confident that it will be of benefit i.e. when a register
spill to memory would otherwise occur.
Rematerialization works by keeping track of the expression used to compute each variable, using the
concept of available expressions. Sometimes the variables used to compute a value are modified, and so can no
longer be used to rematerialize that value. The expression is then said to no longer be available. Other criteria
must also be fulfilled, for example a maximum complexity on the expression used to rematerialize the value; it
would do no good to rematerialize a value using a complex computation that takes more time than a load.
Code Hoisting finds expressions that are always executed or evaluated following some point in a program,
without meaning of execution path and moves them to the latest point beyond which they would be executed.
This process must reduce the space occupied by program.

EX.: Write three address code for the following code and then perform optimization technique
for(i=0;i<=n;i++)
{
for(j=0;j<n;j++)
{
c[i,j]=0 ;
}
}
for(i=0;i<=n;i++)
{
for(j=0;j<=n;j++)
{
for(k=0;k<=n;k++)
{
c[i,j]=c[i,j]+a[i,k]*b[k,j] ;
}
}
}

SOL. : Here we represent three address code for the given code after that perform optimization:
Three Address Code
(1) i=0
(2) if i<n goto 4 (L)
(3) goto 18 (L)
(4) j=0 (L)
(5) if j<n goto 7 (L)
(6) goto 15 (L)
(7) t1=i*k1 (L)
(8) t1=t1+j
(9) t2=Q
(10) t3=4*t1
(11) t4=t2[t3]
(12) t5=j+1
(13) j=15
(14) goto 5
(15) t6=i+1 (L)
(16) i=t6
(17) goto 2
(18) i=0 (L)
(19) if <=n goto 21 (L)
(20) goto next (L)
(21) j=0
(22) if j<=n goto 24 (L)
(23) goto 52 (L)
(24) k=0 (L)
(25) if k<=n goto 27 (L)
(26) goto 49 (L)
(27) t7 =i*k1 (L)
(28) t7=t7+j
(29) t8=Q
(30) t9=4*t7
(31) t10=t8[t9]
(32) t11=i*k1
(33) t11=t11+k
(34) t12=Q
(35) t13=t11*4
(36) t14=t12[t13]
(37) t15=k+k1
(38) t15=t15+j
(39) t16=Q
(40) t17=4*t15
(41) t18=t16[t17]
(42) t19=t14*t18
(43) t20=t10+t19
(44) t10=t20
(45) t21=k+1
(46) k=t21
(47) goto 25
(48) t22=j+1 (L)
(49) j=t22
(50) goto 22
(51) t23=i+1 (L)
(52) i=t23
(53) goto 19
(next) ——————
Now we constructing basic block for given code as:

i=0 B1

B2
if i < n goto B3

B3
j=1 i=0 B7

B4 B8

if i < n goto B5 if i < n goto B9

j=1 next

t6 = i + 1
t1 = i* k1 i = t6 B10
t1 = t1 + j if i < n goto B5 = 11
goto B2
t2 = q
t3 = 4*t1 B6
t4 = t2 [t3] B11
t5 = j + 1 k=1
t22 = j + 1
j = 15 j = t22 B14
goto B5 goto B8
B12
if i < n goto B5 = 13
B5
B13

t7 = i* k1
t7 = t7 + j t22 = j + 1
t8 = Q j = t22
t9 = 4 * t7 goto B8
t10 = t8 [t9]
t11 = i * k1
t11 = t11 + k
B15
t12 = Q
t13 = t11 * 4
t14 = t12 [t13]
t15 = k + k
t15 = t15 + j
t16 = Q
t17 = 4* t15
t18 = t16 [t17]
t19 = t14 * t18
t20 = t10 + t19
t21 = k + 1
k = t21
goto B13
Now we apply optimization techniques by which only two blocks are change as B8, B10, B13. First
statement of B13 is moved towards B8 block, other second statement is moved to B10 block and one statement
is deleted, after that all blocks look like as:

i=0 B1

B2
if i < n goto B3

B3
j=1 i=0 B7
B8
B4
if i < n goto B5 if i < n goto B9
B9

j=1 next

t6 = i + 1
t1 = i* k1 i = t6 if i < n goto B5 = 11 B10
t1 = t1 + j goto B2 t7 = t7 + j
t2 = q
t3 = 4*t1 B6
t4 = t2 [t3] B11
t5 = j + 1 k=1
t22 = j + 1
j = 15 j = t22 B14
goto B5 goto B8
B12
if i < n goto B5 = 13
B5
B13

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
The Hold Me Tight Workbook - Dr. Sue Johnson
100% (16)
The Hold Me Tight Workbook - Dr. Sue Johnson
187 pages
Read People Like A Book by Patrick King-Edited
62% (66)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
94% (212)
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
212 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
36 Questions To Fall in Love 1
97% (31)
36 Questions To Fall in Love 1
2 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
JUNOS Intermediate Routing-11a-Lab Guide PDF
No ratings yet
JUNOS Intermediate Routing-11a-Lab Guide PDF
92 pages
Java Performance Optimization: Patterns and Anti-Patterns
No ratings yet
Java Performance Optimization: Patterns and Anti-Patterns
7 pages
Introduction Smart Shift For SAP Baker Hughes
No ratings yet
Introduction Smart Shift For SAP Baker Hughes
6 pages
LTE OSS THP ENO Training Material 201905
100% (1)
LTE OSS THP ENO Training Material 201905
14 pages
SSD9 Unit5
No ratings yet
SSD9 Unit5
166 pages
Performance Measurement Tools and Techniques
No ratings yet
Performance Measurement Tools and Techniques
50 pages
CCDE : 30. Routing : 1. Design Considerations
No ratings yet
CCDE : 30. Routing : 1. Design Considerations
1 page
peep hole optimization
No ratings yet
peep hole optimization
3 pages
clausen1997
No ratings yet
clausen1997
15 pages
Optimization Techniques Code Optimizations
No ratings yet
Optimization Techniques Code Optimizations
10 pages
SON Eden Net in Mobifone: by Caecilia Adji June 2018
100% (1)
SON Eden Net in Mobifone: by Caecilia Adji June 2018
41 pages
66ff99f56191750bb9ff1bc2 - COA9e - CH15 ReduceInstructionSetComputers 24 Slides
No ratings yet
66ff99f56191750bb9ff1bc2 - COA9e - CH15 ReduceInstructionSetComputers 24 Slides
24 pages
Embedded System Course Content
No ratings yet
Embedded System Course Content
5 pages
Reduced Instruction Set Computers (RISC) : William Stallings, Computer Organization and Architecture, 9 Edition
No ratings yet
Reduced Instruction Set Computers (RISC) : William Stallings, Computer Organization and Architecture, 9 Edition
24 pages
Golan
No ratings yet
Golan
2 pages
Booch 1986
No ratings yet
Booch 1986
11 pages
Fusesource: Download New Camel Ide Today
No ratings yet
Fusesource: Download New Camel Ide Today
7 pages
Instant Access Cisco 6807 TDW - 200 - Presentation PDF
No ratings yet
Instant Access Cisco 6807 TDW - 200 - Presentation PDF
91 pages
backup_reliability_kaushik_presentation
No ratings yet
backup_reliability_kaushik_presentation
10 pages
Born To Be Parallel and Beyond - DA015152
No ratings yet
Born To Be Parallel and Beyond - DA015152
15 pages
Runtime Analysis: ABAP/4 Runtime Analysis ABAP/4 Statement GET RUN TIME
No ratings yet
Runtime Analysis: ABAP/4 Runtime Analysis ABAP/4 Statement GET RUN TIME
11 pages
Compiler Design (All Modules) - 26
No ratings yet
Compiler Design (All Modules) - 26
1 page
Add Control GK
No ratings yet
Add Control GK
50 pages
Optimization of Computer Programs in C
No ratings yet
Optimization of Computer Programs in C
2 pages
Rc047-010d-Enterprise Integration Patterns 1
No ratings yet
Rc047-010d-Enterprise Integration Patterns 1
6 pages
LLM Inference Unveiled Survey and Roofline Model Insights
No ratings yet
LLM Inference Unveiled Survey and Roofline Model Insights
27 pages
frama-slides
No ratings yet
frama-slides
47 pages
SW I T C H: Course Fee
No ratings yet
SW I T C H: Course Fee
2 pages
SOC-NLNA: Synthesis and Optimization For Fully Integrated Narrow-Band CMOS Low Noise Amplifiers
No ratings yet
SOC-NLNA: Synthesis and Optimization For Fully Integrated Narrow-Band CMOS Low Noise Amplifiers
6 pages
MPMC Unit 2 Notes (Vidyarthiplus.com)
No ratings yet
MPMC Unit 2 Notes (Vidyarthiplus.com)
44 pages
Dynamic_Simultaneous_Multithreaded_Architecture[1]
No ratings yet
Dynamic_Simultaneous_Multithreaded_Architecture[1]
11 pages
Lightweight Feedback-Directed Cross-Module Optimization: Xinliang David Li, Raksit Ashok, Robert Hundt
No ratings yet
Lightweight Feedback-Directed Cross-Module Optimization: Xinliang David Li, Raksit Ashok, Robert Hundt
9 pages
How Formal Verification Can Please Your Project Manager
No ratings yet
How Formal Verification Can Please Your Project Manager
30 pages
Tuning Hierarchy - The Pyramid: Server/OS Level Tuning
No ratings yet
Tuning Hierarchy - The Pyramid: Server/OS Level Tuning
75 pages
Code Optimization and Target Code Generation
No ratings yet
Code Optimization and Target Code Generation
24 pages
Review of Techniques For Making Efficient Executables in GCC Compiler
No ratings yet
Review of Techniques For Making Efficient Executables in GCC Compiler
6 pages
Mpi B2
No ratings yet
Mpi B2
44 pages
(Ebook) Groovy in Action, 2nd Edition by Dierk Konig, Paul King, Guillaume Laforge, Hamlet D'Arcy, Cedric Cha, Jon Skeet ISBN 9781935182443, 1935182447 - The ebook version is available in PDF and DOCX for easy access
No ratings yet
(Ebook) Groovy in Action, 2nd Edition by Dierk Konig, Paul King, Guillaume Laforge, Hamlet D'Arcy, Cedric Cha, Jon Skeet ISBN 9781935182443, 1935182447 - The ebook version is available in PDF and DOCX for easy access
50 pages
Shakti
0% (1)
Shakti
32 pages
StoryBoardSlides For RIC Use Case
No ratings yet
StoryBoardSlides For RIC Use Case
14 pages
Design: Netlist Floor
No ratings yet
Design: Netlist Floor
214 pages
Genetic Algorithm Toolbox For SCILABUSERGUIDE
No ratings yet
Genetic Algorithm Toolbox For SCILABUSERGUIDE
18 pages
Itanium Processor Seminar Report
No ratings yet
Itanium Processor Seminar Report
30 pages
Pipeline Optimization Techniques
No ratings yet
Pipeline Optimization Techniques
7 pages
5G Ran Kpi
No ratings yet
5G Ran Kpi
57 pages
Minimizing Impact On JAVA Virtual Machine Via JAVA Code Optimization
No ratings yet
Minimizing Impact On JAVA Virtual Machine Via JAVA Code Optimization
6 pages
Optimizing Inference Performance of Transformers On Cpus: Dave Dice Alex Kogan
No ratings yet
Optimizing Inference Performance of Transformers On Cpus: Dave Dice Alex Kogan
10 pages
Optimized Code Generation of Multiplication-Free Linear Transforms
No ratings yet
Optimized Code Generation of Multiplication-Free Linear Transforms
6 pages
VHDL Tutorial
No ratings yet
VHDL Tutorial
8 pages
Oracle Streams PEOUG Day2009 18112009 MPalacios
No ratings yet
Oracle Streams PEOUG Day2009 18112009 MPalacios
24 pages
Prezi for Web Site
No ratings yet
Prezi for Web Site
19 pages
Quantum Convolutional Neural Networks - PennyLane
No ratings yet
Quantum Convolutional Neural Networks - PennyLane
1 page
10.Week
No ratings yet
10.Week
35 pages
Itanium Processor Seminar Report
No ratings yet
Itanium Processor Seminar Report
30 pages
Acceleo User Guide
No ratings yet
Acceleo User Guide
56 pages
Comparing Architecture and The Partition Table Using Fanemesel
No ratings yet
Comparing Architecture and The Partition Table Using Fanemesel
6 pages
OPNET Full Presentation
No ratings yet
OPNET Full Presentation
90 pages
[Public][Worlds] [Dark Mode] [Portfolio]
No ratings yet
[Public][Worlds] [Dark Mode] [Portfolio]
8 pages
Basic of Timing Analysis
No ratings yet
Basic of Timing Analysis
2 pages
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Java Streams Explained: A Practical Guide with Examples
From Everand
Java Streams Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Unit1 - 26 05 23
No ratings yet
Unit1 - 26 05 23
56 pages
ASC2018 TCM JP Agreed Minutes (Mechanical Engineering-CAD)
No ratings yet
ASC2018 TCM JP Agreed Minutes (Mechanical Engineering-CAD)
15 pages
Lenovo V110-15ISK - Wistron LV115 LV114NOWATER Esquematico Lenovo
No ratings yet
Lenovo V110-15ISK - Wistron LV115 LV114NOWATER Esquematico Lenovo
105 pages
Complete Scania Range PDF
100% (4)
Complete Scania Range PDF
364 pages
Profinet 1
No ratings yet
Profinet 1
39 pages
Senior Business Analyst Project Manager in NYC Resume Ipek Canbolat
No ratings yet
Senior Business Analyst Project Manager in NYC Resume Ipek Canbolat
2 pages
Empowerment Technologies Block 1, Sem 1: Module 1: Ict in Our Everday Lives
No ratings yet
Empowerment Technologies Block 1, Sem 1: Module 1: Ict in Our Everday Lives
35 pages
DX Diag
No ratings yet
DX Diag
28 pages
Asus ZenFone 3 Max 5.5 ZC553KL - Schematic Diagarm
No ratings yet
Asus ZenFone 3 Max 5.5 ZC553KL - Schematic Diagarm
111 pages
The Complete Guide of OEE Presentation
No ratings yet
The Complete Guide of OEE Presentation
26 pages
Inventory Problems
No ratings yet
Inventory Problems
6 pages
Manual of Quality Analysis & Warehouse Practices: (Soybean Seeds)
No ratings yet
Manual of Quality Analysis & Warehouse Practices: (Soybean Seeds)
3 pages
Access No Title Author Publisher Subject Pub Year Ed
No ratings yet
Access No Title Author Publisher Subject Pub Year Ed
102 pages
FYPppt
No ratings yet
FYPppt
40 pages
Cafeteria Ordering System: Software Requirements Specification For
No ratings yet
Cafeteria Ordering System: Software Requirements Specification For
14 pages
Genelec Step by Step Setup Guide Ed 2015
No ratings yet
Genelec Step by Step Setup Guide Ed 2015
2 pages
Sourcing Hacks
No ratings yet
Sourcing Hacks
32 pages
WIRELESS CHARGING SYSTEM FOR EVs USING RESONANT COUPLING
No ratings yet
WIRELESS CHARGING SYSTEM FOR EVs USING RESONANT COUPLING
4 pages
Service Manual: Plifie D
No ratings yet
Service Manual: Plifie D
4 pages
6ep19642ba00 Datasheet en Redundancia
No ratings yet
6ep19642ba00 Datasheet en Redundancia
2 pages
What Are Reviews
No ratings yet
What Are Reviews
3 pages
Fce Student Handbook2016
No ratings yet
Fce Student Handbook2016
69 pages
★Exposé D'anglais★ (1)
No ratings yet
★Exposé D'anglais★ (1)
3 pages
Diseño de Un Sistema Fotovoltaico Universidad de Ecuador
No ratings yet
Diseño de Un Sistema Fotovoltaico Universidad de Ecuador
17 pages
Getting Started With OneDrive
No ratings yet
Getting Started With OneDrive
1 page
Book Vending Machine
No ratings yet
Book Vending Machine
12 pages
Eurotherm 3208 Manual PDF
No ratings yet
Eurotherm 3208 Manual PDF
2 pages
Packaged Palm Cooking Oil Business Process Design in Indonesia
No ratings yet
Packaged Palm Cooking Oil Business Process Design in Indonesia
13 pages
Github Config
No ratings yet
Github Config
12 pages