Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit-1 Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Design and Analysis of Algorithms

Chapter-I: Introduction

Dr. Dwiti Krishna Bebarta

Text Books:
1.Ellis Horowitz, SartajSahni, Sanguthevar Rajasekaran, “Fundamentals of
Computer Algorithms”, 2nd Edition, University Press.
2.Introduction to Algorithms Thomas H. Cormen, PHI Learning
UNIT I: Introduction to
Problem Solving Concepts
1. What is an Algorithm?
2. Algorithm specification
3. Performance analysis
4. Performance Measurement (Time and Space
Complexity)
5. Amortized complexity
6. Asymptotic notation
7. Practical Complexities
What is an Algorithm?

An algorithm is a procedure used for solving a problem or


performing a computation. Algorithms act as an exact list of
instructions that conduct specified actions step by step using
a computer.
Problem Definition
What is the task to be accomplished?
Calculate the average of the grades for a given student
Understand the talks given out by politicians and
translate them in Chinese
What are the time / space / speed / performance
requirements ?
Algorithm Definition
Algorithm: Finite set of instructions that, if followed,
accomplishes a particular task.
Describe: in natural language / pseudo-code / diagrams /
etc.
Criteria to follow:
Input: Zero or more quantities (externally produced)
Output: One or more quantities
Definiteness: Clarity, precision of each instruction
Finiteness: The algorithm has to stop after a finite (may
be very large) number of steps
Effectiveness: Each instruction has to be basic enough
and feasible

• Understand speech
• Translate to Chinese
Algorithm Specification

Algorithm can be specified in three ways


Natural Language like English
Graphics representations like Flowcharts
Pseudo-code Method
Pseudo-code Method
We present most of our algorithms using a pseudo-code that
resembles C and Pascal.
Comments begin with // and continue until the end of line.
Blocks are indicated with matching braces: { and }
A compound statement(i.e., a collection of simple
statements) can be represented as a block.
An identifier begins with a letter. The data types of variables
are not explicitly declared. The types will be clear from the
context. Whether a variable is global or local to a procedure
will also be evident from the context. We assume simple
data types such as integer, float, char, boolean, and so on.
Compound data types can be formed with records. Here is
an example:
node= record { datatype_1 data_1; datatype_n data_n; node
*link; }
Pseudo-code Method Cont.
Assignment of values to variables is done using the
assignment statement (variable):=(expression);
There are two boolean values true and false. In order to
produce these values, the logical operators and, or, and not
and the relational operators <, >, etc. are provided.
Elements of multidimensional arrays for example, if A is a two
dimensional array, the (i,j)th element of the array is denoted
as-A[i,j]. Array indices start at zero.
The following looping statements are employed: for, while,
and repeat- until. The while loop takes the following form:
while (condition)
do { (statement 1) ---- (statement n) }
Pseudo-code Method Cont.
for variable:=value-l to value-2 increment/decrement
do { (statement-1) ----- (statement-n) }
Here value-l,value-2,and increment/decrement are arithmetic
expression

A repeat-until statement is constructed as follows: repeat


{statement-1) ----- {statement-n)
until{condition)
Pseudo-code Method Cont.
A conditional statement has the following forms:
if {condition) then {statement)
if {condition) then {statement-1) else (statement-2)
We also employ the following case statement:
case { :{condition-1) : (statement-1)
-------------
:(condition-n) : (statement-n)
:else: (statement n + 1)
}
Pseudo-code Method Cont.
As an example, the following algorithm finds and returns the
maximum of n given numbers:
Algorithm Max (A, n)
{
// A is an array of size n.
Result:=A[l];
for i :=2 to n do
if A[i] >Result then
Result:=A[I];
return Result;
}
In this algorithm (named Max), A and n are procedure
parameters. Result and i are local variables.
Performance Analysis
There are many criteria upon which we can judge an algorithm.
For instance:
Does it do what we want it to do?
Does it work correctly according to the original specifications of
the task?
Is there documentation that describes how to use it and how it
works?
Are procedures created in such a way that they perform logical
sub-functions
Is the code readable?
Performance analysis can be loosely divided into two major phases:
(1) a priori estimates
(2) a posteriori testing.
We refer to these as performance analysis and performance
measurement respectively
3. Algorithm Analysis

Space complexity
How much space is required
Time complexity
How much time does it take to run the algorithm
Often, we deal with estimates!
Space Complexity

Space complexity = The amount of memory required


by an algorithm to run to completion
[Core dumps = the most often encountered cause is
“memory leaks” – the amount of memory required larger
than the memory available on a given system]
Some algorithms may be more efficient if data
completely loaded into memory
Need to look also at system limitations
E.g. Classify 2GB of text in various categories [politics,
tourism, sport, natural disasters, etc.] – can I afford to load
the entire collection?
Space Complexity (cont’d)
1. Fixed part: The size required to store certain
data/variables, that is independent of the size of the
problem:

2. Variable part: Space needed by variables, whose size is


dependent on the size of the problem:
- e.g. actual text
- load 2GB of text VS. load 1MB of text
Space Complexity (cont’d)
S(P) = c + S(instance characteristics)
c = constant
Example:
int square(int a) { return a*a; }
In the above piece of code, it requires 2 bytes of memory to
store variable 'a' and another 2 bytes of memory is used
for return value.
That means, totally it requires 4 bytes of memory to
complete its execution. And this 4 bytes of memory is fixed
for any input value of 'a'. This space complexity is said to
be Constant Space Complexity.
E.g.: 1
Algorithm abc(a, b, c)
return a + b+b*c+(a+b-c) / (a+b)+40

The problem instance is characterized by the specific values


of a, b, and c.
Hence the space needed for a, b, c is independent of the
values of a, b, c.
Therefore, S(instance Characteristics) = 0 and c=3
S(p)=c + S(instance characteristics)=3+0=3
Space Complexity (cont’d)
S(P) = c + S(instance characteristics)
c = constant
Example:
int sum(int A[ ], int n)
{ int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}
In the above piece of code it requires
'n*2' bytes of memory to store array variable 'a[ ]'
2 bytes of memory for integer parameter 'n'
4 bytes of memory for local integer variables 'sum' and 'i' (2 bytes each)
2 bytes of memory for return value.
That means, totally it requires '2n+8' bytes of memory to complete its
execution. Here, the total amount of memory required depends on the
value of 'n'. As 'n' value increases the space required also increases
proportionately. This type of space complexity is said to be Linear Space
Complexity.
Example: 3
Algorithm R-sum (A, n)
if (n≤0) then return 0
else return R-sum(A, n-1)+A[n]

The problem instances are characterized by n. The recursion


stack space includes space for the for the formal parameters, the
local variables, and the return address.
Each call to R-sum requires at least three words including space
for the values of n, the return address, and a pointer to A[ ] ).
Since the depth of recursion is n+1, the recursion stack space
needed is ≥ 3 (n+1)
S (R-sum ) = 3 (n+1)
Time Complexity
Most algorithms transform input objects into output
objects.
The running time of an algorithm typically grows with the
input size.
Average case time is often difficult to determine.
We focus on the worst case running time.
best case
average case
worst case
120

100
Running Time

80

60

40

20

0
1000 2000 3000 4000
Input Size
Use a Theoretical Approach
Based on high-level description of the algorithms, rather
than language dependent implementations
Makes possible an evaluation of the algorithms that is
independent of the hardware and software environments
Pseudo-Code = a description of an algorithm that is
more structured than usual prose but
less formal than a programming language
Example: find the maximum element of an array.
Algorithm arrayMax(A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
currentMax  A[0]
for i 1 to n -1 do
if currentMax < A[i] then currentMax  A[i]
return currentMax
Counting Primitive Operations
Example: find the maximum element of an array.
Algorithm arrayMax(A, n) No. of Operations
//Input: An array A storing n integers. 00
//Output: The maximum element in A. 00
currentMax  A[0] 01 time
for i 1 to n -1 do n times
if currentMax < A[i] then (n-1) times
currentMax  A[i] (n-1) times
return currentMax 01 time
Total: 3n times
Estimating Running Time

Algorithm arrayMax executes 3n primitive operations in


the worst case. Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation
Let T(n) be worst-case time of arrayMax.
a (8n  2)  T(n)  b(8n  2)
Hence, the running time T(n) is bounded by two linear
functions
Growth Rate of Running Time: The linear growth rate of
the running time T(n) is an intrinsic property of algorithm
arrayMax
Seven functions that often appear in algorithm analysis:
Constant  1
Logarithmic  log n
Linear  n
N-Log-N  n log n
Quadratic  n2
Cubic  n3
Exponential  2n
Asymptotic Complexity
Running time of an algorithm as a function of input size n
for large n.
Expressed using only the highest-order term in the
expression for the exact running time.
Written using Asymptotic Notation Q, O, W, o, w
The notations describe different rate-of-growth relations
between the defining function and the defined set of
functions.
Theta (Q) -notation
For function g(n), we define Q(g(n)),
big-Theta of n, as the set:

Q(g(n)) =
f(n) :  positive constants c1, c2, and
n0, such that n  n0,
we have 0  c1g(n)  f(n)  c2g(n)

f(n) and g(n) are nonnegative, for large n.


Q(g(n)) = {f(n) :  positive constants c1, c2, and n0, such
that n  n0, 0  c1g(n)  f(n)  c2g(n)}

Example:
10n2 - 3n = Θ(n2)
What constants for n0, c1, and c2 will work?
Make c1 a little smaller than the leading coefficient, and c2 a
little bigger.
To compare orders of growth, look at the leading term.
Solution: c1n2 ≤ 10n2 −3n ≤ c2n2
c1 ≤ 10−3/n ≤ c2
For n0=1, c1 ≤ 7 and c2 ≥ 10
c1n2 ≤ n2/2−3n ≤ c2n2
Exercise: Prove that n2/2-3n= Θ(n2) c1 ≤ 1/2−3/n ≤ c2
For n0=7,
c1 ≤ 1/14 and c2 ≥ 1/2
Big-Oh (O) - notation
For a function having only
asymptotic upper bound, Big Oh
„O‟ notation is used. Let a given
function g(n), O(g(n))) is the set
of functions f(n) defined as
O(g(n))={f(n): if there exist
positive constant c and n0 such
that 0≤f(n) ≤cg(n) for all n, n n0}

g(n) is an asymptotic upper bound for f(n).


Show that: 10n2 + 4n + 2 = O(n2 )
Show that: 6*2n + n2 = O(2n )

10n2 + 4n + 2 ≤ C*n2
10n2 + 4n + 2 ≤ 10n2 + n2 ≤ 11n2
For n=1, 16 ≤11 not true
For n=5, 272 ≤ 275 true
So c = 11 ∀n ≥ 5, O(n2 )
6*2n + n2 ≤ 6*2n + 2n
6*2n + n2 ≤ 7*2n
6*2n + n2 ≤ C*2n
For n=1, 13 ≤ 14 not true
For n=2, 28 ≤ 28 true
So C=7 and ∀n ≥ 2, O(2n)
Big Omega (Ω) - Notation
For a function having only
asymptotic lower bound, Ω
notation is used. Let a given
function g(n), Ω(g(n))) is the set
of functions f(n) defined as
Ω(g(n)) ={f(n): if there exist
positive constant c and n0 such
that 0≤ cg(n) ≤f(n) for all n, n n0}
g(n) is an asymptotic lower bound for f(n).
f(n)=3n2+n
f(n)>=c*g(n) => 3n2+n>=3n2= Ω(n2)

Where n<=n0 and n=1


Definition (Little–o, o()): Let f(n) and g(n) be functions that map
positive integers to positive real numbers. We say that f(n) is o(g(n))
(or f(n) ∈ o(g(n))) if for any real constant c > 0, there exists an integer
constant n0 ≥ 1 such that f(n) < c ∗ g(n) for every integer n ≥ n0.

Note: Little o notation is used to describe an upper bound that cannot


be tight. In other words, loose upper bound of f(n).

f(n) = o(g(n))
If f(n) = n2 and g(n) = n3 then
check whether f(n) = o(g(n))
or not.
Definition (Little–Omega, ω()): Let f(n) and g(n) be functions that
map positive integers to positive real numbers. We say that f(n) is
ω(g(n)) (or f(n) ∈ ω(g(n))) if for any real constant c > 0, there exists an
integer constant n0 ≥ 1 such that f(n)>c*g(n) for every integer n ≥ n0.

Examples
•2≠ω(1)
•4x+2≠ω(x)
•4x+2=ω(1)
•3x2+4x+2≠ω(x2)
•3x2+4x+2=ω(x)
Amortized complexity
• Amortized Analysis is used for algorithms where an occasional
operation is very slow, but most of the other operations are faster.
• In Amortized Analysis, we analyze a sequence of operations and
guarantee a worst-case average time that is lower than the worst-
case time of a particularly expensive operation.
• The example data structures whose operations are analyzed using
Amortized Analysis are Hash Tables, Disjoint Sets, and Splay
Trees.
• Amortized analysis is a technique used in computer science to
analyze the average-case time complexity of algorithms that perform
a sequence of operations, where some operations may be more
expensive than others. The idea is to spread the cost of these
expensive operations over multiple operations, so that the average
cost of each operation is constant or less.
Amortized analysis is useful for designing efficient algorithms for data
structures such as dynamic arrays, priority queues, and disjoint-set data
structures. It provides a guarantee that the average-case time complexity
of an operation is constant, even if some operations may be expensive.

Let us consider an example of simple hash table insertions. There is a


trade-off between space and time, if we make hash-table size big, search
time becomes low, but the space required becomes high.
The solution to this trade-off problem is to use Dynamic Table (or Arrays).
The idea is to increase the size of the table whenever it becomes full.
Following are the steps to follow when the table becomes full.
1) Allocate memory for larger table size, typically twice the old table.
2) Copy the contents of the old table to a new table.
3) Free the old table.
Practical Complexities
The complexity can be found in any form such as constant, logarithmic,
linear, n*log(n), quadratic, cubic, exponential, etc.
Constant Complexity:
It imposes a complexity of O(1).
Logarithmic Complexity :It imposes a complexity of O(log(N)).
Linear Complexity: It imposes a complexity of O(N). It also imposes a
run time of O(n*log(n)).
Quadratic Complexity: It imposes a complexity of O(n2).
Cubic Complexity: It imposes a complexity of O(n3).
Exponential Complexity: It imposes a complexity of O(2n), O(N!), O(nk),
…. For N elements, it will execute the order of count of operations that is
exponentially dependable on the input data size.
For example, if N = 10, then the exponential function 2N will result in
1024. Similarly, if N = 20, it will result in 1048 576, and if N = 100, it will
result in a number having 30 digits. The exponential function N! grows
even faster; for example, if N = 5 will result in 120. Likewise, if N = 10, it
will result in 3,628,800 and so on.
Frequency Total Steps
n=0 n>0 n=0 n>0
1 1 1 Or 1
1 0 1 0

0 T(n-1) 0 T(n-1)
total T(n-1) + 2

You might also like