Daa Unit-1: Introduction To Algorithms
Daa Unit-1: Introduction To Algorithms
Daa Unit-1: Introduction To Algorithms
Introduction to Algorithms:
What is Algorithm? Algorithm Basics
The word Algorithm means “a process or set of rules to be followed in
calculations or other problem-solving operations”. Therefore Algorithm refers
to a set of rules/instructions that step-by-step define how a work is to be
executed upon in order to get the expected results.
Output:
30 is present at index 2
Worst Case Analysis (Usually Done) :
In the worst case analysis, we calculate upper bound on running time of an algorithm. We
must know the case that causes maximum number of operations to be executed.
For Linear Search, the worst case happens when the element to be searched (x in the above
code) is not present in the array.
When x is not present, the search() functions compares it with all the elements of arr[] one by
one. Therefore, the worst case time complexity of linear search would be Θ(n).
Average Case Analysis (Sometimes done)
In average case analysis, we take all possible inputs and calculate computing time for all of
the inputs.
Sum all the calculated values and divide the sum by total number of inputs. We must know
(or predict) distribution of cases.
For the linear search problem, let us assume that all cases are uniformly distributed (including
the case of x not being present in array). So we sum all the cases and divide the sum by (n+1).
Following is the value of average case time complexity.
Average Case Time =
=
= Θ(n)
Best Case Analysis (Bogus) :
In the best case analysis, we calculate lower bound on running time of an algorithm. We must
know the case that causes minimum number of operations to be executed. In the linear search
problem, the best case occurs when x is present at the first location.
The number of operations in the best case is constant (not dependent on n). So time
complexity in the best case would be Θ(1)
Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we
guarantee an upper bound on the running time of an algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and it is rarely done.
In the average case analysis, we must know (or predict) the mathematical distribution of all
possible inputs.
The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesn’t provide
any information as in the worst case, an algorithm may take years to run.
For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best
cases. For example, Merge Sort. Merge Sort does Θ(nLogn) operations in all cases.
Most of the other sorting algorithms have worst and best cases. For example, in the typical
implementation of Quick Sort (where pivot is chosen as a corner element), the worst occurs
when the input array is already sorted and the best occur when the pivot elements always
divide array in two halves. For insertion sort, the worst case occurs when the array is reverse
sorted and the best case occurs when the array is sorted in the same order as output.
Performace measurement of algorithm:
Performance analysis of an algorithm depends upon two factors i.e. amount of memory used and
amount of compute time consumed on any CPU. Formally they are notified as complexities in terms
of:
Space Complexity.
Time Complexity.
Space Complexity of an algorithm is the amount of memory it needs to run to completion i.e. from
start of execution to its termination. Space need by any algorithm is the sum of following
components:
1. Fixed Component: This is independent of the characteristics of the inputs and outputs. This
part includes: Instruction Space, Space of simple variables, fixed size component variables,
and constants variables.
2. Variable Component: This consist of the space needed by component variables whose size is
dependent on the particular problems instances(Inputs/Outputs) being solved, the space
needed by referenced variables and the recursion stack space is one of the most prominent
components. Also this included the data structure components like Linked list, heap, trees,
graphs etc.
Therefore the total space requirement of any algorithm 'A' can be provided as
Space(A) = Fixed Components(A) + Variable Components(A)
Among both fixed and variable component the variable part is important to be determined
accurately, so that the actual space requirement can be identified for an algorithm 'A'. To
identify the space complexity of any algorithm following steps can be followed:
1. Determine the variables which are instantiated by some default values.
2. Determine which instance characteristics should be used to measure the space
requirement and this is will be problem specific.
3. Generally the choices are limited to quantities related to the number and magnitudes
of the inputs to and outputs from the algorithms.
4. Sometimes more complex measures of the interrelationships among the data items can
used.
Example: Space Complexity
Algorithm Sum(number,size)\\ procedure will produce sum of all numbers provided in
'number' list
{
result=0.0;
for count = 1 to size do \\will repeat from 1,2,3,4,....size times
result= result + number[count];
return result;
}
In above example, when calculating the space complexity we will be looking for both fixed
and variable components. here we have
Fixed components as 'result','count' and 'size' variable there for total space required is three(3)
words.
Variable components is characterized as the value stored in 'size' variable (suppose value
store in variable 'size 'is 'n'). because this will decide the size of 'number' list and will also
drive the for loop. therefore if the space used by size is one word then the total space required
by 'number' variable will be 'n'(value stored in variable 'size').
therefore the space complexity can be written as Space(Sum) = 3 + n;
Time Complexity :
Time complexity of an algorithm(basically when converted to program) is the amount of
computer time it needs to run to completion.
The time taken by a program is the sum of the compile time and the run/execution time .
The compile time is independent of the instance(problem specific) characteristics. following
factors effect the time complexity:
Characteristics of compiler used to compile the program.
Computer Machine on which the program is executed and physically clocked.
Multiuser execution system.
Number of program steps.
Therefore the again the time complexity consist of two components fixed(factor 1 only) and
variable/instance(factor 2,3 & 4), so for any algorithm 'A' it is provided as:
Time(A) = Fixed Time(A) + Instance Time(A)
Here the number of steps is the most prominent instance characteristics and The number of
steps any program statement is assigned depends on the kind of statement like
comments count as zero steps,
an assignment statement which does not involve any calls to other algorithm is
counted as one step,
for iterative statements we consider the steps count only for the control part of the
statement etc.
Therefore to calculate total number program of program steps we use following procedure.
For this we build a table in which we list the total number of steps contributed by each
statement. This is often arrived at by first determining the number of steps per execution of
the statement and the frequency of each statement executed. This procedure is explained
using an example.
Example: Time Complexity
In above example if you analyze carefully frequency of "for count = 1 to size do" it is 'size +1'
this is because the statement will be executed one time more die to condition check for false
situation of condition provided in for statement. Now once the total steps are calculated they
will resemble the instance characteristics in time complexity of algorithm. Also the repeated
compile time of an algorithm will also be constant every time we compile the same set of
instructions so we can consider this time as constant 'C'. Therefore the time complexity can
be expressed as: Time(Sum) = C + (2size +3)
So in this way both the Space complexity and Time complexity can be calculated.
Combination of both complexity comprises the Performance analysis of any algorithm and
can not be used independently. Both these complexities also helps in defining parameters on
basis of which we optimize algorithms.
C++
Output:
5
Time Complexity: O(2N)
Auxiliary Space: O(1)
Explanation: The time complexity of the above implementation is exponential due to multiple
calculations of the same subproblems again and again. The auxiliary space used is minimum. But our
goal is to reduce the time complexity of the approach even it requires extra space. Below is the
Optimized approach discussed.
Efficient Approach: To optimize the above approach, the idea is to use Dynamic Programming to
reduce the complexity by memoization of the overlapping subproblems as shown in the below
recursion tree:
C++
Output:
5
Time Complexity: O(N)
Auxiliary Space: O(N)
Explanation: The time complexity of the above implementation is linear by using an auxiliary space
for storing the overlapping subproblems states so that it can be used further when required.
Analysis of recursive algorithm through recurrence
relation:
In the previous post, we discussed analysis of loops. Many algorithms are recursive in nature. When
we analyze them, we get a recurrence relation for time complexity. We get running time on an input
of size n as a function of n and the running time on inputs of smaller sizes. For example in Merge
Sort, to sort a given array, we divide it in two halves and recursively repeat the process for the two
halves. Finally we merge the results. Time complexity of Merge Sort can be written as T(n) = 2T(n/2)
+ cn. There are many other algorithms like Binary Search, Tower of Hanoi, etc.
There are mainly three ways for solving recurrences.
1) Substitution Method: We make a guess for the solution and then we use mathematical induction
to prove the guess is correct or incorrect.
For example consider the recurrence T(n) = 2T(n/2) + n
We need to prove that T(n) <= cnLogn. We can assume that it is true
for values smaller than n.
T(n) = 2T(n/2) + n
<= 2cn/2Log(n/2) + n
= cnLogn - cnLog2 + n
= cnLogn - cn + n
<= cnLogn
2) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the time taken
by every level of tree. Finally, we sum the work done at all levels. To draw the recurrence tree, we
start from the given recurrence and keep drawing till we find a pattern among levels. The pattern is
typically a arithmetic or geometric series.
For example consider the recurrence relation
T(n) = T(n/4) + T(n/2) + cn2
cn2
/ \
T(n/4) T(n/2)
cn2
/ \
2
c(n )/16 c(n2)/4
/ \ / \
T(n/16) T(n/8) T(n/8) T(n/4)
Breaking down further gives us following
cn2
/ \
2
c(n )/16 c(n2)/4
/ \ / \
2 2 2
c(n )/256 c(n )/64 c(n )/64 c(n2)/16
/ \ / \ / \ / \