DAA Unit - 1 Brief Notes

Design And Analysis of Algorithms
UNIT-1
Syllabus
Introduction: Algorithms, Analyzing Algorithms, Complexity of Algorithms, Growth of Functions, Performance
Measurements, Sorting and Order Statistics - Shell Sort, Quick Sort, Merge Sort, Heap Sort, Comparison of Sorting
Algorithms, Sorting in Linear Time.
ALGORITHMS
Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values,
as input and produces some value, or set of values, as output. An algorithm is thus a sequence of
computational steps that transform the input into the output.
 In addition every algorithm must satisfy the following criteria:
 Input: there are zero or more quantities, which are externally supplied;
 Output: at least one quantity is produced;
 Definiteness: each instruction must be clear and unambiguous;
 Finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm will
terminate after a finite number of steps;
 Effectiveness: every instruction must be sufficiently basic that it can in principle be carried out by a
person using only pencil and paper. It is not enough that each operation be definite, but it must also be
feasible.
Algorithms as a technology
Suppose computers were infinitely fast and computer memory was free. Would you have any reason to study
algorithms? The answer is yes, if for no other reason than that you would still like to demonstrate that your
solution method terminates and does so with the correct answer.
If computers were infinitely fast, any correct method for solving a problem would do. You would probably
want your implementation to be within the bounds of good software engineering practice (i.e., well designed
and documented), but you would most often use whichever method was the easiest to implement.
Of course, computers may be fast, but they are not infinitely fast. And memory may be cheap, but it is not
free. Computing time is therefore a bounded resource, and so is space in memory. These resources should be
used wisely, and algorithms that are efficient in terms of time or space will help you do so.
IT Department/ ABESEC Ghaziabad 1

ALGORITHM ANALYSIS
Analysis of algorithm is the process of analyzing the problem-solving capability of the algorithm in terms of
the time and size required (the size of memory for storage while implementation). However, the main concern
of analysis of algorithms is the required time or performance. Generally, we perform the following types of
analysis:
Time Complexity:
The time needed by an algorithm expressed as a function of the size of a problem is called the time
complexity of the algorithm. The time complexity of a program is the amount of computer time it needs to run
to completion.
 Worst-case: The maximum number of steps taken on any instance of size a.

 Best-case: The minimum number of steps taken on any instance of size a.
 Average case: An average number of steps taken on any instance of size a.
Space Complexity:
The space complexity of a program is the amount of memory it needs to run to completion. The space need by
a program has the following components:
 Instruction space: Instruction space is the space needed to store the compiled version of the program
instructions.
 Data space: Data space is the space needed to store all constant and variable values. Data space has
two components:
 Space needed by constants and simple variables in program.
 Space needed by dynamically allocated objects such as arrays and class instances.
 Environment stack space: The environment stack is used to save information needed to resume
execution of partially completed functions.
GROWTH OF FUNCTIONS ( ASYMPTOTIC NOTATIONS)
Execution time of an algorithm depends on the instruction set, processor speed, disk I/O speed, etc. Hence, we
estimate the efficiency of an algorithm asymptotically. Time function of an algorithm is represented by 𝐓(𝐧),
where n is the input size.
Different types of asymptotic notations are used to represent the complexity of an algorithm. Following
asymptotic notations are used to calculate the running time complexity of an algorithm.
a) O: Big Oh
b) Ω: Big omega
c) Ɵ: Big theta

O: Asymptotic Upper Bound
„O‟ (Big Oh) is the most commonly used notation. A function 𝐟(𝐧) can be represented is the order of 𝒈(𝒏) that
is 𝑶(𝒈(𝒏)), if there exists a value of positive integer n as n0 and a positive constant c such that:
𝒇(𝒏) ≤ 𝒄. 𝒈(𝒏) for 𝒏 > 𝒏𝟎 in all case.
Hence, function 𝒈(𝒏) is an upper bound for function 𝒇(𝒏), as 𝒈(𝒏) grows faster than 𝒇(𝒏).
Ω: Asymptotic Lower Bound
We say that 𝒇(𝒏) = 𝛀(𝐠(𝒏)) when there exists constant c that 𝒇(𝒏) ≥ 𝒄. 𝒈(𝒏) for all sufficiently large value of
n. Here n is a positive integer. It means function g is a lower bound for function f; after a certain value of n, f
will never go below g.
Ɵ: Asymptotic Tight Bound
We say that 𝑓(𝑛) = Ɵ(g(𝑛)) when there exist constants c1 and c2 that 𝑐1. 𝑔(𝑛) ≤ 𝑓(𝑛) ≤ 𝑐2. 𝑔(𝑛) for all
sufficiently large value of n. Here n is a positive integer.
This means function g is a tight bound for function f.

Some Well Known Complexities Arrange In Ascending Order With Their Description
Complexity Description
1 Next instructions of most programs are executed once or at most only a few times.
Log n When the running time of a program is logarithmic, the program gets slightly slower as n grows.
This running time commonly occurs in programs that solve a big problem by transforming it
into a smaller problem, cutting the size by some constant fraction.
n When the running time of a program is linear, it is generally the case that a small amount of
processing is done on each input element. This is the optimal situation for an algorithm that
must process n inputs.
n. log n This running time arises for algorithms that solve a problem by breaking it up into smaller sub-
problems, solving then independently, and then combining the solutions.
n2 When the running time of an algorithm is quadratic, it is practical for use only on relatively
small problems.
n3 Similarly, an algorithm that process triples of data items (perhaps in a triple–nested loop) has a
cubic running time and is practical for use only on small problems.
2n Few algorithms with exponential running time are likely to be appropriate for practical use,
such algorithms arise naturally as “brute–force” solutions to problems.
HOW TO ANALYSES AN ALGORITHM
Let us form an algorithm for Insertion sort (which sort a sequence of numbers).The pseudo code for the
algorithm is give below.
We start by presenting the INSERTION-SORT procedure with the time “cost” of each statement and the
number of times each statement is executed. For each j = 2, 3, . . . , n, where n = length[A], we let t j be the
number of times the while loop test in line 5 is executed for that value of j. When a for or while loop exits in
the usual way (i.e., due to the test in the loop header), the test is executed one time more than the loop body.
We assume that comments are not executable statements, and so they take no time.

Running time of the algorithm is: The running time of the algorithm is the sum of running times for each
statement
Executed
T(n)=C1n+C2(n-1)+0(n-1)+C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)
Best case:
It occurs when Array is sorted.
All tj values are 1.
T(n)=C1n+C2(n-1)+0 (n-1)+C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)

=C1n+C2 (n-1) +0 (n-1) +C4 (n-1) +C5 + C8 (n-1)
= (C1+C2+C4+C5+ C8) n-(C2+C4+C5+ C8)
· Which is of the form an+b.

· Linear function of n.So, linear growth.
Worst case/ Average case
It occurs when Array is reverse sorted, and tj =j
T(n)=C1n+C2(n-1)+0 (n-1)+C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)

=C1n+C2 (n-1) +C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)
Which is of the form an2+bn+c. Quadratic function. So in worst case insertion set grows in n2.

RECURRENCES, SOLUTION OF RECURRENCES BY SUBSTITUTION,
RECURSION TREE AND MASTER METHOD





Solutions

Modified Master Method

MERGE SORT
Merge sort algorithm is a classic example of divide and conquer. To sort an array, recursively, sort its left and
right halves separately and then merge them. The time complexity of merge mort in the best case, worst case
and average case is O(n log n) and the number of comparisons used is nearly optimal.


Analysis of Merge Sort
Running time T(n) of Merge Sort:
Divide: computing the middle takes (1)
Conquer: solving 2 subproblems takes 2T(n/2)
Combine: merging n elements takes (n)
Total: T(n) = (1) if n = 1
T(n) = 2T(n/2) + (n) if n > 1
 T(n) = (n lg n)
QUICKSORT
 Worst-case running time: O (n2).

 Best/Average running time: O (n lgn).
Description of quicksort
Quicksort is based on the three-step process of divide-and-conquer.
• To sort the subarrayA[p . . r ]:

Divide: Partition A[p . . r ], into two (possibly empty) subarraysA[p . . q − 1] and A[q + 1 . . r ], such that each
element in the ÞrstsubarrayA[p . . q − 1] is ≤ A[q] and A[q] is ≤ each element in the second subarrayA[q + 1 .
. r ].
Conquer: Sort the two subarrays by recursive calls to QUICKSORT.
Combine: No work is needed to combine the subarrays, because they are sorted in place.
• Perform the divide step by a procedure PARTITION, which returns the index q that marks the position
separating the subarrays

Performance of quicksort
Worst-case partitioning
The worst-case behavior for quicksort occurs when the partitioning routine produces one subproblem with n −
1 elements and one with 0 elements.
2
which evaluates to θ (n ).
Best-case partitioning
In the most even possible split, PARTITION produces two sub problems, each of size no more than n/2
The recurrence for the running time is then

T (n) ≤ 2T (n/2) + _(n) ,
which by case 2 of the master theorem (Theorem 4.1) has the solution T (n) = O(n lg n).

Example of Quick sort

HEAP SORT:
A heap sort algorithm works by first organizing the data to be sorted into a special type of binary tree called a
heap. Any kind of data can be sorted either in ascending order or in descending order using heap tree. It does
this with the following steps:
1. Build a heap tree with the given set of data.

2. a. Remove the top most item (the largest) and replace it with the last element in the heap.
b. Re-heapify the complete binary tree.
c. Place the deleted node in the output.
3. Continue step 2 until the heap tree is empty.
Time Complexity:

Example of Heap Sort

Shell Sort
 Designed by Donald Shell and named the sorting algorithm after himself in 1959.
 Shell sort works by comparing elements that are distant rather than adjacent elements in an array or
list where adjacent elements are compared.
 Shell sort is also known as diminishing increment sort.
 Shell sort improves on the efficiency of insertion sort by quickly shifting values to their destination.
 This algorithm tries to decreases the distance between comparisons (i.e. gap) as the sorting algorithm
runs and reach to its last phase where, the adjacent elements are compared only.


SORTING IN LINEAR TIME
COUNTING SORT
Counting sort assumes that each of the n input elements is an integer in the range 0 to k, for some integer k.
When k = O(n), the sort runs in θ(n) time.
In the code for counting sort, we assume that the input is an array A[1 . . n], and thus length[A] = n. We
require two other arrays: the array B[1 . . n] holds the sorted output, and the array C[0 . . k] provides
temporary working storage.
Example

RADIX SORT
Radix sort is the algorithm used by the card-sorting machines you now find only in computer museums.
In a typical computer, which is a sequential random-access machine, radix sort is sometimes used to sort
records of information that are keyed by multiple fields. For example, we might wish to sort dates by three
keys: year, month, and day. We could run a sorting algorithm with a comparison function that, given two
dates, compares years, and if there is a tie, compares months, and if another tie occurs, compares days.
Alternatively, we could sort the information three times with a stable sort: first on day, next on month, and
finally on year.
The code for radix sort is straightforward. The following procedure assumes that each element in the n-
element array A has d digits, where digit 1 is the lowest-order digit and digit d is the highest-order digit.
RADIX-SORT(A, d)
1 for i ← 1 to d
2 do use a stable sort to sort array A on digit i

Example
BUCKET SORT
Bucket sort runs in linear time when the input is drawn from a uniform distribution. Like counting sort, bucket
sort is fast because it assumes something about the input. Whereas counting sort assumes that the input
consists of integers in a small range, bucket sort assumes that the input is generated by a random process that
distributes elements uniformly over the interval
Our code for bucket sort assumes that the input is an n-element array A and that each element A[i ] in the
array satisfies 0 ≤ A[i ] < 1. The code requires an auxiliary array B[0 . . n − 1] of linked lists (buckets) and
assumes that there is a mechanism for maintaining such lists.
Example
COMPARISON OF SORTING ALGORITHMS

DAA Unit - 1 Brief Notes

Uploaded by

Copyright:

Available Formats

DAA Unit - 1 Brief Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DAA Unit - 1 Brief Notes

Uploaded by

Copyright:

Available Formats

Design And Analysis of Algorithms

 In addition every algorithm must satisfy the following criteria:

 Output: at least one quantity is produced;

 Definiteness: each instruction must be clear and unambiguous;

IT Department/ ABESEC Ghaziabad 1

 Worst-case: The maximum number of steps taken on any instance of size a.

GROWTH OF FUNCTIONS ( ASYMPTOTIC NOTATIONS)

IT Department/ ABESEC Ghaziabad 2

O: Asymptotic Upper Bound

𝒇(𝒏) ≤ 𝒄. 𝒈(𝒏) for 𝒏 > 𝒏𝟎 in all case.

Ω: Asymptotic Lower Bound

Ɵ: Asymptotic Tight Bound

This means function g is a tight bound for function f.

IT Department/ ABESEC Ghaziabad 3

HOW TO ANALYSES AN ALGORITHM

IT Department/ ABESEC Ghaziabad 4

T(n)=C1n+C2(n-1)+0(n-1)+C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)

T(n)=C1n+C2(n-1)+0 (n-1)+C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)

· Which is of the form an+b.

T(n)=C1n+C2(n-1)+0 (n-1)+C4(n-1)+C5 +C6( )+C7( )+ C8(n-1)

IT Department/ ABESEC Ghaziabad 5

RECURRENCES, SOLUTION OF RECURRENCES BY SUBSTITUTION,

RECURSION TREE AND MASTER METHOD

IT Department/ ABESEC Ghaziabad 6

IT Department/ ABESEC Ghaziabad 7

IT Department/ ABESEC Ghaziabad 8

IT Department/ ABESEC Ghaziabad 9

IT Department/ ABESEC Ghaziabad 10

IT Department/ ABESEC Ghaziabad 11

Modified Master Method

IT Department/ ABESEC Ghaziabad 12

IT Department/ ABESEC Ghaziabad 13

IT Department/ ABESEC Ghaziabad 14

Running time T(n) of Merge Sort:

Divide: computing the middle takes (1)

Conquer: solving 2 subproblems takes 2T(n/2)

Combine: merging n elements takes (n)

Total: T(n) = (1) if n = 1

T(n) = 2T(n/2) + (n) if n > 1

 Worst-case running time: O (n2).

Quicksort is based on the three-step process of divide-and-conquer.

• To sort the subarrayA[p . . r ]:

Conquer: Sort the two subarrays by recursive calls to QUICKSORT.

IT Department/ ABESEC Ghaziabad 15

The recurrence for the running time is then

IT Department/ ABESEC Ghaziabad 16

Example of Quick sort

IT Department/ ABESEC Ghaziabad 17

1. Build a heap tree with the given set of data.

b. Re-heapify the complete binary tree.

c. Place the deleted node in the output.

3. Continue step 2 until the heap tree is empty.

IT Department/ ABESEC Ghaziabad 18

Example of Heap Sort

IT Department/ ABESEC Ghaziabad 19

IT Department/ ABESEC Ghaziabad 20

IT Department/ ABESEC Ghaziabad 21

SORTING IN LINEAR TIME

IT Department/ ABESEC Ghaziabad 22

IT Department/ ABESEC Ghaziabad 23